opentelemetry-collector logs

collecting logs with otel

SEAN K.H. LIAO

opentelemetry-collector logs

collecting logs with otel

logs

The formalized version of printf debugging. Anyway, you run a lot of applications written by a lot of different people. Their logs are all over the place, what can you do?

Recently the OpenTelemetry Collector, with the donation of stanza / opentelemetry-log-collection gained the ability to collect logs, do some light processing and ship them elsewhere. It's still early so sort of limited in what it can do.

Let's look at log outputs from some Go logging libraries and how you'd parse them.

std log

While the log package does have flags for you to control output, In the standard config it just adds time. It also has no structure in the messages:

2021/06/17 22:37:44 Something happened foo=bar

This can be parsed with:

 1receivers:
 2  filelog/std:
 3    include:
 4      - std.log
 5    operators:
 6      - type: regex_parser
 7        regex: "^(?P<timestamp_field>.{19}) (?P<message>.*)$"
 8        timestamp:
 9          parse_from: timestamp_field
10          layout_type: strptime
11          layout: "%Y/%m/%d %T"

Resulting in (otel logging exporter output):

Resource labels:
     -> host.name: STRING(eevee)
     -> os.type: STRING(LINUX)
InstrumentationLibraryLogs #0
InstrumentationLibrary
LogRecord #0
Timestamp: 2021-06-17 22:37:44 +0000 UTC
Severity: Undefined
ShortName:
Body: {
     -> message: STRING(Something happened foo=bar)
}

json

There are a lot of structured loggers out there, I happen to like zerolog.

{"level":"error","error":"oops","foo":"bar","time":"2021-06-17T22:38:02+02:00","message":"something bad happened"}

json is well supported

note: the strptime parser seems to take issue with the timezone for some reason, so I'm using the Go time parser

 1receivers:
 2  filelog/json:
 3    include:
 4      - json.log
 5    include_file_name: false
 6    operators:
 7      - type: json_parser
 8        timestamp:
 9          parse_from: time
10          layout_type: gotime
11          layout: 2006-01-02T15:04:05Z07:00
12        severity:
13          parse_from: level

output:

Resource labels:
     -> host.name: STRING(eevee)
     -> os.type: STRING(LINUX)
InstrumentationLibraryLogs #0
InstrumentationLibrary
LogRecord #0
Timestamp: 2021-06-17 20:38:02 +0000 UTC
Severity: Error
ShortName:
Body: {
     -> error: STRING(oops)
     -> foo: STRING(bar)
     -> message: STRING(something bad happened)
}

klog / glog

klog is kubernetes' standard logger, mostly based on glog. And recent versions have gained support for structured logging. But this is where we reach the limits of the current log parser.

Unlike loki it doesn't have a logfmt parser meaning your key=value pairs are just stuck there. All the more reason to use json loggers then...

E0617 22:38:02.013247   76356 main.go:57]  "msg"="something bad happened" "error"="oops"

config:

 1receivers:
 2  filelog/klog:
 3    include:
 4      - klog.log
 5    include_file_name: false
 6    operators:
 7      - type: regex_parser
 8        # Lmmdd hh:mm:ss.uuuuuu threadid file:line]
 9        regex: '^(?P<level>[EI])(?P<timestamp_field>.{20})\s+(?P<threadid>\d+)\s(?P<file>\w+\.go):(?P<line>\d+)]\s+(?P<message>.*)$'
10        timestamp:
11          parse_from: timestamp_field
12          layout: "%m%d %H:%M:%S.%f"
13        severity:
14          parse_from: level
15          mapping:
16            error: E
17            info: I

result

Resource labels:
     -> host.name: STRING(eevee)
     -> os.type: STRING(LINUX)
InstrumentationLibraryLogs #0
InstrumentationLibrary
LogRecord #0
Timestamp: 2021-06-17 22:38:02.013247 +0000 UTC
Severity: Error
ShortName:
Body: {
     -> file: STRING(main.go)
     -> line: STRING(57)
     -> msg: STRING("msg"="something bad happened" "error"="oops"  )
     -> threadid: STRING(73779)
}