opentelemetry-collector logs

collecting logs with otel

SEAN K.H. LIAO

opentelemetry-collector logs

collecting logs with otel

logs

The formalized version of printf debugging. Anyway, you run a lot of applications written by a lot of different people. Their logs are all over the place, what can you do?

Recently the OpenTelemetry Collector, with the donation of stanza / opentelemetry-log-collection gained the ability to collect logs, do some light processing and ship them elsewhere. It's still early so sort of limited in what it can do.

Let's look at log outputs from some Go logging libraries and how you'd parse them.

std log

While the log package does have flags for you to control output, In the standard config it just adds time. It also has no structure in the messages:

2021/06/17 22:37:44 Something happened foo=bar

This can be parsed with:

receivers:
  filelog/std:
    include:
      - std.log
    operators:
      - type: regex_parser
        regex: "^(?P<timestamp_field>.{19}) (?P<message>.*)$"
        timestamp:
          parse_from: timestamp_field
          layout_type: strptime
          layout: "%Y/%m/%d %T"

Resulting in (otel logging exporter output):

Resource labels:
     -> host.name: STRING(eevee)
     -> os.type: STRING(LINUX)
InstrumentationLibraryLogs #0
InstrumentationLibrary
LogRecord #0
Timestamp: 2021-06-17 22:37:44 +0000 UTC
Severity: Undefined
ShortName:
Body: {
     -> message: STRING(Something happened foo=bar)
}

json

There are a lot of structured loggers out there, I happen to like zerolog.

{"level":"error","error":"oops","foo":"bar","time":"2021-06-17T22:38:02+02:00","message":"something bad happened"}

json is well supported

note: the strptime parser seems to take issue with the timezone for some reason, so I'm using the Go time parser

receivers:
  filelog/json:
    include:
      - json.log
    include_file_name: false
    operators:
      - type: json_parser
        timestamp:
          parse_from: time
          layout_type: gotime
          layout: 2006-01-02T15:04:05Z07:00
        severity:
          parse_from: level

output:

Resource labels:
     -> host.name: STRING(eevee)
     -> os.type: STRING(LINUX)
InstrumentationLibraryLogs #0
InstrumentationLibrary
LogRecord #0
Timestamp: 2021-06-17 20:38:02 +0000 UTC
Severity: Error
ShortName:
Body: {
     -> error: STRING(oops)
     -> foo: STRING(bar)
     -> message: STRING(something bad happened)
}

klog / glog

klog is kubernetes' standard logger, mostly based on glog. And recent versions have gained support for structured logging. But this is where we reach the limits of the current log parser.

Unlike loki it doesn't have a logfmt parser meaning your key=value pairs are just stuck there. All the more reason to use json loggers then...

E0617 22:38:02.013247   76356 main.go:57]  "msg"="something bad happened" "error"="oops"

config:

receivers:
  filelog/klog:
    include:
      - klog.log
    include_file_name: false
    operators:
      - type: regex_parser
        # Lmmdd hh:mm:ss.uuuuuu threadid file:line]
        regex: '^(?P<level>[EI])(?P<timestamp_field>.{20})\s+(?P<threadid>\d+)\s(?P<file>\w+\.go):(?P<line>\d+)]\s+(?P<message>.*)$'
        timestamp:
          parse_from: timestamp_field
          layout: "%m%d %H:%M:%S.%f"
        severity:
          parse_from: level
          mapping:
            error: E
            info: I

result

Resource labels:
     -> host.name: STRING(eevee)
     -> os.type: STRING(LINUX)
InstrumentationLibraryLogs #0
InstrumentationLibrary
LogRecord #0
Timestamp: 2021-06-17 22:38:02.013247 +0000 UTC
Severity: Error
ShortName:
Body: {
     -> file: STRING(main.go)
     -> line: STRING(57)
     -> msg: STRING("msg"="something bad happened" "error"="oops"  )
     -> threadid: STRING(73779)
}