go profiling

performance issues? profile with pprof

SEAN K.H. LIAO

go profiling

performance issues? profile with pprof

profiling

Got a performance issue? Other than stare at your code really hard, you can profile it. Or in other words, measure it as it does real work.

Go has native (runtime) integration for profiling a few low level things: the CPU (cpu, trace), memory (heap, alloc), sync (block, mutex), concurrency (goroutine, threadcreate), as well as custom profiles. These are exposed via runtime/pprof and net/http/pprof, with the output parseable by pprof, which is vendored into the go tool under go tool pprof.

profile types

custom

The available API for custom profiles is really simple. It serves a single usecase: keep a count of live references by the stacktrace it took to reach the Add call. That's really all there is, a count of stacktraces (unique execution paths), which can be queried / graphed later. The corollary is that profiling this way is likely only to be interesting for relatively low level things, that are used across a range of paths, else you'd just get a linear path.

cpu

The CPU profile shows you where time is spent in terms of function calls, while a trace gives you a low level view of processors, goroutines, and scheduling. All aided by the kernel which provides some of the info when it sends SIGPROF.

memory

heap is a view of where the live things are, allocsis where memory is being allocated (and also a likely candidate for churn / pressure on the GC). Memory related information is only collected during / after a GC cycle.

sync

block points to the waiting things, while mutex points to the things holding the locks (causing others to wait).

concurrency

goroutine dumps the stacktrace of every running goroutine, it is, expectedly, an expensive operation. threadcreate is apparently broken, as the trigger to create new threads has shifted.