SEANK.H.LIAO

metrics push or pull

yeeting metrics data from many internet boxen to one

metrics

Things happen, in the world, in computers, in the software you write. Those are events. In a perfect world, you would be able to time travel and inspect those events. In a less perfect world, you'd have a record of each event that you can inspect after the fact. In our world, doing that quickly becomes expensive and infeasible at scale, so you need to downsample. Group data together by a few important labels, group them up into distinct time intervals, and now you have metrics. A low(er) cost, low resolution view into the state of things, hopefully the useful parts of the state.

shipping metrics

Your code has done its calculations and given you a number, or more likely a new set of numbers every few seconds. And you have many copies of your code running, so many sets of numbers. You want to put them all in the same place so you can watch those numbers, and make dashboards.

pulling metrics

One way of doing it is the pull model, most commonly seen in prometheus compatible software exposing a /metrics endpoint.

pull pros
pull cons

push metrics

What is old is new again. Google built borgmon and the world got prometheus. Google then built monarch and the world... is still waiting for a few more engineers to leave Google and clone it.

push pros
push cons

summary

Does it matter which model you go with? Not really, each one will sort-of work, and you still have more work to do to align the disparate systems.

If you're a SaaS, you probably want the push model, there's a reason why approx. noone offered hosted prometheus until recently, and even that is either based on remote_write api compatibility (eg grafana cloud), or an agent converting to some other protocol and pushing (everyone else).

If you're looking to the future, hopefully more people will adopt the OpenTelemetry model. It's configurable, but for now primarily push based, with the greatest value being a single standard to rule them all...