PromQL,
the query language for prometheus,
has a histogram_quantile
function to calculate φ-quantiles from a histogram.
Or in other words, estimate where a particular quantile was from your partially aggregated data.
All the examples show it used like: histogram_quantile(0.99, rate(some_metric_query_bucket[5m]))
,
and someone asked: is the rate()
necessary?
Let's start from basics
Prometheus records data in time series. An instant-vector gives you a single data point per timestamp, per time series:
1time (min) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2metric_one 1 1 2 2 4 4 8 8 8 8 16 16 32 32 64 64
A range-vector gives you a slice of data (covering the lookback period up to now), for each point in time. For the above with a range of 2:
1time (min) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2metric_one[2m] [1] [1 1] [1 1 2] [1 2 2] [2 2 4] [2 4 4] [4 4 8] [4 8 8] [8 8 8] [8 8 8] [8 8 16] [8 16 16] [16 16 32] [16 32 32] [32 32 64] [32 64 64]
rate()
takes a range vector,
and calculates a per second average increase over the time period,
eg, for a given range [8 16 16]
,
the increase is 8
over 2m
(120s
),
giving a rate 8 / 120
= approx. 0.67
.
1time (min) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2rate(metric_one[2m]) 0 0 0.008 0.008 0.017 0.017 0.033 0.033 0 0 0.067 0.067 0.133 0.133 0.267 0.267
The histograms produced by prometheus are a collection of counters,
each having a le
label denoting bucket boundaries,
recording the count up to now (monotonic over time)
and up to le
in size (monotonic over the buckets).
1time (min) 0 1 2 3 4 5
2metric_two_bucket{le="10"} 0 1 1 4 4 4
3metric_two_bucket{le="50"} 0 1 2 6 6 8
4metric_two_bucket{le="100"} 0 2 3 7 7 9
5metric_two_bucket{le="+Inf"} 0 2 4 8 8 10
6---
7represents the following events:
8time (min) 0 1 2 3 4 5
90 - 10 0 1 0 3 0 0
1010 - 50 0 0 1 1 0 2
1150 - 100 0 1 0 0 0 0
12100 - Inf 0 0 1 0 0 0
For a given instant, histogram_quantile
looks at the increases between buckets to get a distribution of events.
It then calculates the quantile from this distribution,
interpolating if necessary.
ref: quantile.go
1metric_two_bucket{le="10"} 40 40
2metric_two_bucket{le="50"} 80 -> 40
3metric_two_bucket{le="100"} 90 10
4metric_two_bucket{le="+Inf"} 100 10
5---
6quantile = 0.85 sits between bucket 2 and 3,
7so: bucket_2_bound + (bucket_3_bound - bucket_2_bound) * (quantile * all_events - bucket_2_events) / events_in_bucket)
8so: 50 + ( 100 - 50 ) * ( 0.85 * 100 - 80 ) / (90-80) = 75
From here, we can see that the absolute value of the buckets don't matter, only their relative sizes.
histogram_quantile
has no intrinsic requirement that the argument passed to is has passed through rate
,
it will happily calculate the quantile for any set of buckets represented by instant vectors.
For most histograms, that means a quantile representing over all of the data.
However, in most cases, we'll want rate
or increase
,
bounding our quantile calculations to fresh data.
rate(metric_three_bucket[2m])
gives us the increase over the last 2 mins,
meaning our quantile calculations are for all the requests in the past 2 mins,
rather than for all of time.