普罗米修斯配置和 http_requests_total

Question

我已经使用默认配置安装了 prometheus。

我在它的 Web 界面上，http://localhost/9090/metrics 试图获取与 http 请求总数相对应的时间序列。

按名称 http_requests_total 过滤，检索多个具有不同标签的时间序列，例如

http_requests_total{code='200',handler='targets',instance=localhost:9090,job='prometheus',method='get'} 
http_requests_total{code='200',handler='static',instance=localhost:9090,job='prometheus',method='get'} 
http_requests_total{code='200',handler='graph',instance=localhost:9090,job='prometheus',method='get'} 
[...]

这些时间序列是什么？我怎样才能找到每个标签背后的语义？

Answer 1

第一，如果您在浏览器中访问 http://localhost:9090/metrics，您应该会看到类似以下内容的内容：

# HELP prometheus_http_request_duration_seconds Histogram of latencies for HTTP requests.
# TYPE prometheus_http_request_duration_seconds histogram
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.1"} 3
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.2"} 3
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.4"} 3
...

这应该解释指标测量的内容以及标签的意图。如果您不知道 counter/gauge/histogram 是什么，那么您可能应该 RTFM.

如果您想深入了解（并且可以访问受监控服务的源代码，就像 Prometheus 源代码一样），您可以 search said source code for the metric name. Note that the metric name in the code may be a substring of the final metric name, as a namespace may be prepended to it (the prometheus_ part in my example above) and for histograms and summaries _count or bucket or something else may be appended. So in the case of the metric above you should search the code for "http_request_duration_seconds" 而不是 "prometheus_http_request_duration_seconds_bucket" .

普罗米修斯配置和 http_requests_total

Prometheus configuration and http_requests_total

prometheus