即使当前没有错误,Grok 出口商数量也不会减少

Grok exporter count doesn't decreases even if there are no errors currently

我们已将 Grok 导出器配置为监视 Web 服务日志中的错误。我们看到,即使没有错误,它仍然会打印过去的错误计数。

我们使用 "gauge" 作为指标类型并每 5 秒轮询一次日志文件。

请看下面的config.yml:

global:
  config_version: 2
input:
  type: file
  path: /ZAMBAS/logs/Healthcheck/AI/ai_17_grafana.log
  readall: true
  poll_interval_seconds: 5

grok:
  patterns_dir: ./patterns

metrics:
    - type: counter
      name: OutOfThreads
      help: Counter metric example with labels.
      match: '%{GREEDYDATA} WARN!! OUT OF THREADS: %{GREEDYDATA}'

    - type: counter
      name: OutOfMemory
      help: Counter metric example with labels.
      match: '%{GREEDYDATA}: Java heap space'

    - type: gauge
      name: NoMoreEndpointPrefix
      help: Counter metric example with labels.
      match: '%{GREEDYDATA}: APPL%{NUMBER:val1}: IO Exception: Connection refused %{GREEDYDATA}'
      value: '{{.val1}}'
      cumulative: false


    - type: gauge
      name: IOExceptionConnectionReset
      help: Counter metric example with labels.
      match: '   <faultstring>APPL%{NUMBER:val3}: IO Exception: Connection reset'
      value: '{{.val3}}'
      cumulative: false


    - type: gauge
      name: IOExceptionReadTimedOut
      help: Counter metric example with labels.
      match: '   <faultstring>APPL%{NUMBER:val4}: IO Exception: Read timed out'
      value: '{{.val4}}'
      cumulative: false


    - type: gauge
      name: FailedToConnectTo
      help: Counter metric example with labels.
      match: "   <faultstring>RUNTIME0013: Failed to connect to '%{URI:val5}"
      value: '{{.val5}}'
      cumulative: false

server:
port: 9244



Output:

grok_exporter_lines_matching_total{metric="FailedToConnectTo"} 0
grok_exporter_lines_matching_total{metric="IOExceptionConnectionReset"} 0
grok_exporter_lines_matching_total{metric="IOExceptionReadTimedOut"} 3
grok_exporter_lines_matching_total{metric="NoMoreEndpointPrefix"} 0
grok_exporter_lines_matching_total{metric="OutOfMemory"} 0
grok_exporter_lines_matching_total{metric="OutOfThreads"} 0

比如说,1 小时内没有任何错误,但它仍然显示“3”个错误,并且当确实发生错误时,它会不断累加。所以总共变成 4 等等..它不断添加:(

我希望 grok 只显示当前数据而不添加以前的值。

请帮助我们解决我做错了什么。

谢谢 普里约托什

这是正确的行为。您要做的是使用 Prometheus 中的 rate() 函数来计算每秒有多少相关日志行。例如 rate(OutOfThreads[5m]).