使用正则表达式从 Grafana 表达式中检索 Prometheus 指标名称
Using regex to retrieve the Prometheus metric name from Grafana expressions
我尝试了很多不同的 regex
模式来获得它,但不太成功。
这个问题的模式:
<method_name(> metric_name <{filter_condition}> <[time_duration]> <)> <by (some members)>
^------------------------------------------------------^
method_name(...) can be multiple
如您所见,<...>
可以是可选的,而 metric_name
是必须的,我想从这个 equation
中检索它。
Case # 1
input: sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)
output: log_search_by_service_total
Case # 2
input: log_request_total
output: log_request_total
Case # 3
input: sum(delta(log_request_total[5m])) by (args, user_id)
output: log_request_total
Case # 4
input: log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}
output: log_request_total
Case # 5
input: sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)"
output: log_request_total
Case # 6
input: count_scalar(sum(log_query_request_total) by (user_id))
output: log_query_request_total
这是我在 Java 中尝试过的演示。但似乎我无法获得正确的 pattern
来检索我上面提到的模式的确切答案。
如果可能,请分享一些想法。
public static void main(String... args) {
String[] exprs = {"sum(log_query_task_cache_hit_rate_bucket)by(le)",
"sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)",
"log_request_total",
" sum(delta(log_request_total[5m])) by (args, user_id)",
"log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}",
"sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)",
"sum(log_request_total{methodName=\"getInstanceNames\"}) by (user_id)",
"sum(log_request_total{methodName=\"getVpcCardInfo\",user_id!~\"${user}\"}) by (envName)",
"count_scalar(sum(log_query_request_total) by (user_id))",
"avg(log_waiting_time_average) by (exported_tenant, exported_landscape)",
"avg(task_processing_time_average{app=\"athena\"})",
"avg(log_queue_time_average) by (log_type)",
"sum(delta(product_dashboard_service_sum[2m]))",
"ceil(delta(product_dashboard_service_count[5m]))]"
};
String[] expected = {
"log_query_task_cache_hit_rate_bucket",
"log_search_by_service_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_query_request_total",
"log_waiting_time_average",
"task_processing_time_average",
"log_queue_time_average",
"product_dashboard_service_sum",
"product_dashboard_service_count"
};
Pattern pattern = Pattern.compile(".*?\(?([\w|_]+)\{?\[?.*");
testPattern(exprs, expected, pattern);
pattern = Pattern.compile(".*\(?([\w|_]+)\{?\[?.*");
testPattern(exprs, expected, pattern);
pattern = Pattern.compile(".*?\(?([\w|_]+)\{?\[?.*");
testPattern(exprs, expected, pattern);
}
private static void testPattern(String[] exprs, String[] expected, Pattern pattern) {
System.out.println("\n********** Pattern Match Test *********\n");
for (int i = 0; i < exprs.length; ++i) {
String expr = exprs[i];
Matcher matcher = pattern.matcher(expr);
if (matcher.find()) {
System.out.println("\nThe Original Expr: " + expr);
System.out.println(String.format("Expected:\t %-40s Matched:\t %-40s", expected[i], matcher.group(1)));
} else {
System.out.println("expected: " + expected[i] + " not matched");
}
}
}
更新 2018-08-06
感谢 Bohemian 的帮助,它真的启发了我(因为我一直相信 regex
可以用干净的溶液变魔术)。
后来发现expr
比我预想的要复杂,出现了如下情况:
Case # 7
input: topk(10,autoindex_online_consume_time_total_sum{app=~"$app", DTO_Name=~"$c_class"})
expected: autoindex_online_consume_time_total_sum
// to get the metric name: autoindex_online_consume_time_total_sum
// still I can make it work with small modifications as ^(?:\w+\()*(?:\d+,)*(\w+)
但是下面的一个甚至更多不同的复杂组合让我转向了可靠的方法:
Case # 8
input: sum(hue_mail_sent_attachment_bytes_total) by (app) / sum(hue_mail_sent_mails_with_attachment_total) by (app)
Expected: [hue_mail_sent_attachment_bytes_total, hue_mail_sent_mails_with_attachment_total]
现在更复杂了...甚至 不可预测 因为无法控制来自用户的 expr
输入。
所以我用更可靠和简单的解决方案实现了同样的目标:
- 首先将
distinct
指标名称存储到数据库中;
- 当
expr
出现时,使用 contains(String s)
; 在内存中检查它
- 仍然可能存在问题:如果某些指标名称包含其他指标名称,则过度匹配;
此正则表达式捕获了第 1 组中的目标
^(?:\w+\()*(\w+)
参见live demo。
在 java 中,获取目标:
String metricName = input.replaceAll("^(?:\w+\()*(\w+)", "");
我尝试了很多不同的 regex
模式来获得它,但不太成功。
这个问题的模式:
<method_name(> metric_name <{filter_condition}> <[time_duration]> <)> <by (some members)>
^------------------------------------------------------^
method_name(...) can be multiple
如您所见,<...>
可以是可选的,而 metric_name
是必须的,我想从这个 equation
中检索它。
Case # 1
input: sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)
output: log_search_by_service_total
Case # 2
input: log_request_total
output: log_request_total
Case # 3
input: sum(delta(log_request_total[5m])) by (args, user_id)
output: log_request_total
Case # 4
input: log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}
output: log_request_total
Case # 5
input: sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)"
output: log_request_total
Case # 6
input: count_scalar(sum(log_query_request_total) by (user_id))
output: log_query_request_total
这是我在 Java 中尝试过的演示。但似乎我无法获得正确的 pattern
来检索我上面提到的模式的确切答案。
如果可能,请分享一些想法。
public static void main(String... args) {
String[] exprs = {"sum(log_query_task_cache_hit_rate_bucket)by(le)",
"sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)",
"log_request_total",
" sum(delta(log_request_total[5m])) by (args, user_id)",
"log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}",
"sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)",
"sum(log_request_total{methodName=\"getInstanceNames\"}) by (user_id)",
"sum(log_request_total{methodName=\"getVpcCardInfo\",user_id!~\"${user}\"}) by (envName)",
"count_scalar(sum(log_query_request_total) by (user_id))",
"avg(log_waiting_time_average) by (exported_tenant, exported_landscape)",
"avg(task_processing_time_average{app=\"athena\"})",
"avg(log_queue_time_average) by (log_type)",
"sum(delta(product_dashboard_service_sum[2m]))",
"ceil(delta(product_dashboard_service_count[5m]))]"
};
String[] expected = {
"log_query_task_cache_hit_rate_bucket",
"log_search_by_service_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_request_total",
"log_query_request_total",
"log_waiting_time_average",
"task_processing_time_average",
"log_queue_time_average",
"product_dashboard_service_sum",
"product_dashboard_service_count"
};
Pattern pattern = Pattern.compile(".*?\(?([\w|_]+)\{?\[?.*");
testPattern(exprs, expected, pattern);
pattern = Pattern.compile(".*\(?([\w|_]+)\{?\[?.*");
testPattern(exprs, expected, pattern);
pattern = Pattern.compile(".*?\(?([\w|_]+)\{?\[?.*");
testPattern(exprs, expected, pattern);
}
private static void testPattern(String[] exprs, String[] expected, Pattern pattern) {
System.out.println("\n********** Pattern Match Test *********\n");
for (int i = 0; i < exprs.length; ++i) {
String expr = exprs[i];
Matcher matcher = pattern.matcher(expr);
if (matcher.find()) {
System.out.println("\nThe Original Expr: " + expr);
System.out.println(String.format("Expected:\t %-40s Matched:\t %-40s", expected[i], matcher.group(1)));
} else {
System.out.println("expected: " + expected[i] + " not matched");
}
}
}
更新 2018-08-06
感谢 Bohemian 的帮助,它真的启发了我(因为我一直相信 regex
可以用干净的溶液变魔术)。
后来发现expr
比我预想的要复杂,出现了如下情况:
Case # 7
input: topk(10,autoindex_online_consume_time_total_sum{app=~"$app", DTO_Name=~"$c_class"})
expected: autoindex_online_consume_time_total_sum
// to get the metric name: autoindex_online_consume_time_total_sum
// still I can make it work with small modifications as ^(?:\w+\()*(?:\d+,)*(\w+)
但是下面的一个甚至更多不同的复杂组合让我转向了可靠的方法:
Case # 8
input: sum(hue_mail_sent_attachment_bytes_total) by (app) / sum(hue_mail_sent_mails_with_attachment_total) by (app)
Expected: [hue_mail_sent_attachment_bytes_total, hue_mail_sent_mails_with_attachment_total]
现在更复杂了...甚至 不可预测 因为无法控制来自用户的 expr
输入。
所以我用更可靠和简单的解决方案实现了同样的目标:
- 首先将
distinct
指标名称存储到数据库中; - 当
expr
出现时,使用contains(String s)
; 在内存中检查它
- 仍然可能存在问题:如果某些指标名称包含其他指标名称,则过度匹配;
此正则表达式捕获了第 1 组中的目标
^(?:\w+\()*(\w+)
参见live demo。
在 java 中,获取目标:
String metricName = input.replaceAll("^(?:\w+\()*(\w+)", "");