NEST Elastic 查询工作几个小时然后停止

NEST Elastic query works for a few hours and then stops

我在使用 Elasticsearch 5.4 和 NEST 5.4.0 时遇到了一个非常奇怪的情况。我编写了一个简单的 C# 控制台应用程序,它每分钟查询一次 Elastic 和 returns hits/documents,并将它们存储在 Postgres 数据库中以供进一步处理。它在几个小时内运行良好,然后开始 return 使用有效的 .DebugInformation 和零文档进行查询,但我可以在 Kibana Dev Tools 中复制和 运行 相同的查询并获得结果。当我停止控制台应用程序并重新启动它时,它会成功查询并且 returns 命中,一切都很好。以下是代码示例和日志条目。我想弄清楚为什么它会在一段时间后停止工作。我在 NEST 中使用 .NET Core C# 控制台应用程序。

我不确定 .DebugInformation 是否正在 return 获取有关 ES 运行状况的任何信息,以查看 ES 集群当时是否存在问题,如 429s。我查看了 elasticsearch.log,它只显示了插入内容。不知道有没有地方可以找到查询问题

有没有人遇到过 NEST 工作正常然后停止的问题?

这是一个包含两个 运行 的查询日志。第一个 运行s 很好,returns 9 行(由于敏感数据,我删除了示例中除一个以外的所有行)然后它再次 运行s 但 returns 为零点击。在我再次重新启动 C# 代码之前,此后的所有查询的命中率为零。相同的开始和结束日期输入,我在 Elastic 中获得了真实数据....

2017-09-12 16:41:59.799 -05:00 [Information] Dates: Start 9/12/2017 4:41:00 PM End 9/12/2017 4:42:00 PM
2017-09-12 16:41:59.800 -05:00 [Debug] AlertService._queryErrors: 9/12/2017 4:41:00 PM End 9/12/2017 4:42:00 PM
2017-09-12 16:41:59.811 -05:00 [Debug] AlertService._elasticQueryLogErrors: elasticQuery {
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:41:00Z",
                                                    "lte": "2017-09-12T21:42:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
2017-09-12 16:41:59.811 -05:00 [Debug] AlertService._elasticQueryLogErrors: searchResponse 9 : Valid NEST response built from a successful low level call on POST: /filebeat-%2A/_search
# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://servername:9200/ Took: 00:00:00.0112120
# Request:
{"from":0,"query":{
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:41:00Z",
                                                    "lte": "2017-09-12T21:42:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
# Response:
{"took":7,"timed_out":false,"_shards":{"total":215,"successful":215,"failed":0},"hits":{"total":9,"max_score":0.0,"hits":[{"_index":"filebeat-2017.09.12","_type":"log","_id":"AV54Cdl2yay890uCUru4","_score":0.0,"_source":{"offset":237474,"target_url":"...url...","input_type":"log","source":"....source....","type":"log","tags":["xxx-001","beats_input_codec_plain_applied","@timestamp":"2017-09-12T21:41:02.000Z","@version":"1","beat":{"hostname":"xxx-001","name":"xxx-001","version":"5.4.3"},"host":"xxx-001","timestamp":"09/12/2017 16:41:02","error_data":"EXCEPTION, see detail log"}]}

2017-09-12 16:41:59.811 -05:00 [Debug] AlertService._queryErrors: (result) System.Collections.Generic.List`1[XX.Alerts.Core.Models.FilebeatModel]
2017-09-12 16:41:59.811 -05:00 [Information] ErrorCount: 9

2017-09-12 16:42:00.222 -05:00 [Information] Dates: Start 9/12/2017 4:42:00 PM End 9/12/2017 4:43:00 PM
2017-09-12 16:42:00.222 -05:00 [Debug] AlertService._queryErrors: 9/12/2017 4:42:00 PM End 9/12/2017 4:43:00 PM
2017-09-12 16:42:00.229 -05:00 [Debug] AlertService._elasticQueryLogErrors: elasticQuery {
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:42:00Z",
                                                    "lte": "2017-09-12T21:43:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
2017-09-12 16:42:00.229 -05:00 [Debug] AlertService._elasticQueryLogErrors: searchResponse 0 : Valid NEST response built from a successful low level call on POST: /filebeat-%2A/_search
# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://servername:9200/ Took: 00:00:00.0066742
# Request:
{"from":0,"query":{
                    "bool": {
                        "filter":
                            [ {
                                "range":
                                { "@timestamp": { "gte": "2017-09-12T21:42:00Z",
                                                    "lte": "2017-09-12T21:43:00Z" }
                                }
                              },
                              {
                                "exists" : { "field" : "error_data" }
                              }
                            ]
                        } }
# Response:
{"took":4,"timed_out":false,"_shards":{"total":215,"successful":215,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

2017-09-12 16:42:00.229 -05:00 [Debug] AlertService._queryErrors: (result) System.Collections.Generic.List`1[Q2.Alerts.Core.Models.FilebeatModel]
2017-09-12 16:42:00.229 -05:00 [Information] ErrorCount: 0

这是我的 NEST 查询

    public IEnumerable<FilebeatModel> _elasticQueryLogErrors(DateTime startDate, DateTime endDate)
    {
        //var startDateString = startDate.Kind;
        //var endDateString = endDate.Kind;

        var searchQuery = @"{
                ""bool"": {
                    ""filter"":
                        [ {
                            ""range"":
                            { ""@timestamp"": { ""gte"": """ + string.Format("{0:yyyy-MM-ddTHH:mm:ssZ}", startDate.ToUniversalTime()) +
                    @""",
                                                ""lte"": """ + string.Format("{0:yyyy-MM-ddTHH:mm:ssZ}", endDate.ToUniversalTime()) + @""" }
                            }
                          },
                          {
                            ""exists"" : { ""field"" : ""error_data"" }
                          }
                        ]
                    } }";

        var searchResponse = _es.Search<FilebeatModel>(s => s
            .AllTypes()
            .From(0)
            .Query(query => query.Raw(searchQuery)));

        _logger.LogDebug("AlertService._elasticQueryLogErrors: elasticQuery " + searchQuery);

        _logger.LogDebug("AlertService._elasticQueryLogErrors: searchResponse " + searchResponse.Hits.Count + " : " + searchResponse.DebugInformation);

        foreach (var searchResponseHit in searchResponse.Hits)
        {
            searchResponseHit.Source.Id = searchResponseHit.Id;
        }

        return searchResponse.Documents.ToList();
    }

这是我的 class 的构造函数,它是 运行 循环上述代码。循环可能 运行 数小时或数天。这可能是我的问题所在,即连接是如何长期构建的。当我关闭并重新打开应用程序时 运行 在此期间错过的查询 运行 很好。

    public AlertService(IOptions<ElasticConfig> elasticConfig, AlertsDbContext context, ILogger<AlertService> logger)
    {
        _logger = logger;

        _logger.LogDebug(" *** Entering AlertService");
        string elasticConnectionString = elasticConfig.Value.ConnectionString;
        string defaultIndex = elasticConfig.Value.IndexName;

        var settings = new ConnectionSettings(
                new Uri(elasticConnectionString))
            .ConnectionLimit(-1)
            .DisableDirectStreaming()
            .DefaultIndex(defaultIndex);

        _es = new ElasticClient(settings);
        _context = context;
    }

我已经确认这是我自己创建的竞争条件,因为正如 Val 在评论中指出的那样,内部计时器在调用 Elastic 时逐渐增加。这不是 NEST 中的错误,而只是我的代码及其时间。我已将使用 System.Threading.Timer 的调用对齐到每次经过的单个回调中,并且它可以正常工作。感谢 Val 的协助