Elasticsearch Java API 模糊搜索测试

Elasticsearch Java API fuzzy search test

我在使用原生 elasticsearch java api 时遇到问题。 我想创建一个方法来通过名称属性搜索对象。 到目前为止很简单,之后我想为此方法创建一个 JUnit 测试,问题就在这里开始了。

    @Test
public void nameSearchTest() throws ElasticSearchUnavailableException, IOException{
    String nameToSearch = "fuzzyText";
    TrainingToCreate t = new TrainingToCreate();
    t.setName(nameToSearch);
    //Create two Trainings to find sth
    String id1 = ElasticIndexer.index(t);
    String id2 = ElasticIndexer.index(t);
    //For creating delay, throws Exception if id doesn't exist
    ElasticGetter.getTrainingById(id1);
    ElasticGetter.getTrainingById(id2);

    int hits = 0;
    ArrayList<Training> trainings = ElasticSearch.fuzzySearchTrainingByName(nameToSearch, Integer.MAX_VALUE, 0);
    System.out.println("First id: " + id1);
    System.out.println("Second id: " + id2);
    String idOfTraining;
    if(trainings.size() == 0){
        System.out.println("Zero hits could be found.");
    }
    //just for printing id's of results
    //-------------------------------------------------
    for (int i = 0; i < trainings.size(); i++) {
        idOfTraining = trainings.get(i).getId();
        System.out.println("Training: "+i+" id: "+ idOfTraining);
    }
    //-------------------------------------------------
    for (Training training : trainings) {
        if(training.getId().equals(id1)||training.getId().equals(id2)){
            hits++;
        }
    }
    assertTrue(hits>=2);
    ElasticDelete.deleteTrainingById(id1);
    ElasticDelete.deleteTrainingById(id2);
}

有时这个测试没有问题,有时搜索结果什么也没有,即使我已经创建了一些文件来确保可以找到一些东西。但是如果我在 elasticsearch 的数据库中查看文档存在,所以我猜我的实现不正确或者搜索 api 有严重的延迟。

这里是正在测试的代码:

public static ArrayList<Training> fuzzySearchTrainingByName(String name, int size, int offset) throws ElasticSearchUnavailableException, JsonParseException, JsonMappingException, IOException {
    Client client = clientFactory.getClient(configService.getConfig().getElasticSearchIp(), configService
            .getConfig().getElasticSearchPort());
    return ElasticSearch.fuzzySearchDocument(client, "trainings", "training", "name", name, size, offset);
}

private static ArrayList<Training> fuzzySearchDocument(Client client, String index, String type, String field, String value, int size, int offset) throws JsonParseException, JsonMappingException, IOException {
    QueryBuilder query = fuzzyQuery(field, value);

    SearchResponse response = client.prepareSearch(index).setTypes(type)
            .setQuery(query).setSize(size).setFrom(offset).execute().actionGet();

    SearchHits hits = response.getHits();

    TrainingToCreate source = null;
    ObjectMapper mapper = new ObjectMapper();
    ArrayList<Training> trainings = new ArrayList<Training>();

    for (SearchHit searchHit : hits) {
        source = mapper.readValue(searchHit.getSourceAsString(), TrainingToCreate.class);
        trainings.add(TrainingFactory.getTraining(searchHit.getId(), source));
    }
    return trainings;

}

我在 Java 8 使用 Elastic 1.7.0 有没有人重新考虑问题的位置? 如果有人需要更多信息,请随时询问。

Elasticsearch 是 near real time,这意味着在索引文档和可搜索文档之间存在一些延迟(默认为 1 秒)。您可以通过在 运行 查询之前简单地刷新索引来解决这个问题。

所以我会在您为示例文档编制索引后执行此操作...

public void nameSearchTest() throws ElasticSearchUnavailableException, IOException{
    String nameToSearch = "fuzzyText";
    TrainingToCreate t = new TrainingToCreate();
    t.setName(nameToSearch);
    //Create two Trainings to find sth
    String id1 = ElasticIndexer.index(t);
    String id2 = ElasticIndexer.index(t);

    // REFRESH YOUR INDICES (just after indexing)
    client().admin().indices().prepareRefresh().execute().actionGet();

... 或者就在 fuzzySearchDocument

的最开始
 private static ArrayList<Training> fuzzySearchDocument(Client client, String index, String type, String field, String value, int size, int offset) throws JsonParseException, JsonMappingException, IOException {
     // REFRESH YOUR INDICES (just before searching)
     client().admin().indices().prepareRefresh().execute().actionGet();

     QueryBuilder query = fuzzyQuery(field, value);
     ...

如果您 运行 示例文档上的多个测试用例,我会选择第一个选项,否则任何选项都可以。