Elasticsearch延迟存储并立即搜索
Elasticsearch delay in store and search immediately
我正在使用
elasticsearch 和 python。并在 python.
中使用 dsl
驱动程序
我的脚本如下。
import time
from elasticsearch_dsl import DocType, String
from elasticsearch import exceptions as es_exceptions
from elasticsearch_dsl.connections import connections
ELASTICSEARCH_INDEX = 'test'
class StudentDoc(DocType):
student_id = String(required=True)
tags = String(null_value=[])
class Meta:
index = ELASTICSEARCH_INDEX
def save(self, **kwargs):
'''
Override to set metadata id
'''
self.meta.id = self.student_id
return super(StudentDoc, self).save(**kwargs)
# Define a default Elasticsearch client
connections.create_connection(hosts=['localhost:9200'])
# create the mappings in elasticsearch
StudentDoc.init()
student_doc_obj = \
StudentDoc(
student_id=str(1),
tags=['test'])
try:
student_doc_obj.save()
except es_exceptions.SerializationError as ex:
# catch both exception raise by elasticsearch
LOGGER.error('Error while creating elasticsearch data')
LOGGER.exception(ex)
else:
print "*"*80
print "Student Created:", student_doc_obj
print "*"*80
search_docs = \
StudentDoc \
.search().query('ids',
values=["1"])
try:
student_docs = search_docs.execute()
except es_exceptions.NotFoundError as ex:
LOGGER.error('Unable to get data from elasticsearch')
LOGGER.exception(ex)
else:
print "$"*80
print student_docs
print "$"*80
time.sleep(2)
search_docs = \
StudentDoc \
.search().query('ids',
values=["1"])
try:
student_docs = search_docs.execute()
except es_exceptions.NotFoundError as ex:
LOGGER.error('Unable to get data from elasticsearch')
LOGGER.exception(ex)
else:
print "$"*80
print student_docs
print "$"*80
在此脚本中,我正在创建 StudentDoc
并尝试在创建时访问相同的文档。我在记录 search
时收到 empty
响应。
输出
********************************************************************************
Student Created: {'student_id': '1', 'tags': ['test']}
********************************************************************************
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
<Response: []>
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
<Response: [{u'student_id': u'1', u'tags': [u'test']}]>
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
save
命令执行并存储数据,那也是为什么 search
而不是 return tat 数据。在 2
第二次睡眠后,它 return 数据。 :(
尝试与 curl
命令相同,输出相同。
echo "Create Data"
curl http://localhost:9200/test/student_doc/2 -X PUT -d '{"student_id": "2", "tags": ["test"]}' -H 'Content-type: application/json'
echo
echo "Search ID"
curl http://localhost:9200/test/student_doc/_search -X POST -d '{"query": {"ids": {"values": ["2"]}}}' -H 'Content-type: application/json'
echo
存储数据到elasticsearch有延迟吗?
是的,一旦您为新文档建立索引,在刷新索引之前它是不可用的。不过,您有几个选择,主要的是。
一个。您可以 refresh
test
索引在保存 student_doc_obj
之后搜索它之前使用基础连接:
connections.get_connection.indices.refresh(index= ELASTICSEARCH_INDEX)
乙。您可以 get
文档而不是搜索它,因为 get
是完全实时的,不需要等待刷新:
student_docs = StudentDoc.get("1")
同样,使用 curl,您只需在 PUT 调用中添加 refresh
查询字符串参数
echo "Create Data"
curl 'http://localhost:9200/test/student_doc/2?refresh=true' -X PUT -d '{"student_id": "2", "tags": ["test"]}' -H 'Content-type: application/json'
或者您可以简单地通过 id
获取文档
echo "GET ID"
curl -XGET http://localhost:9200/test/student_doc/2
我正在使用 elasticsearch 和 python。并在 python.
dsl
驱动程序
我的脚本如下。
import time
from elasticsearch_dsl import DocType, String
from elasticsearch import exceptions as es_exceptions
from elasticsearch_dsl.connections import connections
ELASTICSEARCH_INDEX = 'test'
class StudentDoc(DocType):
student_id = String(required=True)
tags = String(null_value=[])
class Meta:
index = ELASTICSEARCH_INDEX
def save(self, **kwargs):
'''
Override to set metadata id
'''
self.meta.id = self.student_id
return super(StudentDoc, self).save(**kwargs)
# Define a default Elasticsearch client
connections.create_connection(hosts=['localhost:9200'])
# create the mappings in elasticsearch
StudentDoc.init()
student_doc_obj = \
StudentDoc(
student_id=str(1),
tags=['test'])
try:
student_doc_obj.save()
except es_exceptions.SerializationError as ex:
# catch both exception raise by elasticsearch
LOGGER.error('Error while creating elasticsearch data')
LOGGER.exception(ex)
else:
print "*"*80
print "Student Created:", student_doc_obj
print "*"*80
search_docs = \
StudentDoc \
.search().query('ids',
values=["1"])
try:
student_docs = search_docs.execute()
except es_exceptions.NotFoundError as ex:
LOGGER.error('Unable to get data from elasticsearch')
LOGGER.exception(ex)
else:
print "$"*80
print student_docs
print "$"*80
time.sleep(2)
search_docs = \
StudentDoc \
.search().query('ids',
values=["1"])
try:
student_docs = search_docs.execute()
except es_exceptions.NotFoundError as ex:
LOGGER.error('Unable to get data from elasticsearch')
LOGGER.exception(ex)
else:
print "$"*80
print student_docs
print "$"*80
在此脚本中,我正在创建 StudentDoc
并尝试在创建时访问相同的文档。我在记录 search
时收到 empty
响应。
输出
********************************************************************************
Student Created: {'student_id': '1', 'tags': ['test']}
********************************************************************************
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
<Response: []>
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
<Response: [{u'student_id': u'1', u'tags': [u'test']}]>
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
save
命令执行并存储数据,那也是为什么 search
而不是 return tat 数据。在 2
第二次睡眠后,它 return 数据。 :(
尝试与 curl
命令相同,输出相同。
echo "Create Data"
curl http://localhost:9200/test/student_doc/2 -X PUT -d '{"student_id": "2", "tags": ["test"]}' -H 'Content-type: application/json'
echo
echo "Search ID"
curl http://localhost:9200/test/student_doc/_search -X POST -d '{"query": {"ids": {"values": ["2"]}}}' -H 'Content-type: application/json'
echo
存储数据到elasticsearch有延迟吗?
是的,一旦您为新文档建立索引,在刷新索引之前它是不可用的。不过,您有几个选择,主要的是。
一个。您可以 refresh
test
索引在保存 student_doc_obj
之后搜索它之前使用基础连接:
connections.get_connection.indices.refresh(index= ELASTICSEARCH_INDEX)
乙。您可以 get
文档而不是搜索它,因为 get
是完全实时的,不需要等待刷新:
student_docs = StudentDoc.get("1")
同样,使用 curl,您只需在 PUT 调用中添加 refresh
查询字符串参数
echo "Create Data"
curl 'http://localhost:9200/test/student_doc/2?refresh=true' -X PUT -d '{"student_id": "2", "tags": ["test"]}' -H 'Content-type: application/json'
或者您可以简单地通过 id
获取文档echo "GET ID"
curl -XGET http://localhost:9200/test/student_doc/2