我在elasticsearch中对数据进行索引,使用大容量方法来减少elasticsearch中索引数据的时间。问题是在使用bulk方法后,我以前的查询失败了(即返回0次命中),甚至简单的查询匹配查询也会返回零匹配。
elasticsearch版本6.3,语言- Python,库-Python Elasticsearch客户机
最初,我使用这段代码在Elasticsearch中索引数据。
temp_entities_list = []
for each_row in master_entities:
entity_data = {}
entity_data['entity_id'] = each_row.id
entity_data['createdat'] = each_row.createdat
entity_data['updatedat'] = each_row.updatedat
entity_data['individual_business_tag']=each_row.individual_business_tag
temp_entities_list.append(entity_data)
def indexing(entity_list):
for entity in entity_list:
index_name = "demo"
yield{
"_index":index_name,
"_type":"businesses",
"_source" :{
"body":entity
}
}
try:
helpers.bulk(es,testing(temp_entities_list))
except Exception as exe:
indexing_logger.exception("Error:"+str(exe))这是我的旧查询,当我一次索引单个对象时,它工作得很好。
{
"query": {
"match" : {
"entity_name" : {
"query" : "Premium Market",
"operator" : "and"
}
}
}
}根据文档https://elasticsearch-py.readthedocs.io/en/master/helpers.html#example,我尝试了以下代码
def indexing(entity_list):
for entity in entity_list:
index_name = "demo"
yield{
"_index":index_name,
"_type":"businesses",
"doc" :{entity
}
}获取此错误:
Traceback (most recent call last):
File "sql-to-elasticsearch.py", line 90, in <module>
helpers.bulk(es,indexing(temp_entities_list),chunk_size=500,)
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\helpers\__init__.py", line 257, in bulk
for ok, item in streaming_bulk(client, actions, *args, **kwargs):
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\helpers\__init__.py", line 180, in streaming_bulk
client.transport.serializer):
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\site-packages\elasticsearch\helpers\__init__.py", line 58, in _chunk_actions
for action, data in actions:
File "sql-to-elasticsearch.py", line 81, in indexing
index_name = "demo"
TypeError: unhashable type: 'dict'发布于 2019-08-30 07:39:24
我认为这会导致错误:
"doc" :{entity}因为您的entity似乎是一个字典,并且您试图将它放在一个集合中,而且在Python中只有不可变的对象可以存储在集合中(字符串、整数、浮点数、元组.)因为他们是可理解的。
请注意,此表示法用于设置{}。
如果您想把它放入容器中,我建议使用一个列表:
"doc" : [entity]或者,如果您只是指向带有doc的entity,请使用:
"doc" : entity希望这能有所帮助。
https://stackoverflow.com/questions/57722270
复制相似问题