我希望执行这样的查询,以便查询显示输出当且仅当查询中的所有单词都以字符串或查询的形式出现在给定字符串中,例如-
让text =“垃圾桶”
所以如果我质疑
“装束”
它应该返回“垃圾桶”
如果我质疑
“垃圾卡”
它应该返回“垃圾桶”
但如果我质疑
“垃圾b”
它不应该归还任何东西
我试过使用子字符串和match,但他们都没有为我完成任务。
发布于 2020-08-04 04:12:49
我想你想做一个前缀查询。请尝试使用以下前缀查询
GET /test_index/_search
{
"query": {
"prefix": {
"my_keyword": {
"value": "garbage b"
}
}
}
}然而,这种前缀查询的性能并不好。
您可以使用自定义的前缀分析器来尝试以下查询。首先,创建一个新的索引:
PUT /test_index
{
"settings": {
"index": {
"number_of_shards": "1",
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "1",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete": {
"filter": [
"lowercase",
"autocomplete_filter"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "1"
}
},
"mappings": {
"properties": {
"my_text": {
"analyzer": "autocomplete",
"type": "text"
},
"my_keyword": {
"type": "keyword"
}
}
}
}第二,在此索引中插入数据:
PUT /test_index/_doc/1
{
"my_text": "garbage can",
"my_keyword": "garbage can"
}使用“垃圾c”查询
GET /test_index/_search
{
"query": {
"term": {
"my_text": "garbage c"
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.45802015,
"hits" : [
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.45802015,
"_source" : {
"my_text" : "garbage can",
"my_keyword" : "garbage can"
}
}
]
}
}查询“垃圾b”
GET /test_index/_search
{
"query": {
"term": {
"my_text": "garbage b"
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}如果不想执行前缀查询,可以尝试以下通配符查询。请记住,性能是不好的,你也可以尝试使用立方体分析器来优化它。
GET /test_index/_search
{
"query": {
"wildcard": {
"my_keyword": {
"value": "*garbage c*"
}
}
}
}新编辑部件
我不确定我是否想要你这次真的想要..。
无论如何,请尝试使用以下_mapping和查询:
1.创建索引
PUT /test_index
{
"settings": {
"index": {
"max_ngram_diff": 50,
"number_of_shards": "1",
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "ngram",
"min_gram": 1,
"max_gram": 51,
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"autocomplete": {
"filter": [
"lowercase",
"autocomplete_filter"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "1"
}
},
"mappings": {
"properties": {
"my_text": {
"analyzer": "autocomplete",
"type": "text"
},
"my_keyword": {
"type": "keyword"
}
}
}
}2.插入一些smaple
PUT /test_index/_doc/1
{
"my_text": "test garbage can",
"my_keyword": "test garbage can"
}
PUT /test_index/_doc/2
{
"my_text": "garbage",
"my_keyword": "garbage"
}3.查询
GET /test_index/_search
{
"query": {
"term": {
"my_text": "bage c"
}
}
}请注意:
max_ngram_diff、min_gram和max_gram。发布于 2020-08-04 05:40:00
您可以使用边缘N标记程序对数据进行索引。您还可以在最新的7.8版本中使用自定义token_chars!
查看文档以获得更多详细信息:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html
https://stackoverflow.com/questions/63237202
复制相似问题