我在河插件中使用ES,因为我使用的是couchDB,我试图在查询时使用nGrams。我基本上已经完成了我所需要的一切,除了当某人输入一个空格时,查询不能正常工作。这是因为ES标记查询的每个元素,将其按空格分割。
我需要做的是:
下面是我所做的,下面是一个问题:How to search for a part of a word with ElasticSearch
curl -X PUT 'localhost:9200/_river/myDB/_meta' -d '
{
"type" : "couchdb",
"couchdb" : {
"host" : "localhost",
"port" : 5984,
"db" : "myDB",
"filter" : null
},
"index" : {
"index" : "myDB",
"type" : "myDB",
"bulk_size" : "100",
"bulk_timeout" : "10ms",
"analysis" : {
"index_analyzer" : {
"my_index_analyzer" : {
"type" : "custom",
"tokenizer" : "standard",
"filter" : ["lowercase", "mynGram"]
}
},
"search_analyzer" : {
"my_search_analyzer" : {
"type" : "custom",
"tokenizer" : "standard",
"filter" : ["standard", "lowercase", "mynGram"]
}
},
"filter" : {
"mynGram" : {
"type" : "nGram",
"min_gram" : 2,
"max_gram" : 50
}
}
}
}
}
'然后我将为排序添加一个映射:
curl -s -XGET 'localhost:9200/myDB/myDB/_mapping'
{
"sorting": {
"Title": {
"fields": {
"Title": {
"type": "string"
},
"untouched": {
"include_in_all": false,
"index": "not_analyzed",
"type": "string"
}
},
"type": "multi_field"
},
"Year": {
"fields": {
"Year": {
"type": "string"
},
"untouched": {
"include_in_all": false,
"index": "not_analyzed",
"type": "string"
}
},
"type": "multi_field"
}
}
}
}'我已经添加了所有的信息,我使用的只是为了完整。无论如何,对于这个设置,我认为它应该可以工作,每当我试图获得一些结果时,仍然会使用这个空间来分割我的查询,例如:
http://localhost:9200/myDB/myDB/_search?q=Title:(Hello%20Wor)&pretty=true返回任何包含"Hello“和"Wor”的内容(我通常不使用括号,但我在一个示例中看到了它们,但结果似乎非常相似)。
任何帮助都是真正感谢的,因为这让我非常苦恼。
更新:最后,我意识到我不需要nGram。一个正常的索引就可以了;只需用‘AND’替换查询的空格就可以了。
示例:
Query: "Hello World" ---> Replaced as "(*Hello And World*)"发布于 2012-10-27 20:32:19
现在没有弹性搜索装置,但这可能是医生的帮助吗?
http://www.elasticsearch.org/guide/reference/query-dsl/match-query.html
Types of Match Queries
boolean
The default match query is of type boolean. It means that the text provided is analyzed and the analysis process constructs a boolean query from the provided text. The operator flag can be set to or or and to control the boolean clauses (defaults to or).
The analyzer can be set to control which analyzer will perform the analysis process on the text. It default to the field explicit mapping definition, or the default search analyzer.
fuzziness can be set to a value (depending on the relevant type, for string types it should be a value between 0.0 and 1.0) to constructs fuzzy queries for each term analyzed. The prefix_length and max_expansions can be set in this case to control the fuzzy process. If the fuzzy option is set the query will use constant_score_rewrite as its rewrite method the rewrite parameter allows to control how the query will get rewritten.
Here is an example when providing additional parameters (note the slight change in structure, message is the field name):
{
"match" : {
"message" : {
"query" : "this is a test",
"operator" : "and"
}
}
}https://stackoverflow.com/questions/13103605
复制相似问题