文章/答案/技术大牛

发布

社区首页 >问答首页 >Elasticsearch - River和nGrams

问Elasticsearch - River和nGrams
EN

Stack Overflow用户

提问于 2012-10-27 19:49:43

回答 1查看 700关注 0票数 1

我在河插件中使用ES，因为我使用的是couchDB，我试图在查询时使用nGrams。我基本上已经完成了我所需要的一切，除了当某人输入一个空格时，查询不能正常工作。这是因为ES标记查询的每个元素，将其按空格分割。

我需要做的是：

查询字符串中的部分文本：查询："Hello“回复：" Hello，Hello”/排除"Hello，World，Word“
根据我指定的标准对结果进行排序；
对案件不敏感。

下面是我所做的，下面是一个问题：How to search for a part of a word with ElasticSearch

curl -X PUT  'localhost:9200/_river/myDB/_meta' -d '
{
"type" : "couchdb",
"couchdb" : {
    "host" : "localhost",
    "port" : 5984,
    "db" : "myDB",
    "filter" : null
},
   "index" : {
    "index" : "myDB",
    "type" : "myDB",
    "bulk_size" : "100",
    "bulk_timeout" : "10ms",
    "analysis" : {
               "index_analyzer" : {
                          "my_index_analyzer" : {
                                        "type" : "custom",
                                        "tokenizer" : "standard",
                                        "filter" : ["lowercase", "mynGram"]
                          }
               },
               "search_analyzer" : {
                          "my_search_analyzer" : {
                                        "type" : "custom",
                                        "tokenizer" : "standard",
                                        "filter" : ["standard", "lowercase", "mynGram"]
                          }
               },
               "filter" : {
                        "mynGram" : {
                                   "type" : "nGram",
                                   "min_gram" : 2,
                                   "max_gram" : 50
                        }
               }
    }
}
}
'

然后我将为排序添加一个映射：

curl -s -XGET 'localhost:9200/myDB/myDB/_mapping' 
{
"sorting": {
       "Title": {
            "fields": {
                "Title": {
                     "type": "string"
                  }, 
                "untouched": {
                    "include_in_all": false, 
                    "index": "not_analyzed", 
                    "type": "string"
                    }
               }, 
              "type": "multi_field"
         },
        "Year": {
              "fields": {
                   "Year": {
                       "type": "string"
                       }, 
                       "untouched": {
                           "include_in_all": false, 
                           "index": "not_analyzed", 
                           "type": "string"
                         }
                     }, 
                    "type": "multi_field"
        }
     }
    }
   }'

我已经添加了所有的信息，我使用的只是为了完整。无论如何，对于这个设置，我认为它应该可以工作，每当我试图获得一些结果时，仍然会使用这个空间来分割我的查询，例如：

  http://localhost:9200/myDB/myDB/_search?q=Title:(Hello%20Wor)&pretty=true

返回任何包含"Hello“和"Wor”的内容(我通常不使用括号，但我在一个示例中看到了它们，但结果似乎非常相似)。

任何帮助都是真正感谢的，因为这让我非常苦恼。

更新：最后，我意识到我不需要nGram。一个正常的索引就可以了；只需用‘AND’替换查询的空格就可以了。

示例：

 Query: "Hello World"  --->  Replaced as "(*Hello And World*)"

lucene

couchdb

elasticsearch

n-gram

database

回答 1

Stack Overflow用户

发布于 2012-10-27 20:32:19

现在没有弹性搜索装置，但这可能是医生的帮助吗？

http://www.elasticsearch.org/guide/reference/query-dsl/match-query.html

Types of Match Queries

boolean

The default match query is of type boolean. It means that the text provided is analyzed and the analysis process constructs a boolean query from the provided text. The operator flag can be set to or or and to control the boolean clauses (defaults to or).

The analyzer can be set to control which analyzer will perform the analysis process on the text. It default to the field explicit mapping definition, or the default search analyzer.

fuzziness can be set to a value (depending on the relevant type, for string types it should be a value between 0.0 and 1.0) to constructs fuzzy queries for each term analyzed. The prefix_length and max_expansions can be set in this case to control the fuzzy process. If the fuzzy option is set the query will use constant_score_rewrite as its rewrite method the rewrite parameter allows to control how the query will get rewritten.

Here is an example when providing additional parameters (note the slight change in structure, message is the field name):

{
    "match" : {
        "message" : {
            "query" : "this is a test",
            "operator" : "and"
        }
    }
}

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/13103605

复制

相似问题

问Elasticsearch - River和nGrams
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch - River和nGramsEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch - River和nGrams
EN