文章/答案/技术大牛

发布

社区首页 >问答首页 >Elasticsearch模糊查询忽略boost因子？

问Elasticsearch模糊查询忽略boost因子？
EN

Stack Overflow用户

提问于 2015-02-21 14:18:04

回答 2查看 2.6K关注 0票数 0

当我运行这个查询时：

GET /index_for_test/_search
{
    "query": {
        "multi_match": {
            "query":       "Italian",
            "type":        "most_fields",
            "fields":      [ "name^2", "categories" ],
        }
    }
}

它表明了这一结果：

{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 0.04012554,
      "hits": [
         {
            "_index": "index_for_test",
            "_type": "business",
            "_id": "1269493995",
            "_score": 0.04012554,
            "_source": {
               "name": "Bono Italian Restaurant",
               "categories": [
                  "Pizza"
               ]
            }
         },
         {
            "_index": "index_for_test",
            "_type": "business",
            "_id": "2017788160",
            "_score": 0.014542127,
            "_source": {
               "name": "Pizza Perperook",
               "categories": [
                  "Italian Food"
               ]
            }
         }
      ]
   }
}

但是，当我在这个查询中添加模糊性时：

GET /index_for_test/_search
{
    "query": {
        "multi_match": {
            "query":       "Italian",
            "type":        "most_fields",
            "fields":      [ "name^2", "categories" ],
            "fuzziness":2
        }
    }
}

它将忽略boost因子并显示此结果：

{
   "took": 28,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 0.095891505,
      "hits": [
         {
            "_index": "index_for_test",
            "_type": "business",
            "_id": "2017788160",
            "_score": 0.095891505,
            "_source": {
               "name": "Pizza Perperook",
               "categories": [
                  "Italian Food"
               ]
            }
         },
         {
            "_index": "index_for_test",
            "_type": "business",
            "_id": "1269493995",
            "_score": 0.076713204,
            "_source": {
               "name": "Bono Italian Restaurant",
               "categories": [
                  "Pizza"
               ]
            }
         }
      ]
   }
}

由于我两次提升name字段(使用name^2作为字段)，它应该显示与第一个查询相同的结果，但它似乎忽略了boost因子。

我使用其他类型的查询(query_string，fuzzy_like_this)，并遇到了同样的问题。

编辑：

GET /index_for_test/_search?explain=true
{
    "query": {
        "multi_match": {
            "query":       "پیتزا",
            "type":        "most_fields",
            "fields":      [ "name^2", "categories" ]
        }
    }
}

基于?explain=true的模糊搜索结果：

{
   "took": 25,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 0.05015693,
      "hits": [
         {
            "_shard": 1,
            "_node": "ZTZ37EpAR1W9e4Qqwk0O5Q",
            "_index": "index_for_test",
            "_type": "business",
            "_id": "2017788160",
            "_score": 0.05015693,
            "_source": {
               "name": "پیتزا پرپروک",
               "categories": [
                  "غذای ایتالیایی"
               ]
            },
            "_explanation": {
               "value": 0.05015693,
               "description": "product of:",
               "details": [
                  {
                     "value": 0.10031386,
                     "description": "sum of:",
                     "details": [
                        {
                           "value": 0.10031386,
                           "description": "weight(name:پیتزا^2.0 in 0) [PerFieldSimilarity], result of:",
                           "details": [
                              {
                                 "value": 0.10031386,
                                 "description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
                                 "details": [
                                    {
                                       "value": 0.5230591,
                                       "description": "queryWeight, product of:",
                                       "details": [
                                          {
                                             "value": 2,
                                             "description": "boost"
                                          },
                                          {
                                             "value": 0.30685282,
                                             "description": "idf(docFreq=1, maxDocs=1)"
                                          },
                                          {
                                             "value": 0.8522964,
                                             "description": "queryNorm"
                                          }
                                       ]
                                    },
                                    {
                                       "value": 0.19178301,
                                       "description": "fieldWeight in 0, product of:",
                                       "details": [
                                          {
                                             "value": 1,
                                             "description": "tf(freq=1.0), with freq of:",
                                             "details": [
                                                {
                                                   "value": 1,
                                                   "description": "termFreq=1.0"
                                                }
                                             ]
                                          },
                                          {
                                             "value": 0.30685282,
                                             "description": "idf(docFreq=1, maxDocs=1)"
                                          },
                                          {
                                             "value": 0.625,
                                             "description": "fieldNorm(doc=0)"
                                          }
                                       ]
                                    }
                                 ]
                              }
                           ]
                        }
                     ]
                  },
                  {
                     "value": 0.5,
                     "description": "coord(1/2)"
                  }
               ]
            }
         },
         {
            "_shard": 2,
            "_node": "ZTZ37EpAR1W9e4Qqwk0O5Q",
            "_index": "index_for_test",
            "_type": "business",
            "_id": "1269493995",
            "_score": 0.023267403,
            "_source": {
               "name": "رستوران ایتالیایی بونو",
               "categories": [
                  "پیتزا"
               ]
            },
            "_explanation": {
               "value": 0.023267403,
               "description": "product of:",
               "details": [
                  {
                     "value": 0.046534806,
                     "description": "sum of:",
                     "details": [
                        {
                           "value": 0.046534806,
                           "description": "weight(categories:پیتزا in 0) [PerFieldSimilarity], result of:",
                           "details": [
                              {
                                 "value": 0.046534806,
                                 "description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
                                 "details": [
                                    {
                                       "value": 0.15165187,
                                       "description": "queryWeight, product of:",
                                       "details": [
                                          {
                                             "value": 0.30685282,
                                             "description": "idf(docFreq=1, maxDocs=1)"
                                          },
                                          {
                                             "value": 0.49421698,
                                             "description": "queryNorm"
                                          }
                                       ]
                                    },
                                    {
                                       "value": 0.30685282,
                                       "description": "fieldWeight in 0, product of:",
                                       "details": [
                                          {
                                             "value": 1,
                                             "description": "tf(freq=1.0), with freq of:",
                                             "details": [
                                                {
                                                   "value": 1,
                                                   "description": "termFreq=1.0"
                                                }
                                             ]
                                          },
                                          {
                                             "value": 0.30685282,
                                             "description": "idf(docFreq=1, maxDocs=1)"
                                          },
                                          {
                                             "value": 1,
                                             "description": "fieldNorm(doc=0)"
                                          }
                                       ]
                                    }
                                 ]
                              }
                           ]
                        }
                     ]
                  },
                  {
                     "value": 0.5,
                     "description": "coord(1/2)"
                  }
               ]
            }
         },
         {
            "_shard": 3,
            "_node": "ZTZ37EpAR1W9e4Qqwk0O5Q",
            "_index": "index_for_test",
            "_type": "business",
            "_id": "1203656733",
            "_score": 0.023267403,
            "_source": {
               "name": "چمن",
               "categories": [
                  "پیتزا"
               ]
            },
            "_explanation": {
               "value": 0.023267403,
               "description": "product of:",
               "details": [
                  {
                     "value": 0.046534806,
                     "description": "sum of:",
                     "details": [
                        {
                           "value": 0.046534806,
                           "description": "weight(categories:پیتزا in 0) [PerFieldSimilarity], result of:",
                           "details": [
                              {
                                 "value": 0.046534806,
                                 "description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
                                 "details": [
                                    {
                                       "value": 0.15165187,
                                       "description": "queryWeight, product of:",
                                       "details": [
                                          {
                                             "value": 0.30685282,
                                             "description": "idf(docFreq=1, maxDocs=1)"
                                          },
                                          {
                                             "value": 0.49421698,
                                             "description": "queryNorm"
                                          }
                                       ]
                                    },
                                    {
                                       "value": 0.30685282,
                                       "description": "fieldWeight in 0, product of:",
                                       "details": [
                                          {
                                             "value": 1,
                                             "description": "tf(freq=1.0), with freq of:",
                                             "details": [
                                                {
                                                   "value": 1,
                                                   "description": "termFreq=1.0"
                                                }
                                             ]
                                          },
                                          {
                                             "value": 0.30685282,
                                             "description": "idf(docFreq=1, maxDocs=1)"
                                          },
                                          {
                                             "value": 1,
                                             "description": "fieldNorm(doc=0)"
                                          }
                                       ]
                                    }
                                 ]
                              }
                           ]
                        }
                     ]
                  },
                  {
                     "value": 0.5,
                     "description": "coord(1/2)"
                  }
               ]
            }
         }
      ]
   }
}

elasticsearch

fuzzy-search

回答 2

Stack Overflow用户

回答已采纳

发布于 2015-02-21 15:59:01

Boost并没有被忽视.你只是在分数中添加了一个模糊成分，这改变了整个排序。如果您使用?explain=true运行查询，您将得到分数是如何构造的调试转储。

对于第一个查询，需要精确的匹配。与most_fields相结合，评分相对简单:查找在最多字段中具有最精确匹配的文档。

您的第二个查询引入了两个编辑的模糊性。这意味着在两个字符编辑内的任何单词都将匹配。这可以极大地改变匹配令牌的数量。

如果你发布了explain调试输出，我可以帮你分析，给你一个更清晰的解释，但基本上答案是:助推仍然有效，你的分数只是因为模糊匹配而改变。

票数 1

Stack Overflow用户

发布于 2015-02-21 17:25:50

正如Zach所提供的，我将查询更改为以下内容，以实现我的结果：

GET /index_for_test/_search
{
    "query": {
      "bool": {
        "should": [
          {
            "multi_match": {
            "query":       "Italian",
            "type":        "most_fields",
            "fields":      [ "name^2", "categories" ],
            "boost":10
          }
          },
          {
            "multi_match": {
            "query":       "Italian",
            "type":        "most_fields",
            "fields":      [ "name^2", "categories" ],
            "fuzziness":2
          }
          }
        ]
      }
    }
}

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/28646800

复制

相似问题

问Elasticsearch模糊查询忽略boost因子？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch模糊查询忽略boost因子？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Elasticsearch模糊查询忽略boost因子？
EN