文章/答案/技术大牛

发布

社区首页 >问答首页 >如何按匹配词的顺序排序结果集

问如何按匹配词的顺序排序结果集
EN

Stack Overflow用户

提问于 2019-09-11 14:46:44

回答 1查看 126关注 0票数 1

如何按照匹配词的顺序对结果集进行排序？

我有几个词"heinz meyer“

我的查询返回：

海因茨·迈耶
Heinz Meyer GmbH Heizung-Sanit r
海因茨·迈耶
卡尔-海因茨梅耶GmbH

但我需要，按与下一步相匹配的位置排序：

海因茨·迈耶
Heinz Meyer GmbH Heizung-Sanit r
海因茨·迈耶
卡尔-海因茨梅耶GmbH

我的问题是：

    {
        "query": {
            "bool": {
                "must": [{
                    "wildcard": {
                        "name": "heinz*"
                    }
                }, {
                    "wildcard": {
                        "name": "meyer*"
                    }
                }],
                "must_not": [],
                "should": [],
                "filter": {
                    "bool": {
                        "must": [{
                            "range": {
                                "latestRevenueStatistics.revenue": {
                                    "gte": "0",
                                    "lte": "40000000"
                                }
                            }
                        }, {
                            "range": {
                                "latestRevenueStatistics.number_of_employees": {
                                    "gte": "0",
                                    "lte": "300"
                                }
                            }
                        }, {
                            "term": {
                                "addresses.postal_code_length": 5
                            }
                        }]
                    }
                }
            }
        },
        "from": 0,
        "size": 10
    }

最后解决办法：

{
    "query": {
        "bool": {
            "must": [{
                "wildcard": {
                    "name": "heinz*"
                }
            }, {
                "wildcard": {
                    "name": "mayer*"
                }
            }, {
                "span_near": {
                    "clauses": [{
                        "span_term": {
                            "name": {
                                "value": "heinz"
                            }
                        }
                    }, {
                        "span_term": {
                            "name": {
                                "value": "mayer"
                            }
                        }
                    }],
                    "slop": 4,
                    "in_order": true
                }
            }],
            "must_not": [],
            "should": [{
                "span_first": {
                    "match": {
                        "span_term": {
                            "name": "heinz"
                        }
                    },
                    "end": 1
                }
            }, {
                "span_first": {
                    "match": {
                        "span_term": {
                            "name": "mayer"
                        }
                    },
                    "end": 2
                }
            }],
            "filter": {
                "bool": {
                    "must": [{
                        "range": {
                            "latestRevenueStatistics.revenue": {
                                "gte": "0",
                                "lte": "40000000"
                            }
                        }
                    }, {
                        "range": {
                            "latestRevenueStatistics.number_of_employees": {
                                "gte": "0",
                                "lte": "300"
                            }
                        }
                    }, {
                        "term": {
                            "addresses.postal_code_length": 5
                        }
                    }]
                }
            }
        }
    },
    "from": 0,
    "size": 10
}

elasticsearch

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-09-11 17:17:27

您可以使用跨距优先、跨期和Span近查询的组合实现匹配查询。

为了简单起见，我创建了一个示例索引，其中只有一个字段标记为name，类型为文本，以及下面的文档。

文档：

POST sortindex/_doc/1
{
  "name": "Heinz A. Meyer"
}

POST sortindex/_doc/2
{
  "name": "Heinz Meyer GmbH Heizung-Sanitär"
}

POST sortindex/_doc/3
{
  "name": "Heinz Meyer"
}

POST sortindex/_doc/4
{
  "name": "Karl-Heinz Meyer GmbH"
}

查询：

POST sortindex/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "span_near": {               <---- Span Near Query
            "clauses": [
              {
                "span_term": {         <---- Span Term Query
                  "name": {
                    "value": "heinz"
                  }
                }
              },
              {
                "span_term": {
                  "name": {
                    "value": "meyer"
                  }
                }
              }
            ],
            "slop": 4,                 <---- Retrieve all docs having both heinz and meyer with distance of <= 4 words
            "in_order": true           <---- Heinz must always come before Meyer 
          }     
        }
      ],
      "should": [
        {
          "span_first": {              <---- Span First Query
            "match": {
              "span_term": {           <---- Span Term Query
                "name": "heinz"
              }
            },
            "end": 1                   <----  Retrieve docs having heinz's postition <= 1 and > 0 i.e. the first word
          }
        }
      ]
    }
  }
}

注意，Span Near放在must子句中，而Span First放在should子句中。这样，符合should子句的文档将获得比不匹配的文档更高的分数。

在内部，我们使用Span Term进行搜索，这只不过是一个术语查询，但对于Span查询来说，这是特别的意思。

如果您想了解更多关于Span查询的内容，我建议您浏览一下这些链接。

从链接中：

Span查询是低级别的位置查询，它提供对指定术语的顺序和邻近性的专家控制。它们通常用于执行关于法律文件或专利的非常具体的查询。

响应：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 0.38327998,
    "hits" : [
      {
        "_index" : "sortindex",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.38327998,
        "_source" : {
          "name" : "Heinz Meyer"
        }
      },
      {
        "_index" : "sortindex",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.26893127,
        "_source" : {
          "name" : "Heinz Meyer GmbH Heizung-Sanitär"
        }
      },
      {
        "_index" : "sortindex",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.25940484,
        "_source" : {
          "name" : "Heinz A. Meyer"
        }
      },
      {
        "_index" : "sortindex",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.19908611,
        "_source" : {
          "name" : "Karl-Heinz Meyer GmbH"
        }
      }
    ]
  }
}

您可以继续并将上述查询添加到您拥有的查询中。

希望这能有所帮助！

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/57891689

复制

相似问题

问如何按匹配词的顺序排序结果集
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何按匹配词的顺序排序结果集EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何按匹配词的顺序排序结果集
EN