文章/答案/技术大牛

发布

社区首页 >问答首页 >如何按匹配顺序排序最左边的单词匹配

问如何按匹配顺序排序最左边的单词匹配
EN

Stack Overflow用户

提问于 2015-11-30 11:47:05

回答 1查看 46关注 0票数 0

如何按匹配顺序排序最左边的单词匹配

解释

根据匹配的单词对前缀查询进行排序，但在左边的单词中对匹配排序。

我做的测试

数据

DELETE /test
PUT /test

PUT /test/person/_mapping
{
  "properties": {
    "name": {
      "type": "multi_field",
      "fields": {
        "name": {"type": "string"},
        "original": {
          "type": "string", 
          "index": "not_analyzed"
        }
      }
    }
  }
}

PUT /test/person/1
{"name": "Berta Kassulke"}

PUT /test/person/2
{"name": "Kaley Bartoletti"}

PUT /test/person/3
{"name": "Kali Hahn"}

PUT /test/person/4
{"name": "Karolann Klein"}

PUT /test/person/5
{"name": "Sofia Mandez Kaloo"}

映射是为“对原始值进行排序”测试添加的。

简单查询

查询

POST /test/person/_search
{
  "query": {
    "prefix": {"name": {"value": "ka"}}
  }
}

结果

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "test",
        "_type": "person",
        "_id": "4",
        "_score": 1,
        "_source": {
          "name": "Karolann Klein"
        }
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "5",
        "_score": 1,
        "_source": {
          "name": "Sofia Mandez Kaloo"
        }
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "Berta Kassulke"
        }
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "Kaley Bartoletti"
        }
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "Kali Hahn"
        }
      }
    ]
  }
}

分拣

请求

POST /test/person/_search
{
  "query": {
    "prefix": {"name": {"value": "ka"}}
  },
  "sort": {"name": {"order": "asc"}}
}

结果

{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": null,
    "hits": [
      {
        "_index": "test",
        "_type": "person",
        "_id": "2",
        "_score": null,
        "_source": {
          "name": "Kaley Bartoletti"
        },
        "sort": [
          "bartoletti"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "1",
        "_score": null,
        "_source": {
          "name": "Berta Kassulke"
        },
        "sort": [
          "berta"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "3",
        "_score": null,
        "_source": {
          "name": "Kali Hahn"
        },
        "sort": [
          "hahn"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "5",
        "_score": null,
        "_source": {
           "name": "Sofia Mandez Kaloo"
        },
        "sort": [
          "kaloo"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "4",
        "_score": null,
        "_source": {
          "name": "Karolann Klein"
        },
        "sort": [
          "karolann"
        ]
      }
    ]
  }
}

在原值上排序

查询

POST /test/person/_search
{
  "query": {
    "prefix": {"name": {"value": "ka"}}
  },
  "sort": {"name.original": {"order": "asc"}}
}

结果

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": null,
    "hits": [
      {
        "_index": "test",
        "_type": "person",
        "_id": "1",
        "_score": null,
        "_source": {
          "name": "Berta Kassulke"
        },
        "sort": [
          "Berta Kassulke"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "2",
        "_score": null,
        "_source": {
          "name": "Kaley Bartoletti"
        },
        "sort": [
          "Kaley Bartoletti"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "3",
        "_score": null,
        "_source": {
          "name": "Kali Hahn"
        },
        "sort": [
          "Kali Hahn"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "4",
        "_score": null,
        "_source": {
          "name": "Karolann Klein"
        },
        "sort": [
          "Karolann Klein"
        ]
      },
      {
        "_index": "test",
        "_type": "person",
        "_id": "5",
        "_score": null,
        "_source": {
           "name": "Sofia Mandez Kaloo"
        },
        "sort": [
          "Sofia Mandez Kaloo"
        ]
      }
    ]
  }
}

预期结果

按名称ASC排序，但按最左边的单词排序

卡利·巴托莱蒂
卡利·哈恩
卡罗琳·克莱因
伯塔·卡苏克
索菲亚·曼德斯·卡鲁

sorting

elasticsearch

prefix

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-12-01 04:20:06

问得好。实现这一目标的一种方法是将边缘ngram滤波器和span优先查询结合起来。

这是我的背景

{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_custom_analyzer": {
                    "tokenizer": "standard",
                    "filter": ["lowercase",
                        "edge_filter",
                        "asciifolding"
                    ]
                }
            },
            "filter": {
                "edge_filter": {
                    "type": "edgeNGram",
                    "min_gram": 2,
                    "max_gram": 8
                }

            }

        }
    },
    "mappings": {
        "person": {
            "properties": {
                "name": {
                    "type": "string",
                    "analyzer": "my_custom_analyzer",
                    "search_analyzer": "standard",
                    "fields": {
                        "standard": {
                            "type": "string"
                        }
                    }
                }
            }
        }

    }
}

之后，我插入了你的样本文档。然后，我用最大值编写了以下查询。注意，第一个end的span query参数是1，因此将优先考虑(较高的分数)最左边匹配。我先按score排序，然后再按name排序。

{
  "query": {
    "dis_max": {
      "tie_breaker": 0.7,
      "boost": 1.2,
      "queries": [
        {
          "match": {
            "name": "ka"
          }
        },
        {
          "span_first": {
            "match": {
              "span_term": {
                "name": "ka"
              }
            },
            "end": 1
          }
        },
        {
          "span_first": {
            "match": {
              "span_term": {
                "name": "ka"
              }
            },
            "end": 2
          }
        }
      ]
    }
  },
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    },
    {
      "name.standard": {
        "order": "asc"
      }
    }
  ]
}

我得到的结果

"hits": [
         {
            "_index": "esedge",
            "_type": "policy_data",
            "_id": "2",
            "_score": 0.72272325,
            "_source": {
               "name": "Kaley Bartoletti"
            },
            "sort": [
               0.72272325,
               "bartoletti"
            ]
         },
         {
            "_index": "esedge",
            "_type": "policy_data",
            "_id": "3",
            "_score": 0.72272325,
            "_source": {
               "name": "Kali Hahn"
            },
            "sort": [
               0.72272325,
               "hahn"
            ]
         },
         {
            "_index": "esedge",
            "_type": "policy_data",
            "_id": "4",
            "_score": 0.72272325,
            "_source": {
               "name": "Karolann Klein"
            },
            "sort": [
               0.72272325,
               "karolann"
            ]
         },
         {
            "_index": "esedge",
            "_type": "policy_data",
            "_id": "1",
            "_score": 0.54295504,
            "_source": {
               "name": "Berta Kassulke"
            },
            "sort": [
               0.54295504,
               "berta"
            ]
         },
         {
            "_index": "esedge",
            "_type": "policy_data",
            "_id": "5",
            "_score": 0.2905494,
            "_source": {
               "name": "Sofia Mandez Kaloo"
            },
            "sort": [
               0.2905494,
               "kaloo"
            ]
         }
      ]

我希望这能帮到你。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/33997915

复制

相似问题

问如何按匹配顺序排序最左边的单词匹配
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何按匹配顺序排序最左边的单词匹配EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何按匹配顺序排序最左边的单词匹配
EN