首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Elastic Search中词查找文档中基于属性的过滤和排序

Elastic Search中词查找文档中基于属性的过滤和排序
EN

Stack Overflow用户
提问于 2019-09-19 22:46:55
回答 1查看 253关注 0票数 0

我的索引中有一些文档:

代码语言:javascript
复制
POST "/index/thing/_bulk" -s -d'
    { "index":{ "_id": 1 } }
    { "title":"One thing"}
    { "index":{ "_id": 2 } }
    { "title":"Second thing"}
    { "index":{ "_id": 3 } }
    { "title":"Three things"}
    { "index":{ "_id": 4 } }
    { "title":"And so fourth"}
    { "index":{ "_id": 5 } }
    { "title":"Five things"}
'

我还有一些包含用户collection的文档,这些文档通过documents id属性链接到其他文档(内容),如下所示:

代码语言:javascript
复制
PUT /index/collection/1
{
    "items": [
        {"id": 1, "time_added": "2017-08-07T09:07:15.000Z", "condition": "fair"},
        {"id": 3, "time_added": "2019-08-07T09:07:15.000Z", "condition": "good"},
        {"id": 4, "time_added": "2016-08-07T09:07:15.000Z", "condition": "poor"}
    ]
}

然后,我使用terms lookup来获取用户集合中的所有内容,如下所示:

代码语言:javascript
复制
GET /documents/_search
{
    "query" : {
        "terms" : {
            "_id" : {
                "index" : "index",
                "type" : "collection",
                "id" : 1,
                "path" : "items.id"
            }
        }
    }
}

这可以很好地工作。我得到了集合中的三个文档,可以按照自己的意愿进行搜索、排序和使用聚合。

但是,有没有一种方法可以根据collection文档中的属性(本例中是time_addedcondition )对这些文档进行聚合、过滤和排序呢?假设我想要根据集合中的time_addedcondition=="good"进行排序?

也许是一个脚本,可以应用于collection对其中的项目进行排序或过滤?感觉这已经非常接近sql了,就像left-join一样,所以也许Elastic Search是错误的工具?

EN

回答 1

Stack Overflow用户

发布于 2019-09-20 01:25:43

看起来你需要nested data type

以你的数据为例:

不带嵌套类型的

代码语言:javascript
复制
POST collection/_bulk?filter_path=_
{"index":{}}
{"items":[{"id":11,"time_added":"2017-08-07T09:07:15.000Z","condition":"fair"},{"id":13,"time_added":"2019-08-07T09:07:15.000Z","condition":"good"},{"id":14,"time_added":"2016-08-07T09:07:15.000Z","condition":"poor"}]}
{"index":{}}
{"items":[{"id":21,"time_added":"2017-09-07T09:07:15.000Z","condition":"fair"},{"id":23,"time_added":"2019-09-07T09:07:15.000Z","condition":"good"},{"id":24,"time_added":"2016-09-07T09:07:15.000Z","condition":"poor"}]}
{"index":{}}
{"items":[{"id":31,"time_added":"2017-10-07T09:07:15.000Z","condition":"fair"},{"id":33,"time_added":"2019-10-07T09:07:15.000Z","condition":"good"},{"id":34,"time_added":"2016-10-07T09:07:15.000Z","condition":"poor"}]}
{"index":{}}
{"items":[{"id":41,"time_added":"2017-11-07T09:07:15.000Z","condition":"fair"},{"id":43,"time_added":"2019-11-07T09:07:15.000Z","condition":"good"},{"id":44,"time_added":"2016-11-07T09:07:15.000Z","condition":"poor"}]}
{"index":{}}
{"items":[{"id":51,"time_added":"2017-12-07T09:07:15.000Z","condition":"fair"},{"id":53,"time_added":"2019-12-07T09:07:15.000Z","condition":"good"},{"id":54,"time_added":"2016-12-07T09:07:15.000Z","condition":"poor"}]}

查询(你会得到不正确的结果-预期是一个,得到五个):

代码语言:javascript
复制
GET collection/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "items.condition": {
              "value": "good"
            }
          }
        },
        {
          "range": {
            "items.time_added": {
              "lte": "2019-09-01"
            }
          }
        }
      ]
    }
  }
}

聚合(incorect results --看看第一个存储桶"2016-08-01T00:00:00.000Z" --它包含3个CONDITION子存储桶,每个条件类型)

代码语言:javascript
复制
GET collection/_search
{
  "size": 0,
  "aggs": {
    "DATE": {
      "date_histogram": {
        "field": "items.time_added",
        "calendar_interval": "month"
      },
      "aggs": {
        "CONDITION": {
          "terms": {
            "field": "items.condition.keyword",
            "size": 10
          }
        }
      }
    }
  }
}

具有嵌套类型

代码语言:javascript
复制
DELETE collection

PUT collection
{
  "mappings": {
    "properties": {
      "items": {
        "type": "nested"
      }
    }
  }
}

# and POST the same data from above

查询(只返回一个结果)

代码语言:javascript
复制
GET collection/_search
{
  "query": {
    "nested": {
      "path": "items",
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "items.condition": {
                  "value": "good"
                }
              }
            },
            {
              "range": {
                "items.time_added": {
                  "lte": "2019-09-01"
                }
              }
            }
          ]
        }
      }
    }
  }
}

聚合(第一个date bucket只包含一个CONDITION子bucket)

代码语言:javascript
复制
GET collection/_search
{
  "size": 0,
  "aggs": {
    "ITEMS": {
      "nested": {
        "path": "items"
      },
      "aggs": {
        "DATE": {
          "date_histogram": {
            "field": "items.time_added",
            "calendar_interval": "month"
          },
          "aggs": {
            "CONDITION": {
              "terms": {
                "field": "items.condition.keyword",
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

希望这能有所帮助:)

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58013654

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档