我有来自一个提供者的一些数据--非常大的结构化JSON数据:
"mappings": {
"properties": {
"field_a": { .. },
"field_b": { .. },
"field_c": { .. },
"field_d": {
"properties": {
"subfield_a": {...},
"subfield_b": {...},
"subfield_c": {...},
"subfield_d": {...},
"subfield_e": {
"properties": {
"myfield": {
"type": "keyword"
},
"another_a": {...},
"another_b": {...},
}
}
}
}
}
}subfield_e是包含许多字段的对象数组,我感兴趣的是"myfield“。
我需要只包含一些字符串的字段"myfield“的聚合。
因此,我现在这样做是错误的(但逻辑结果):
GET /index/_search
{
"query": {
"wildcard": {
"field_d.subfield_e.myfield": "*string*"
}
},
"aggs": {
"interest": {
"terms": {
"field": "field_d.subfield_e.myfield",
"size": 10
}
}
},
"size": 0
}这个查询的问题是,查询将选择对象数组"esubfield_e“包含带有字符串的对象myfield的所有文档,并在这些文档下进行聚合。最后,我得到了这些文档下所有“myfield”的结果,而不仅仅是包含字符串的myfield。
我尝试在主聚合之后进行bucket_selector聚合,但我得到了错误:"buckets_path必须引用数字值或单个值数字度量聚合,got: String at _key“。
我的代码受:Filter Elasticsearch Aggregation by Bucket Key Value启发,现在看起来:
GET /index/_search
{
"query": {
"wildcard": {
"field_d.subfield_e.myfield": "*string*"
}
},
"aggs": {
"interest": {
"terms": {
"field": "field_d.subfield_e.myfield",
"size": 10
}
},
"aggs": {
"buckets": {
"bucket_selector": {
"buckets_path": {
"key": "_key"
},
"script": "params.key.contains('string')"
}
}
}
}
},
"size": 0
}那么,我如何通过字符串键过滤聚合桶(术语aggs)呢?
发布于 2021-03-18 18:34:25
我通过将subfield_e转换为嵌套对象而不是未定义的数组来解决这个问题,并将所有数据重新导入到这个新的映射中。
当前的映射看起来如下:
"mappings": {
"properties": {
"field_a": { .. },
"field_b": { .. },
"field_c": { .. },
"field_d": {
"properties": {
"subfield_a": {...},
"subfield_b": {...},
"subfield_c": {...},
"subfield_d": {...},
"subfield_e": {
"type": "nested" <======= This line added
"properties": {
"myfield": {
"type": "keyword"
},
"another_a": {...},
"another_b": {...},
}
}
}
}
}
}最后的工作查询是:
GET /index/_search
{
"query": {
"nested": {
"path": "field_d.subfield_e",
"query": {
"wildcard": {
"field_d.subfield_e.myfield": {
"value": "*string*"
}
}
}
}
},
"aggs": {
"agg": {
"nested": {
"path": "field_d.subfield_e"
},
"aggs": {
"inner": {
"filter": {
"wildcard": {
"field_d.subfield_e.myfield": "*string*"
}
}, "aggs": {
"interest": {
"terms": {
"field": "field_d.subfield_e.myfield",
"size": 10
}
}
}
}
}
}
},
"size": 0
}在我的例子中,这个查询的速度要比使用包含/排除的术语聚合要快得多。
https://stackoverflow.com/questions/66687769
复制相似问题