我有像这样的弹性搜索中的数据索引,这是数据在sku_id上分组的输出,我需要整个日期范围的平均排名,在日期范围内,last_7days_avg_rank的第一个值和last_7days_avg_rank的最后一个值作为两个单独的字段,如下所示
有谁能告诉我,如果这是可能的弹性搜索吗?现在正在服务层进行这种计算,但是由于响应时间已经变得不可接受,我想将这个逻辑转移到ES本身,但无法知道如何实现这一点?
输入:
date sku_id last_7days_avg_rank rank
20180101 S1 200 200
20180102 S1 210 200
20180105 S1 220 200
20180108 S1 230 200
20180101 S2 180 300
20180103 S2 200 300
20180106 S2 250 300
20180107 S2 300 300预期产出:
sku first_val_last7day_avg last_val_last7days_avg avg(rank)
S1 200 230 200
S2 180 300 300谢谢!
发布于 2018-03-19 07:57:25
您可以使用聚合获得所需的结果。
{
"size": 0,
"aggs": {
"GROUP": {
"terms": {
"field": "sku_id"
},
"aggs": {
"AVG_RANK": {
"avg": {
"field": "rank"
}
},
"FIRST_7_RANK": {
"top_hits": {
"size": 1,
"sort": [
{
"my_date": {
"order": "asc"
}
}
]
}
},
"LAST_7_RANK": {
"top_hits": {
"size": 1,
"sort": [
{
"my_date": {
"order": "desc"
}
}
]
}
}
}
}
}
}您可以以输出的形式获得以下结果:
"aggregations": {
"GROUP": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "S1",
"doc_count": 40,
"LAST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9MU6JeKRzn3ttxGOr",
"_score": null,
"_source": {
"my_date": "2018-01-08",
"sku_id": "S1",
"last_7days_avg_rank": 230,
"rank": 200
},
"sort": [
1515369600000
]
}
]
}
},
"AVG_RANK": {
"value": 200
},
"FIRST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9LYVpeKRzn3ttxGOQ",
"_score": null,
"_source": {
"my_date": "20180101",
"sku_id": "S1",
"last_7days_avg_rank": 200,
"rank": 200
},
"sort": [
20180101
]
}
]
}
}
},
{
"key": "S2",
"doc_count": 40,
"LAST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9MU6JeKRzn3ttxGOv",
"_score": null,
"_source": {
"my_date": "2018-01-07",
"sku_id": "S2",
"last_7days_avg_rank": 300,
"rank": 300
},
"sort": [
1515283200000
]
}
]
}
},
"AVG_RANK": {
"value": 300
},
"FIRST_7_RANK": {
"hits": {
"total": 40,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "type_name",
"_id": "AWI9LYVpeKRzn3ttxGOU",
"_score": null,
"_source": {
"my_date": "20180101",
"sku_id": "S2",
"last_7days_avg_rank": 180,
"rank": 300
},
"sort": [
20180101
]
}
]
}
}
}
]
}
}上面的结果为S1和S2创建了两个桶(组)。在每个桶中,您可以在first_val_last7day_avg字段中获得该组的平均排名,对于last_val_last7days_avg,您需要跟踪"FIRST_7_RANK"->“->”“->”安打“->”_source“->”的值,而对于last_val_last7days_avg,则需要"LAST_7_RANK"->“->”命中“->”_source“->”的值,我希望这可能会对您有所帮助。
https://stackoverflow.com/questions/49325741
复制相似问题