首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Elasticsearch上的存储桶计数

Elasticsearch上的存储桶计数
EN

Stack Overflow用户
提问于 2020-07-04 21:33:51
回答 1查看 915关注 0票数 2

我正在尝试提取可穿戴设备使用情况的统计数据。忠实用户是指在过去30天内使用可穿戴设备超过20天,并且平均每天使用可穿戴设备超过4小时的用户。因此,简而言之,一个忠实用户=(最少20天使用量+每天平均使用量>4小时)

在Elasticsearch中,使用文档根据日期和使用小时进行索引。

代码语言:javascript
复制
{
id:"AL-2930",
"usage_duration":4.5,
"sessionDate":"2020-05-01" 
},
{
id:"AL-2930",
"usage_duration":5.5,
"sessionDate":"2020-05-02" 
},
{
id:"AL-2931",
"usage_duration":3.5,
"sessionDate":"2020-05-01" 
},
{
id:"AL-2931",
"usage_duration":3.0,
"sessionDate":"2020-05-02" 
},

我正在尝试运行的查询给出了正确的结果。

代码语言:javascript
复制
{

  "aggs": {
    "users": {
     "terms": {
        "field": "id",
        "min_doc_count": 20,
        "order" : { "_key" : "asc" }
      },
   
      "aggs": {
        "avg_usage": {
          "avg": {
            "field": "usage_duration"
          }
           
        },
        "usage_filter": {
          "bucket_selector": {
            "buckets_path": {
              "avgUsage": "avg_usage"
            },
            "script": "params.avgUsage > 4.0"
          }
        
        }
        
      }
    }

  }


}

我得到的结果如下:

代码语言:javascript
复制
{
    "took": 15,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2139,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "patients": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 1926,
            "buckets": [
                {
                    "key": "BG-P-A100CR",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 4.5
                    }
                },
                {
                    "key": "BG-P-A102XF",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 5.5
                    }
                },
                {
                    "key": "BG-P-A103ZU",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 5.0
                    }
                },
                {
                    "key": "BG-P-A104IA",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 6.5
                    }
                },
                {
                    "key": "BG-P-A104ZL",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 4.5
                    }
                },
                {
                    "key": "BG-P-A106BT",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 5.0
                    }
                },
                {
                    "key": "BG-P-A110VY",
                    "doc_count": 24,
                    "avg_usage": {
                        "value": 5.5
                    }
                }
            ]
        }
    }

我真正需要的是返回找到的存储桶中的存储桶总数的查询。我试着回答一个类似的问题(Count buckets returned by sub aggregation),但没有帮助。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-07-05 01:47:43

下面的内容会有帮助吗?

代码语言:javascript
复制
POST <your_index_name>/_search
{
  "size": 0,
  "aggs": {
    "users": {
     "terms": {
        "field": "id",
        "min_doc_count": 20,
        "order" : { "_key" : "asc" },
        "size": 100,                       <----- Added this
        "show_term_doc_count_error": true  <----- Added this 
      },
      "aggs": {
        "avg_usage": {
          "avg": {
            "field": "usage_duration"
          }
        },
        "usage_filter": {
          "bucket_selector": {
            "buckets_path": {
              "avgUsage": "avg_usage"
            },
            "script": "params.avgUsage > 4.0"
          }
        },
        "bucket_count":{
          "bucket_script": {
            "buckets_path": {
              "count": "_count"
            },
            "script": "return params.count"
          }
        }
      }
    },
    "mybucketcount":{
      "stats_bucket": {
        "buckets_path":"users._count"
      }
    }
  }
}

我运行了上面的查询,将"script": "params.avgUsage > 4.0"替换为"script": "params.avgUsage > 3.0",并为您提到的文档集创建了min_doc_count as 2,我看到了以下响应:

代码语言:javascript
复制
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "users" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "AL-2930",
          "doc_count" : 2,
          "avg_usage" : {
            "value" : 5.0
          },
          "bucket_count" : {
            "value" : 2.0
          }
        },
        {
          "key" : "AL-2931",
          "doc_count" : 2,
          "avg_usage" : {
            "value" : 3.25
          },
          "bucket_count" : {
            "value" : 2.0
          }
        }
      ]
    },
    "mybucketcount" : {
      "count" : 2,             <---- Note this.
      "min" : 2.0,
      "max" : 2.0,
      "avg" : 2.0,
      "sum" : 4.0
    }
  }
}

我假设您需要Terms Aggregation返回的存储桶总数,例如,对于users,我只是将Stats Aggregation添加到您已有的存储桶中。

如果有帮助,请让我知道!

票数 4
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62729791

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档