文章/答案/技术大牛

发布

社区首页 >问答首页 >弹性搜索中的too_many_buckets_exception

问弹性搜索中的too_many_buckets_exception
EN

Stack Overflow用户

提问于 2019-10-04 10:40:16

回答 1查看 7.1K关注 0票数 3

我在ElasticSearch聚合中面临一个问题。我们使用RestHighLevelClient查询ElasticSearch。

例外是-

ElasticsearchStatusException[ElasticsearchException type=search_phase_execution_exception，reason=]；嵌套:ElasticsearchException[ElasticsearchException [type=too_many_buckets_exception，reason=Trying ]，以创建太多的桶。必须小于或等于: 20000，但等于20001。可以通过更改search.max_buckets群集级别设置来设置此限制。]；

我已经使用PUT请求更改了search.max_buckets，但我仍然面临这个问题。

PUT /_群集/设置{“持久”：{ "search.max_buckets":20000 }}

根据我们的要求，首先，我们必须收集数据的基础上，然后是每小时的基础，然后是ruleId基础。聚合看起来像是低于水平-

Day{
    1:00[
       {
       ruleId : 1 ,
       count : 20
       },
       {
       ruleId : 2 ,
       count : 25
       }
    ],
    2:00[
    {
       ruleId : 1 ,
       count : 20
       },
       {
       ruleId : 2 ,
       count : 25
       }
    ]

现在我的密码是-

    final List<DTO> violationCaseMgmtDtos = new ArrayList<>();
        try {
            RangeQueryBuilder queryBuilders =
                (end_timestmp > 0 ? customTimeRangeQueryBuilder(start_timestmp, end_timestmp, generationTime)
                    : daysTimeRangeQueryBuilder(14, generationTime));

            BoolQueryBuilder boolQuery = new BoolQueryBuilder();
            boolQuery.must(queryBuilders);
            boolQuery.must(QueryBuilders.matchQuery("pvGroupBy", true));
            boolQuery.must(QueryBuilders.matchQuery("pvInformation", false));
            TopHitsAggregationBuilder topHitsAggregationBuilder =
                AggregationBuilders.topHits("topHits").docValueField(policyId).sort(generationTime, SortOrder.DESC);

            TermsAggregationBuilder termsAggregation = AggregationBuilders.terms("distinct").field(policyId).size(10000)
                .subAggregation(topHitsAggregationBuilder);

            DateHistogramAggregationBuilder timeHistogramAggregationBuilder =
                AggregationBuilders.dateHistogram("by_hour").field("eventDateTime")
                    .fixedInterval(DateHistogramInterval.HOUR).subAggregation(termsAggregation);

            DateHistogramAggregationBuilder dateHistogramAggregationBuilder =
                AggregationBuilders.dateHistogram("by_day").field("eventDateTime")
                    .fixedInterval(DateHistogramInterval.DAY).subAggregation(timeHistogramAggregationBuilder);

            SearchRequest searchRequest = new SearchRequest(violationDataModel);
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
            searchSourceBuilder.aggregation(dateHistogramAggregationBuilder);
            searchSourceBuilder.query(boolQuery);
            searchSourceBuilder.from(offset);
            searchSourceBuilder.size(10000);
            searchRequest.source(searchSourceBuilder);
            SearchResponse searchResponse = null;

            searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

            ParsedDateHistogram parsedDateHistogram = searchResponse.getAggregations().get("by_day");

            parsedDateHistogram.getBuckets().parallelStream().forEach(dayBucket -> {


                ParsedDateHistogram hourBasedData = dayBucket.getAggregations().get("by_hour");

                hourBasedData.getBuckets().parallelStream().forEach(hourBucket -> {

                    // TimeLine timeLine = new TimeLine();
                    String dateTime = hourBucket.getKeyAsString();
                    // long dateInLong = DateUtil.getMiliSecondFromStringDate(dateTime);
                    // timeLine.setViolationEventTime(dateTime);

                    ParsedLongTerms distinctPolicys = hourBucket.getAggregations().get("distinct");
                    distinctPolicys.getBuckets().parallelStream().forEach(policyBucket -> {

                        DTO violationCaseManagementDTO = new DTO();
                        violationCaseManagementDTO.setDataAggregated(true);
                        violationCaseManagementDTO.setEventDateTime(dateTime);
                        violationCaseManagementDTO.setRuleId(Long.valueOf(policyBucket.getKey().toString()));

                        ParsedTopHits parsedTopHits = policyBucket.getAggregations().get("topHits");
                        SearchHit[] searchHits = parsedTopHits.getHits().getHits();
                        SearchHit searchHit = searchHits[0];

                        String source = searchHit.getSourceAsString();
                        ViolationDataModel violationModel = null;
                        try {
                            violationModel = objectMapper.readValue(source, ViolationDataModel.class);
                        } catch (Exception e) {
                            e.printStackTrace();
                        }

                        violationCaseManagementDTO.setRuleName(violationModel.getRuleName());
                        violationCaseManagementDTO.setGenerationTime(violationModel.getGenerationTime());
                        violationCaseManagementDTO.setPriority(violationModel.getPriority());
                        violationCaseManagementDTO.setStatus(violationModel.getViolationStatus());
                        violationCaseManagementDTO.setViolationId(violationModel.getId());
                        violationCaseManagementDTO.setEntity(violationModel.getViolator());
                        violationCaseManagementDTO.setViolationType(violationModel.getViolationEntityType());
                        violationCaseManagementDTO.setIndicatorsOfAttack( (int)
                            (policyBucket.getDocCount() * violationModel.getNoOfViolatedEvents()));
                        violationCaseMgmtDtos.add(violationCaseManagementDTO);

                    });
                  //  violationCaseMgmtDtos.sort((d1,d2) -> d1.getEventDateTime().compareTo(d2.getEventDateTime()));
                });

            });

            List<DTO> realtimeViolation = findViolationWithoutGrouping(start_timestmp,  end_timestmp,  offset,  size);
            realtimeViolation.stream().forEach(action -> violationCaseMgmtDtos.add(action)); 
        } catch (Exception e) {
            e.printStackTrace();
        }

        if (Objects.nonNull(violationCaseMgmtDtos) && violationCaseMgmtDtos.size() > 0) {
            return violationCaseMgmtDtos.stream()
                .filter(violationDto -> Objects.nonNull(violationDto))
                .sorted((d1,d2) -> d2.getEventDateTime().compareTo(d1.getEventDateTime()))
                .collect(Collectors.toList());
        }
        return violationCaseMgmtDtos;
}

请帮我解决这个问题。

elasticsearch-rest-client

java

elasticsearch

回答 1

Stack Overflow用户

发布于 2021-05-17 22:46:30

如果您使用ES版本7.x.x，则可以向查询中添加terminate_after子句，以限制将数据划分到其中的桶数。这种情况主要发生在您试图聚合的数据具有高度随机性时。

如果您的数据包含文本，那么最好在.keyword字段上进行聚合(假设您在使用默认设置)。

POST your_index/_search
{
  "from": 0,
  "query": {
    "match_all": {}
  },
  "size": 0,
  "sort": [
    {
      "your_target_field": {
        "order": "desc"
      }
    }
  ],
  "terminate_after": 10000,
  "version": true,
  "aggs": {
    "title": {
      "terms": {
        "field": "your_target_field.keyword",
        "size": 10000
      }
    }
  }
}

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58234784

复制

相似问题

问弹性搜索中的too_many_buckets_exception
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问弹性搜索中的too_many_buckets_exceptionEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问弹性搜索中的too_many_buckets_exception
EN