文章/答案/技术大牛

发布

社区首页 >问答首页 >在EKS上使用EKS上的弹力搜索群集处理意外磁盘容量问题

问在EKS上使用EKS上的弹力搜索群集处理意外磁盘容量问题
EN

Stack Overflow用户

提问于 2022-01-28 13:27:37

回答 1查看 52关注 0票数 1

我在kubernetes集群(EKS)中配置了一个elasticsearch集群，elasticsearch集群有3个节点，我已经为节点设置了一个8E磁盘来存储数据。(认为我暂时不会有任何空间问题)

[root@es-cluster-0 elasticsearch]# curl -s -XGET http://localhost:9200/_cat/allocation?v
shards disk.indices disk.used disk.avail disk.total disk.percent host         ip           node
    36       66.7gb   966.1gb   8191.9pb   8191.9pb            0 10.65.32.184 10.65.32.184 es-cluster-0
    33       82.6gb   966.1gb   8191.9pb   8191.9pb            0 10.65.32.202 10.65.32.202 es-cluster-2
    37         76gb   966.1gb   8191.9pb   8191.9pb            0 10.65.32.178 10.65.32.178 es-cluster-1
    14                                                                                     UNASSIGNED

集群当前的健康状况是：

[root@es-cluster-0 elasticsearch]# curl -s -XGET http://localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "k8s-logs",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 56,
  "active_shards" : 106,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 14,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 88.33333333333333
}

我可以看到，我有14个"unassigned_shards"，它与上面/_cat/allocation的最后一行完全匹配。

当我开始弄清楚发生了什么事时，我发现：

[root@es-cluster-0 elasticsearch]# curl -s -XGET http://localhost:9200/_cluster/allocation/explain?pretty
{
  "index" : "logstash-2022.01.22",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "ALLOCATION_FAILED",
    "at" : "2022-01-22T00:00:11.254Z",
    "failed_allocation_attempts" : 5,
    "details" : "failed shard on node [bf_GjmcUQGuCTk-_voh4Xw]: failed recovery, failure RecoveryFailedException[[logstash-2022.01.22][0]: Recovery failed from {es-cluster-0}{hYJ4ifx7R7yWJq6VFP3Drw}{jjAAtdcmQXeVpJXxj4DYcA}{10.65.32.184}{10.65.32.184:9300}{dilmrt}{ml.machine_memory=15878057984, ml.max_open_jobs=20, xpack.installed=true, transform.node=true} into {es-cluster-1}{bf_GjmcUQGuCTk-_voh4Xw}{QNp4DD51TQa716D4TjMFPg}{10.65.32.178}{10.65.32.178:9300}{dilmrt}{ml.machine_memory=15878057984, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; nested: RemoteTransportException[[es-cluster-0][10.65.32.184:9300][internal:index/shard/recovery/start_recovery]]; nested: RemoteTransportException[[es-cluster-1][10.65.32.178:9300][internal:index/shard/recovery/clean_files]]; nested: UncategorizedExecutionException[Failed execution]; nested: NotSerializableExceptionWrapper[execution_exception: java.io.IOException: Disk quota exceeded]; nested: IOException[Disk quota exceeded]; ",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "7WHft5LVTYCEWvwKM64A-w",
      "node_name" : "es-cluster-2",
      "transport_address" : "10.65.32.202:9300",
      "node_attributes" : {
        "ml.machine_memory" : "15878057984",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "transform.node" : "true"
      },         
--- TRUNCATED ---

我不知道为什么要说Disk quota exceeded，如果elasticsearch集群正确地报告了它的可用容量，那么/_cat/allocation还有什么额外的配置需要设置，以便告诉elasticsearch，我们有足够的空间可以使用？

elasticsearch

amazon-eks

amazon-efs

回答 1

Stack Overflow用户

发布于 2022-01-28 15:39:46

有关可能导致磁盘配额错误的EFS限制，请参阅此处，该错误与磁盘大小无关。一般来说，EFS不支持相当大的ES堆栈，例如elasticsearch期望每个数据节点实例有64K文件描述符，但EFS目前只支持32K。如果您查看您的elasticsearch日志，可能会发现哪些限制已经违反了。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70894538

复制

相似问题

问在EKS上使用EKS上的弹力搜索群集处理意外磁盘容量问题
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在EKS上使用EKS上的弹力搜索群集处理意外磁盘容量问题EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在EKS上使用EKS上的弹力搜索群集处理意外磁盘容量问题
EN