ES数据库索引报错-EW帮帮网

ES数据库索引报错

ERROR - {'took': 0, 'errors': True, 'items': [{'create': {'_index': 'xxxx-log-2.0-2022.01.11-000001', '_type': '_doc', '_id': 'wdrDR5YBNUJot4R74noE', 'status': 429, 'error': {'type': 'cluster_block_exception', 'reason': 'index [xxxx-log-2.0-2022.01.11-000001] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];'}}}]}

关键点：[TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block]

由于 Elasticsearch 磁盘空间不足导致的写入阻塞错误

解决方式

错误原因分析

{
  "error": {
    "type": "cluster_block_exception",
    "reason": "index [...] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark..."
  }
}

• 触发机制：Elasticsearch 默认磁盘水位线保护策略
• flood-stage (红色警戒线)：磁盘使用 ≥ 95%
• 超过该阈值后，ES 自动将索引设为 只读模式（仅允许删除）

紧急处理步骤

1. 确认当前磁盘状态

# 查看所有节点磁盘使用率
GET _cat/allocation?v&h=node,disk.percent,disk.avail,disk.total,disk.indices

# 检查水位线配置
GET _cluster/settings?include_defaults=true&filter_path=*.cluster.routing.allocation.disk*

2. 临时恢复写入（应急）

# 关闭磁盘检查（需谨慎，可能导致磁盘写满）
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.threshold_enabled": "false"
  }
}

# 解除索引只读状态（替换your_index_name）
PUT your_index_name/_settings
{
  "index.blocks.read_only_allow_delete": null
}

3. 长期解决方案

操作方向	具体措施
清理数据	- 删除旧索引：`DELETE /old_index_*` - 使用 ILM 自动管理生命周期
扩容存储	- 增加磁盘容量 - 添加新数据节点
优化存储	- 启用压缩：`"index.codec": "best_compression"` - 关闭未用字段索引
调整水位线	修改配置（需重启）： `yaml<br>cluster.routing.allocation.disk.watermark.flood_stage: 97%<br>`

预防性配置建议

# elasticsearch.yml
# -------------------------------
# 磁盘水位线阈值（按需调整）
cluster.routing.allocation.disk.watermark.low: 85% 
cluster.routing.allocation.disk.watermark.high: 90%
cluster.routing.allocation.disk.watermark.flood_stage: 95%

# 自动删除旧数据（示例保留7天）
PUT _ilm/policy/log_retention_policy 
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {"rollover": {"max_size": "50gb", "max_age": "7d"}}
      },
      "delete": {
        "min_age": "7d",
        "actions": {"delete": {}}
      }
    }
  }
}

监控方案示例

# 通过Prometheus监控（示例告警规则）
- alert: ElasticsearchDiskFull
  expr: elasticsearch_cluster_filesystem_used_percent > 90
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "ES节点磁盘即将写满 ({{ $value }}% used)"
    description: "节点 {{ $labels.node }} 磁盘使用率超过90%"

注意事项

强制关闭磁盘检查后，需在 4小时内 完成数据清理或扩容
best_compression 压缩算法会增加约 10% CPU 负载
修改水位线阈值时，建议保持至少 5% 的缓冲空间

通过以上操作可快速恢复服务，但需从根本上解决存储容量问题以避免重复触发保护机制。

ES数据库索引报错

ES数据库索引报错

解决方式

错误原因分析

紧急处理步骤

1. 确认当前磁盘状态

2. 临时恢复写入（应急）

3. 长期解决方案

预防性配置建议

监控方案示例

注意事项

网站公告

今日签到

热门文章

最新发布