Elasticsearch 主要通过 Snapshot(快照)和 Restore(恢复) 功能来实现备份和恢复。
1. 核心概念
什么是 Snapshot?
是集群状态的增量备份
可以备份到各种存储库(Repository)
支持全量和增量备份
可以备份整个集群或特定索引
支持的后端存储:
共享文件系统 (FS) - 最常用
AWS S3
Google Cloud Storage
Azure Blob Storage
HDFS (Hadoop)
2. 配置备份存储库(Repository)
2.1 共享文件系统配置(最常用)
在 elasticsearch.yml 中配置:
yaml
path.repo: ["/path/to/backup/dir", "/another/backup/dir"]
创建文件系统存储库:
json
PUT /_snapshot/my_backup_repo { "type": "fs", "settings": { "location": "/mnt/elasticsearch_backups", "compress": true, "max_snapshot_bytes_per_sec": "50mb", "max_restore_bytes_per_sec": "50mb" } }
2.2 AWS S3 配置(需要安装插件)
json
PUT /_snapshot/my_s3_backup { "type": "s3", "settings": { "bucket": "my-elasticsearch-backups", "region": "us-west-2", "base_path": "production", "access_key": "your-access-key", "secret_key": "your-secret-key" } }
3. 备份操作(Snapshot)
3.1 创建快照
json
PUT /_snapshot/my_backup_repo/snapshot_20231001 { "indices": "index1,index2,logstash-*", // 指定要备份的索引 "ignore_unavailable": true, // 忽略不存在的索引 "include_global_state": false, // 是否包含集群全局状态 "metadata": { "taken_by": "admin", "purpose": "monthly_backup" } }
3.2 查看快照信息
json
// 查看所有快照 GET /_snapshot/my_backup_repo/_all // 查看特定快照状态 GET /_snapshot/my_backup_repo/snapshot_20231001/_status // 查看存储库信息 GET /_snapshot/my_backup_repo
3.3 删除快照
json
DELETE /_snapshot/my_backup_repo/snapshot_20231001
4. 恢复操作(Restore)
4.1 基本恢复
json
POST /_snapshot/my_backup_repo/snapshot_20231001/_restore { "indices": "index1,index2", // 指定要恢复的索引 "ignore_unavailable": true, "include_global_state": false, "rename_pattern": "index_(.+)", // 重命名模式 "rename_replacement": "restored_index_$1", // 重命名替换 "include_aliases": false }
4.2 恢复到不同集群
json
// 在新集群上注册相同的存储库 PUT /_snapshot/my_backup_repo { "type": "fs", "settings": { "location": "/mnt/elasticsearch_backups" } } // 然后执行恢复 POST /_snapshot/my_backup_repo/snapshot_20231001/_restore { "indices": "*", "include_global_state": false }
5. 自动化备份策略
5.1 使用 Curator 工具自动化
安装 Curator:
bash
pip install elasticsearch-curator
创建 curator.yml:
yaml
client: hosts: - localhost port: 9200 use_ssl: False
创建 action.yml:
yaml
actions: 1: action: snapshot description: "Create monthly snapshot" options: repository: my_backup_repo name: monthly-snapshot-%Y.%m.%d ignore_unavailable: False include_global_state: False wait_for_completion: True filters: - filtertype: pattern kind: prefix value: logstash- 2: action: delete_snapshots description: "Delete snapshots older than 30 days" options: repository: my_backup_repo disable_action: False filters: - filtertype: age source: creation_date direction: older unit: days unit_count: 30
5.2 使用 Elasticsearch 的 SLM(Snapshot Lifecycle Management)
创建生命周期策略:
json
PUT /_slm/policy/daily-snapshots { "schedule": "0 30 1 * * ?", // 每天凌晨1:30 "name": "<daily-snapshot-{now/d}>", "repository": "my_backup_repo", "config": { "indices": ["*"], "include_global_state": false }, "retention": { "expire_after": "30d", "min_count": 5, "max_count": 50 } }
立即执行策略:
json
POST /_slm/policy/daily-snapshots/_execute
6. 完整备份恢复示例
场景:迁移数据到新集群
源集群备份:
json
// 1. 创建存储库 PUT /_snapshot/migration_repo { "type": "fs", "settings": { "location": "/shared_backups/migration" } } // 2. 创建快照 PUT /_snapshot/migration_repo/full_migration_snapshot { "indices": "products,users,orders", "include_global_state": false }
目标集群恢复:
json
// 1. 配置相同的存储库路径 // 确保目标集群的 elasticsearch.yml 中有: // path.repo: ["/shared_backups/migration"] // 2. 注册存储库 PUT /_snapshot/migration_repo { "type": "fs", "settings": { "location": "/shared_backups/migration" } } // 3. 恢复数据 POST /_snapshot/migration_repo/full_migration_snapshot/_restore { "indices": "products,users,orders", "ignore_unavailable": true, "include_global_state": false }
7. 监控和管理
查看备份状态
json
GET /_snapshot/_status GET /_snapshot/my_backup_repo/_status
查看恢复状态
json
GET /_recovery?human&detailed=true
取消恢复操作
json
DELETE /_restore/snapshot_20231001
8. 最佳实践和建议
定期测试恢复:确保备份可用
监控存储空间:快照会占用磁盘空间
使用增量备份:Elasticsearch 自动进行增量备份
分离存储:备份存储与数据存储分离
权限控制:保护备份存储库的访问权限
文档化流程:记录备份恢复步骤
版本兼容性:确保备份和恢复的集群版本兼容
9. 常见问题解决
问题1:存储库路径未配置
错误:path.repo is not set
解决:在 elasticsearch.yml 中添加 path.repo: ["/your/backup/path"]
问题2:权限不足
错误:Permission denied
解决:确保 Elasticsearch 用户对备份目录有读写权限
问题3:存储空间不足
监控:定期检查备份目录的磁盘使用情况
问题4:恢复时索引已存在
解决:先删除冲突的索引,或者使用重命名功能
json
POST /_snapshot/my_backup_repo/snapshot_20231001/_restore { "indices": "old_index", "rename_pattern": "old_index", "rename_replacement": "restored_old_index" }
通过这套完整的备份恢复机制,可以确保 Elasticsearch 数据的安全性和可恢复性。