Ansible 补丁管理方案(Windows & Linux)
1. Ansible 简介
Ansible 是一款开源的自动化运维工具,由 Red Hat 公司维护。它采用声明式的 YAML 语言来描述自动化任务,具有以下核心特点:
- 无代理架构:通过 SSH (Linux) 或 WinRM (Windows) 进行通信,无需在目标主机安装客户端
- 幂等性:相同 playbook 多次执行结果一致
- 模块化设计:提供丰富的内置模块(超过 750 个)
- 跨平台支持:可管理 Linux、Windows、网络设备等
- 简单易用:YAML 语法易于理解和编写
2. 补丁管理方案设计
2.1 系统要求
控制节点:运行 Ansible 的主机(Linux/macOS)
- Python 3.8+
- Ansible 8.0+
- 必要的依赖包(python3-winrm 用于 Windows 管理)
被管节点:
- Linux:SSH 服务,Python 2.7+/3.5+
- Windows:WinRM 配置,PowerShell 5.1+
2.2 目录结构
patch_management/
├── inventory/
│ ├── production/
│ │ ├── linux_hosts
│ │ └── windows_hosts
│ └── staging/
│ ├── linux_hosts
│ └── windows_hosts
├── group_vars/
│ ├── linux.yml
│ └── windows.yml
├── roles/
│ ├── linux-patching/
│ │ ├── tasks/
│ │ │ ├── main.yml
│ │ │ ├── pre_reboot.yml
│ │ │ └── post_reboot.yml
│ │ └── vars/
│ │ └── main.yml
│ └── windows-patching/
│ ├── tasks/
│ │ ├── main.yml
│ │ ├── pre_reboot.yml
│ │ └── post_reboot.yml
│ └── vars/
│ └── main.yml
├── playbooks/
│ ├── linux-patch.yml
│ └── windows-patch.yml
└── README.md
3. 详细实施步骤
3.1 环境准备
控制节点安装
# 安装 Ansible
python3 -m pip install --user ansible
# 安装 Windows 支持
python3 -m pip install --user pywinrm
# 验证安装
ansible --version
被管节点配置
Linux 节点:
- 确保 SSH 服务运行
- 配置 SSH 密钥认证
Windows 节点:
- 配置 WinRM(以管理员身份运行 PowerShell):
# 快速配置(仅测试环境)
Enable-PSRemoting -Force
Set-Item WSMan:\localhost\Client\TrustedHosts * -Force
Restart-Service WinRM
- 详细生产环境配置参考:Ansible Windows 设置指南
3.2 库存(inventory)配置
inventory/production/linux_hosts
:
[linux_servers]
web01.example.com ansible_user=admin
db01.example.com ansible_user=admin
[linux_servers:vars]
ansible_python_interpreter=/usr/bin/python3
inventory/production/windows_hosts
:
[windows_servers]
win01.example.com
win02.example.com
[windows_servers:vars]
ansible_user=Administrator
ansible_password=SecurePass123!
ansible_connection=winrm
ansible_winrm_transport=ntlm
ansible_winrm_server_cert_validation=ignore
3.3 变量配置
group_vars/linux.yml
:
---
# Linux 补丁管理配置
patch_reboot: true
patch_security_only: true
exclude_packages:
- kernel*
- *-devel
pre_patch_scripts:
- "/usr/local/scripts/pre-patch.sh"
post_patch_scripts:
- "/usr/local/scripts/post-patch.sh"
group_vars/windows.yml
:
---
# Windows 补丁管理配置
windows_update_categories:
- SecurityUpdates
- CriticalUpdates
windows_reboot: true
windows_reboot_timeout: 1800 # 30分钟
blacklist_kbs:
- KB1234567
- KB8912345
3.4 角色开发
Linux 补丁角色
roles/linux-patching/tasks/main.yml
:
---
- name: 检查是否为 root 用户
assert:
that: ansible_user_id == "root"
fail_msg: "必须使用 root 用户运行补丁任务"
- name: 执行预补丁脚本
include_tasks: pre_reboot.yml
when: pre_patch_scripts is defined
- name: 更新包缓存
package:
update_cache: yes
when: ansible_os_family in ['Debian', 'RedHat']
- name: 检查可用更新
command: "yum check-update --security"
register: yum_updates
changed_when: false
ignore_errors: true
when: ansible_os_family == 'RedHat' and patch_security_only
- name: 安装安全更新 (RHEL/CentOS)
yum:
name: "*"
security: yes
exclude: "{{ exclude_packages }}"
update_cache: yes
state: latest
register: yum_result
when: ansible_os_family == 'RedHat' and patch_security_only
- name: 安装所有更新 (RHEL/CentOS)
yum:
name: "*"
exclude: "{{ exclude_packages }}"
update_cache: yes
state: latest
register: yum_result
when: ansible_os_family == 'RedHat' and not patch_security_only
- name: 安装安全更新 (Debian/Ubuntu)
apt:
upgrade: dist
update_cache: yes
autoremove: yes
only_upgrade: yes
register: apt_result
when: ansible_os_family == 'Debian' and patch_security_only
- name: 执行后补丁脚本
include_tasks: post_reboot.yml
when: post_patch_scripts is defined
- name: 重启系统 (如果需要)
reboot:
msg: "应用系统更新后重启"
connect_timeout: 5
reboot_timeout: 600
pre_reboot_delay: 30
post_reboot_delay: 30
when: patch_reboot and (yum_result.changed or apt_result.changed)
Windows 补丁角色
roles/windows-patching/tasks/main.yml
:
---
- name: 检查 Windows 更新服务状态
win_service:
name: wuauserv
state: started
- name: 下载 Windows 更新
win_updates:
category_names: "{{ windows_update_categories }}"
state: downloaded
blacklist: "{{ blacklist_kbs }}"
log_path: C:\Windows\Temp\ansible_wu.log
register: win_updates_result
- name: 安装 Windows 更新
win_updates:
category_names: "{{ windows_update_categories }}"
state: installed
blacklist: "{{ blacklist_kbs }}"
log_path: C:\Windows\Temp\ansible_wu.log
register: win_updates_install
when: win_updates_result.found_update_count > 0
- name: 执行预重启检查
include_tasks: pre_reboot.yml
when: windows_reboot and win_updates_install is changed
- name: 重启 Windows 服务器
win_reboot:
msg: "应用 Windows 更新后重启"
connect_timeout: 5
reboot_timeout: "{{ windows_reboot_timeout }}"
pre_reboot_delay: 60
post_reboot_delay: 30
when: windows_reboot and win_updates_install is changed
- name: 验证更新后状态
win_updates:
category_names: "{{ windows_update_categories }}"
state: searched
register: post_update_check
when: win_updates_install is changed
- name: 执行后重启任务
include_tasks: post_reboot.yml
when: windows_reboot and win_updates_install is changed
3.5 Playbook 开发
playbooks/linux-patch.yml
:
---
- name: 应用 Linux 服务器补丁
hosts: linux_servers
serial: "30%" # 滚动更新,每次 30% 主机
become: yes
gather_facts: yes
roles:
- linux-patching
post_tasks:
- name: 验证系统状态
shell: uptime
register: uptime
changed_when: false
- name: 显示系统正常运行时间
debug:
msg: "系统正常运行时间: {{ uptime.stdout }}"
playbooks/windows-patch.yml
:
---
- name: 应用 Windows 服务器补丁
hosts: windows_servers
serial: 2 # 每次更新 2 台主机
gather_facts: yes
roles:
- windows-patching
post_tasks:
- name: 获取系统信息
win_shell: systeminfo
register: systeminfo
changed_when: false
- name: 显示最后启动时间
debug:
msg: "{{ (systeminfo.stdout | regex_search('系统启动时间:.*')) }}"
4. 执行补丁管理
4.1 测试运行
# 测试 Linux 补丁 (检查模式)
ansible-playbook -i inventory/production/linux_hosts playbooks/linux-patch.yml --check
# 测试 Windows 补丁 (检查模式)
ansible-playbook -i inventory/production/windows_hosts playbooks/windows-patch.yml --check
4.2 实际执行
# 执行 Linux 补丁
ansible-playbook -i inventory/production/linux_hosts playbooks/linux-patch.yml \
--limit "web_servers" \
--extra-vars "patch_security_only=false"
# 执行 Windows 补丁
ansible-playbook -i inventory/production/windows_hosts playbooks/windows-patch.yml \
--extra-vars "windows_reboot_timeout=3600"
4.3 高级选项
# 使用标签执行特定任务
ansible-playbook -i inventory/production/linux_hosts playbooks/linux-patch.yml \
--tags "pre_patch,update"
# 从失败点继续执行
ansible-playbook -i inventory/production/linux_hosts playbooks/linux-patch.yml \
--start-at-task "安装安全更新 (RHEL/CentOS)"
5. 报告与验证
5.1 生成报告
# 收集 Linux 补丁状态
ansible linux_servers -i inventory/production/linux_hosts -m shell -a \
"yum history list | head -n 10 || apt list --upgradable"
# 收集 Windows 补丁状态
ansible windows_servers -i inventory/production/windows_hosts -m win_shell -a \
"Get-HotFix | Sort-Object InstalledOn -Descending | Select-Object -First 10"
5.2 使用回调插件
在 ansible.cfg
中启用插件:
[defaults]
callback_whitelist = mail, slack, junit
6. 最佳实践
- 测试环境先行:先在非生产环境测试补丁
- 分阶段部署:使用 Ansible 的
serial
关键字进行滚动更新 - 维护窗口:使用
--limit
参数控制补丁应用范围 - 备份策略:在应用补丁前确保系统备份
- 监控验证:补丁后验证关键服务状态
- 文档记录:记录每次补丁的变更和问题
7. 故障排除
常见 Linux 问题
问题:YUM/DNF 锁错误
解决:
- name: 清理 YUM 锁
file:
path: /var/run/yum.pid
state: absent
ignore_errors: yes
常见 Windows 问题
问题:WinRM 连接失败
解决:
# 在 Windows 主机上执行
winrm quickconfig
winrm set winrm/config/service '@{AllowUnencrypted="true"}'
winrm set winrm/config/service/auth '@{Basic="true"}'
8. 扩展建议
- 与 CMDB 集成:动态获取 inventory
- 与监控系统集成:补丁前后检查系统指标
- 自动化审批流程:与 ITSM 工具集成
- 补丁合规报告:生成 PDF/HTML 报告
- 容器化 Ansible:使用 AWX 或 Ansible Tower
9. 参考资源
通过Ansible工具,您可以实现跨平台的自动化补丁管理,确保系统安全性和稳定性,同时减少人工干预和操作错误。