1 高可用集群
1.1 集群类型
- LB:Load Balance 负载均衡
- LVS/HAProxy/nginx(http/upstream, stream/upstream)
- HA:High Availability 高可用集群
- 数据库、Redis
- SPoF: Single Point of Failure,解决单点故障
- HPC:High Performance Computing 高性能集群
1.2 系统可用性
SLA:Service-Level Agreement 服务等级协议(提供服务的企业与客户之间就服务的品质、水准、性能等方面所达成的双方共同认可的协议或契约)
A = MTBF / (MTBF+MTTR)
99.95%:(60*24*30)*(1-0.9995)=21.6分钟 #一般按一个月停机时间统计
指标 :99.9%, 99.99%, 99.999%,99.9999% (越大越好)
1.3 系统故障
硬件故障:设计缺陷、wear out(损耗)、非人为不可抗拒因素 软件故障:设计缺陷 bug
1.4 实现高可用
提升系统高用性的解决方案:降低MTTR- Mean Time To Repair(平均故障时间) 解决方案:建立冗余机制
active/passive 主/备
active/active 双主
active --> HEARTBEAT --> passive
active <--> HEARTBEAT <--> active
1.5.VRRP:Virtual Router Redundancy Protocol
虚拟路由冗余协议,解决静态网关单点风险
物理层:路由器、三层交换机
软件层:keepalived
1.5.1 VRRP 相关术语
虚拟路由器:Virtual Router
虚拟路由器标识:VRID(0-255),唯一标识虚拟路由器
VIP:Virtual IP
VMAC:Virutal MAC (00-00-5e-00-01-VRID)
物理路由器:
master:主设备
backup:备用设备
priority:优先级
1.5.2 VRRP 相关技术
通告:心跳,优先级等;周期性 工作方式:抢占式,非抢占式 安全认证:
无认证
简单字符认证:预共享密钥
MD5
工作模式:
主/备:单虚拟路由器
主/主:主/备(虚拟路由器1),备/主(虚拟路由器2)
2 Keepalived 部署
2.1 keepalived 简介
vrrp 协议的软件实现,原生设计目的为了高可用 ipvs服务
功能:
基于vrrp协议完成地址流动
为vip地址所在的节点生成ipvs规则(在配置文件中预先定义)
为ipvs集群的各RS做健康状态检测
基于脚本调用接口完成脚本中定义的功能,进而影响集群事务,以此支持nginx、haproxy等服务
2.2 Keepalived 架构
官方文档:
Keepalived User Guide — Keepalived 1.4.3 documentation Keepalived for Linux
用户空间核心组件:
vrrp stack:VIP消息通告
checkers:监测real server
system call:实现 vrrp 协议状态转换时调用脚本的功能
SMTP:邮件组件
IPVS wrapper:生成IPVS规则
Netlink Reflector:网络接口
WatchDog:监控进程
控制组件:提供keepalived.conf 的解析器,完成Keepalived配置
IO复用器:针对网络目的而优化的自己的线程抽象
内存管理组件:为某些通用的内存管理功能(例如分配,重新分配,发布等)提供访问权限
2.3 Keepalived 环境准备
client eth0:172.25.254.111
KA1 eth0:172.25.254.50
KA2 eth0:172.25.254.60
RS1 ech0:172.25.254.11
RS2 ech0:172.25.254.22
各节点时间必须同步:ntp, chrony
关闭防火墙及SELinux
各节点之间可通过主机名互相通信:非必须
建议使用/etc/hosts文件实现:非必须
各节点之间的root用户可以基于密钥认证的ssh服务完成互相通信:非必须
2.4 Keepalived 相关文件
- 软件包名:keepalived
主程序文件:/usr/sbin/keepalived
主配置文件:/etc/keepalived/keepalived.conf
配置文件示例:/usr/share/doc/keepalived/
Unit File:/lib/systemd/system/keepalived.service
Unit File的环境配置文件:/etc/sysconfig/keepalived
[!WARNING]
RHEL7中可能会遇到一下bug,RHEL9中无此问题
systemctl restart keepalived #新配置可能无法生效
systemctl stop keepalived;systemctl start keepalived #无法停止进程,需要 kill 停止
2.5 配置语法说明
2.5.1 全局配置
! Configuration File for keepalived
global_defs {
notification_email {
timiniglee-zln@163.com #keepalived 发生故障切换时邮件发送的目标邮箱,可以按行区分写多个
}
notification_email_from keepalived@KA1.timinglee.org #发邮件的地址
smtp_server 127.0.0.1 #邮件服务器地址
smtp_connect_timeout 30 #邮件服务器连接timeout
router_id KA1.timinglee.org #每个keepalived主机唯一标识
#建议使用当前主机名,但多节点重名不影响
vrrp_skip_check_adv_addr #对所有通告报文都检查,会比较消耗性能
#启用此配置后,如果收到的通告报文和上一个报文是同一 #个路由器,则跳过检查,默认值为全检查
vrrp_strict #严格遵循vrrp协议
#启用此项后以下状况将无法启动服务:
#1.无VIP地址
#2.配置了单播邻居
#3.在VRRP版本2中有IPv6地址
#建议不加此项配置
vrrp_garp_interval 1 #免费 ARP(Gratuitous ARP)报文时间间隔
#免费 ARP用于通知网络中其他设备,某 IP 地址对应的 MAC 地址发生了变化
#帮助网络设备更新 ARP 缓存,确保数据能正确转发到新的主节点
vrrp_gna_interval 1 #用于配置发送 Gratuitous NA(免费邻居通告)报文的时间间隔
#通知网络中其他设备,某 IPv6 地址对应的链路层地址(MAC 地址)发生了变化
#帮助网络设备更新邻居缓存(Neighbor Cache)
#确保 IPv6 数据包能正确转发到新的主节点
vrrp_mcast_group4 224.0.0.44 #指定组播IP地址范围:
}
3 keepalived 示例
3.1 主备模式的部署
#KA2
[root@KA2 ~]# vim /etc/chrony.conf
....略....
pool 172.25.254.50 iburst
....略....
[root@KA2 ~]# systemctl restart chronyd.service
#KA1和KA2
dnf install keepalived -y
#KA1
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
timinglee@timinglee.org
}
notification_email_from timinglee@timinglee.org
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id KA1
vrrp_skip_check_adv_addr
#vrrp_strict
vrrp_garp_interval 1
vrrp_gna_interval 1
vrrp_mcast_group4 224.0.0.44
}
vrrp_instance WEBVIP {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
}
....略....
[root@KA1 ~]# systemctl restart keepalived.service
#KA2 只看区别于KA1的地方
global_defs {
router_id KA2
}
vrrp_instance WEBVIP {
state BACKUP
priority 80
}
[root@KA2 ~]# systemctl restart keepalived.service
测试:
3.2 日志分离
[root@KA1 ~]# vim /etc/sysconfig/keepalived
KEEPALIVED_OPTIONS="-D -S 6" #日志级别为0-7
[root@ka1 ~]#vim /etc/rsyslog.conf
local6.* /var/log/keepalived.log
[root@ka1 ~]#systemctl restart keepalived.service rsyslog.service
测试:
3.3 独立子配置文件
#KA1
[root@KA1 ~]# mkdir /etc/keepalived/conf.d
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
....略....
include "/etc/keepalived/conf.d/*.conf"
[root@KA1 ~]# systemctl restart keepalived.service
3.4 非抢占模式及延迟抢占
非抢占(携带vip的主机挂了后,即使后面恢复也不会再抢占vip)
#KA1
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
....略....
vrrp_instance WEBVIP {
state BACKUP
interface eth0
nopreempt
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
}
[root@KA1 ~]# systemctl restart keepalived.service
#KA2
[root@KA2 ~]# vim /etc/keepalived/keepalived.conf
....略....
vrrp_instance WEBVIP {
state BACKUP
interface eth0
virtual_router_id 51
priority 80
nopreempt
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
}
[root@KA2 ~]# systemctl restart keepalived.service
延迟抢占(高优先级恢复后延迟一会再抢回vip)
#KA1
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
....略....
vrrp_instance WEBVIP {
state BACKUP
interface eth0
preempt_delay 10 #抢占延迟10s
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
}
[root@KA1 ~]# systemctl restart keepalived.service
#KA2
[root@KA2 ~]# vim /etc/keepalived/keepalived.conf
....略....
vrrp_instance WEBVIP {
state BACKUP
interface eth0
preempt_delay 10 #抢占延迟10s
virtual_router_id 51
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
}
[root@KA2 ~]# systemctl restart keepalived.service
3.5 单播模式的设定
#KA1
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
....略....
vrrp_instance WEBVIP {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
preempt_delay 10
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
unicast_src_ip 172.25.254.50 #源50
unicast_peer{
172.25.254.60 #目标60
}
}
[root@KA1 ~]# systemctl restart keepalived.service
#KA2
[root@KA2 ~]# vim /etc/keepalived/keepalived.conf
....略....
vrrp_instance WEBVIP {
state BACKUP
interface eth0
virtual_router_id 51
priority 80
preempt_delay 10
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
unicast_src_ip 172.25.254.60 #源60
unicast_peer{
172.25.254.50 目标50
}
}
[root@KA1 ~]# systemctl restart keepalived.service
测试:
3.6 利用脚本实现vip迁移通知
#KA1
global_defs {
notification_email {
timinglee@timinglee.org
}
notification_email_from timinglee@timinglee.org
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id KA1
vrrp_skip_check_adv_addr
#vrrp_strict
vrrp_garp_interval 1
vrrp_gna_interval 1
vrrp_mcast_group4 224.0.0.44
enable_script_security #启用脚本
script_user root #脚本身份
}
vrrp_instance WEBVIP {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
unicast_src_ip 172.25.254.50
unicast_peer{
172.25.254.60
}
notify_master "/etc/keepalived/scripts/mail.sh master" #脚本加参数
notify_backup "/etc/keepalived/scripts/mail.sh backup"
notify_fault "/etc/keepalived/scripts/mail.sh faild"
}
[root@KA1 ~]# dnf install s-nail sendmail -y
[root@KA1 ~]# systemctl enable --now sendmail.service
[root@KA1 ~]# echo test message |mail -s test 4411****4@qq.com
[root@KA1 ~]# mkdir /etc/keepalived/scripts
[root@KA1 ~]# vim /etc/keepalived/scripts/mail.sh
#!/bin/bash
case $1 in
master)
echo master | mailx -s test 441158154@qq.com
;;
backup)
echo master | mailx -s test 441158154@qq.com
;;
fault)
echo master | mailx -s test 441158154@qq.com
;;
*)
exit 1
;;
esac
~
[root@KA1 ~]# chmod +x /etc/keepalived/scripts/mail.sh
测试:
重启KA1的keepalived
3.7 双主模式
#KA1
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
....略(上面不动,新加一个vrrp_instance)....
vrrp_instance DBVIP {
state BACKUP
interface eth0
virtual_router_id 52
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.200/24 dev eth0 label eth0:1
}
unicast_src_ip 172.25.254.50
unicast_peer{
172.25.254.60
}
}
[root@KA1 ~]# systemctl restart keepalived.service
#KA2
[root@KA2 ~]# vim /etc/keepalived/keepalived.conf
....略(同上)....
vrrp_instance DBVIP {
state MASTER
interface eth0
virtual_router_id 52
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.200/24 dev eth0 label eth0:1
}
unicast_src_ip 172.25.254.60
unicast_peer{
172.25.254.50
}
}
[root@KA2 ~]# systemctl restart keepalived.service
测试:
3.8 与lvs的整合实现自动检测
#RS1 RS2同RS1
[root@RS1 ~]# ip a a 172.25.254.100/32 dev lo
[root@RS1 ~]# echo net.ipv4.conf.all.arp_announce = 2 > /etc/sysctl.conf
[root@RS1 ~]# echo net.ipv4.conf.lo.arp_announce = 2 >> /etc/sysctl.conf
[root@RS1 ~]# echo net.ipv4.conf.lo.arp_ignore = 1 >> /etc/sysctl.conf
[root@RS1 ~]# echo net.ipv4.conf.all.arp_ignore = 1 >> /etc/sysctl.conf
#KA1 KA2同KA1
[root@KA1 ~]# dnf install ipvsadm-1.31-6.el9.x86_64 -y
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
....略....
virtual_server 172.25.254.100 80 {
delay_loop 6
lb_algo rr
lb_kind DR
protocol TCP
real_server 172.25.254.11 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 2
retry 3
delay_before_retry 3
}
}
real_server 172.25.254.22 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 2
retry 3
delay_before_retry 3
}
}
}
[root@KA1 ~]# systemctl restart keepalived.service
[root@KA1 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.254.100:80 rr
-> 172.25.254.11:80 Route 1 0 0
-> 172.25.254.22:80 Route 1 0 0
测试:
#client
[root@client ~]# curl 172.25.254.100
RS2 - 172.25.254.22
[root@client ~]# curl 172.25.254.100
RS1 - 172.25.254.11
3.9 双主模式处理多种业务
#client
[root@client ~]# dnf install mariadb-server -y
#RS1
[root@RS1 ~]# dnf install mariadb-server -y
[root@RS1 ~]# vim /etc/my.cnf
[mysqld]
server-id=1
[root@RS1 ~]# systemctl restart mariadb
[root@RS1 ~]# mysql -e "grant all on *.* to lee@'%' identified by 'lee';"
[root@RS1 ~]# ip a a 172.25.254.200/32 dev lo
#RS2
[root@RS2 ~]# dnf install mariadb-server -y
[root@RS2 ~]# vim /etc/my.cnf
[mysqld]
server-id=2
[root@RS2 ~]# systemctl restart mariadb
[root@RS2 ~]# mysql -e "grant all on *.* to lee@'%' identified by 'lee';"
[root@RS2 ~]# ip a a 172.25.254.200/32 dev lo
#KA1 KA2同KA1
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
...略(新加的)...
virtual_server 172.25.254.200 3306 {
delay_loop 6
lb_algo rr
lb_kind DR
protocol TCP
real_server 172.25.254.11 3306 {
weight 1
TCP_CHECK{
connect_port 3306
connect_timeout 10
retry 3
delay_before_retry 5
}
}
real_server 172.25.254.22 3306 {
weight 1
TCP_CHECK{
connect_port 3306
connect_timeout 10
retry 3
delay_before_retry 5
}
}
}
[root@KA1 ~]# systemctl restart keepalived.service
测试:
3.10 keepalived和haporxy整合
#KA1 KA2同KA1
[root@KA1 ~]# dnf install haproxy -y
[root@KA1 ~]# vim /etc/haproxy/haproxy.cfg
...略...
listen webcluster
bind *:80
mode http
balance roundrobin
server web1 172.25.254.11:80 check inter 3 fall 2 rise 3
server web2 172.25.254.22:80 check inter 3 fall 2 rise 3
...注释掉别的frotend和backend...
[root@KA1 ~]# echo net.ipv4.ip_nonlocal_bind=1 >> /etc/sysctl.conf
[root@KA1 ~]# systemctl enable --now haproxy.service
[root@KA1 ~]# vim /etc/keepalived/scripts/haproxy.sh
#!/bin/bash
killall -0 haproxy &> /dev/null
[root@KA1 ~]# chmod +x /etc/keepalived/scripts/haproxy.sh
[root@KA1 ~]# vim /etc/keepalived/keepalived.conf
global_defs {
notification_email {
timinglee@timinglee.org
}
notification_email_from timinglee@timinglee.org
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id KA1
vrrp_skip_check_adv_addr
#vrrp_strict
vrrp_garp_interval 1
vrrp_gna_interval 1
vrrp_mcast_group4 224.0.0.44
enable_script_security
script_user root
}
vrrp_script CHECK_HAPROXY {
script "/etc/keepalived/scripts/haproxy.sh"
interval 1
weight -30
fall 2
rise 2
timeout 2
}
vrrp_instance WEBVIP {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:0
}
unicast_src_ip 172.25.254.50
unicast_peer{
172.25.254.60
}
track_script {
CHECK_HAPROXY #加载该脚本
}
}
....注释掉之前的vrrp_server.....
[root@KA1 ~]# systemctl restart keepalived.service
测试: