前言
Oracle RAC下,在系统配置阶段务必保证配置统一,包括但不限于OS系统版本、网卡名称、公网、私网网段、内存、磁盘空间、sysctl、limit等配置一致。将会极大提高效率。
错误信息
在GRID软件安装过程中需要在所有节点执行root.sh脚本,以致于节点加入集群。
[root@standby grid]# ./root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /data/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /data/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to oracle-ohasd.service
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node master, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Start of resource "ora.cssd" failed
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'standby'
CRS-2672: Attempting to start 'ora.gipcd' on 'standby'
CRS-2676: Start of 'ora.cssdmonitor' on 'standby' succeeded
CRS-2676: Start of 'ora.gipcd' on 'standby' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'standby'
CRS-2672: Attempting to start 'ora.diskmon' on 'standby'
CRS-2676: Start of 'ora.diskmon' on 'standby' succeeded
CRS-2674: Start of 'ora.cssd' on 'standby' failed
CRS-2679: Attempting to clean 'ora.cssd' on 'standby'
CRS-2681: Clean of 'ora.cssd' on 'standby' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'standby'
CRS-2677: Stop of 'ora.gipcd' on 'standby' succeeded
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'standby'
CRS-2677: Stop of 'ora.cssdmonitor' on 'standby' succeeded
CRS-5804: Communication error with agent process
CRS-4000: Command Start failed, or completed with errors.
Failed to start Oracle Grid Infrastructure stack
Failed to start Cluster Synchorinisation Service in clustered mode at /data/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1278.
/data/app/11.2.0/grid/perl/bin/perl -I/data/app/11.2.0/grid/perl/lib -I/data/app/11.2.0/grid/crs/install /data/app/11.2.0/grid/crs/install/rootcrs.pl exe cution failed
问题
当前问题就是因为私网网卡与hosts文件中配置的地址不一致,导致无法出错。具体是因为RAC之间通信都是通过域名,不会具体到IP进行通信,而私网间通信只要所有节点能够ping通就可以通过runcluvfy.sh脚本检测。
私网网卡配置
[root@standby network-scripts]# cat ifcfg-eth1
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=eth1
#UUID=8d3b8381-d9cc-43d0-a27f-ae75bea5dc03
DEVICE=eth1
ONBOOT=yes
NM_CONTROLLED=yes
IPADDR=192.168.1.155 -- 错误地方
NETMASK=255.255.255.0
GATEWAY=192.168.1.11
主节点中的hosts配置
注意:此处看主节点的hosts配置是因为,grid软件在安装过程中,只需要在主节点执行安装即可,安装到最后会自动执行复制,将集群软件复制到所有节点。完成复制后要求在所有节点执行root.sh脚本,将所有节点全部加入集群中。
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
# public ip
192.168.211.154 master.hek.cn master
192.168.211.155 standby.hek.cn standby
# private ip
192.168.1.54 master-priv
192.168.1.55 standby-priv -- 主节点中指定的私网IP与该节点中的私网网卡中的IP不一致,所以该节点无法成功执行root.sh
# VIP
192.168.211.156 master-vip
192.168.211.157 standby-vip
# scan ip
192.168.211.158 scan-ip
解决方案
1、从集群中强制删除当前节点信息
[root@standby install]# cd /data/app/11.2.0/grid/crs/install/
[root@standby install]# ./roothas.pl -verbose -deconfig -force
Using configuration parameter file: ./crsconfig_params
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Delete failed, or completed with errors.
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'standby'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'standby'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'standby'
CRS-2677: Stop of 'ora.mdnsd' on 'standby' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'standby' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'standby' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully deconfigured Oracle Restart stack
2、更改私网IP
[root@standby network-scripts]# cat ifcfg-eth1
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=eth1
#UUID=8d3b8381-d9cc-43d0-a27f-ae75bea5dc03
DEVICE=eth1
ONBOOT=yes
NM_CONTROLLED=yes
IPADDR=192.168.1.55 -- 与主节点中host保持一致
NETMASK=255.255.255.0
GATEWAY=192.168.1.11
3、重启当前服务器
reboot
4、重新执行root.sh脚本即可
[root@standby ~]# cd /data/app/11.2.0/grid/
[root@standby grid]# ./root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /data/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /data/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to oracle-ohasd.service
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node master, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
sh: /bin/netstat: No such file or directory
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
[root@standby grid]#