Oracle故障处理-ORA-01578-Oracle坏块处理案例

发布于:2023-09-22 ⋅ 阅读:(143) ⋅ 点赞:(0)


问题描述

迪斯杰工程师吴奇,数据库同步库服务器10.74.132.188,报该错。
迪斯杰系统进入方法:

su - dsg
cds
cdl

目录下面有ro-r4的日志,现在只有r0和r0有这样的情况。
直接vi r0和r1看具体的报错日志信息情况。
数据装载正常就行了。
cds
执行监控**./mon*** ,查看数据同步的实时状态。

一、查找oracle报错信息

1.1、找到 alter<sid>.log

show parameter_dump_dest;
/home/oracle/app/diag/rdbms/orcl/ORCL/trace

进入linux服务器,
服务器要安装了rz,sz
yum install lrzsz
当然你的本地windows主机也通过ssh连接了linux服务器
运行rz,会将windows的文件传到linux服务器
运行sz filename,会将文件下载到windows本地

ls alert*log
sz alert_ORCL.log

SecureCRT直接下载到本地:
c:\Users\Administrator\Downloads
发现报错是:文件号5,块号1395985

1.2、找到坏块详情

SELECT *
  FROM dba_extents
 WHERE file_id = 5
   AND 1395985 BETWEEN block_id AND block_id + blocks - 1;
SEGMENT_NAME SEGMENT_TYPE
IDX_DRV_LOG_CLRQ_NEW INDEX

发现索引有坏块。
重建索引:

alter index IDX_DRV_LOG_CLRQ_NEW rebuild online;

1.3、Linux检查磁盘物理坏块。

root下,查找出所有disk ,

sudo fdisk -l
disk /dev/sda: 299.0GB
disk /dev/sdb: 1998.0GB
disk /dev/mapper/vg_orcldb-lv_swap: 68.7GB
disk /dev/mapper/vg_orcldb-Logvol01: 229.8GB

然后一个一个磁盘检查:

sudo badblocks -v /dev/sda
0 bad blocks found.
sudo badblocks -v /dev/sdb
sudo badblocks -v /dev/mapper/vg_orcldb-lv_swap
sudo badblocks -v /dev/mapper/vg_orcldb-Logvol01

二、分发库重建索引

[oracle@orcldb ~]$ sqlplus trff_app/trff_app as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Nov 13 03:44:58 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> show user
USER is "SYS"
SQL> conn trff_app/trff_app as syydba
SP2-0306: Invalid option.
Usage: CONN[ECT] [{logon|/|proxy} [AS {SYSDBA|SYSOPER|SYSASM}] [edition=value]]
where <logon> ::= <username>[/<password>][@<connect_identifier>]
      <proxy> ::= <proxyuser>[<username>][/<password>][@<connect_identifier>]
SQL> conn trff_app/trff_app
Connected.
SQL> show user
USER is "TRFF_APP"
SQL> alter index IDX_DRV_LOG_CLRQ_NEW rebuild online;

Index altered.

二、查询所有坏块

select * from v$database_block_corruption where file#=5;

结果如下:

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE
1 5 1207155 1 0 FRACTURED
2 5 1395985 1 0 FRACTURED
3 5 140013 1 0 FRACTURED
4 5 156443 1 0 FRACTURED
5 5 134810 1 0 FRACTURED
6 5 462393 1 0 FRACTURED
7 5 119537 1 0 FRACTURED
8 5 981166 1 0 FRACTURED

三、分别查出单独的坏块类型

SELECT *
  FROM dba_extents
 WHERE file_id = 5
   AND 462393 BETWEEN block_id AND block_id + blocks - 1;

结果:

SEGMENT_NAME SEGMENT_TYPE
IDX_DRV_PREASIGN_ZT_NEW INDEX
SELECT *
  FROM dba_extents
 WHERE file_id = 5
   AND 1207155 BETWEEN block_id AND block_id + blocks - 1;

结果:

SEGMENT_NAME SEGMENT_TYPE
VEHICLE TABLE
--结果
1207155 VEHICLE TABLE
1395985 IDX_DRV_LOG_CLRQ_NEW INDEX ok
140013 I_VEHICLE_CLSBDH INDEX
156443 I_VEHICLE_JYHGBZBH INDEX
134810 I_VEHICLE_XSZBH INDEX
462393 IDX_DRV_PREASIGN_ZT_NEW INDEX DRV_PREASIGN
119537 VEHICLE TABLE
981166 PK_VEH_PRINT INDEX VEH_PRINT

四、尝试导出,坏块在VEHICLE表中的数据,报错,如果表有坏块无法导出。

exp trff_app/trff_app@orcl file=/home/oracle/trff_app_201811141133.dmp log=/home/oracle/exp_trff_app_201811141133.log tables=VEHICLE

五、如果是数据坏块,10231事件指定数据库在进行全表扫描时跳过损坏的块

alter system set events='10231 trace name context forever,level 10';

六、然后导出

exit
exp trff_app/trff_app@orcl file=/home/oracle/trff_app_201811141133.dmp log=/home/oracle/exp_trff_app_201811141133.log tables=VEHICLE

七、取消events设置

alter system set events='10231 trace name context off';

八、重建表的索引

conn trff_app/trff_app
--alter index IDX_DRV_LOG_CLRQ_NEW rebuild online;
alter index I_VEHICLE_CLSBDH rebuild online;
alter index I_VEHICLE_JYHGBZBH rebuild online;
alter index I_VEHICLE_XSZBH rebuild online;
alter index IDX_DRV_PREASIGN_ZT_NEW rebuild online;
alter index PK_VEH_PRINT rebuild online;

八、重建索引报错.需要清空表数据。

alter index I_VEHICLE_CLSBDH rebuild online;

九、清空表(truncate)、重建索引

truncate table VEHICLE;
alter index I_VEHICLE_CLSBDH rebuild online;
alter index I_VEHICLE_JYHGBZBH rebuild online;
alter index I_VEHICLE_XSZBH rebuild online;

九、尝试把数据导入

imp trff_app/trff_app@orcl file=/home/oracle/trff_app_201811141133.dmp log=/home/oracle/imp_trff_app_201811141524.log tables=VEHICLE indexes=n grants=n constraints=n statistics=none data_only=y fromuser=trff_app touser=trff_app buffer=5400000 ignore=y

十、后续操作

10.1、查看扩展的块信息:

select * from dba_extents where file_id=5 and 462393 between block_id and block_id+blocks-1;

10.2、察看游标数:

select count(*) from v$open_cursor

10.3、查看等待事件:

select 
p.SPID pid,
s.SID,
s.SERIAL#,
s.USERNAME,
w.event,
w.P1,
w.P1TEXT,
w.P2,
w.P2TEXT,
w.P3,
w.P3TEXT,
sq.SQL_TEXT,
w.WAIT_TIME,
w.SECONDS_IN_WAIT,
w.STATE
from v$session_wait w,v$session s,v$process p,v$sql sq
where w.event not like 'SQL%' and w.sid=s.sid and s.PADDR=p.ADDR and
s.SQL_ADDRESS = sq.ADDRESS and s.SQL_HASH_VALUE=sq.HASH_VALUE

10.4、导入卡住,无法导入,查看锁表

select s.sid,
       s.serial#,
       l.oracle_username,
       l.os_user_name,
       a.object_name,
       l.locked_mode
from v$locked_object l,
     dba_objects a,
     v$session s
  where a.object_id = l.object_id
    and l.session_id = s.sid;

10.5、解锁表

--alter system kill session 'sid,serial#';
alter system kill session '469,6539';

10.6、无法解锁,迪斯杰服务不停无法导入。

SID SERIAL# ORACLE_USERNAME OS_USER_NAME OBJECT_NAME LOCKED_MODE
1 703 19 DSG dsg DRV_LOG 3
2 469 6539 TRFF_APP oracle VEHICLE 3
3 729 19 DSG dsg VEHICLE_TEMP 3
4 703 19 DSG dsg DRV_TEMP 3
5 729 19 DSG dsg VIO_SURVEIL 3

10.7、停止迪斯杰服务:

su -sdg
cds
./stop_vagentd to_jz1
./check to_jz1
./check all(检查所有)
--启动迪斯杰服务
./start_vagentd to_jz1
./mon*(监视情况)
ctrl+c停止监控

10.8、进数据库杀死进程,杀不掉

SQL> alter system kill session '469,6539';
alter system kill session '469,6539'
*
ERROR at line 1:
ORA-00031: session marked for kill

10.9、直接linux操作系统下杀

--查出session的spid
select * from v$session s,v$process p where s.PADDR = p.ADDR and s.sid='469';
    SPID OSUSER PROGRAM
1 35867 oracle imp@orcldb (TNS V1-V3)
-- -9 杀死spid
 kill -9 35867

10.10、准备重建普通索引

select count(*) from DRV_PREASIGN
161799
--
select * from dba_extents where file_id=5 and 462393 between block_id and block_id+blocks-1;
alter index IDX_DRV_PREASIGN_ZT_NEW rebuild online;

10.11、准备重建主键索引

select count(*) from VEH_PRINT
8382343
PK_VEH_PRINT

10.12、查看约束

SELECT * FROM DBA_CONSTRAINTS A
WHERE A.owner='TRFF_APP' 
AND A.constraint_name='PK_VEH_PRINT'

10.13、重建主键索引成功:

alter index PK_VEH_PRINT rebuild online;
select * from dba_extents where file_id=5 and 981166 between block_id and block_id+blocks-1;
--居然干了
index altered.

10.14、在迪斯杰里面启动服务,察看状态

cds
./start_vagentd to_jz1
./mon*

10.15、停止迪斯杰服务

./stop_vagentd to_jz1
./check to_jz1

10.16、删有坏块的表

drop table VEHICLE;

10.17、分析数据文件、标记坏块为损坏

su - oracle
dbv file=/home/oracle/app/oradata/orcl/data01.dbf

[root@orcldb ~]# su - oracle
[oracle@orcldb ~]$ dbv file=/home/oracle/app/oradata/orcl/data01.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Wed Nov 14 09:45:57 2018

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/app/oradata/orcl/data01.dbf

DBV-00200: Block, DBA 21111533, already marked corrupt
csc(0x0001.f81f900e) higher than block scn(0x0000.00000000)
Page 140013 failed with check code 6054

DBV-00200: Block, DBA 21127963, already marked corrupt
csc(0x0001.f81f9007) higher than block scn(0x0000.00000000)
Page 156443 failed with check code 6054

DBV-00200: Block, DBA 21952686, already marked corrupt
csc(0x0001.f81f873f) higher than block scn(0x0000.00000000)
Page 981166 failed with check code 6054

DBV-00200: Block, DBA 22178675, already marked corrupt
csc(0x0001.f81f9075) higher than block scn(0x0000.00000000)
Page 1207155 failed with check code 6054

DBV-00200: Block, DBA 22367505, already marked corrupt
csc(0x0001.f81f9040) higher than block scn(0x0000.00000000)
Page 1395985 failed with check code 6054


DBVERIFY - Verification complete

Total Pages Examined         : 2435072
Total Pages Processed (Data) : 1379736
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 916053
Total Pages Failing   (Index): 4
Total Pages Processed (Other): 19952
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 119331
Total Pages Marked Corrupt   : 5
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 4163919293 (1.4163919293)
[oracle@orcldb ~]$ 

10.18、启动迪斯杰监控进程后,报ora01410,无效的rowid错误,至迪斯杰吴奇解决即可。

tail -5 /home/dsg/supersync/log/to_jz1/log.r1
2 transactions rollbacked (oci: 1410)
TRFF_APP.VEHICLE (objn 96758, objd 96758) cleaned from cache
OCI Error -1 occurred at File xf1_to_oracle_upd_rid.c:1207.
ORA-01410: 无效的 ROWID
fail to process /home/dsg/supersync/xldr/to_jz1/real1/2994525.xf1 xf1_to_oracle.c:5910, retry it.fail to process /home/dsg/supersync/xldr/to_jz1/real1/2994525.xf1 xf1_to_oracle.c:5910, retry it.

10.19、直接到分发库里面删表

可以正常删掉。

Oracle故障处理-ORA-01578-Oracle坏块处理案例完成,至此案例结束!

本文含有隐藏内容,请 开通VIP 后查看

网站公告

今日签到

点亮在社区的每一天
去签到