空块的逻辑损坏处理一例
给一个DB部署备份,老是在备份一个文件的时候alert日志报错:
Fri Aug 1 01:59:30 2014
The input backup piece /oradata/backup/3apegrof_1_1 is in compressed format.
No other copies of piece /oradata/backup/3apegrof_1_1 to read, reading current piece until end.
Fri Aug 1 01:59:46 2014
The input backup piece /oradata/backup/2qpdelim_1_1 is in compressed format.
No other copies of piece /oradata/backup/2qpdelim_1_1 to read, reading current piece until end.
Fri Aug 1 02:01:58 2014
Errors in file /oracle/admin/pi/udump/pi_ora_14664.trc:
ORA-00600: internal error code, arguments: [729], [32], [space leak], [], [], [], [], []
Fri Aug 1 02:22:59 2014
ALTER SYSTEM: Flushing buffer cache
Fri Aug 1 02:38:23 2014
Error backing up file 8, block 14: logical corruption
Fri Aug 1 02:47:03 2014
alter database datafile ‘/oradata/pi/pi/users01.dbf’ autoextend off
Fri Aug 1 02:47:03 2014
Completed: alter database datafile ‘/oradata/pi/pi/users01.dbf’ autoextend off
Fri Aug 1 02:48:19 2014
ALTER SYSTEM: Flushing buffer cache
Fri Aug 1 02:50:37 2014
Error backing up file 8, block 14: logical corruption
可以使用DBV和RMAN进行坏块检查:
-
使用DBV进行检查,发现有一个逻辑坏块,但Total Pages Marked Corrupt : 0显示这个坏块是个空数据块:
[oracle@tmstest udump]$ dbv file=’/oradata/test1.dbf’
DBVERIFY: Release 10.2.0.4.0 – Production on Fri Aug 1 02:56:23 2014
Copyright (c) 1982, 2007, Oracle. All rights reserved.
DBVERIFY – Verification starting : FILE = /oradata/test1.dbf
Block Checking: DBA = 33554446, Block Type = KTB-managed data block
data header at 0x2b04ab004264
kdbchk: row locked by non-existent transaction
table=0 slot=5
lockid=2 ktbbhitc=2
Page 14 failed with check code 6101
DBVERIFY – Verification complete
Total Pages Examined : 640
Total Pages Processed (Data) : 5
Total Pages Failing (Data) : 1
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 578
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 57
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 1934656 (0.1934656)
DBA = 33554446转化为16进制就是0X0200000E(file#=8,block#=14).
-
使用RMAN脚本检查坏块:
检查单个数据文件:
RMAN> run {
allocate channel d1 type disk;
backup check logical validate datafile 4;
release channel d1;
}
检查所有数据文件:
run {
allocate channel d1 type disk;
backup check logical validate database;
release channel d1;
}
运行以上任一脚本之后,发现的坏块为记录到视V$DATABASE_BLOCK_CORRUPTION中:
SQL> select * from V$DATABASE_BLOCK_CORRUPTION;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
———- ———- ———- —————— ———
8 14 1 1772717 LOGICAL
并运行如下脚本查看坏块位于哪个段中:
SQL>SELECT tablespace_name, segment_type, owner, segment_name FROM dba_extents
2 WHERE file_id = 8 and 14 between block_id AND block_id + blocks – 1;
no rows selected
没有返回行,证明了DBV的检查结果—-该坏块是一个空块,结合alert中的报错ORA-00600:[729], [32], [space leak],猜测可能是segment重建时回收extent遇到异常所致,属于Oracle的内部错误。
由于在块损坏之前这个DB没有任何备份,如果是数据坏块的话意味着可能需要通过一些非常规方式去恢复块中的数据,但现在可以松一口气了,空闲坏块说明不会存在数据丢失,基本上可以忽略这个报错。
但如果我们不想看到alert中有这样的报错呢?那么我们可以格式化这个空闲坏块,具体步骤如下:
-
查出该坏块所在的空闲extent对应的大小:
SQL> Select BYTES from dba_free_space where file_id=8 and 14 between block_id and block_id + blocks -1;
BYTES
—————- ———- ———- ———- ———- ——
65536
-
手动对某个表多执行几次空间分配操作,分配的单元为上述空闲extent大小(需确保数据文件为autoextend off):
SQL>alter table pio.s allocate extent (DATAFILE ‘/oradata/pi/pi/users01.dbf ‘ SIZE 64K);
-
验证该空闲坏块是否已经被格式化:
SQL>SELECT segment_name FROM dba_extents
2 WHERE file_id = 8 and 14 between block_id AND block_id + blocks – 1;
SEGMENT_NAME
—————-
S
-
使用DBV验证,坏块已经被修复了:
[oracle@tmstest udump]$ dbv file=’/oradata/test1.dbf’
DBVERIFY: Release 10.2.0.4.0 – Production on Fri Aug 1 02:56:23 2014
Copyright (c) 1982, 2007, Oracle. All rights reserved.
DBVERIFY – Verification starting : FILE = /oradata/test1.dbf
DBVERIFY – Verification complete
Total Pages Examined : 640
Total Pages Processed (Data) : 5
Total Pages Failing (Data) : 1
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 578
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 57
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 1934656 (0.1934656)
至此这个问题已经得到解决,alert日志中已经没有报错信息了。
建议:
1、如果有备份,用rman做blockrecover:
RMAN>run {
allocate channel d1 type disk;
blockrecover corruption list;
release channel d1;
}
2、如果没有备份,该坏块会在这个块重新被分配时被format掉,所以:
1)可以参考下列文档去手工格式化这个坏块:
2)也可以等到块重新被分配时自动格式化,这样为了使RMAN备份成功进行,需要对rman指定skip坏块,例如:
run {
set maxcorrupt for datafile 8 to 1;
backup database;
}
物理坏块和逻辑坏块的区别在于块的数据内容是否损坏。
物理坏块通常伴随ORA-1578之类坏块的错误报出来,产生的原因有:
1、块头损坏
2、写入不完全(会导致块头和块尾的校验位不一致)
3、块校验无效
4、块错位(跑到别的对象下了)
5、块被清零(通常是第三方软件导致,因为oracle是不会写出全零块的)
逻辑坏块是有个有效的校验位,块的物理结构没问题。但块内的数据结构乱了,即块内容有问题,通常由ORA-00600之类的异常导致。
未经允许不得转载:SRE空间 » 空块的逻辑损坏处理一例
评论前必须登录!
注册