One Can Succeed at Almost Anything For Which He Has Enthusiasm...: Rman Data Recovery Advisor in Oracle 11g

The Data Recovery Advisor automatically diagnoses corruption or loss of persistent data on disk, determines the appropriate repair options, and executes repairs at the user's request. This reduces the complexity of recovery process, thereby reducing the Mean Time To Recover (MTTR). The advisor comes in two flavors: command line mode and as a screen in Oracle Enterprise Manager Database Control.

Below are the following command used in Data Recovery Advisor

1. LIST FAILURE

2. LIST FAILURE DETAILS

3. ADVISE FAILURE

4. REPAIR FAILURE

Before we can start identifying and repairing failures, we need to damage a datafile.In this scenario, I have shut my database and open one of the datafile(user01.dbf) with wordpad(os utility ) and edit two letter and save it , and then open the database and got following error message.

C:\>sqlplus sys/ramtech@noida as sysdba

SQL*Plus: Release 11.1.0.6.0 - Production on Fri May 6 14:11:23 2011

Connected to an idle instance.

SQL> startup

ORACLE instance started.

Total System Global Area 426852352 bytes

Fixed Size 1333648 bytes

Variable Size 318768752 bytes

Database Buffers 100663296 bytes

Redo Buffers 6086656 bytes

Database mounted.

ORA-01157 : cannot identify/lock data file 4 - see DBWR trace file

ORA-01110 : data file 4: 'D:\ORACLE\ORADATA\NOIDA\USERS01.DBF'

Since the error has occurred , so we want to find out what happened. So we connect to RMAN and check the failure.

C:\>rman target sys/ramtech@noida

Recovery Manager: Release 11.1.0.6.0 - Production on Fri May 6 14:16:06 2011

connected to target database: NOIDA (DBID=1503672566, not open)

LIST FAILURE : If there is no error, this command will come back with the message: "no failures found that match specification " and if there is an error, a more explanatory message will follow:

RMAN> list failure;

using target database control file instead of recovery catalog

List of Database Failures

========================

Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

382 HIGH OPEN 06-MAY-11 One or more non-system datafiles are corrupt

This message shows that some datafiles are corrupt . As the datafiles belong to a tablespace other than SYSTEM, the database stays up with that tablespace being offline. This error is fairly critical, so the priority is set to HIGH. Each failure gets a Failure ID, which makes it easier to identify and address individual failures. For instance we can issue the following command to get the details of Failure 382.

LIST FAILURE DETAILS : This command will show us the exact cause of the error.This command will give the details about the failure id i.e, 382

RMAN> list failure 382 detail;

List of Database Failures

=========================

Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

382 HIGH OPEN 06-MAY-11 One or more non-system datafiles are corrupt

Impact: See impact for individual child failures

List of child failures for parent failure ID 382

Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

385 HIGH OPEN 06-MAY-11 Datafile 4: 'D:\ORACLE\ORADATA\NOIDA\USERS01.DBF' is corrupt

Impact: Some objects in tablespace USERS might be unavailable

ADVISE FAILURE : It responds with a detailed explanation of the error and how to correct it:

RMAN> advise failure;

List of Database Failures

=========================

Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

382 HIGH OPEN 06-MAY-11 One or more non-system datafiles are corrupt

Impact: See impact for individual child failures

List of child failures for parent failure ID 382

Failure ID Priority Status Time Detected Summary

---------- -------- --------- ------------- -------

385 HIGH OPEN 06-MAY-11 Datafile 4: 'D:\ORACLE\ORADATA\NOIDA\USERS01.DBF' is corrupt

Impact: Some objects in tablespace USERS might be unavailable

analyzing automatic repair options; this may take some time

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=155 device type=DISK

analyzing automatic repair options complete

Mandatory Manual Actions

========================

no manual actions available

Optional Manual Actions

=======================

no manual actions available

Automated Repair Options

========================

Option Repair Description

------ ------------------

1 Restore and recover datafile 4

Strategy: The repair includes complete media recovery with no data loss

Repair script: d:\oracle\diag\rdbms\noida\noida\hm\reco_1928090031.hm

This output has several important parts. First, the advisor analyzes the error. In this case, it's pretty obvious: the datafile is corrupt . Next, it suggests a strategy. In this case, this is fairly simple as well: restore and recover the file. The dynamic performance view V$IR_MANUAL_CHECKLIST also shows this information.

However, the most useful task Data Recovery Advisor does is shown in the very last line: it generates a script that can be used to repair the datafile or resolve the issue. The script does all the work; we don't have to write a single line of code.

Sometimes the advisor doesn't have all the information it needs. For instance, in this case, it does not know if someone moved the file to a different location or renamed it. In that case, it advises to move the file back to the original location and name (under Optional Manual Actions).

So the script is prepared for us . I would verify what the script actually does first. So, I issue the following command to "preview" the actions the repair task will execute:

RMAN> repair failure preview ;

Strategy: The repair includes complete media recovery with no data loss

Repair script: d:\oracle\diag\rdbms\noida\noida\hm\reco_1928090031.hm

contents of repair script:

# restore and recover datafile

restore datafile 4;

recover datafile 4;

This is good; the repair seems to be doing the same thing I would have done myself using RMAN. Now I can execute the actual repair by issuing:

REPAIR FAILURE : This command will execute the above script. After recovery the tablespace it is prompt for opening the database .

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss

Repair script: d:\oracle\diag\rdbms\noida\noida\hm\reco_1928090031.hm

contents of repair script:

# restore and recover datafile

restore datafile 4;

recover datafile 4;

Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script

Starting restore at 06-MAY-11

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00004 to D:\ORACLE\ORADATA\NOIDA\USERS01.DBF

channel ORA_DISK_1: reading from backup piece D:\RMAN_BKP\03MBDHJ7_1_1

channel ORA_DISK_1: piece handle=D:\RMAN_BKP\03MBDHJ7_1_1 tag=TAG20110503T141047

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:03

Finished restore at 06-MAY-11

Starting recover at 06-MAY-11

using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 20 is already on disk as file D:\ARCHIVE\NOIDA_20_1_749730106.ARC

archived log for thread 1 with sequence 1 is already on disk as file D:\ARCHIVE\NOIDA_1_1_750184743.ARC

archived log for thread 1 with sequence 2 is already on disk as file D:\ARCHIVE\NOIDA_2_1_750184743.ARC

archived log for thread 1 with sequence 3 is already on disk as file D:\ARCHIVE\NOIDA_3_1_750184743.ARC

archived log for thread 1 with sequence 4 is already on disk as file D:\ARCHIVE\NOIDA_4_1_750184743.ARC

archived log for thread 1 with sequence 5 is already on disk as file D:\ARCHIVE\NOIDA_5_1_750184743.ARC

archived log file name=D:\ARCHIVE\NOIDA_20_1_749730106.ARC thread=1 sequence=20

archived log file name=D:\ARCHIVE\NOIDA_1_1_750184743.ARC thread=1 sequence=1

archived log file name=D:\ARCHIVE\NOIDA_2_1_750184743.ARC thread=1 sequence=2

archived log file name=D:\ARCHIVE\NOIDA_3_1_750184743.ARC thread=1 sequence=3

media recovery complete, elapsed time: 00:00:13

Finished recover at 06-MAY-11

repair failure complete

Do you want to open the database (enter YES or NO)? Y

database opened

Note how RMAN prompts us before attempting to repair. In a scripting case, we may not want to do that; rather, we would want to just go ahead and repair it without an additional prompt. In such a case, just use repair failure noprompt at the RMAN prompt.

Several views have been added to Oracle 11g to support the Data Recovery Advisor. The following views are available:

V$IR_FAILURE - This view provides information on the failure. Note that records in this view can be hierarchal.
V$IR_FAILURE_SET - This view provides a list of the various advice records associated with the failure. we can use this view to join the V$IR_FAILURE to the V$IR_MANUAL_CHECKLIST view.
V$IR_ MANUAL_CHECKLIST - This view provides detailed informational messages related to the failure. These messages provide information on how to manually correct the problem.
V$IR_REPAIR - This view, when joined with V$IR_FAILURE and V$IR_FAILURE_SET, can be used to provide a pointer to the physical file created by Oracle that contains the repair steps required to correct a detected error.

Enjoy J J J

One Can Succeed at Almost Anything For Which He Has Enthusiasm...

Friday, May 6, 2011

Rman Data Recovery Advisor in Oracle 11g

No comments: