Friday, May 6, 2011

Rman Data Recovery Advisor in Oracle 11g

The Data Recovery Advisor automatically diagnoses corruption or loss of persistent data on disk, determines the appropriate repair options, and executes repairs at the user's request. This reduces the complexity of recovery process, thereby reducing the Mean Time To Recover (MTTR). The advisor comes in two flavors: command line mode and as a screen in Oracle Enterprise Manager Database Control.

Below are the following command used in  Data  Recovery Advisor

1.     LIST FAILURE
2.     LIST FAILURE DETAILS
3.     ADVISE FAILURE
4.     REPAIR FAILURE

Before we can start identifying and repairing failures, we need to damage a datafile.In  this scenario, I have  shut my database and open one of the datafile(user01.dbf) with wordpad(os utility )  and edit two letter and save it , and then open the database and got following error message.

C:\>sqlplus sys/ramtech@noida as sysdba
SQL*Plus: Release 11.1.0.6.0 - Production on Fri May 6 14:11:23 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Connected to an idle instance.

SQL> startup
ORACLE instance started.
Total System Global Area  426852352 bytes
Fixed Size                  1333648 bytes
Variable Size             318768752 bytes
Database Buffers          100663296 bytes
Redo Buffers                6086656 bytes
Database mounted.
ORA-01157 : cannot identify/lock data file 4 - see DBWR trace file
ORA-01110 : data file 4: 'D:\ORACLE\ORADATA\NOIDA\USERS01.DBF'

Since the   error has occurred , so we want to find out what happened. So we connect to RMAN and check the failure.

C:\>rman  target  sys/ramtech@noida
Recovery Manager: Release 11.1.0.6.0 - Production on Fri May 6 14:16:06 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database: NOIDA (DBID=1503672566, not open)

LIST FAILURE   : If there is no error, this command will come back with the message: "no failures found that match specification " and if there is an error, a more explanatory message will follow:

RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
========================
Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
382        HIGH     OPEN      06-MAY-11     One or more non-system datafiles are corrupt

This message shows that some datafiles are corrupt . As the datafiles belong to a tablespace other than SYSTEM, the database stays up with that tablespace being offline. This error is fairly critical, so the priority is set to HIGH. Each failure gets a Failure ID, which makes it easier to identify and address individual failures. For instance we can issue the following command to get the details of Failure 382.

LIST FAILURE DETAILS :   This command will show us the exact cause of the error.This command will give the details about the failure id i.e, 382  

RMAN> list failure 382 detail;
List of Database Failures
=========================
Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
382        HIGH     OPEN      06-MAY-11     One or more non-system datafiles are corrupt
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 382
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  385        HIGH     OPEN      06-MAY-11     Datafile 4: 'D:\ORACLE\ORADATA\NOIDA\USERS01.DBF' is corrupt
 Impact: Some objects in tablespace USERS might be unavailable

ADVISE FAILURE  :  It responds with a detailed explanation of the error and how to correct it:

RMAN> advise failure;
List of Database Failures
=========================
Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
382        HIGH     OPEN      06-MAY-11     One or more non-system datafiles are corrupt
  Impact: See impact for individual child failures
  List of child failures for parent failure ID 382
  Failure ID Priority Status    Time Detected Summary
  ---------- -------- --------- ------------- -------
  385        HIGH     OPEN      06-MAY-11     Datafile 4: 'D:\ORACLE\ORADATA\NOIDA\USERS01.DBF' is corrupt
    Impact: Some objects in tablespace USERS might be unavailable

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=155 device type=DISK
analyzing automatic repair options complete
Mandatory Manual Actions
========================
no manual actions available
Optional Manual Actions
=======================
no manual actions available
Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 4
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: d:\oracle\diag\rdbms\noida\noida\hm\reco_1928090031.hm

This output has several important parts. First, the advisor analyzes the error. In this case, it's pretty obvious: the datafile is corrupt . Next, it suggests a strategy. In this case, this is fairly simple as well: restore and recover the file. The dynamic performance view V$IR_MANUAL_CHECKLIST also shows this information.
However, the most useful task Data Recovery Advisor does is shown in the very last line: it generates a script that can be used to repair the datafile or resolve the issue. The script does all the work; we don't have to write a single line of code.
Sometimes the advisor doesn't have all the information it needs. For instance, in this case, it does not know if someone moved the file to a different location or renamed it. In that case, it advises to move the file back to the original location and name (under Optional Manual Actions).
So the script is prepared for us  . I would verify what the script actually does first. So, I issue the following command to "preview" the actions the repair task will execute:

RMAN> repair failure preview ;

Strategy: The repair includes complete media recovery with no data loss
Repair script: d:\oracle\diag\rdbms\noida\noida\hm\reco_1928090031.hm
contents of repair script:
   # restore and recover datafile
   restore datafile 4;
   recover datafile 4;

This is good; the repair seems to be doing the same thing I would have done myself using RMAN. Now I can execute the actual repair by issuing:

REPAIR FAILURE  :  This command will execute the above script. After recovery the tablespace it is prompt for  opening  the database .

RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: d:\oracle\diag\rdbms\noida\noida\hm\reco_1928090031.hm
contents of repair script:
   # restore and recover datafile
   restore datafile 4;
   recover datafile 4;

Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script
Starting restore at 06-MAY-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to D:\ORACLE\ORADATA\NOIDA\USERS01.DBF
channel ORA_DISK_1: reading from backup piece D:\RMAN_BKP\03MBDHJ7_1_1
channel ORA_DISK_1: piece handle=D:\RMAN_BKP\03MBDHJ7_1_1 tag=TAG20110503T141047
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 06-MAY-11
Starting recover at 06-MAY-11
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 20 is already on disk as file D:\ARCHIVE\NOIDA_20_1_749730106.ARC
archived log for thread 1 with sequence 1 is already on disk as file D:\ARCHIVE\NOIDA_1_1_750184743.ARC
archived log for thread 1 with sequence 2 is already on disk as file D:\ARCHIVE\NOIDA_2_1_750184743.ARC
archived log for thread 1 with sequence 3 is already on disk as file D:\ARCHIVE\NOIDA_3_1_750184743.ARC
archived log for thread 1 with sequence 4 is already on disk as file D:\ARCHIVE\NOIDA_4_1_750184743.ARC
archived log for thread 1 with sequence 5 is already on disk as file D:\ARCHIVE\NOIDA_5_1_750184743.ARC
archived log file name=D:\ARCHIVE\NOIDA_20_1_749730106.ARC thread=1 sequence=20
archived log file name=D:\ARCHIVE\NOIDA_1_1_750184743.ARC thread=1 sequence=1
archived log file name=D:\ARCHIVE\NOIDA_2_1_750184743.ARC thread=1 sequence=2
archived log file name=D:\ARCHIVE\NOIDA_3_1_750184743.ARC thread=1 sequence=3
media recovery complete, elapsed time: 00:00:13
Finished recover at 06-MAY-11
repair failure complete
Do you want to open the database (enter YES or NO)? Y
database opened

Note how RMAN prompts us before attempting to repair. In a scripting case, we may not want to do that; rather, we would want to just go ahead and repair it without an additional prompt. In such a case, just use repair failure noprompt at the RMAN prompt.

Several views have been added to Oracle 11g to support the Data Recovery Advisor. The following views are available:
  • V$IR_FAILURE  - This view provides information on the failure. Note that records in this view can be hierarchal.
  • V$IR_FAILURE_SET -  This view provides a list of the various advice records associated with the failure. we can use this view to join the V$IR_FAILURE to the V$IR_MANUAL_CHECKLIST view.
  • V$IR_ MANUAL_CHECKLIST - This view provides detailed informational messages related to the failure. These messages provide information on how to manually correct the problem.
  • V$IR_REPAIR - This view, when joined with V$IR_FAILURE and V$IR_FAILURE_SET, can be used to provide a pointer to the physical file created by Oracle that contains the repair steps required to correct a detected error.
Enjoy    J J J

Tuesday, May 3, 2011

Complete loss of all oracle datafiles, redologs and controlfiles (Disaster Recovery )

In this post, we will cover the disaster recovery situation where the oracle database server has been destroyed and all the oracle database files (datafiles,controlfiles,redologs) are lost . In such scenario, a  database can be recover, if we have  valid backup of the database and then it is possible to recover all the data up to the last full backup. Here, in  this testing environment, we take rman full backup and then delete the database(through dbca) and hence finally having only rman full backup. Let's have a look on the below steps : 

1.) Create Directory structure for datafile and for diagonistics files
2.) Create oracle services.
3.) Configure listener and tns (service)
4.) Restore spfile 
5.)  Restore controlfile
6.) Restore datafile
7.) Recover database and open the database.

1.) Create Directory structure :  Create all directories required for datafiles, (online and archived) logs, control files and backups. All directory paths should match those on the original server. Though,this is  not mandatory.If we do not know where the database files are located even though we can recover database ( i will come back later on ).

2.) Create oracle services : In case of window, we have to create an oracle service and password file in case of linux and unix  .
C:\>oradim  -new  -sid  noida  -intpwd  noida  -startmode  m
Instance created
For Password file 
$orapwd   file=filename  password=noida  entries=5 force=y   

3.) Configure listener and tns (service name) : Configure the listener through net mgr and reload the listener
LSNRCTL> reload
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
The command completed successfully
LSNRCTL> status
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for 32-bit Windows: Version 11.1.0.6.0 - Production
Start Date                03-MAY-2011 13:45:31
Uptime                    0 days 2 hr. 19 min. 34 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   D:\oracle\product\11.1.0\db_1\network\admin\listener.ora
Listener Log File         d:\oracle\diag\tnslsnr\xxxx\listener\alert\log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\EXTPROC1521ipc)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xxxx)(PORT=1521)))
Services Summary...
Service "noida" has 1 instance(s).
  Instance "noida", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
LSNRCTL> exit

Now  Configure  the  tns  through NETCA  and check it by using tnsping .
C:\>tnsping noida
TNS Ping Utility for 32-bit Windows: Version 11.1.0.6.0 - Production on 03-MAY-2011 16:05:12
Copyright (c) 1997, 2007, Oracle.  All rights reserved.
Used parameter files:
D:\oracle\product\11.1.0\db_1\network\admin\sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = xxxx)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = noida)))
OK (40 msec)

4.) Restore spfile : Here will restore spfile from the rman backup . But before that we have to set dbid and startup the database in nomount  stage with the dummy pfile .

C:\> rman target sys/noida@noida
Recovery Manager: Release 11.1.0.6.0 - Production on Tue May 3 16:05:23 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database (not started)
RMAN> set dbid=1503672566
executing command: SET DBID
RMAN> startup force nomount
startup failed: ORA-01078: failure in processing system parameters
LRM-00109: could not open parameter file 'D:\ORACLE\PRODUCT\11.1.0\DB_1\DATABASE\INITNOIDA.ORA'
starting Oracle instance without parameter file for retrieval of spfile
Oracle instance started
Total System Global Area     159019008 bytes
Fixed Size                     1331852 bytes
Variable Size                 67112308 bytes
Database Buffers              83886080 bytes
Redo Buffers                   6688768 bytes

RMAN> restore spfile from 'D:\rman_bkp\cf\C-1503672566-20110503-00';
Starting restore at 03-MAY-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=98 device type=DISK
channel ORA_DISK_1: restoring spfile from AUTOBACKUP D:\rman_bkp\cf\C-1503672566-20110503-00
channel ORA_DISK_1: SPFILE restore from AUTOBACKUP complete
Finished restore at 03-MAY-11
RMAN> exit
Recovery Manager complete.

5.) Restore control file :  Now we will shut  the database and startup database with spfile which we have restore in above step. After startup with spfile we connect to rman and restore the controlfile.

C:\>sqlplus sys/noida@noida as sysdba
SQL*Plus: Release 11.1.0.6.0 - Production on Tue May 3 16:07:04 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> shut immediate
ORA-01507: database not mounted
ORACLE instance shut down.
SQL> exit

C:\>sqlplus sys/noida@noida as sysdba
SQL*Plus: Release 11.1.0.6.0 - Production on Tue May 3 16:07:25 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Connected.
SQL> startup nomount
ORACLE instance started.
Total System Global Area  426852352 bytes
Fixed Size                  1333648 bytes
Variable Size             369100400 bytes
Database Buffers           50331648 bytes
Redo Buffers                6086656 bytes
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

C:\>rman target sys/noida@noida
Recovery Manager: Release 11.1.0.6.0 - Production on Tue May 3 16:07:44 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database: NOIDA (not mounted)
RMAN> restore controlfile  from  'D:\rman_bkp\cf\C-1503672566-20110503-00';
Starting restore at 03-MAY-11
using channel ORA_DISK_1
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
output file name=D:\ORACLE\ORADATA\NOIDA\CONTROL01.CTL
output file name=D:\ORACLE\ORADATA\NOIDA\CONTROL02.CTL
output file name=D:\ORACLE\ORADATA\NOIDA\CONTROL03.CTL
Finished restore at 03-MAY-11
RMAN> alter database mount ; 
database mounted
released channel: ORA_DISK_1

6.) Restore datafile : As  we have restore the controlfile ,we restore all  the datafiles .

RMAN> alter database mount;
database mounted
released channel: ORA_DISK_1
RMAN> restore database;
Starting restore at 03-MAY-11
Starting implicit crosscheck backup at 03-MAY-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=153 device type=DISK
Crosschecked 3 objects
Finished implicit crosscheck backup at 03-MAY-11
Starting implicit crosscheck copy at 03-MAY-11
using channel ORA_DISK_1
Crosschecked 2 objects
Finished implicit crosscheck copy at 03-MAY-11
searching for all files in the recovery area
cataloging files...
no files cataloged
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to D:\ORACLE\ORADATA\NOIDA\SYSTEM01.DBF
channel ORA_DISK_1: restoring datafile 00002 to D:\ORACLE\ORADATA\NOIDA\SYSAUX01.DBF
channel ORA_DISK_1: restoring datafile 00003 to D:\ORACLE\ORADATA\NOIDA\UNDOTBS01.DBF
channel ORA_DISK_1: restoring datafile 00004 to D:\ORACLE\ORADATA\NOIDA\USERS01.DBF
channel ORA_DISK_1: restoring datafile 00005 to D:\ORACLE\ORADATA\NOIDA\EXAMPLE01.DBF
channel ORA_DISK_1: reading from backup piece D:\RMAN_BKP\03MBDHJ7_1_1
channel ORA_DISK_1: piece handle=D:\RMAN_BKP\03MBDHJ7_1_1 tag=TAG20110503T141047
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:35
Finished restore at 03-MAY-11

7.) Recover database and open the database :  While recovering the database , an error occur related to next log sequence , so find the log sequence and recover until last logseq and open the database in resetlogs mode.
RMAN> recover database ; 
Starting recover at 03-MAY-11
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 20 is already on disk as file D:\ARCHIVE\NOIDA_20_1_749730106.
archived log file name=D:\ARCHIVE\NOIDA_20_1_749730106.ARC thread=1 sequence=20
unable to find archived log
archived log thread=1 sequence=21
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 05/03/2011 16:38:03
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 21 and starting

RMAN> recover database until logseq 21 ;
Starting recover at 03-MAY-11
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:03
Finished recover at 03-MAY-11

RMAN> alter database open resetlogs ;
database opened


Hence, we restore the database successfully .



Enjoy     :-)


Monday, May 2, 2011

Important points to be noticed when diagnosing export/import issues

There are various points should be considered while performing or diagnosing the imp/exp operation . Here are some important point which we should while diagnosing the import/export issues.

1.) Check if we are using correct syntax or not . Use exp help=y or imp help=y to get the correct syntax.

2.) Check if we are using correct exp/imp utility (version) against the database. Use following metalink note for compatibility matrix .Note 132904.1 – Compatibility Matrix for Export And Import Between Different Oracle Versions .

3.) Check  DBA_REGISTRY  view to see if the components are in valid state .

4.) Try  setting  errorstack  if  we are facing  “ORA-”  error .

5.) Try  setting  trace=y  parameter  in  export/import. This  will  generate  a trace  file  ( similar to sql trace ) which can be used to diagnose the problems .
e.g;  exp  system/passwd  full=y  file=expdat.dmp  log=exp.log  trace=y 

Reference  :  http://askdba.org


Enjoy      :-)