Friday, August 5, 2011

Does the Optimizer need a hint?

Today I have gone through a very useful post regarding the use of the hints in Oracle.Though there are lots of articles on the internet but Oracle Corporation has not documented the vast majority of hints.Here i am providing the link of Jonathan Lewis blog who is one of the Master in the Oracle Tunning .

http://jonathanlewis.wordpress.com/2008/05/02/rules-for-hinting/

I’ve written a number of notes about hinting in fact, by using at the “Select Category” list to the right, I see that I have (so far) tagged 26 different articles (and this will be the 27th) with the hints tag. So I’ve decided it was time that I made clear my basic guidelines on safe hinting, as follows:

1.) Don’t
2.) If you must use hints, then assume you’ve used them incorrectly.
3.) On every patch or upgrade to Oracle, assume every piece of hinted SQL is going to do the wrong thing. Because of (2) above; you’ve been lucky so far, but the patch/upgrade lets you discover your mistake.
4.) Every time you apply some DDL to an object that appears in a piece of hinted SQL assume that the hinted SQL is going to do the wrong thing. Because of (2) above; you’ve been lucky so far, but the structural change lets you discover your mistake.

You will appreciate from these guidelines that I don’t really approve of using hints. The only reason that I leave them in place on a production system is when I’m sure that there is no alternative safe mechanism for making the optimizer do what I want. (And that does mean that I will use hints sometimes on a production system.)

What I use them for on test systems is to check whether a particular execution plan is actually possible, and to track down bugs in the optimizer.

Finally, for the purposes of education, I use them to demonstrate execution plans without first having to craft data sets and set database parameters to make a plan appear ‘spontaneously’.

Always be cautious about adding hints to production systems.

For More References click below hints  : 


Enjoy     :-) 

Thursday, August 4, 2011

Rman Retention Policy Based On Redundancy Policy


I found many of us uses the Rman Recovery window Retention Policy . I have not find the rman Redundancy policy used . Here we will discuss the Disadvantages of the Redundancy Policy.   

The REDUNDANCY parameter of the CONFIGURE RETENTION POLICY command specifies how many backups of each datafile and control file that RMAN should keep. In other words, if the number of backups for a specific datafile or control file exceeds the REDUNDANCY setting, then RMAN considers the extra backups as obsolete.

Suppose we have a RMAN retention policy of "REDUNDANCY 2". This means that as long as we have at least two backups of the same datafile, controlfile/spfile or archivelog the other older backups become obsolete and RMAN is allowed to safely remove them.

Now, let's also suppose that every night we backup our database using the following script:
SQL> CONFIGURE CONTROLFILE AUTOBACKUP ON;
SQL> rman {
                        backup database plus archivelog;
                        delete noprompt obsolete redundancy 2;
                     }

The backup task is quite simple    :    first of all it ensures that we have the controlfile autobackup feature on, then it backups the database and archives and, at the end, it deletes all obsolete backups using the REDUNDANCY 2 retention policy.
Using the above approach we might think that we can restore our database as it was two days ago, right? For example, if we have a backup taken on Monday and another one taken on Tuesday we may restore our database as it was within the (Monday_last_backup - Today) time interval. Well, that's wrong!

Consider the following scenario :
1.)  On Monday night we backup the database using the above script;
2.) On Tuesday, during the day, we drop a tablespace. Because this is a structural database change a controlfile autobackup will be triggered. Ieeei, we have a new controlfile backup.
3.) On Tuesday night we backup again the database... nothing unusual, right? 

Well, the tricky part is regarding the DELETE OBSOLETE command. When the backup script will run this command, RMAN finds out three controlfile backups: One is originating from the Monday backup, One is from the structural change and the third is from our just finished Tuesday backup database command. Now according to the retention policy of "REDUNDANCY 2", RMAN will assume that it is safe to delete the backup of the controlfile taken on Monday night backup because it's out of our retention policy and because this backup is the oldest one. Uuups... this means that we gonna have a big problem restoring the database as it was before our structural change because we don't have a controlfile backup from that time.

So, if we intend to incomplete recover our database to a previous time in the past it's really a good idea to switch to a retention policy based on a "RECOVERY WINDOW" instead. In our case a RECOVERY WINDOW OF 2 DAYS would be more appropriate.

Reference : Click Here


 
Enjoy   :-) 


Monday, August 1, 2011

Parallel Query Option in Oracle


The Parallel Query Option is a mechanism that allows a large query to be split up (transparently to the end-user) into a number of smaller queries that can all be run simultaneously.

The separate sections of the query are operated by processes that can run virtually independently of each other, so that under suitable conditions the performance of the base query can scale almost linearly with the requested degree of parallelism.

The Parallel Query Option (PQO) should not be confused with the Oracle Parallel Server (OPS) which allows multiple instances of Oracle (usually running on a number of separate machines) to address a single set of database files and uses a high-speed ‘distributed lock manager’ to synchronise the ownership of blocks and locks across the separate SGAs.  Parallel Query Option does work on Oracle Parallel Server, but for the rest of this article I shall assume that we are running only a single instance.

Parallel Query Option: the purpose  :  The Parallel Query Option is based on the premises that: 
  • Disc is very much slower than CPU
  • Seek time is much larger than read time
  • An inefficient job that ‘wastes’ CPU may give better overall response time than an efficient job that leaves the CPU idle most of the time.
As a simple example, this means that a very large tablescan can produce results more quickly than even a good indexed access path if:
  •  the table is split over many discs
  •  different discs can be accessed in parallel with minimal contention
  • there are multiple CPUs in use so that there is no contention for CPU (although PQO can help even on single CPU systems).

  For example : 
   SQL> select count(*) from BIGT  where COLX = 99;

The above statement would be split down internally into a number of separately executed queries of the form:

AQL> select /*+ rowid(A1) PIV_SSF */ count(*)
from “BIGT” A1
where ROWID between :1 and :2
and COLX = 99;

The predicate
ROWID between :1 and :2
combined with the hint:
/*+ rowid (alias) */
is the real key to the operation of the PQO.  The predicate and hint ensure that the query operates only on a selected set of contiguous blocks from the table

Since it is possible to restrict a query to a range of blocks, Oracle need only determine that different blocks can be found on different devices to be able to run multiple copies of the query in parallel in such a way that they don’t interfere with each other;  and this separation of I/O is how a near linear performance gain can be achieved by PQO.

Clearly, the PQO is of most benefit when planning fairly large database systems (since these are the ones likely to have a significant number of discs and CPUs).  As a very rough guideline, I probably wouldn’t ‘design for PQO’ on a database of less than 25 GB. 

Implementing PQO:
There are three main steps to using PQO.  First, make sure that it is linked into our Oracle server (this usually happens by default);  second, set up the init.ora to allow for parallel query slaves and to control the side-effects of parallel query slaves;  third change the definitions of tables, views, or application SQL statements to allow the optimiser to invoke parallelism.

Init.ora Parameters
To use PQO, we need to have a number of ‘floating’ processes available in the system that can be called by our normal shadow process (dedicated server). These processes, called the ‘parallel query slaves’  will have names like ora_p000_{SID}, and we can have up to 256 of them.

The parameter parallel_min_servers specifies the number of such processes that we want permanently running and waiting for calls.  Since such processes take up machine resources we may want to keep this number low for normal running:  if we set  parallel_max_servers  Oracle will then spawn further processes on demand up to this limit.  One odd feature of dynamically spawned processes, however, is that they are owned at the O/S level by the user running the session’s shadow process, and not by any special ‘Oracle id.

To ensure that the dynamically spawned processes die when they are no longer needed, we  can set
parallel_server_idle_time to the number of minutes  a process is allowed to be idle before it dies.

Once we have the capability of running parallel queries, there are two other parameters which we should consider.  These relate to the way the system performance will degrade as the load increases.

Consider the problem of a query that takes 3 minutes to complete when running with a degree of parallelism of 15.  What happens if the database is so busy that no parallel slaves are free when someone starts the query.  Is it acceptable for the query to run serially and take 45 minutes, or should it simply die with a warning message ?

Set parallel_min_percent to a number between 0 and 100, and if Oracle is unable to get that percentage of the demanded degree of parallelism  it returns an error (ORA-12827 insufficient parallel query slaves available).  For example, we want a tablescan at degree 6.  We have parallel_min_percent set to 50.  If the query can run at least at degree 3 with the current number of slaves available it will run, but it there are insufficient slaves available for running at degree 3 the error is raised.

Associated with this parameter is optimizer_percent_parallel which affects the assumptions that the optimiser makes about the degree of parallelism at which a query is expected to run. 

Imagine a query which works best at a high degree of parallelism (20 say) by using hash-joins, but gives an acceptable performance at a lower level of parallelism (5 to 15 say) by switching to nested loops.  If the optimiser decides that the default degree of the query is 20, then it will always generate the hashing path, even when the system is busy and there are not enough free slaves around to run the query at degree 20.  Setting this parameter to a number between 0 and 100, however, tells the optimiser to reduce the assumed degree of parallelism - setting it to 50 in our example would make the optimiser choose the path that was appropriate to degree 10, hence taking the nested loop path.

Just for the record, I have only found a few extreme examples where a change in the value of this parameter has affected the path, and even then the actual performance effect was not hugely significant.  It is worth noting that the option exists though, as there are likely to be some types of application where it can be used to flatten out response times.

Both these parameters can be set at the session level using:

    SQL> alter session set parm = nnn;

which is very convenient both for testing and in production running.

Invoking PQO
After setting our parameters and restarting our database, what makes it happen ?

The first possibility is to give a table an explicit (or implicit) degree of parallelism:

alter table PX parallel (degree 4);
alter table PX parallel (degree default)

Whenever a query involves this table, the optimiser will try to find a path that uses parallel techniques on the table. 
Two points that are specific to 7.3.2:
First, the Cost Based Optimiser will always take over in queries where there are parallel tables available even if the current optimiser goal is Rule.  Second, the default keyword translates into ‘the smaller of the number of CPUs and the number of devices across which the table appears to be spread’. 

The second possibility for introducing parallelism into a query is to put an explicit hint into it:

SQL> select /*+ parallel (px, 4)*/
count(*) from PX;

SQL> select /*+ parallel (px, default) */
count(*) from PX;

The third option is to define views which contain hints, so that application code does not have to worry about degrees of parallelism,

create or replace view big_px as
SQL> select
/*+
    parallel (px1, 4)
    parallel (px2, 6)
*/
from px1, px2
where
px1.id = px2.x_id ;

An interesting, and possibly irritating, feature of the view definition above is the question of what happens when there are several parallel tables in a query.

In general it is not possible to predict the degree of parallelism that such a query will use, as it depends on the usage, the driving table, and the join paths used.  In the view above I could only say that, if no other parallel tables are introduced, the final query will run with degree 1, 4 or 6.

How Many Slaves:
So far in this article I have repeatedly used the expression ‘degree of parallelism’ and avoided saying anything explicitly about the number of parallel slave processes involved in a parallel query.

It is now time to come clean, and admit that the degree and the number of slaves are not always the same.  Specifically, a query of degree N may take 2N slaves.  Basically this is to handle sorting. 

Consider a query like:

SQL> select /*+ parallel (fa,3) */
prod_id, count(*)
from fa
group by prod_id ;

When the PQO cuts in, the degree (3)  specifies the number of slaves to be used to scan the table.   By looking at the other and other_tag columns of the 7.3 Explain Plan utility, it is possible to determine that each slave will perform the required aggregation on each subset of the table that it is required to scan.  The partial results are then shared out to another set of 3 slaves using some form of hashing algorithm to achieve a complete and even share.  Each slave in this second set sorts and aggregates the results it has received before forwarding them to the normal shadow process (which is given the title ‘query co-ordinator’.  

There are four things we can infer :

1) If we fire off a query that involves sorting, then half the slaves will be I/O intensive as they read the disc, and the others will be CPU intensive in brief bursts and quiet the rest of the time.

2) There will be large demands on memory as every slave will require its own sort_area_size, as will the normal shadow.

3) There must be some memory used up in the ‘n-squared’ lines of communication going on between the two layers of slaves:  this is actually in the SGA, and a guideline I have seen suggests an addition to the SGA of:
 parallel_max_servers * (largest degree used + 2) * 3K

4) If we have a query that returns a lot of rows, then the ultimate bottleneck on response may be the link between the normal shadow and the front-end, which is often the only serialised part of the process.  This is a good reason for not using PQO through the multi-threaded server.

One interesting little point that is not obvious is that we cannot rely on a group by to do an implicit order by.  In the example, slave p000 may return results about prod_ids 1,4,7,11 whilst slave p001 returns results about prod_ids 2,5,8,12:  unless the query does an explicit order, the result set returned to SQL*Plus will be in the order 1,4,7,11,2,5,8,12. etc

Conclusion:
The parallel query option is in principle a very simple concept which in practice means reviewing the assumptions we may have made about optimising Oracle over the last few years since it relies very much on a huge amount of table-scanning.

The main benefit of the PQO is in large databases, where a significant number of disks is available for spreading I/O;  but behind this it also requires lots of memory and multiple CPUs.


Enjoy  :-)