Saturday, April 20, 2013

Largest Databases of the World


Greetings Everyone  I am  back after a  long break . I  was  busy  in some other  aspects of my life and now i have decided to keep updating this blog to share my knowledge and experience with you all . It's always give me a great feeling when i share some this with you all and learn alot from you guys through your mails , comments and chat.

One  day , i  was thinking  about  some scenario's  about  a  large database  then  one  thing  click  in my mind i.e, "What  is the size of  the largest database in  this world" and  what  database are  they using ? I guess, it  may be  google  and  then  search on net  and  find  some  interesting  facts  which  i will like to share you all . Let’s get started and just keep in mind that size alone does not determine how big a database is, it’s the information it contains in the form of fields and records  and eventually it all depends on the technology being employed for storage and management. 


No 1. The World Data Centre for Climate (WDCC)
WDCC is  Operated  by  the Max Planck Institute for Meteorology and German Climate Computing Centre, The  World  Data Centre  for Climate  is  the largest database in the world with 220 terabytes of data that is readily available on the internet. Add to that 110 terabytes of climate simulation data and 6 petabytes of data stored on magnetic tapes.

The  WDCC  is  included  in  the CERA  database system. Access  to  the  CERA  database  is possible from  the Internet  by  use  of  a Java- based  browser. The  CERA ( Climate  and  Environmental Retrieving and Archiving) - data archive is realised on an ORACLE database connected to a STK Silo system. Thus  large data  sets  may  be stored under control of this System while the metadata associated with CERA permit an easy way to relocate data, which have to be retrieved.


No 2. National Energy Research Scientific Computing Center  (NERSCC) :
Based  in Oakland, California, National  Energy Research  Scientific Computing Center or NERSC is owned and operated by the Lawrence and  the  U.S.  Department of  Energy.  Included in its  database of 2.8 petabytes  is  information on  atomic energy  research, high energy physics experiments and simulations of the early universe.

The High Performance Storage System (HPSS) is a modern, flexible, performance-oriented mass storage system.  It has been used at NERSC for archival storage since 1998.

No 3.  AT&T :
Bigger than sprint, AT&T boasts 1.9 trillion calling records which contribute to 323 terabytes worth of information. One  factor behind  the  massiveness  of  its database  is the fact that AT&T has been maintaining databases from the time when the technology to store terabytes wasn’t even available.

The  Daytona®  data management  system  is  used  by AT&T to solve a wide spectrum of data management problems. For example, Daytona  is  managing  over 312 terabytes of data in a 7x24 production  data warehouse  whose largest  table  contains over 743 billion records as of Sept 2005. Indeed, for  this  database, Daytona  is managing over 1.924  trillion records; it  could easily manage more but  we  ran  out of  data. Update: as of June 2007, Daytona is managing over 2.8 trillion records in  this same data warehouse, with over 938 billion records in the largest table.

AT&T  is  the  sole source for the Daytona product, service and support and is the only company authorized to use the Daytona trademark for a database product.


No 4. Google  :
The   list  wouldn't  be  complete  without  Google. Subjected  to  around  100  million  searches per day, Google  is one  of  the  largest databases  in  the  world  that  has over 33  trillion  database entries. Although  the  exact size of Google’s database  is unknown, it’s  said  that Google  accounts every  single search  that  makes  each day into  its  database which  is  around 91 million  searches per day. Google  stores  every  search and  makes patterns  from  previous  searches so  that  the user can be  easily directed. Google also  collects  information  of their users and stores them as entries in their database which is said to expand over 33 trillion entries. On top of that Google has simply expanded their database with Gmail and Google ads and with their acquisitions like YouTube. 

Bigtable  is  a  distributed  storage  system (built by Google)  for  managing  structured data  that  is designed to scale to a very large size: petabytes of data across thousands of commodity servers.Many projects  at Google  store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving).


No 5. Sprint  :
The third largest wireless telecommunications network in the US has a database of over 55 million users. Sprint processes over 365 million call detail records per day. Making up its huge database are 2.85 trillion rows of information.


No 6. LexisNexis  :
LexisNexis  is  a  company providing  computer-assisted  legal  research  services  which  bought Choicepoint  in 2008 , and   Choicepoint  was  in  the  business of  acquiring  inform ation  about  the American  population  including  everything  from  phone numbers to criminal histories. It had  over 250 terabytes of data on  the  American population  until it was bought by LexisNexis .


No 7. YouTube  :
With over  60 hours  of video  uploaded  per  minute, yes,  per minute, YouTube  has a  video  database of  around  45 terabytes. It  has  over 100 million  videos  being  watched every  day  Reports say that about a 100 million  videos are  watched in  YouTube which is about 60% of the  overall number of videos watched online.


No 8. Amazon  :
Containing  records of more than 60 million  active users, Amazon also has more than 250,000 full text books available online and allows users to comment and interact on  virtually every page of the website. Overall, the Amazon database is over 42 terabytes in size.

Amazon SimpleDB is a highly available and flexible non-relational data store that offloads the work of database administration. Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest.


No 9. Central Intelligence Agency  :
With  exact  size  understandably  not  made  public, the  CIA  has  comprehensive  statistics on  more than 250  countries  of  the world  and  also collects  and distributes  information  on  people, places and things. Although,  the database  is  not  open to public, portions  of  it  are  made available. Freedom of Information Act (FOIA)  Electronic  Reading  Room  is  one  such  example  where  100s  of items are  added  from  the  database  monthly.

The  ARC  maintains  an  automated  system  containing  information  concerning  each  individual accession ("job"). The A RC database  includes  detailed  information  at  the  file  folder level for each accession retired  after  1978,  including  the  job  number, box  and  file number,  file  title, level  of security classification, inclusive dates, and  disposition  instructions, including  date  when  disposition action  will  be taken. Less detailed information is maintained for accessions retired before 1978.


No 10.  Library of Congress  :
With  over  130  million  items  including  29  million  books,  photographs  and  maps, 10,000  new items added  each  day and nearly  530 miles  of  shelves, the Library  of  Congress  is a wonder to behold in itself. Only  the  text  portion of  the  library would  take up  20  terabytes  of  space.  If internet  isn’t helping  you  ou t in  your  research, head  to  the oldest  federal cultural  institution  in  the United States in DC.

The  Library of  Congress  offers a  wide variety  of  online databases  and  Internet  resources  to  the public via  the Web, including  its  own  online  catalog.  In  addition, LC provides  an easy-to-use gateway  for  searching  other  institutions' online  catalogs  and  extensive  links  to  resources  on  the Internet.

Souce ::  http://globtech24x7.com/10-largest-databases-of-the-world/


Enjoy   :-) 



Tuesday, August 14, 2012

Manual Upgradation From Oracle 9i to 10g

Upgradation  is  the process of  replacing our existing software with  a  newer  version  of  the  same  product . For example, replacing  oracle  9i  release  to oracle 10g release . Upgrading our  applications  usually does not require special tools. Our existing reports should look and behave the same in both products. However, sometimes minor changes may be seen in product .Upgradation is done at Software level .

I  received a mail from  a  reader regarding  the upgradation  of  database . He wants  to upgrade his database  from  9i  to 10g  . Here, i  will  like  advice that  it's  better  to  upgrade our database  from 9i  to 11g  as compare  to  9i  to  10g  because  Oracle extended  support  for 10gR2 will ends on 31-Jul-2013 and  also there are more features available in Oracle 11g . We can directly upgrade to oracle 11g, if our curent database is 9.2.0.4 or newer then its supports direct upgrades to versions 9.2.0.4, 10.1 and 10.2 . We can upgrade the version as
  • 7.3.3 -> 7.3.4 -> 9.2.0.8 -> 11.1
  • 8.0.5 -> 8.0.6 -> 9.2.0.8 -> 11.1
  • 8.1.7 -> 8.1.7.4 -> 9.2.0.8 -> 11.1
  • 9.0.1.3-> 9.0.1.4 -> 9.2.0.8 -> 11.1
  • 9.2.0.3 (or lower) -> 9.2.0.8 -> 11.1
Oracle 11g client can access Oracle databases of versions 8i, 9i and 10g.

There are generally four method to Upgrade the Oracle database .
1.) Manual Upgradation :
2.) Upgradation Using the DBUA .
3.) export/import
4.) Data Copying

Let's have a look on manual upgradation .


Manual Upgradation :  A  manual   upgrade consists  of  running  SQL  scripts  and  utilities  from  a command line  to  upgrade  a  database  to  the  new  Oracle Database 10g  release. While  a   manual  upgrade  gives us finer control over the upgrade process, it  is  more susceptible to error  if  any  of  the  upgrade or pre-upgrade steps  are either not followed or are performed out of order. Below are the steps 

1.) Install Oracle 10g software : For Upgradation , Invoke the .exe or rumInstaller ad select  "Install software only" to Install the Oracle S/w .

2.) Take Full Backup Database :  Take full database backup of database which is to be upgraded .

3.) Check the invalid Objects : Check the invalid objects by running ultrp.sql scripts as
SQL> @ORACLE_HOME/rdbms/admin/utlrp.sql

4.) Login into 9i  home  and run the utlu102i.sql : This  script is in oracle 10g home .
SQL> spool  pre_upgrd.sql
SQL> @<ORACLE_10G_HOME>/rdbms/admin/utlu102i.sql
SQL> spool off

The above scripts checks a number of areas to make sure the instance is suitable for upgrade including
  • Database version
  • Log file sizes 
  • Tablespace sizes 
  • Server options
  • Initialization parameters (updated, depercated and obsolete)
  • Database components
  • Miscellaneous Warnings 
  • SYSAUX tablespace present
  • Cluster information
The  issues  indicated  by  this script  should  be  resolved  before  a  manual  upgrade  is  attempted. Once we  have  resolved the  above  warning , then re-run  the  above  script  once  more  to  cross-check .

5.)  Check  for  the  timestamp  with  timezone  Datatype : The  time zone  files  that are  supplied  with  Oracle  Database 10g  have  been  updated  from  version 1  to version 2  to  reflect changes  in  transition  rules  for  some  time  zone  regions. The changes may affect existing  data  of  TIMESTAMP WITH TIME ZONE  datatype. To  preserve this  TIMESTAMP data for updating  according  to  the  new  time zone  transition  rules, we  must  run  the utltzuv2.sql script on  the database  before  upgrading. This  script  analyzes our database for  TIMESTAMP WITH TIME ZONE columns  that a re  affected  by  the  updated  time  zone  transition  rules.
SQL> @ORACLE_10G_HOME/rdbms/admin/utltzuv2.sql
SQL> select * from sys.sys_tzuv2_temptab;

If   the  utltzuv2.sql   script  identifies  columns   with   time zone   data   affected  by  a  database  upgrade, then  back  up  the  data  in character  format  before we upgrade the database. After the   upgrade,   we   must  update  the  tables  to ensure  that  the data  is  stored   based on  the new  rules. If  we export  the tables  before  upgrading  and  import  them  after  the  upgrade, the  conversion  will  happen  automatically  during the import.

6.) Shutdown the database :
shut down the database and copy the spfile(or pfile) and password file from 9i home to 10g home .

7.) Upgrade Database : Set following environment for 10g and login using  "SYS"  user . It takes roughly half an hour to complete. Spool the output to a file so that you can review it afterward.
ORACLE_SID=<sid>
ORACLE_HOME=<10g home>
PATH=<10g path>
sqlplus / as sysdba

SQL> startup  upgrade
SQL>spool  upgrd_log.sql
SQL>@catupgrd.sql
SQL> spool off

8.)  Recompile any invalid objects : Compare the number of invalid objects with the number noted in step 4 . It should hopefully be the same or less.

SQL>@ORACLE_HOME/rdbms/admin/utlrp.sql

9.) Check the status of the upgrade :
SQL> @ORACLE_HOME/rdbms/admin/utlu102s.sql 
The above script queries the DBA_SERVER_REGISTRY to determine upgrade status and provides information about invalid or incorrect component upgrades. It also provides names of scripts to rerun to fix the errors.

10.) Edit the spfile : Create a  pfile  from spfile as
SQL>create pfile from spfile ;

Open the pfile and set the compatible parameter to 10.2.0.0.0 . Shutdown the database and create the new modified spfile .
SQL>shut immediate
SQL> create spfile from pfile ;

11.) Start the database normally
SQL> startup
and finally configure the Oracle net and drop the old Oracle database software i.e, 9i  using the OUI .

Reference :: http://docs.oracle.com/cd/B19306_01/server.102/b14238/upgrade.htm


Enjoy    :-)


Wednesday, July 11, 2012

Oracle Pro On Otn

Today , I  got the status of Oracle Pro on otn site . We always  feel good to acheive some awards or appreciations for  good work  and this time same with me . 

Working with oracle is always exiting , challenging and great fun . I am passionate about oracle and try to learn as much as i can .








Enjoy     :-)