/***************************************************************************/
/* Document     : Quick Intro: Oracle 10g/11g RAC                          */
/* Doc. Versie  : 3                                                        */
/* File         : ora10g11gRAC.txt                                         */
/* Date         : 11/10/2008                                               */
/* Content      : Elementary Introduction on Real Application Clusters     */ 
/* Compiled by  : Albert van der Sel                                       */ 
/***************************************************************************/


-----------------------------------------------------
REMARK:
This is a very simple, and incomplete, document, 
providing a birds-eye view on Oracle RAC technology.
-----------------------------------------------------


An "Oracle Real Application Cluster" (RAC), is about a clustered Oracle database.

If a RAC is properly set up, all the nodes (Servers) are active at the same time, acting
on the same one Database. This is very different from a failover Cluster.

Let's first take on a discussion about Cluster systems in general. See Section 1. The example system used
here is Linux, but the discussion about Cluster systems (Not targeted at RAC) here is "general" enough
to be of use. As from section 2, we will discuss RAC.


===============================================
1. Discussion about Cluster systems in general:
=============================================== 


1.1 Cluster Overview (in general):
----------------------------------


To set up a cluster, an administrator must connect the cluster systems (often referred to as member
systems) to the cluster hardware, and configure the systems into the cluster environment. The foundation
of a cluster is an advanced host membership algorithm. This algorithm ensures that the cluster
maintains complete data integrity at all times by using the following methods of inter-node communication:

• Quorum partitions on shared disk storage to hold system status
• Ethernet (and optional serial or other type of connections) between the cluster systems 
  for heartbeat channels

To make an application and data highly available in a cluster, the administrator must configure a 
"cluster service" — a discrete group of service properties and resources, such as an application and shared
disk storage. A service can be assigned an IP address to provide transparent client access to the service.
For example, an administrator can set up a cluster service that provides clients with access to
highly-available database application data.
Both cluster systems can run any service and access the service data on shared disk storage. 

However, each service can run on only one cluster system at a time, in order to maintain data integrity. 
Administrators can set up

- an "active-active" configuration in which both cluster systems run different services,
or 
- an "active-passive" (hot-standby) configuration in which a primary cluster system runs all the services, 
  and a backupcluster system takes over only if the primary system fails.

  NOTE:
  So this is actually a difference from Oracle 10g Real Application Cluster (RAC), where both instances,
  or multiple instances (from 2 - 100), accesses the single database on shared storage, at the same time !


Scetch of a 2-node Linux cluster


         ------------------------------------------ public network
             |                              |
             |                              |
        ------------                    -------------
        |cluster   |                    |cluster    |
        |system    |Ethernet            |system     |
        |          |--------------------|           |
        |          |heartbeat           |           |
        |          |                    |           |
        |          |____________        |           |
        |ServiceA  |  -----    -|---    |           |
        |ServiceB  |--|PWR|    |PWR|----|ServiceC   |
        |          |  -----    -----    |           |
        |          |    |_______________|           |
        |          |                    |           |
        ------------                    -------------
             | SCSI bus or Fible Channel      |
             ------------------  --------------
               Interconnect   |  |
                              |  |
Fig 1.1                   -----------
                          |Shared   |  - has Quorum partitions (or disks)
                          |Disk     |  - has partitions (or disks) for ServiceA, B, C
                          |Storage  |
                          ----------- 


Figure 1–1, shows an example of a cluster in an active-active configuration.
If a hardware or software failure occurs, the cluster will automatically restart the failed system’s services
on the functional cluster system. This service failover capability ensures that no data is lost,
and there is little disruption to users. When the failed system recovers, the cluster can re-balance the
services across the two systems.
In addition, a cluster administrator can cleanly stop the services running on a cluster system and then
restart them on the other system. This service relocation capability enables the administrator to maintain
application and data availability when a cluster system requires maintenance.

-- Service configuration framework:

Clusters enable an administrator to easily configure individual services to make data and applications
highly available. To create a service, an administrator specifies the resources used in the
service and properties for the service, including the service name, application start and stop script,
disk partitions, mount points, and the cluster system on which an administrator prefers to run the
service. After the administrator adds a service, the cluster enters the information into the cluster
database on shared storage, where it can be accessed by both cluster systems.
The cluster provides an easy-to-use framework for database applications. For example, a database
service serves highly-available data to a database application. The application running on a cluster
system provides network access to database client systems, such as Web servers. If the service
fails over to another cluster system, the application can still access the shared database data. A
network-accessible database service is usually assigned an IP address, which is failed over along
with the service to maintain transparent access for clients.
The cluster service framework can be easily extended to other applications, as well.

-- Multiple cluster communication methods:

To monitor the health of the other cluster system, each cluster system monitors the health of the
remote power switch, if any, and issues heartbeat pings over network and serial channels to monitor
the health of the other cluster system. In addition, each cluster system periodically writes a
timestamp and cluster state information to two (or more) quorum partitions located on shared disk storage.
System state information includes whether the system is an active cluster member. Service state
information includes whether the service is running and which cluster system is running the service.
Each cluster system checks to ensure that the other system’s status is up to date.
To ensure correct cluster operation, if a system is unable to write to both quorum partitions at
startup time, it will not be allowed to join the cluster. In addition, if a cluster system is not updating
its timestamp, and if heartbeats to the system fail, the cluster system will be removed from the
cluster.

If a hardware or software failure occurs, the cluster will take the appropriate action to maintain application
availability and data integrity. For example, if a cluster system completely fails, the other
cluster system will restart its services. Services already running on this system are not disrupted.
When the failed system reboots and is able to write to the quorum partitions, it can rejoin the
cluster and run services. Depending on how the services are configured, the cluster can re-balance
the services across the two cluster systems.

-- Manual service relocation capability:

In addition to automatic service failover, a cluster enables administrators to cleanly stop services
on one cluster system and restart them on the other system. This allows administrators to perform
planned maintenance on a cluster system, while providing application and data availability.

-- Event logging facility:

To ensure that problems are detected and resolved before they affect service availability, the cluster
daemons log messages by using the conventional Linux syslog subsystem. Administrators can
customize the severity level of the logged messages.

-- Application Monitoring:

The cluster services infrastructure can optionally monitor the state and health of an application. In
this manner, should an application-specific failure occur, the cluster will automatically restart the
application. In response to the application failure, the application will attempt to be restarted on
the member it was initially running on; failing that, it will restart on the other cluster member.

-- Status Monitoring Agent:

A cluster status monitoring agent is used to gather vital cluster and application state information.
This information is then accessible both locally on the cluster member as well as remotely. A
graphical user interface can then display status information from multiple clusters in a manner
which does not degrade system performance.


1.2 Just an example of a more detailed view of an almost "No single point of failure" 2-Node Clustered System:
--------------------------------------------------------------------------------------------------------------

                            ----------
                            |NETWORK |
         -------------------|SWITCH  |-----------------------
         |                  ----------                      | public network
         |                      |		            |               
 ---------------------          |		    ---------------------   
 |network interface  |      ----------		    |network interface  |   
 |--------------------      |terminal|		    |--------------------   
 |serial port        |------|server  |--------------|serial port        |
 |--------------------      ----------		    |--------------------   
 |CLUSTER            | 				    |CLUSTER            | 
 |SYSTEM             |				    |SYSTEM             |
 |--------------------	private network             |--------------------
 |network interface  |------------------------------|network interface  | 
 |--------------------				    |--------------------
 |serial port        |------------------------------|serial port        |
 |--------------------				    |--------------------
 |serial port        |-----------------\	    |                   |
 |--------------------   -----       -----	    |--------------------   
 |power plug         |---|PWR|       |PWR|----------|power plug         |
 |--------------------   -----       -----	    |--------------------   
 |                   |	   |			    |-------------------|
 |                   |	   -------------------------|serial port        |
 |--------------------				    |--------------------
 |SCSI adapter (T)   |				    |SCSI adapter (T)   |
 ---------------------				    ---------------------
       |                                                       |
       |                                                       |
       -----------           -----------------------------------
                 |           |
                 |           |                 (T)         (T)
             -------------------------------------------------------
             | Port A/in | Port B/in |    | Port A/Out| Port B/Out |
             |------------------------------------------------------
             |     |           |                |          |       |
             |  -------------------         --------------------   |
             |  |controller 1     |         |controller 2      |   |
             |  -------------------         --------------------   |
             |          |                            |             | 
             |          |                            |             |
  RAID       |         ( )                          ( )            |
             |          |                            |             |
             |         ( )                          ( )            |
             |                                                     |
             |    mirrored shared disks                            |
             -------------------------------------------------------


=====================================================================================
2. Overview of the architecture of a Single Oracle Instance compared to a RAC system:
=====================================================================================

Let's first take a birds-eye overview of a single Instance architecture, compared to RAC
architecture.


2.1 Single Instance:
--------------------

If you look at a (traditional) Single Server where a single Oracle Instance (8i, 9i, 10g, 11g) is involved,
you would see the following situation.


>>>>> Files:

You can find a number of database files, residing on a disksystem, amongst others are:

 . system.dbf:    # this contains the dictionary (users, grants, table properties, packages etc..)
 . undo.dbf:      # this contains "undo/rollback" information about all modifying SQL statements, and
                    thus containing the "former situation" before transactions are committed to the DB.
 . redo logs:     # in case of a crash, these write ahead logs can be used to redo committed transactions 
                    that which were not written to the datafiles yet, but were logged in the redo logs.
 
 . user defined   # These are data files, organized in the logical concept of "tablespaces".
   data and index   These tablespaces contain the tables (and indexes) 
   tablespaces:
                   

Note: a tablespace consist of one or more files. To the Operating system, there are only files
to be concerned of, but from the Database perspective, the DBA can create a logical entity 
called "tablespace", consisting of possibly multiple files, possibly distributed over multiple disks.
Then, if the DBA then creates a table (or index), he or she should specify a tablespace, and thereby
distributing the (future) tablecontent over multiple files, which might increase I/O performance.
So, the DBA might create tablespaces with names like for example "DATA_BIG", "DATA_SMALL", 
"INDEX_BIG" etc.. 


>>>>> Memory structure and processes:

The Instance gets created in memory when the DBA (or the system) "starts the database". 
Starting the database means that a number of processes gets active, and that a rather complex shared 
memory area gets created. This memory area is called the SGA (System Global Area) and contains some
buffers and other pools, of which the following are most noticable:

SGA contains:

buffer cache	: datablocks from disk, are cached in this buffer. Most of this cached data
                  are blocks from tables and indexes.
log buffer	: small memory area which contains modified data which is about to be written
                  to the redologs
Shared pool	: All used SQL queries and procedures are cached in this pool
Library cache	: The systems metadata is cached in this structure

By the way, an Oracle Instance can be highly configured by a configuration file (traditionally that is
the file "init.ora" which is an ascii file and can be edited to adjust values. The modern variant of "init.ora"
is a binary file "spfile.ora".).
Some of the parameters in that file, determine the sizes of the different caches and pools.
For example, here is a section that determines the SGA of a small database:

db_cache_size        =268435456
java_pool_size       = 67108864
shared_pool_size     = 67108864
java_pool_size       = 67108864
streams_pool_size    = 67108864 


So, "an instance" is ofcourse not synonym to the database files on disk, but is really the "stuff" that
gets loaded or get created in memory.
After an Oracle Database has started, a number of processes are running, among which the most notable are:

pmon	: process monitor
smon	: system monitor
chkpt	: checkpoint process
dbwr	: database writer process
lgwr	: the process that writes the redologs


2.2 Scetch of a 2-node RAC Architecture:
----------------------------------------

+: network (with example IP addresses in picture)              user pc's/terminals
-: Fiber, or other Storage connection                          [ ] [ ] [ ]
                                                                +   +   +
                                                                +   +   +
                (subnet 1: 192.168.1 ) Public Network           +   +   +
        ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
                 +                                                       +   
                 +192.168.1.1 (vip=192.168.1.100)           192.168.1.2  + (vip=192.168.1.101)
                 +                                                       +
   Server Node:  +                                                       +
  ==================					   Server Node:  +       
  |Server A        |					  ==================  |--> SGA, and
  |--------------- |					  |Server B        |  |    processes like
  |Oracle          | Fiber				  |--------------- |  |    pmon, smon etc..
  |Instance A      |-----				--|Oracle          |<--
  |----------------|    |				| |Instance B      |
  |Listener A      |    |				| |----------------|
  |--------------- |    |				| |Listener B      | If ASM is used as a shared
  |-local rdbms    |    |				| |--------------- | storage for the database,
  | binaries etc.in|    |				| |-local rdbms    | each node also has an
  | (ORACLE_HOME)  |    |        (subnet 2: 192.168.2)  | | binaries etc.in| ASM Instance.
--|-CRS: Cluster   |               Private Network        | (ORACLE_HOME)  | 
| | ready services |++++++++++++++++++++++++++++++++++++++|-CRS: Cluster   |---
| | (CRS_HOME)     | 192.168.2.1              192.168.2.2 | ready services |  |
| -----------------          	                          | (CRS_HOME)     |  |
| |possibly vendor |    |    	                        | |-----------------  |
| |Clusterware     |    |                               | |possibly vendor |  |
| ==================    |      Shared Disks:		| |Clusterware     |  |
|  vendor               |    	----------------------	| ==================  |
|  clusterware          |--<>--	( 1 Shared Database: )<>|                     |
|  is not needed        |      	(system.dbf          )  |                     |
|  but might be in      |      	(temp.dbf            )  |                     |
|  Extended             |      	(users.dbf           )  |                     |
|  (long distance) RAC  |      	(data tablespaces    )  |                     |
|                       |      	(index tablespaces   )  |                     |
|                       |      	(etc..               )  |                     |
|                       |      	(------------------- )  |                     |
|			|---<>--(private redolog(s) A)  |                     |
|                       |---<>--(private undo.dbf   A)  |                     |
|                              	(private redolog(s) B)<>|                     |
|                              	(private undo.dbf   B)<>|                     |
|				(====================)                        |
----------------------<>--------( OCR:Oracle Cluster )--------<>--------------
				(      Registry      )
				(- Voting Disk       )
				----------------------


2.3 Overview RAC:
-----------------


RAC Architecture Overview


Let's begin by reviewing the structure of a Real Applications Cluster. Physically, a RAC consists 
of several nodes (servers), connected to each other by a private interconnect, which most of the time will be
a "private" Ethernet. 
The database files are kept on a shared storage subsystem, where they're accessible to all nodes. And each node has 
a public network connection. 

A cluster is a set of 2 or more machines (nodes) that share or coordinate resources to perform the same task. 
A RAC system is 2 or more instances running on a set of clustered nodes, with all instances accessing 
a shared set of database files (one Database). 
Depending on the O/S platform, a RAC database may be deployed on a cluster that uses vendor clusterware 
plus Oracle's own clusterware (Cluster Ready Services, CRS), or on a cluster that solely uses 
Oracle's own clusterware.
Thus, every RAC sits on a cluster that is running Cluster Ready Services. srvctl is the primary tool DBAs use 
to configure CRS for their RAC database and processes.


-- Cache Fushion:

Each cluster database instance in an Oracle RAC cluster uses its own memory structures and background processes. 
Oracle RAC uses Cache Fusion to synchronize the data stored in the buffer cache of each cluster database instance. 
Cache Fusion moves current data blocks (which reside in memory) between database instances, rather than having one database instance 
write the data blocks to disk and requiring another database instance to reread the data blocks from disk. When a data block located in the 
buffer cache of one instance is required by another instance, Cache Fusion transfers the data block directly between the instances 
using the interconnect, enabling the Oracle RAC database to access and modify data as if the data resided in a single buffer cache.

The CRS processes deal with cluster management, failover, services, OCR, Voting Disk etc..
But "the real cache fushion", that is Inter-instance block management, and lock management,
that is dealt with by a number of specialized Oracle background processes.
So, if an Instance reads a block from disk, then maybe a usersession will modify rows within that block.
Another Instance may request the same block, because an other user session connected to this second Instance,
wants to read (or modify) rows from that same block as well.

Single Instance:

> You probably know that a Single Instance Oracle database, consists of shared memory, where we can distinquish
certain areas. This shared memory is called the SGA (System Global Area), which "contains" the 
Buffer cache (diskblocks cached from disk), the shared pool (parsed sql, plsql codes), and a number of other area's.
like the large pool and the java pool.

>Next to this shared memory, a number of background processes make up the Instance, most notably are

pmon 
smon  
dbw0  
lgwr  
ckpt

and a number of other processes are running in a Single Instance.

If you have installed a RAC database, you have two or more Instances running on two ore more Nodes.
In this case, a number of additional structures and background processes deal with all the aspects of resource and
lock management that can occur if two or more instances wants to access the same blocks.
So, this in NOT the CRS processes who deals with that, but instead here we are talking of processes belonging
to the Oracle kernel.
In RAC, as more than one instance is accessing the resource, the instances require better coordination, at
the resource level (database objects, blocks, locks).
Sure, in RAC, the buffer cache of one Instance may contain data that is requested by an Instance at another node.

In former versions of RAC, namely Oracle Parallel Server (e.g. OPS in Oracle 8i), this management framework 
was called DLM (Distributed Lock Management).
In RAC 10g, 11g, this framework is made up by the Global Cache Services (GCS), the Global Enqueue Service (GES), 
and the Global Resource Directory (GRD).
The datastructures in memory distributed among all nodes (GRD), along with the specialized RAC background processes, collaborate
to enable "Cache Fushion".

In RAC Instances, we can expect, among others, to see the following additional background processes:

lms	Global Cache Services process
lmon	Global Enqueue Services Monitor
lmd	Global Enqueue Services daemon
lck0	Instance Enqueue process

When datablocks are requested from users attached to different instances, a lot of "tracking" and "synchronization" needs to be done
by, or with the aid of, the GCS, GES and GRD.

GCS:
Global Cache Service (GCS) is the main component of Oracle Cache Fusion technology. This is represented by background process LMSn. 
There can be max 10 LMS process for an instance. The main function of GCS is to track the status and location of data blocks. 
Status of data block means the mode and role of data block. GCS is the main mechanism by which cache coherency among “multiple cache” 
is maintained. GCS is also responsible for block transfer between the instances

GES:
Global Enqueue Service (GES) tracks the status of all Oracle enqueuing mechanism. This involves all non-cache fusion intra instance operations. 
GES performs concurrency control on dictionary cache locks, library cache locks and transactions. If performs this operation for resources 
that are accessed by more then once instance.
Enqueue services are also present in single instance database. These are responsible for locking the rows on a table using different locking modes.

GRD:
GES and GCS together maintains Global Resource Directory (GRD). GRD is like a in-memory database which contains details about all 
the blocks that are present in cache. GRD know what is the location of latest version of block, what is the mode of block, 
what is the role of block etc. When ever a user ask for any data block GCS gets all the information from GRD. 
GRD is a distributed resource, meaning that each instance maintain some part of GRD. This distributed nature of GRD is a key 
to fault tolerance of RAC. GRD is stored in the SGA of each Instance..


-- Cluster Ready Services and the OCR

Cluster Ready Services, or CRS, is a new feature for 10g RAC. Essentially, it is Oracle's own clusterware. 
On most platforms, Oracle supports vendor clusterware; in these cases, CRS interoperates with the vendor 
clusterware, providing high availability support and service and workload management. On Linux and Windows clusters, 
CRS serves as the sole clusterware. In all cases, CRS provides a standard cluster interface that is consistent 
across all platforms.

CRS consists of four processes (crsd, occsd, evmd, and evmlogger) and two disks (partitions): 
the Oracle Cluster Registry (OCR), and the voting disk. 

The CRSD manages the HA functionality by starting, stopping, and failing over the application resources 
and maintaining the profiles and current states in the Oracle Cluster Registry (OCR) whereas the OCSSD 
manages the participating nodes in the cluster by using the voting disk. The OCSSD also protects against 
the data corruption potentially caused by "split brain" syndrome by forcing a machine to reboot. 


-- CRS Processes

About those processes, we can show you how they run, and how they are started, on a unix system.

CRS consists of four processes (on most platforms: oprocd, crsd, occsd, evmd) and two disks: 
the Oracle Cluster Registry (OCR), and the voting disk. 

On most platforms, you may see the following processes:

oprocd	the Process Monitor Daemon
crsd	Cluster Ready Services Daemon (CRSD)
occsd	Oracle Cluster Synchronization Service Daemon
evmd	Event Volume Manager Daemon

Oracle CRS is Oracle's own clusterware tightly coupled with Oracle Real Application Clusters (RAC). CRS must be installed prior 
to the installation of Oracle RAC. It can also work over any third-party clustering software but there is no longer 
a requirement to buy and deploy such software. 

In short, Oracle CRS is primarily responsible for managing the high-availability (HA) architecture of Oracle RAC with the help 
of Cluster Ready Services Daemon (CRSD), Oracle Cluster Synchronization Server Daemon (OCSSD) and the Event Manager Daemon (EVMD). 
The CRSD manages the HA functionality by starting, stopping, and failing over the application resources and maintaining the profiles 
and current states in the Oracle Cluster Registry (OCR) whereas the OCSSD manages the participating nodes in the cluster 
by using the voting disk. The OCSSD also protects against the data corruption potentially caused by "split brain" syndrome 
by forcing a machine to reboot. 

Although Oracle CRS replaces the Oracle Cluster Manager (ORACM) in Oracle9i RAC, it does continue support for the Global Services Daemon (GSD), 
which in Oracle9i is responsible for communicating with the Oracle RAC database. In Oracle 10g, GSD's sole purpose is to serve Oracle9i 
clients (such as SRVCTL, Database Configuration Assistant, and Oracle Enterprise Manager). Financially, this is a very positive benefit 
since one is not bound to buy new client licenses and hardware to support an Oracle 10g database. 


To start and stop CRS when the machine starts or shutdown, on unix there are rc scripts in place.

You can also, as root, manually start, stop, enable or disable the services with:

/etc/init.d/init.crs start
/etc/init.d/init.crs stop
/etc/init.d/init.crs enable
/etc/init.d/init.crs disable

Or with

# crsctl start crs
# crsctl stop crs
# crsctl enable crs
# crsctl disable crs

On a unix system, you may find the following in the /etc/inittab file.

# cat /etc/inittab | grep crs
h3:35:respawn:/etc/init.d/init.crsd run > /dev/null 2>&1 </dev/null

# cat /etc/inittab | grep evmd
h1:35:respawn:/etc/init.d/init.evmd run > /dev/null 2>&1 </dev/null

# cat /etc/inittab | grep css
h2:35:respawn:/etc/init.d/init.cssd fatal > /dev/null 2>&1 </dev/null

/etc/init.d> ls -al *init*
init.crs
init.crsd
init.cssd
init.evmd

# cat /etc/inittab
..
..
h1:35:respawn:/etc/init.d/init.evmd run > /dev/null 2>&1 </dev/null
h2:35:respawn:/etc/init.d/init.cssd fatal > /dev/null 2>&1 </dev/null
h3:35:respawn:/etc/init.d/init.crsd run > /dev/null 2>&1 </dev/null


-- CRS logs:

Locating the Oracle Clusterware Alert Log
Oracle Clusterware posts alert messages when important events occur. For example, you might see alert messages from 
the Cluster Ready Services (CRS) daemon process when it starts, if it aborts, if the failover process fails, 
or if automatic restart of a CRS resource failed.

The location of the Oracle Clusterware log file is 

CRS_home/log/hostname/alerthostname.log, 

where CRS_home is the directory in which Oracle Clusterware was installed and hostname is the host name of the local node.

Oracle RAC uses a unified log directory structure to store all the Oracle Clusterware component log files. This consolidated structure 
simplifies diagnostic information collection and assists during data retrieval and problem analysis.

>>> The log files for the CRS daemon, crsd, can be found in the following directory:

CRS_home/log/hostname/crsd/

>>> The log files for the CSS deamon, cssd, can be found in the following directory:

CRS_home/log/hostname/cssd/

>>> The log files for the EVM deamon, evmd, can be found in the following directory:

CRS_home/log/hostname/evmd/

>>> The log files for the Oracle Cluster Registry (OCR) can be found in the following directory:

CRS_home/log/hostname/client/

>>> The log files for the Oracle RAC high availability component can be found in the following directories:

CRS_home/log/hostname/racg/
$ORACLE_HOME/log/hostname/racg


-- Enabling Debugging for an Oracle Clusterware Resource
You can use crsctl commands to enable resource debugging using the following syntax, where resource_name is the name 
of an Oracle Clusterware resource, such as ora.docrac1.vip, and debugging_level is a number from 1 to 5:

# crsctl debug log res resource_name:debugging_level

-- Running the Oracle Clusterware Diagnostics Collection Script

Run the diagcollection.pl script as the root user to collect diagnostic information from an Oracle Clusterware installation. 
The diagnostics provide additional information so that Oracle Support Services can resolve problems. Run this script from 
the operating system prompt as follows, where CRS_home is the home directory of your Oracle Clusterware installation:

# CRS_home/bin/diagcollection.pl --collect

This command displays the status of the Cluster Synchronization Services (CSS), Event Manager (EVM), 
and the Cluster Ready Services (CRS) daemons.


-- CRS manages the following resources: 

. The ASM instances on each node (for an explanation of ASM, see section 4)
. Databases 
. The instances on each node 
. Oracle Services on each node 
. The cluster nodes themselves, including the following processes, or "nodeapps":
  . VIP 
  . GSD 
  . The listener 
  . The ONS daemon

CRS stores information about these resources in the OCR. If the information in the OCR for one of these 
resources becomes damaged or inconsistent, then CRS is no longer able to manage that resource. 
Fortunately, the OCR automatically backs itself up regularly and frequently.


10g RAC (10.2) uses, or depends on,:

- Oracle Clusterware (10.2), formerly referred to as CRS "Cluster Ready Services" (10.1).
- Oracle's optional Cluster File System OCFS (This is optional), or use ASM and RAW.
- Oracle Database extensions

RAC is "scale out" technology: just add commodity nodes to the system.
The key component is "cache fusion". Data are transferred from one node
to another via very fast interconnects. 
Essential to 10g RAC is a "Shared Cache" technology.

Automatic Workload Repository (AWR) plays a role also.  The Fast Application Notification (FAN) mechanism
that is part of RAC, publishes events that describe the current service level being provided
by each instance, to AWR. The load balancing advisory information is then used to determine
the best instance to serve the new request.

. With RAC, ALL Instances of ALL nodes in a cluster, access a SINGLE database.
. But every instance has it's own UNDO tablespace, and REDO logs.

The Oracle Clusterware comprise several background processes that facilitate cluster operations.
The Cluster Synchronization Service CSS, Event Management EVM, and Oracle Cluster components
communicate with other cluster components layers in the other instances within the same 
cluster database environment.


Questions per implementation arise in the following points:
. Storage
. Computer Systems/Storage-Interconnect
. Datbase
. Application Server
. Public and Private networks
. Application Control & Display

On the Storage level, it can be said that 10g RAC supports
- Automatic Storage Management (ASM)
- Oracle Cluster File System (OCFS)
- Network File System (NFS) - limited (only theoretical actually, except for 11g)
- Disk raw partitions
- Third party cluster file systems, like GPFS

For application control and tools, it can be said that 10g RAC supports
- OEM Grid Control     http://hostname:5500/em
  OEM Database Control http://hostname:1158/em
- "svrctl" is a command line interface to manage the cluster configuration,
   for example, starting and stopping all nodes in one command.
- Cluster Verification Utility (cluvfy) can be used for an installation and sanity check.

Failure in Client connections:

Depending on the Net configuration, type of connection, type of transaction etc.., 
Oracle Net services provides a feature called "Transparant Application Failover" 
which can fail over a client session to another backup connection.

About HA and DR:

- RAC is HA       , High Availability, that will keep things Up and Running in one site.
- Data Guard is DR, Disaster Recovery, and is able to mirror one site to another remote site.


2.4 Storage with RAC:
---------------------

We have the following Database storage options:

Raw			Raw devices, no filesystem present
ASM			Automatic Storage Management 
Third party CFS		Vendor's Cluster File System 
OCFS			Oracle Cluster File System 
LVM			Logical Volume Manager 
NFS			Network File System (must be on certified NAS device) 

Storage					Oracle Clusterware OCR and Voting Disk	Database Recovery area
--------------				--------------------------------------	-------- -------------
Automatic Storage Management 		No 					Yes 	 Yes 
Cluster file system (OCFS or Other)	Yes		 			Yes 	 Yes 
Shared raw storage 			Yes 					Yes 	 No 


Here is a description about file types. A regular single-instance database has three basic types of files: 

1. database software and dump files (alertlog, trace files and that stuff); 
2. datafiles, spfile, control files and log files, often referred to as "database files"; 
3. and it may have recovery files, if using RMAN. 
   and, in case of RAC:
4. A RAC database has an additional type of file referred to as "CRS files". These consist of the 
   Oracle Cluster Registry (OCR) and the voting disk. 

Not all of these files have to be on the shared storage subsystem. The database files and CRS files 
must be accessible to all instances, so these *must be* on the shared storage subsystem. 

The database software can be on the shared subsystem and shared between nodes; or each node can have 
its own ORACLE_HOME. The flash recovery area must be shared by all instances, if used. 

Some storage options can't handle all of these file types. To take an obvious example, the database software 
and dump files can't be stored on raw devices. This isn't important for the dump files, 
but it does mean that choosing raw devices precludes having a shared ORACLE_HOME on the shared storage device. 

Remarks:

1.
On a particular platform, there might exist a vendor specific solution for shared storage.
For example, on AIX it is usually IBM GPFS that is used as a shared file system. But for
this platform you might also use SFRAC of Veritas.

VERITAS Storage Foundation for Oracle Real Application Clusters (SFRAC) provides an integrated solution stack 
for using clustered filesystems with Oracle RAC on AIX, as an alternative to using raw logical volumes, 
Automatic Storage Management (ASM) or the AIX General Parallel Filesystem (GPFS). 

If your OS is Linux Redhat, then investigate your options with the Redhat Global FileSystem GFS.


2.
SAN solutions:

And as far as SAN, there's no inherent SAN protocol that allows for 
block-level locking between hosts.  Your clustered filesystem is 
responsible for providing that.


========================================================
3. Oracle 10g RAC Installation example on Redhat Linux:
========================================================


This section shows how to install 10g RAC on a couple of Linux machines. But the method used,
represents an installation on any platform. For the most part, on all platforms the installation
is the same.


3.1 Prepare your nodes:
-----------------------


3.1.1 Scetch of a 2-node Linux cluster

			192.168.2.0
         ---------------------------------------------- public network 
             |                                 |
  Server A   |                    Server B     |
        ------------                       -------------
        |InstanceA |Private network        |InstanceB  |
        |          |Ethernet (interconnect)|           |
        |          |-----------------------|           |
        |          |192.168.1.0            |           |
        |          |                       |           |
        |          |____________           |           |
        |          |  -----    -|---       |           |
        |          |--|PWR|    |PWR|-------|           |
        |          |  -----    -----       |           |
        |          |    |__________________|           |
        |          |                       |           |
        ------------                       -------------
             | SCSI bus or Fible Channel       |
             ------------------  ---------------
                              |  |
                              |  |
                          -----------
                          |Shared   |  - has Single DB on ASM, or OCFS (or other Cluster FS), or RAW
                          |Disk     |  - has OCR and Voting disk on OCFS (or other Cluster FS), or RAW
                          |Storage  |
                          ----------- 


3.1.2 Storage Options

Storage					Oracle Clusterware	Database	Recovery area
--------------				------------------	--------	-------------
Automatic Storage Management 		No 			Yes 		Yes 
Cluster file system (OCFS) 		Yes 			Yes 		Yes 
Shared raw storage 			Yes 			Yes 		No 


In the following, we will do an example installation on 3 nodes.


3.1.3 Install Redhat on all nodes with all options.

3.1.4 create oracle user and groups dba, oinstall on all nodes.
      Make sure they all have the same UID and GUI.

3.1.5 Make sure the user oracle has an appropriate .profile or .bash_profile

3.1.6 Every node needs a private network connection and a public network connection (at least
      two networkcards).

3.1.7 Linux kernel parameters:

Most out of the box kernel parameters (of RHELS 3,4,5) are set correctly for Oracle
except a few.

You should have the following minimal configuration:

net.ipv4.ip_local_port_range	1024  65000
kernel.sem			250  32000  100  128
kernel.shmmni			4096
kernel.shmall			2097152
kernel.shmmax			2147483648
fs.file-max			65536


You can check the most important parameters using the following command:

# /sbin/sysctl -a | egrep 'sem|shm|file-max|ip_local'

net.ipv4.ip_local_port_range = 1024  65000
kernel.sem = 250  32000  100  128
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.shmmax = 2147483648
fs.file-max = 65536

If some value should be changed, you can change the "/etc/sysctl.conf" file and run the "/sbin/sysctl -p" command
to change the value immediately.
Every time the system boots, the init program runs the /etc/rc.d/rc.sysinit script. This script contains 
a command to execute sysctl using /etc/sysctl.conf to dictate the values passed to the kernel. 
Any values added to /etc/sysctl.conf will take effect each time the system boots. 
 

3.1.8 make sure ssh and scp are working on all nodes without asking for a password.
      Use shh-keygen to arrange that.


3.1.9 Example "/etc/host" on the nodes:

Suppose you have the following 3 hosts, with their associated public and private names:

public  private
oc1	poc1
oc2	poc2
oc3	poc3

Then this could be a valid "/etc/hosts" file on the nodes: 

127.0.0.1	localhost.localdomain	localhost

192.168.2.99	rhes30
192.168.2.166	oltp
192.168.2.167	mw

192.168.2.101	oc1	#public1
192.168.2.179	voc1	#virtual1
192.168.1.101	poc1	#private1

192.168.2.102	oc2	#public2
192.168.2.177	voc2	#virtual2
192.168.1.102	poc2	#private2

192.168.2.103	oc3	#public3
192.168.2.178	voc3	#virtual3
192.168.1.103	poc3	#private3


3.1.10 Example disk devices

On all nodes, the shared disk devices should be accessible through the same devices names.

Raw Device Name		Physical Device Name	Purpose
/dev/raw/raw1		/dev/sda1		ASM Disk 1: +DATA1
/dev/raw/raw2		/dev/sdb1		ASM Disk 1: +DATA1
/dev/raw/raw3		/dev/sdc1		ASM Disk 2: +RECOV1
/dev/raw/raw4		/dev/sdd1		ASM Disk 2: +RECOV1
/dev/raw/raw5		/dev/sde1		OCR Disk    (on RAW device)
/dev/raw/raw6		/dev/sdf1		Voting Disk (on RAW device)

So as you can see, we use a combination of ASM (for database files and recovery area),
and RAW (for the OCR and Voting Disk).


3.2 CRS installation:
---------------------

3.2.1 First install CRS in its own home directory

First install CRS in its own home directory, e.g. CRS10gHome, apart from the Oracle home dir.
In fact, you NEED to install CRS first, before installing any Oacle RDBMS software.

The special "thing" here, is that you perform the installation from one node, and that the setup program
will in fact install CRS on all three nodes. Ofcourse, given that scp and shh works OK on all nodes, and
that the accounts are all the same.
But, you still need to run a few scripts on all individual nodes. 


As Oracle user:

./runInstaller

 ---------------------------------------------------
 |                                                 |  Screen 1
 |Specify File LOcations                           |
 |                                                 |
 |Source                                           |
 |Path: /install/crs10g/Disk1/stage/products.xml   |
 |                                                 |
 |Destination                                      |
 |Name: CRS10gHome                                 |
 |Path: /u01/app/oracle/product/10.1.0/CRS10gHome  |
 |                                                 |
 ---------------------------------------------------


 ---------------------------------------------------
 |                                                 |  Screen 2
 |Cluster Configuration                            |
 |                                                 |
 |Cluster Name: lec1                               |
 |                                                 |
 | Public Node Name            Private Node Name   |
 | ---------------------------------------------   |
 | |oc1                 | p0c1                  |  |
 | |--------------------------------------------   |
 | |oc2                 | p0c2                  |  |
 | |--------------------------------------------   |
 | |oc3                 | poc3                  |  |
 | |--------------------------------------------   |
 ---------------------------------------------------

In the next screen, you specify which of your networks is to be used as
the public interface (to connect to the public network) and which will be used
for the private interconnect to support cache fushion and the cluster heartbeat.

 ---------------------------------------------------
 |                                                 |  Screen 3
 |Private Interconnect Enforcement                 |
 |                                                 |
 |                                                 |
 |                                                 |
 | Interface Name   Subnet          Interface type |
 | ---------------------------------------------   |
 | |eth0           |192.168.2.0   |Public      |   |
 | |--------------------------------------------   |
 | |eth1           |192.168.1.0   |Private     |   |
 | |--------------------------------------------   |
 |                                                 |
 ---------------------------------------------------

In the next screen, you specify /dev/raw/raw5 as the raw disk for the Oracle Cluster Registry.

 ---------------------------------------------------
 |                                                 |  Screen 4
 |Oracle Cluster Registry                          |
 |                                                 |
 |Specify OCR Location: /dev/raw/raw5              | (you are able to specify one extra mirror location of the OCR)
 |                                                 |
 ---------------------------------------------------

In a similar fashion you specify the location of the Voting Disk.

 ---------------------------------------------------
 |                                                 |  Screen 5
 |Voting Disk                                      |
 |                                                 |
 |Specify Voting Disk: /dev/raw/raw6               | (you are able to specify two extra locations 
 |                                                 | for copies of the VD)
 ---------------------------------------------------

You now have to execute the /u01/app/oracle/orainventory/orainstRoot.sh script
on all Cluster Nodes as the root user.

After this, you can continue with the other window, and see an "Install Summary" screen.
No you click "Install" and the installation begins.
Apart from the node you work on, the software will also be copied to the other nodes as well.

After the installation is complete, you are once again prompted to run a script as root
on each node of the Cluster.
This is the script "/u01/app/oracle/product/10.1.0/CRS10gHome/root.sh".


-- The olsnodes command.

After finishing the CSR installation, you can verify that the installation completed successfully
by running on any node the following command:

# cd /u01/app/oracle/product/10.1.0/CRS10gHome/bin
# olsnodes -n
oc1   1
oc2   2
oc3   3


3.3 Database software installation:
-----------------------------------

You can install the database software into the same directory in each node.
With OCFS2, you might do one install in a common shared directory for all nodes.

Because CSR is already running, the OUI detects that, and because its cluster aware, it
provides you with the options to install a clustered implementation.

You start the installation by running ./runInstaller as the oracle user on one node.
For most part, it looks the same as a single-instance installation.

After the file location screen, that is source and destination, you will see this screen:

 ---------------------------------------------------
 |                                                 |  
 |Specify Hardware Cluster Installation Mode       |
 |                                                 |
 | o Cluster installation mode                     |
 |                                                 |
 |  Node name                                      |
 |  ---------------------------------------------  |
 |  | [] oc1                                    |  |
 |  | [] oc2                                    |  |
 |  | [] oc3                                    |  |
 |  ---------------------------------------------  |
 |                                                 |
 | o Local installation (non cluster)              |
 |                                                 |
 |-------------------------------------------------|

Most of the time, you will do a "software only" installation, and create the database later
with the DBCA.

For the first node only, after some time, the Virtual IP Configuration Assistant, VIPCA, will start.
Here you can configure the Virtual IP adresses you will use for application failover
and the Enterprise Manager Agent.
Here you will select the Virtual IP's for all nodes.
VIPCA only needs to run once per Cluster.


3.4 Creating the RAC database with DBCA:
----------------------------------------

Launching the DBCA for installing a RAC database is much the same as launching DBCA for a single instance.
If DBCA detects cluster software installed, it gives you the option to install a RAC database 
or a single instance.

as oracle user:

% dbca &

 ---------------------------------------------------
 |                                                 |  
 |Welcome to the database configuration assistant  |
 |                                                 |
 |                                                 |
 |                                                 |
 | o Oracle Real Application Cluster database      |
 |                                                 |
 | o Oracle single instance database               |
 |                                                 |
 |-------------------------------------------------|

After selecting RAC, the next screen gives you the option to select nodes:

 ---------------------------------------------------
 |                                                 |  
 |Select the nodes on which you want to create     |
 |the cluster database. The local node oc1 will    |
 |always be used whether or not it is selected.    |
 |                                                 |
 |  Node name                                      |
 |  ---------------------------------------------  |
 |  | [] oc1                                    |  |
 |  | [] oc2                                    |  |
 |  | [] oc3                                    |  |
 |  ---------------------------------------------  |
 |                                                 |
 |                                                 |
 |-------------------------------------------------|
 
In the next screens, you can choose the type of database (oltp, dw etc..), and all
other items, just like a single instance install.
At a cetain point, you can choose to use ASM diskgroups or RAW etc.., choose a flash-recovery area etc..
The way you install the database really resembles a normal single instance install, so we won't discuss
that here.


==============================
5. ASM and RAC in Oracle 10g:
==============================

A number of notes will explain ASM, and the integration of ASM into a RAC system.


========
Note 1:
========

Automatic Storage Management (ASM) in Oracle Database 10g


With ASM, Automatic Storage Management, there is a separate lightweight 10g database involved.
This ASM database (+ASM), contains all metadata about the ASM system.
It also acts as the interface between the regular database and the filesystems.

ASM will provide for presentation and implementation of a special filesystem, on which a number
of redundancy/availability and performance features are implemented.

In addition to the normal database background processes like CKPT, DBWR, LGWR, SMON, and PMON, 
an ASM instance uses at least two additional background processes to manage data storage operations. 
The Rebalancer process, RBAL, coordinates the rebalance activity for ASM disk groups, 
and the Actual ReBalance processes, ARBn, handle the actual rebalance of data extent movements. 
There are usually several ARB background processes (ARB0, ARB1, and so forth). 

Every database instance that uses ASM for file storage, will also need the two new processes. 
The Rebalancer background process (RBAL) handles global opens of all ASM disks in the ASM Disk Groups, 
while the ASM Bridge process (ASMB) connects as a foreground process into the ASM instance when the 
regular database instance starts. ASMB facilitates communication between the ASM instance and 
the regular database, including handling physical file changes like data file creation and deletion. 

ASMB exchanges messages between both servers for statistics update and instance health validation. 
These two processes are automatically started by the database instance when a new Oracle file type - 
for example, a tablespace's datafile -- is created on an ASM disk group. When an ASM instance mounts 
a disk group, it registers the disk group and connect string with Group Services. The database instance 
knows the name of the disk group, and can therefore use it to locate connect information for 
the correct ASM instance.


========
Note 2: 
========

Some terminology in RAC:

CRS cluster ready services - Clusterware:

For Oracle10g on Linux and Windows-based platforms, CRS co-exists with, but does not inter-operate, 
with vendor clusterware. You may use vendor clusterware for all UNIX-based operating systems 
except for Linux. Even though, many of the Unix platforms have their own clusterware products, 
you *need to use* the CRS software to provide the RAC HA support services. CRS (cluster ready services) 
supports services and workload management and helps to maintain the continuous availability of the services. 
CRS also manages resources such as virtual IP (VIP) address for the node and the global services daemon.
Note that the "Voting disks" and the "Oracle Cluster Registry", are regarded as part of the CRS.

OCR:

The Oracle Cluster Registry (OCR) contains cluster and database configuration information 
for Real Application Clusters Cluster Ready Services (CRS), including the list of nodes 
in the cluster database, the CRS application, resource profiles, and the authorizations for 
the Event Manager (EVM). The OCR can reside in a file on a cluster file system or on a shared raw device. 
When you install Real Application Clusters, you specify the location of the OCR.

OCFS (not used often):

OCFS is a shared disk cluster filesystem. Version 1 released for Linux is specifically designed 
to alleviate the need for manag-ing raw devices. It can contain all the 
oracle datafiles, archive log files and controlfiles.  It is however not designed as a 
general purpose filesystem.

OCFS2 is the next generation of the Oracle Cluster File System for Linux. It is an extent based, 
POSIX compliant file system. Unlike the previous release (OCFS), OCFS2 is a general-purpose 
file system that can be used for shared Oracle home installations making management of 
Oracle Real Application Cluster (RAC) installations even easier. Among the new features and benefits are: 

Node and architecture local files using Context Dependent Symbolic Links (CDSL) 
Network based pluggable DLM 
Improved journaling / node recovery using the Linux Kernel "JBD" subsystem 
Improved performance of meta-data operations (space allocation, locking, etc). 
Improved data caching / locking (for files such as oracle binaries, libraries, etc) 

- OCFS1 does NOT support a shared Oracle Home
- OCFS2 does     support a shared Oracle Home

Though ASM appears to be the intended replacement for Oracle Cluster File System (OCFS) 
for the Real Applications Cluster (RAC).
ASM supports Oracle Real Application Clusters (RAC), so there is no need 
for a separate Cluster LVM or a Cluster File System.

So it boils down to:
- You use or OCFS2, or RAW, or ASM (preferrably) for your database files.

Storage Option				Oracle Clusterware	Database	Recovery area
--------------				------------------	--------	-------------
Automatic Storage Management 		No 			Yes 		Yes 
Cluster file system (OCFS) 		Yes 			Yes 		Yes 
Shared raw storage 			Yes 			Yes 		No 


========
Note 3:
========

Automatic Storage Management (ASM) simplifies database administration. It eliminates the need for you, 
as a DBA, to directly manage potentially thousands of Oracle database files. It does this by enabling 
you to create disk groups, which are comprised of disks and the files that reside on them. You only need 
to manage a small number of disk groups.

In the SQL statements that you use for creating database structures such as tablespaces, redo log and 
archive log files, and control files, you specify file location in terms of disk groups. 
Automatic Storage Management then creates and manages the associated underlying files for you.

Automatic Storage Management extends the power of Oracle-managed files. With Oracle-managed files, 
files are created and managed automatically for you, but with Automatic Storage Management you get 
the additional benefits of features such as mirroring and striping.
The primary component of Automatic Storage Management is the disk group. You configure Automatic Storage Management 
by creating disk groups, which, in your database instance, can then be specified as the default 
location for files created in the database. Oracle provides SQL statements that create and manage 
disk groups, their contents, and their metadata.

A disk group consists of a grouping of disks that are managed together as a unit. These disks are referred 
to as ASM disks. Files written on ASM disks are ASM files, whose names are automatically generated 
by Automatic Storage Management. You can specify user-friendly alias names for ASM files, 
but you must create a hierarchical directory structure for these alias names.

You can affect how Automatic Storage Management places files on disks by specifying failure groups. 
Failure groups define disks that share components, such that if one fails then other disks sharing 
the component might also fail. An example of what you might define as a failure group would be a set 
of SCSI disks sharing the same SCSI controller. Failure groups are used to determine which ASM disks 
to use for storing redundant data. For example, if two-way mirroring is specified for a file, 
then redundant copies of file extents must be stored in separate failure groups.


If you would take a look at the v$datafile, v$logfile, and v$controlfile of the regular Database,
you would see information like in the following example:

SQL> select file#, name from v$datafile;

1  +DATA1/rac0/datafile/system.256.1
2  +DATA1/rac0/datafile/undotbs.258.1
3  +DATA1/rac0/datafile/sysaux.257.1
4  +DATA1/rac0/datafile/users.259.1
5  +DATA1/rac0/datafile/example.269.1


SQL> select name from v$controlfile;

+DATA1/rac0/controlfile/current.261.3
+DATA1/rac0/controlfile/current.260.3


-- Initialization Parameters (init.ora or SPFILE) for ASM Instances

The following initialization parameters relate to an ASM instance. Parameters that start with ASM_ 
cannot be set in database instances.

Name             Description 
INSTANCE_TYPE    Must be set to INSTANCE_TYPE = ASM. 
                 Note: This is the only required parameter. All other parameters take suitable defaults 
                 for most environments.
 
DB_UNIQUE_NAME   Unique name for this group of ASM instances within the cluster or on a node. 
Default: +ASM    (Needs to be modified only if trying to run multiple ASM instances on the same node)
 
ASM_POWER_LIMIT  The maximum power on an ASM instance for disk rebalancing. 
Default: 1       Can range from 1 to 11. 1 is the lowest priority. 

See Also: "Tuning Rebalance Operations"
 
ASM_DISKSTRING   Limits the set of disks that Automatic Storage Management considers for discovery. 
Default: NULL    (This default causes ASM to find all of the disks in a platform-specific location to which 
                  it has read/write access.).
                  Example: /dev/raw/*

ASM_DISKGROUPS   Lists the names of disk groups to be mounted by an ASM instance at startup, 
                 or when the ALTER DISKGROUP ALL MOUNT statement is used. 
Default: NULL    (If this parameter is not specified, then no disk groups are mounted.)

Note: This parameter is dynamic and if you are using a server parameter file (SPFILE), then you should 
rarely need to manually alter this value. Automatic Storage Management automatically adds a disk group 
to this parameter when a disk group is successfully mounted, and automatically removes a disk group that 
is specifically dismounted. However, when using a traditional text initialization parameter file, 
remember to edit the initialization parameter file to add the name of any disk group that you want automatically 
mounted at instance startup, and remove the name of any disk group that you no longer want automatically mounted.


-- ASM Views:

The ASM configuration can be viewed using the V$ASM_% views, which often contain different information 
depending on whether they are queried from the ASM instance, or a dependant database instance.

Viewing ASM Instance Information Via SQL Queries
There are several dynamic and data dictionary views available to view an ASM configuration from within 
the ASM instance itself:

ASM Dynamic Views: FROM ASM Instance Information
 
View Name        Description
 
V$ASM_ALIAS      Shows every alias for every disk group mounted by the ASM instance
 
V$ASM_CLIENT     Shows which database instance(s) are using any ASM disk groups that are being mounted by this ASM instance
 
V$ASM_DISK       Lists each disk discovered by the ASM instance, including disks that are not part of any ASM disk group
 
V$ASM_DISKGROUP  Describes information about ASM disk groups mounted by the ASM instance
 
V$ASM_FILE       Lists each ASM file in every ASM disk group mounted by the ASM instance
 
V$ASM_OPERATION  Like its counterpart, V$SESSION_LONGOPS, it shows each long-running ASM operation in the ASM instance
 
V$ASM_TEMPLATE   Lists each template present in every ASM disk group mounted by the ASM instance
 
 
-- Managing disk groups

The SQL statements introduced in this section are only available in an ASM instance. 
You must first start the ASM instance. 

Creating disk group examples:

Example 1:
----------

Creating a Disk Group: Example

The following examples assume that the ASM_DISKSTRING is set to '/devices/*'. Assume the following:

ASM disk discovery identifies the following disks in directory /devices.

/devices/diska1 
/devices/diska2 
/devices/diska3 
/devices/diska4 
/devices/diskb1 
/devices/diskb2 
/devices/diskb3 
/devices/diskb4

The disks diska1 - diska4 are on a separate SCSI controller from disks diskb1 - diskb4.


The following SQL*Plus session illustrates starting an ASM instance and creating a disk group named dgroup1.

% SQLPLUS /NOLOG
SQL> CONNECT / AS SYSDBA

SQL> CREATE DISKGROUP dgroup1 NORMAL REDUNDANCY 
  2  FAILGROUP controller1 DISK
  3 '/devices/diska1',
  4 '/devices/diska2',
  5 '/devices/diska3',
  6 '/devices/diska4',
  7 FAILGROUP controller2 DISK
  8 '/devices/diskb1',
  9 '/devices/diskb2',
 10 '/devices/diskb3',
 11 '/devices/diskb4';

In this example, dgroup1 is composed of eight disks that are defined as belonging to either 
failure group controller1 or controller2. Since NORMAL REDUNDANCY level is specified for the disk group, 
then Automatic Storage Management provides redundancy for all files created in dgroup1 according to the 
attributes specified in the disk group templates.

For example, in the system default template shown in the table in "Managing Disk Group Templates", 
normal redundancy for the online redo log files (ONLINELOG template) is two-way mirroring. This means that 
when one copy of a redo log file extent is written to a disk in failure group controller1, a mirrored copy 
of the file extent is written to a disk in failure group controller2. You can see that to support normal 
redundancy level, at least two failure groups must be defined.

Since no NAME clauses are provided for any of the disks being included in the disk group, 
the disks are assigned the names of dgroup1_0001, dgroup1_0002, ..., dgroup1_0008.


Example 2:
----------

CREATE DISKGROUP disk_group_1 NORMAL REDUNDANCY
  FAILGROUP failure_group_1 DISK
    '/devices/diska1' NAME diska1,
    '/devices/diska2' NAME diska2,
  FAILGROUP failure_group_2 DISK
    '/devices/diskb1' NAME diskb1,
    '/devices/diskb2' NAME diskb2;


Example 3:
----------

At some point in using OUI in installing the software, and creating a database, you will
see the following screen:

----------------------------------------------------
|SPECIFY Database File Storage Option               |
|                                                   |
|  o File system                                    |
|    Specify Database file location: #########      |
|                                                   |
|  o Automatic Storage Management (ASM)             |
|                                                   |
|  o Raw Devices                                    |
|                                                   |
|    Specify Raw Devices mapping file: ##########   |
----------------------------------------------------

Suppose that you have on a Linux machine the following raw disk devices:

/dev/raw/raw1	8GB
/dev/raw/raw2	8GB
/dev/raw/raw3	6GB
/dev/raw/raw4	6GB
/dev/raw/raw5	6GB
/dev/raw/raw6	6GB

Then you can choose ASM in the upper screen, and see the following screen, where
you can create the initial diskgroup and assign disks to it:

-----------------------------------------------------
| Configure Automatic Storage Management              |
|                                                     |
| Disk Group Name:  data1                             |
|                                                     |
| Redundancy                                          |
| o High  o Normal  o External                        |             
|                                                     |
| Add member Disks                                    |
| |--------------------------------                   |
| | select  Disk Path              |                  |
| |[#]     /dev/raw/raw1           |                  |
| |[#]     /dev/raw/raw2           |                  | 
| |[ ]     /dev/raw/raw3           |                  |
| |[ ]     /dev/raw/raw4           |                  |
|  --------------------------------                   |
|                                                     |
-----------------------------------------------------


-- Mounting and Dismounting Disk Groups

Disk groups that are specified in the ASM_DISKGROUPS initialization parameter are mounted automatically 
at ASM instance startup. This makes them available to all database instances running on the same node 
as Automatic Storage Management. The disk groups are dismounted at ASM instance shutdown. 
Automatic Storage Management also automatically mounts a disk group when you initially create it, 
and dismounts a disk group if you drop it.

There may be times that you want to mount or dismount disk groups manually. For these actions use 
the ALTER DISKGROUP ... MOUNT or ALTER DISKGROUP ... DISMOUNT statement. You can mount or dismount 
disk groups by name, or specify ALL.

If you try to dismount a disk group that contains open files, the statement will fail, unless you also
specify the FORCE clause.


Example

The following statement dismounts all disk groups that are currently mounted to the ASM instance:

ALTER DISKGROUP ALL DISMOUNT;


The following statement mounts disk group dgroup1:

ALTER DISKGROUP dgroup1 MOUNT; 


========
Note 4:
========


-- Installing Oracle ASMLib for Linux:

ASMLib is a support library for the Automatic Storage Management feature of Oracle Database 10g. 
This document is a set of tips for installing the Linux specific ASM library and its assocated driver. 
This library is provide to enable ASM I/O to Linux disks without the limitations of the 
standard Unix I/O API. The steps below are steps that the system administrator must follow. 

The ASMLib software is available from the Oracle Technology Network. Go to ASMLib download page 
and follow the link for your platform. 
You will see 4-6 packages for your Linux platform. 

-The oracleasmlib package provides the actual ASM library. 
-The oracleasm-support package provides the utilities used to get the ASM driver 
 up and running. Both of these packages need to be installed. 
-The remaining packages provide the kernel driver for the ASM library. Each package provides 
 the driver for a different kernel. You must install the appropriate package for the kernel you are running. 
 Use the "uname -r command to determine the version of the kernel. The oracleasm kerel driver package 
 will have that version string in its name. For example, if you were running Red Hat Enterprise Linux 4 AS, 
 and the kernel you were using was the 2.6.9-5.0.5.ELsmp kernel, you would choose the 
 oracleasm-2.6.9-5.0.5-ELsmp package. 

So, for example, to install these packages on RHEL4 on an Intel x86 machine,  you might use the command: 

rpm -Uvh oracleasm-support-2.0.0-1.i386.rpm \
    oracleasm-lib-2.0.0-1.i386.rpm \
    oracleasm-2.6.9-5.0.5-ELsmp-2.0.0-1.i686.rpm

Once the command completes, ASMLib is now installed on the system. 

-- Configuring ASMLib: 
 
Now that the ASMLib software is installed, a few steps have to be taken by the system administrator 
to make the ASM driver available. The ASM driver needs to be loaded, and the driver filesystem needs 
to be mounted. This is taken care of by the initialization script, "/etc/init.d/oracleasm". 
Run the "/etc/init.d/oracleasm" script with the "configure" option. It will ask for the user and group 
that default to owning the ASM driver access point. If the database was running as the 'oracle' user 
and the 'dba' group, the output would look like this: 

[root@ca-test1 /]# /etc/init.d/oracleasm configure
  Configuring the Oracle ASM library driver.
 
  This will configure the on-boot properties of the Oracle ASM library
  driver.  The following questions will determine whether the driver is
  loaded on boot and what permissions it will have.  The current values
  will be shown in brackets ('[]').  Hitting  without typing an
  answer will keep that current value.  Ctrl-C will abort.

  Default user to own the driver interface []: oracle
  Default group to own the driver interface []: dba
  Start Oracle ASM library driver on boot (y/n) [n]: y
  Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
  Writing Oracle ASM library driver configuration            [  OK  ]
  Creating /dev/oracleasm mount point                        [  OK  ]
  Loading module "oracleasm"                                 [  OK  ]
  Mounting ASMlib driver filesystem                          [  OK  ]
  Scanning system for ASM disks                              [  OK  ]
 

This should load the oracleasm.o driver module and mount the ASM driver filesystem. 
By selecting enabled = 'y' during the configuration, the system will always load the module 
and mount the filesystem on boot. 
The automatic start can be enabled or disabled with the 'enable' and 'disable' options 
to /etc/init.d/oracleasm: 

  [root@ca-test1 /]# /etc/init.d/oracleasm disable
  Writing Oracle ASM library driver configuration            [  OK  ]
  Unmounting ASMlib driver filesystem                        [  OK  ]
  Unloading module "oracleasm"                               [  OK  ]

  [root@ca-test1 /]# /etc/init.d/oracleasm enable
  Writing Oracle ASM library driver configuration            [  OK  ]
  Loading module "oracleasm"                                 [  OK  ]
  Mounting ASMlib driver filesystem                          [  OK  ]
  Scanning system for ASM disks                              [  OK  ]


-- Making Disks Available to ASMLib: 
 
The system administrator has one last task. Every disk that ASMLib is going to be accessing 
needs to be made available. This is accomplished by creating an ASM disk. The /etc/init.d/oracleasm script 
is again used for this task: 

  [root@ca-test1 /]# /etc/init.d/oracleasm createdisk VOL1 /dev/sdg1
  Creating Oracle ASM disk "VOL1"                            [  OK  ]

 
Disk names are ASCII capital letters, numbers, and underscores. They must start with a letter. 
Disks that are no longer used by ASM can be unmarked as well: 

  [root@ca-test1 /]# /etc/init.d/oracleasm deletedisk VOL1
  Deleting Oracle ASM disk "VOL1"                            [  OK  ]

Any operating system disk can be queried to see if it is used by ASM: 

  [root@ca-test1 /]# /etc/init.d/oracleasm querydisk /dev/sdg1
  Checking if device "/dev/sdg1" is an Oracle ASM disk        [  OK  ]
  [root@ca-test1 /]# /etc/init.d/oracleasm querydisk /dev/sdh1
  Checking if device "/dev/sdh1" is an Oracle ASM disk        [FAILED]

Existing disks can be listed and queried: 

  [root@ca-test1 /]# /etc/init.d/oracleasm listdisks
  VOL1
  VOL2
  VOL3
  [root@ca-test1 /]# /etc/init.d/oracleasm querydisk VOL1
  Checking for ASM disk "VOL1"                               [  OK  ]

When a disk is added to a RAC setup, the other nodes need to be notified about it. 
Run the 'createdisk' command on one node, and then run 'scandisks' on every other node: 

  [root@ca-test1 /]# /etc/init.d/oracleasm scandisks
  Scanning system for ASM disks                              [  OK  ]


-- Discovery Strings for Linux ASMLib: 
 
ASMLib uses discovery strings to determine what disks ASM is asking for. The generic Linux ASMLib 
uses glob strings. The string must be prefixed with "ORCL:". Disks are specified by name. 
A disk created with the name "VOL1" can be discovered in ASM via the discovery string "ORCL:VOL1". 
Similarly, all disks that start with the string "VOL" can be queried with the discovery string "ORCL:VOL*". 
Disks cannot be discovered with path names in the discovery string. If the prefix is missing, 
the generic Linux ASMLib will ignore the discovery string completely, expecting that it is intended 
for a different ASMLib. The only exception is the empty string (""), which is considered a full wildcard. 
This is precisely equivalent to the discovery string "ORCL:*". 

NOTE: Once you mark your disks with Linux ASMLib, Oracle Database 10g R1 (10.1) OUI will not be able 
to discover your disks. It is recommended that you complete a Software Only install and then use DBCA 
to create your database (or use the custom install). 

 
========
Note 5:
========

Automatic Storage Management (ASM) is a new feature that has be introduced in Oracle 10g to 
simplify the storage of Oracle datafiles, controlfiles and logfiles. 


- Overview of Automatic Storage Management (ASM) 
- Initialization Parameters and ASM Instance Creation 
- Startup and Shutdown of ASM Instances 
- Administering ASM Disk Groups 
- Disks 
- Templates 
- Directories 
- Aliases 
- Files 
- Checking Metadata 
- ASM Filenames 
- ASM Views 
- SQL and ASM 
- Migrating to ASM Using RMAN 

Overview of Automatic Storage Management (ASM)
Automatic Storage Management (ASM) simplifies administration of Oracle related files by allowing 
the administrator to reference disk groups rather than individual disks and files, which are managed by ASM. 
The ASM functionality is an extention of the Oracle Managed Files (OMF) functionality that also includes 
striping and mirroring to provide balanced and secure storage. The new ASM functionality can be used in 
combination with existing raw and cooked file systems, along with OMF and manually managed files.

The ASM functionality is controlled by an ASM instance. This is not a full database instance, 
just the memory structures and as such is very small and lightweight.

The main components of ASM are disk groups, each of which comprise of several physical disks that are controlled 
as a single unit. The physical disks are known as ASM disks, while the files that reside on the disks 
are know as ASM files. The locations and names for the files are controlled by ASM, but user-friendly aliases and directory structures can be defined for ease of reference.

The level of redundancy and the granularity of the striping can be controlled using templates. 
Default templates are provided for each file type stored by ASM, but additional templates can be defined as needed.

Failure groups are defined within a disk group to support the required level of redundancy. 
For two-way mirroring you would expect a disk group to contain two failure groups so individual files 
are written to two locations.

In summary ASM provides the following functionality:

Manages groups of disks, called disk groups. 
Manages disk redundancy within a disk group. 
Provides near-optimal I/O balancing without any manual tuning. 
Enables management of database objects without specifying mount points and filenames. 
Supports large files. 
Initialization Parameters and ASM Instance Creation

The init.ora / spfile initialization parameters that are of specific interest for an ASM instance are:

INSTANCE_TYPE   - Set to ASM or RDBMS depending on the instance type. The default is RDBMS. 
DB_UNIQUE_NAME  - Specifies a globally unique name for the database. This defaults to +ASM but 
                  must be altered if you intend to run multiple ASM instances. 
ASM_POWER_LIMIT - The maximum power for a rebalancing operation on an ASM instance. The valid values range 
                  from 1 to 11, with 1 being the default. The higher the limit the more resources are allocated 
                  resulting in faster rebalancing operations. This value is also used as the default 
                  when the POWER clause is omitted from a rebalance operation. 
ASM_DISKGROUPS  - The list of disk groups that should be mounted by an ASM instance during instance startup, 
                  or by the ALTER DISKGROUP ALL MOUNT statement. ASM configuration changes are automatically 
                  reflected in this parameter. 
ASM_DISKSTRING -  Specifies a value that can be used to limit the disks considered for discovery. 
                  Altering the default value may improve the speed of disk group mount time and the speed 
                  of adding a disk to a disk group. Changing the parameter to a value which prevents 
                  the discovery of already mounted disks results in an error. The default value is NULL 
                  allowing all suitable disks to be considered. 

Incorrect usage of parameters in ASM or RDBMS instances result in ORA-15021 errors.

To create an ASM instance first create a file called init+ASM.ora in the /tmp directory 
containing the following information.

INSTANCE_TYPE=ASM 

Next, using SQL*Plus connect to the ide instance.

export ORACLE_SID=+ASM

sqlplus / as sysdba

Create an spfile using the contents of the init+ASM.ora file.

SQL> CREATE SPFILE FROM PFILE='/tmp/init+ASM.ora';

File created.

Finally, start the instance with the NOMOUNT option.

SQL> startup nomount
ASM instance started

Total System Global Area  125829120 bytes
Fixed Size                  1301456 bytes
Variable Size             124527664 bytes
Database Buffers                  0 bytes
Redo Buffers                      0 bytes
SQL>

The ASM instance is now ready to use for creating and mounting disk groups. 
To shutdown the ASM instance issue the following command.

SQL> shutdown
ASM instance shutdown
SQL>

Once an ASM instance is present disk groups can be used for the following parameters 
in database instances (INSTANCE_TYPE=RDBMS) to allow ASM file creation:

DB_CREATE_FILE_DEST 
DB_CREATE_ONLINE_LOG_DEST_n 
DB_RECOVERY_FILE_DEST 
CONTROL_FILES 
LOG_ARCHIVE_DEST_n 
LOG_ARCHIVE_DEST 
STANDBY_ARCHIVE_DEST 


Here is an example of how to create a datafile using a default disk group specified by an initialization parameter setting. 
Suppose the Database initialization parameter file is set as follows:

DB_CREATE_FILE_DEST = ‘+dskgrp01’

If you now create a tablespace

SQL> CREATE TABLESPACE SALESDATA;

it will be stored in +dskgrp01


Startup and Shutdown of ASM Instances
ASM instance are started and stopped in a similar way to normal database instances. The options 
for the STARTUP command are:

FORCE - Performs a SHUTDOWN ABORT before restarting the ASM instance. 
MOUNT - Starts the ASM instance and mounts the disk groups specified by the ASM_DISKGROUPS parameter. 
NOMOUNT - Starts the ASM instance without mounting any disk groups. 
OPEN - This is not a valid option for an ASM instance. 

The options for the SHUTDOWN command are:

NORMAL - The ASM instance waits for all connected ASM instances and SQL sessions to exit then shuts down. 
IMMEDIATE - The ASM instance waits for any SQL transactions to complete then shuts down. 
            It doesn't wait for sessions to exit. 
TRANSACTIONAL - Same as IMMEDIATE. 
ABORT - The ASM instance shuts down instantly. 

Aministering ASM Disk Groups

Disk groups are created using the CREATE DISKGROUP statement. This statement allows you to specify 
the level of redundancy:

NORMAL REDUNDANCY   - Two-way mirroring, requiring two failure groups. 
HIGH REDUNDANCY     - Three-way mirroring, requiring three failure groups. 
EXTERNAL REDUNDANCY - No mirroring for disks that are already protected using hardware mirroring or RAID. 

In addition failure groups and preferred names for disks can be defined. If the NAME clause is omitted 
the disks are given a system generated name like "disk_group_1_0001". The FORCE option can be used 
to move a disk from another disk group into this one.

CREATE DISKGROUP disk_group_1 NORMAL REDUNDANCY
  FAILGROUP failure_group_1 DISK
    '/devices/diska1' NAME diska1,
    '/devices/diska2' NAME diska2,
  FAILGROUP failure_group_2 DISK
    '/devices/diskb1' NAME diskb1,
    '/devices/diskb2' NAME diskb2;

Disk groups can be deleted using the DROP DISKGROUP statement.

DROP DISKGROUP disk_group_1 INCLUDING CONTENTS;

Disks can be added or removed from disk groups using the ALTER DISKGROUP statement. 
Remember that the wildcard "*" can be used to reference disks so long as the resulting string does not match 
a disk already used by an existing disk group.

-- Add disks.
ALTER DISKGROUP disk_group_1 ADD DISK
  '/devices/disk*3',
  '/devices/disk*4';

-- Drop a disk.
ALTER DISKGROUP disk_group_1 DROP DISK diska2;

Disks can be resized using the RESIZE clause of the ALTER DISKGROUP statement. 
The statement can be used to resize individual disks, all disks in a failure group or all disks 
in the disk group. If the SIZE clause is omitted the disks are resized to the size of the disk returned by the OS.

-- Resize a specific disk.
ALTER DISKGROUP disk_group_1
  RESIZE DISK diska1 SIZE 100G;

-- Resize all disks in a failure group.
ALTER DISKGROUP disk_group_1
  RESIZE DISKS IN FAILGROUP failure_group_1 SIZE 100G;

-- Resize all disks in a disk group.
ALTER DISKGROUP disk_group_1
  RESIZE ALL SIZE 100G;The UNDROP DISKS clause of the ALTER DISKGROUP statement allows pending disk drops 
to be undone. It will not revert drops that have completed, or disk drops associated with the dropping of a disk group.

ALTER DISKGROUP disk_group_1 UNDROP DISKS;

Disk groups can be rebalanced manually using the REBALANCE clause of the ALTER DISKGROUP statement. 
If the POWER clause is omitted the ASM_POWER_LIMIT parameter value is used. Rebalancing is only needed 
when the speed of the automatic rebalancing is not appropriate. 

ALTER DISKGROUP disk_group_1 REBALANCE POWER 5;

Disk groups are mounted at ASM instance startup and unmounted at ASM instance shutdown. 
Manual mounting and dismounting can be accomplished using the ALTER DISKGROUP statement as seen below.

ALTER DISKGROUP ALL DISMOUNT;
ALTER DISKGROUP ALL MOUNT;
ALTER DISKGROUP disk_group_1 DISMOUNT;
ALTER DISKGROUP disk_group_1 MOUNT;

Templates
Templates are named groups of attributes that can be applied to the files within a disk group. 
The following example show how templates can be created, altered and dropped.

-- Create a new template.
ALTER DISKGROUP disk_group_1 ADD TEMPLATE my_template ATTRIBUTES (MIRROR FINE);

-- Modify template.
ALTER DISKGROUP disk_group_1 ALTER TEMPLATE my_template ATTRIBUTES (COARSE);

-- Drop template.
ALTER DISKGROUP disk_group_1 DROP TEMPLATE my_template;Available attributes include:

UNPROTECTED - No mirroring or striping regardless of the redundancy setting. 
MIRROR - Two-way mirroring for normal redundancy and three-way mirroring for high redundancy. 
         This attribute cannot be set for external redundancy. 
COARSE - Specifies lower granuality for striping. This attribute cannot be set for external redundancy. 
FINE - Specifies higher granularity for striping. This attribute cannot be set for external redundancy. 

Directories
A directory heirarchy can be defined using the ALTER DISKGROUP statement to support ASM file aliasing. 
The following examples show how ASM directories can be created, modified and deleted.

-- Create a directory.
ALTER DISKGROUP disk_group_1 ADD DIRECTORY '+disk_group_1/my_dir';

-- Rename a directory.
ALTER DISKGROUP disk_group_1 RENAME DIRECTORY '+disk_group_1/my_dir' TO '+disk_group_1/my_dir_2';

-- Delete a directory and all its contents.
ALTER DISKGROUP disk_group_1 DROP DIRECTORY '+disk_group_1/my_dir_2' FORCE;Aliases
Aliases allow you to reference ASM files using user-friendly names, rather than the fully qualified ASM filenames. 
-- Create an alias using the fully qualified filename.
ALTER DISKGROUP disk_group_1 ADD ALIAS '+disk_group_1/my_dir/my_file.dbf'
  FOR '+disk_group_1/mydb/datafile/my_ts.342.3';

-- Create an alias using the numeric form filename.
ALTER DISKGROUP disk_group_1 ADD ALIAS '+disk_group_1/my_dir/my_file.dbf'
  FOR '+disk_group_1.342.3';

-- Rename an alias.
ALTER DISKGROUP disk_group_1 RENAME ALIAS '+disk_group_1/my_dir/my_file.dbf'
  TO '+disk_group_1/my_dir/my_file2.dbf';

-- Delete an alias.
ALTER DISKGROUP disk_group_1 DELETE ALIAS '+disk_group_1/my_dir/my_file.dbf';

Attempting to drop a system alias results in an error.

Files
Files are not deleted automatically if they are created using aliases, as they are not Oracle Managed Files (OMF), 
or if a recovery is done to a point-in-time before the file was created. For these circumstances 
it is necessary to manually delete the files, as shown below.

-- Drop file using an alias.
ALTER DISKGROUP disk_group_1 DROP FILE '+disk_group_1/my_dir/my_file.dbf';

-- Drop file using a numeric form filename.
ALTER DISKGROUP disk_group_1 DROP FILE '+disk_group_1.342.3';

-- Drop file using a fully qualified filename.
ALTER DISKGROUP disk_group_1 DROP FILE '+disk_group_1/mydb/datafile/my_ts.342.3';

Checking Metadata
The internal consistency of disk group metadata can be checked in a number of ways using the CHECK clause 
of the ALTER DISKGROUP statement.

-- Check metadata for a specific file.
ALTER DISKGROUP disk_group_1 CHECK FILE '+disk_group_1/my_dir/my_file.dbf'

-- Check metadata for a specific failure group in the disk group.
ALTER DISKGROUP disk_group_1 CHECK FAILGROUP failure_group_1;

-- Check metadata for a specific disk in the disk group. 
ALTER DISKGROUP disk_group_1 CHECK DISK diska1;

-- Check metadata for all disks in the disk group. 
ALTER DISKGROUP disk_group_1 CHECK ALL;

ASM Views
The ASM configuration can be viewed using the V$ASM_% views, which often contain different information 
depending on whether they are queried from the ASM instance, or a dependant database instance.

Viewing ASM Instance Information Via SQL Queries
Finally, there are several dynamic and data dictionary views available to view an ASM configuration from within 
the ASM instance itself:

-- ASM Dynamic Views: FROM ASM Instance Information
 
View Name        Description
 
V$ASM_ALIAS      Shows every alias for every disk group mounted by the ASM instance
 
V$ASM_CLIENT     Shows which database instance(s) are using any ASM disk groups that are being mounted by this ASM instance
 
V$ASM_DISK       Lists each disk discovered by the ASM instance, including disks that are not part of any ASM disk group
 
V$ASM_DISKGROUP  Describes information about ASM disk groups mounted by the ASM instance
 
V$ASM_FILE       Lists each ASM file in every ASM disk group mounted by the ASM instance
 
V$ASM_OPERATION  Like its counterpart, V$SESSION_LONGOPS, it shows each long-running ASM operation in the ASM instance
 
V$ASM_TEMPLATE   Lists each template present in every ASM disk group mounted by the ASM instance
 

I was also able to query the following dynamic views against my database instance to view the related ASM storage 
components of that instance:

-- ASM Dynamic Views: FROM Database Instance Information
 
View Name          Description
 
V$ASM_DISKGROUP    Shows one row per each ASM disk group that's mounted by the local ASM instance
 
V$ASM_DISK         Displays one row per each disk in each ASM disk group that are in use by the database instance
 
V$ASM_CLIENT       Lists one row per each ASM instance for which the database instance has any open ASM files
 

ASM Filenames
There are several ways to reference ASM file. Some forms are used during creation and some for 
referencing ASM files. The forms for file creation are incomplete, relying on ASM to create the fully qualified name, 
which can be retrieved from the supporting views. The forms of the ASM filenames are summarised below.

Filename Type Format 
Fully Qualified ASM Filename +dgroup/dbname/file_type/file_type_tag.file.incarnation 
Numeric ASM Filename +dgroup.file.incarnation 
Alias ASM Filenames +dgroup/directory/filename 
Alias ASM Filename with Template +dgroup(template)/alias 
Incomplete ASM Filename +dgroup 
Incomplete ASM Filename with Template +dgroup(template) 

SQL and ASM
ASM filenames can be used in place of conventional filenames for most Oracle file types, including controlfiles, 
datafiles, logfiles etc. For example, the following command creates a new tablespace with a datafile 
in the disk_group_1 disk group.

CREATE TABLESPACE my_ts DATAFILE '+disk_group_1' SIZE 100M AUTOEXTEND ON;Migrating to ASM Using RMAN
The following method shows how a primary database can be migrated to ASM from a disk based backup:

Disable change tracking (only available in Enterprise Edition) if it is currently being used.

SQL> ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;Shutdown the database.

SQL> SHUTDOWN IMMEDIATEModify the parameter file of the target database as follows:

Set the DB_CREATE_FILE_DEST and DB_CREATE_ONLINE_LOG_DEST_n parameters to the relevant ASM disk groups. 
Remove the CONTROL_FILES parameter from the spfile so the control files will be moved to the DB_CREATE_* destination 
and the spfile gets updated automatically. If you are using a pfile the CONTROL_FILES parameter must be set 
to the appropriate ASM files or aliases. 


Start the database in nomount mode.

RMAN> STARTUP NOMOUNTRestore the controlfile into the new location from the old location.

RMAN> RESTORE CONTROLFILE FROM 'old_control_file_name';Mount the database.

RMAN> ALTER DATABASE MOUNT;Copy the database into the ASM disk group.

RMAN> BACKUP AS COPY DATABASE FORMAT '+disk_group';Switch all datafile to the new ASM location.

RMAN> SWITCH DATABASE TO COPY;Open the database.

RMAN> ALTER DATABASE OPEN;Create new redo logs in ASM and delete the old ones.


Enable change tracking if it was being used.

SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING;Form more information see:

Using Automatic Storage Management 
Migrating a Database into ASM 
Hope this helps. Regards Tim...


Note 6:
=======


How to Use Oracle10g release 2 ASM on Linux:

[root@danaly etc]# fdisk /dev/cciss/c0d0

The number of cylinders for this disk is set to 8854.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/cciss/c0d0: 72.8 GB, 72833679360 bytes
255 heads, 63 sectors/track, 8854 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

           Device Boot      Start         End      Blocks   Id  System
/dev/cciss/c0d0p1   *           1          33      265041   83  Linux
/dev/cciss/c0d0p2              34         555     4192965   82  Linux swap
/dev/cciss/c0d0p3             556         686     1052257+  83  Linux
/dev/cciss/c0d0p4             687        8854    65609460    5  Extended
/dev/cciss/c0d0p5             687        1730     8385898+  83  Linux
/dev/cciss/c0d0p6            1731        2774     8385898+  83  Linux
/dev/cciss/c0d0p7            2775        3818     8385898+  83  Linux
/dev/cciss/c0d0p8            3819        4601     6289416   83  Linux

Command (m for help): n
First cylinder (4602-8854, default 4602): 
Using default value 4602
Last cylinder or +size or +sizeM or +sizeK (4602-8854, default 8854): +20000M    

Command (m for help): n
First cylinder (7035-8854, default 7035): 
Using default value 7035
Last cylinder or +size or +sizeM or +sizeK (7035-8854, default 8854): +3000M 

Command (m for help): n
First cylinder (7401-8854, default 7401): 
Using default value 7401
Last cylinder or +size or +sizeM or +sizeK (7401-8854, default 8854): +3000M

Command (m for help): p

Disk /dev/cciss/c0d0: 72.8 GB, 72833679360 bytes
255 heads, 63 sectors/track, 8854 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

           Device Boot      Start         End      Blocks   Id  System
/dev/cciss/c0d0p1   *           1          33      265041   83  Linux
/dev/cciss/c0d0p2              34         555     4192965   82  Linux swap
/dev/cciss/c0d0p3             556         686     1052257+  83  Linux
/dev/cciss/c0d0p4             687        8854    65609460    5  Extended
/dev/cciss/c0d0p5             687        1730     8385898+  83  Linux
/dev/cciss/c0d0p6            1731        2774     8385898+  83  Linux
/dev/cciss/c0d0p7            2775        3818     8385898+  83  Linux
/dev/cciss/c0d0p8            3819        4601     6289416   83  Linux
/dev/cciss/c0d0p9            4602        7034    19543041   83  Linux
/dev/cciss/c0d0p10           7035        7400     2939863+  83  Linux
/dev/cciss/c0d0p11           7401        7766     2939863+  83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.


[root@danaly data1]# /etc/init.d/oracleasm createdisk VOL5 /dev/cciss/c0d0p10
Marking disk "/dev/cciss/c0d0p10" as an ASM disk: [  OK  ]
[root@danaly data1]# /etc/init.d/oracleasm createdisk VOL6 /dev/cciss/c0d0p11
Marking disk "/dev/cciss/c0d0p11" as an ASM disk: [  OK  ]
[root@danaly data1]# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
VOL5
VOL6


(THE FOLLOWING QUERIES ARE ISSUED FROM THE ASM INSTANCE.)

[oracle@danaly ~]$ export ORACLE_SID=+ASM
[oracle@danaly ~]$ sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.1.0 - Production on Sun Sep 3 00:28:09 2006

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ASM instance started

Total System Global Area   83886080 bytes
Fixed Size                  1217836 bytes
Variable Size              57502420 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted

SQL> select group_number,disk_number,mode_status from v$asm_disk;

GROUP_NUMBER DISK_NUMBER MODE_STATUS
------------ ----------- --------------
           0           4 ONLINE
           0           5 ONLINE
           1           0 ONLINE
           1           1 ONLINE
           1           2 ONLINE
           1           3 ONLINE

6 rows selected.

SQL> select group_number,disk_number,mode_status,name from v$asm_disk;

GROUP_NUMBER DISK_NUMBER MODE_STATUS    NAME
------------ ----------- -------------- ---------------------------------
           0           4 ONLINE
           0           5 ONLINE
           1           0 ONLINE         VOL1
           1           1 ONLINE         VOL2
           1           2 ONLINE         VOL3
           1           3 ONLINE         VOL4

6 rows selected.

SQL> create diskgroup orag2 external redundancy disk 'ORCL:VOL5';

Diskgroup created.

SQL> select group_number,disk_number,mode_status,name from v$asm_disk;

GROUP_NUMBER DISK_NUMBER MODE_STATUS    NAME
------------ ----------- -------------- -------------------------------------
           0           5 ONLINE
           1           0 ONLINE         VOL1
           1           1 ONLINE         VOL2
           1           2 ONLINE         VOL3
           1           3 ONLINE         VOL4
           2           0 ONLINE         VOL5

6 rows selected.


(THE FOLLOWING QUERIES ARE ISSUED FROM THE DATABASE INSTANCE.)

[oracle@danaly ~]$ export ORACLE_SID=danaly
[oracle@danaly ~]$ sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.1.0 - Production on Sun Sep 3 00:47:04 2006

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  943718400 bytes
Fixed Size                  1222744 bytes
Variable Size             281020328 bytes
Database Buffers          654311424 bytes
Redo Buffers                7163904 bytes
Database mounted.
Database opened.

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+ORADG/danaly/datafile/system.264.600016955
+ORADG/danaly/datafile/undotbs1.265.600016969
+ORADG/danaly/datafile/sysaux.266.600016977
+ORADG/danaly/datafile/users.268.600016987


SQL> create tablespace eygle datafile '+ORAG2' ;

Tablespace created.

SQL> select name from v$datafile;

NAME
---------------------------------------------------------------------------------
+ORADG/danaly/datafile/system.264.600016955
+ORADG/danaly/datafile/undotbs1.265.600016969
+ORADG/danaly/datafile/sysaux.266.600016977
+ORADG/danaly/datafile/users.268.600016987
+ORAG2/danaly/datafile/eygle.256.600137647


oracle@danaly log]$ export ORACLE_SID=+ASM
[oracle@danaly log]$ sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.1.0 - Production on Sun Sep 3 01:36:37 2006

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Oracle Label Security, OLAP and Data Mining Scoring Engine options

SQL> alter diskgroup orag2 add disk 'ORCL:VOL6';

Diskgroup altered.


============
7: OMF
============

In a way, OMF is partly a predeccessor for ASM, on Oracle 9i. The OMF feature only provides for
easy management of tablespace files, like in create statements, where you do not have to
"worry" anymore on filelocations.


Using Oracle-managed files simplifies the administration of an Oracle database. Oracle-managed files eliminate 
the need for you, the DBA, to directly manage the operating system files comprising an Oracle database. 
You specify operations in terms of database objects rather than filenames. Oracle internally uses standard 
file system interfaces to create and delete files as needed for the following database structures:

Tablespaces 
Online redo log files 
Control files 


The following initialization parameters init.ora/spfile.ora allow the database server to use 
the Oracle Managed Files feature:


- DB_CREATE_FILE_DEST
  Defines the location of the default file system directory where Oracle creates datafiles 
  or tempfiles when no file specification is given in the creation operation. Also used as the default 
  file system directory for online redo log and control files if DB_CREATE_ONLINE_LOG_DEST_n is not specified.
 
- DB_CREATE_ONLINE_LOG_DEST_n
  Defines the location of the default file system directory for online redo log files and 
  control file creation when no file specification is given in the creation operation. You can use this 
  initialization parameter multiple times, where n specifies a multiplexed copy of the online redo log 
  or control file. You can specify up to five multiplexed copies
 
Example:

DB_CREATE_FILE_DEST         = '/u01/oradata/payroll'
DB_CREATE_ONLINE_LOG_DEST_1 = '/u02/oradata/payroll'
DB_CREATE_ONLINE_LOG_DEST_2 = '/u03/oradata/payroll'


=============================================
8: Installation notes 10g RAC on Windows
==============================================


8.1 Before you install:
-----------------------

Each node in a cluster requires the following:

> One private internet protocol (IP) address for each node to serve as the private interconnect. 
 The following must be true for each private IP address:

 -It must be separate from the public network
 -It must be accessible on the same network interface on each node
 -It must have a unique address on each node

 The private interconnect is used for inter-node communication by both Oracle Clusterware and RAC. 
 If the private address is available from a network name server (DNS), then you can use that name. 
 Otherwise, the private IP address must be available in each node's C:\WINNT\system32\drivers\etc\hosts file.

> One public IP address for each node, to be used as the Virtual IP (VIP) address for client connections 
and for connection failover. The name associated with the VIP must be different from the default host name.

This VIP must be associated with the same interface name on every node that is part of your cluster. 
In addition, the IP addresses that you use for all of the nodes that are part of a cluster must be from 
the same subnet. 

> One public fixed hostname address for each node, typically assigned by the system administrator 
during operating system installation. If you have a DNS, then register both the fixed IP and the VIP address 
with DNS. If you do not have DNS, then you must make sure that the public IP and VIP addresses for all 
nodes are in each node's host file.

For example, with a two node cluster where each node has one public and one private interface, 
you might have the configuration shown in the following table for your network interfaces, 
where the hosts file is %SystemRoot%\system32\drivers\etc\hosts:

Node Interface Name 	Type 		IP Address 	Registered In 
rac1 rac1 		Public 		143.46.43.100 	DNS (if available, else the hosts file) 
rac1 rac1-vip 		Virtual 	143.46.43.104 	DNS (if available, else the hosts file) 
rac1 rac1-priv 		Private 	10.0.0.1 	Hosts file 
rac2 rac2 		Public 		143.46.43.101 	DNS (if available, else the hosts file) 
rac2 rac2-vip 		Virtual 	143.46.43.105 	DNS (if available, else the hosts file) 
rac2 rac2-priv 		Private 	10.0.0.2 	Hosts file 

The virtual IP addresses are assigned to the listener process.

To enable VIP failover, the configuration shown in the preceding table defines the public and VIP addresses 
of both nodes on the same subnet, 143.46.43. When a node or interconnect fails, then the associated VIP 
is relocated to the surviving instance, enabling fast notification of the failure to the clients connecting 
through that VIP. If the application and client are configured with transparent application failover options, 
then the client is reconnected to the surviving instance.

To disable Windows Media Sensing for TCP/IP, you must set the value of the DisableDHCPMediaSense parameter to 1 
on each node. Disable Media Sensing by completing the following steps on each node of your cluster:

Use Registry Editor (Regedt32.exe) to view the following key in the registry:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters

Add the following registry value:

Value Name: DisableDHCPMediaSense
Data Type: REG_DWORD -Boolean
Value: 1


- External shared disks for storing Oracle Clusterware and database files.
The disk configuration options available to you are described in Chapter 3, "Storage Pre-Installation Tasks". 
Review these options before you decide which storage option to use in your RAC environment. However, note 
that when Database Configuration Assistant (DBCA) configures automatic disk backup, it uses a 
database recovery area which must be shared. The database files and recovery files do not necessarily have 
to be located on the same type of storage.

Determine the storage option for your system and configure the shared disk. Oracle recommends that 
you use Automatic Storage Management (ASM) and Oracle Managed Files (OMF), or a cluster file system. 
If you use ASM or a cluster file system, then you can also take advantage of OMF and other Oracle Database 10g 
storage features. If you use RAC on Oracle Database 10g Standard Edition, then you must use ASM.

If you use ASM, then Oracle recommends that you install ASM in a separate home from the 
Oracle Clusterware home and the Oracle home. 

Oracle Database 10g Real Application Clusters installation is a two-phase installation. 
In phase one, use Oracle Universal Installer (OUI) to install Oracle Clusterware. 
In phase two, install the database software using OUI.

When you install Oracle Clusterware or RAC, OUI copies the Oracle software onto the node from which 
you are running it. If your Oracle home is not on a cluster file system, then OUI propagates the software 
onto the other nodes that you have selected to be part of your OUI installation session. 

- Shared Storage for Database Recovery Area
When you configure a database recovery area in a RAC environment, the database recovery area must be on 
shared storage. When Database Configuration Assistant (DBCA) configures automatic disk backup, it uses 
a database recovery area that must be shared.

If the database files are stored on a cluster file system, then the recovery area can also be shared through 
the cluster file system.

If the database files are stored on an Automatic Storage Management (ASM) disk group, then the recovery area 
can also be shared through ASM.

If the database files are stored on raw devices, then you must use either a cluster file system or ASM 
for the recovery area.

Note:

ASM disk groups are always valid recovery areas, as are cluster file systems. Recovery area files do not have 
to be in the same location where datafiles are stored. For instance, you can store datafiles on raw devices, 
but use ASM for the recovery area.

Data files are not placed on NTFS partitions, because they cannot be shared. 
Data files can be placed on Oracle Cluster File System (OCFS), on raw disks using ASM, or on raw disks.


- Oracle Clusterware
You must provide OUI with the names of the nodes on which you want to install Oracle Clusterware. 
The Oracle Clusterware home can be either shared by all nodes, or private to each node, depending 
on your responses when you run OUI. The home that you select for Oracle Clusterware must be different 
from the RAC-enabled Oracle home.

Versions of cluster manager previous to Oracle Database 10g were sometimes referred to as "Cluster Manager". 
In Oracle Database 10g, this function is performed by a Oracle Clusterware component known as 
Cluster Synchronization Services (CSS). The OracleCSService, OracleCRService, and OracleEVMService 
replace the service known previous to Oracle Database 10g as OracleCMService9i.


8.2 cluvfy or runcluvfy.bat:
----------------------------

Once you have installed Oracle Clusterware, you can use CVU by entering cluvfy commands on the command line. 
To use CVU before you install Oracle Clusterware, you must run the commands using a command file available 
on the Oracle Clusterware installation media. Use the following syntax to run a CVU command run from the 
installation media, where media is the location of the Oracle Clusterware installation media and options 
is a list of one or more CVU command options:

media\clusterware\cluvfy\runcluvfy.bat options

You do not have to be the root user to use the CVU and the CVU assumes that the current user is the oracle user.


The following code example is of a CVU help command, run from a staged copy of the Oracle Clusterware 
directory downloaded from OTN into a directory called stage on your C: drive:

C:\stage\clusterware\cluvfy> runcluvfy.bat comp nodereach -n node1,node2 -verbose

For a quick test, you can run the following CVU command that you would normally use after you have completed 
the basic hardware and software configuration:

prompt> media\clusterware\cluvfy\runcluvfy.bat stage –post hwos –n node_list

Use the location of your Oracle Clusterware installation media for the media value and a list of the nodes, 
separated by commas, in your cluster for node_list. Expect to see many errors if you run this command 
before you or your system administrator complete the cluster pre-installation steps.

On Oracle Real Application Clusters systems, each member node of the cluster must have user equivalency 
for the Administrative privileges account that installs the database. This means that the administrative 
privileges user account and password must be the same on all nodes.

- Checking the Hardware and Operating System Setup with CVU
You can use two different CVU commands to check your hardware and operating system configuration. 
The first is a general check of the configuration, and the second specifically checks for the components required 
to install Oracle Clusterware.

The syntax of the more general CVU command is:

cluvfy stage –post hwos –n node_list [-verbose]

where node_list is the names of the nodes in your cluster, separated by commas. However, because you have 
not yet installed Oracle Clusterware, you must execute the CVU command from the installation media using a command 
like the one following. In this example, the command checks the hardware and operating system of a two-node 
cluster with nodes named node1 and node2, using a staged copy of the installation media in a directory called 
stage on the C: drive:

C:\stage\clusterware\cluvfy> runcluvfy.bat stage –post hwos –n node1,node2 -verbose

You can omit the -verbose keyword if you do not wish to see detailed results listed as CVU performs 
each individual test.

The following example is a command, without the -verbose keyword, to check for the readiness of the cluster 
for installing Oracle Clusterware:

C:\stage\clusterware\cluvfy> runcluvfy.bat comp sys -n node1,node2 -p crs

- Checking the Network Setup
Enter a command using the following syntax to verify node connectivity between all of the nodes 
for which your cluster is configured:

cluvfy comp nodecon -n node_list [-verbose]

- Verifying Cluster Privileges
Before running Oracle Universal Installer, from the node where you intend to run the Installer, 
verify that you have administrative privileges on the other nodes. To do this, enter the following command 
for each node that is a part of the cluster:

net use \\node_name\C$

where node_name is the node name. If your installation will access drives in addition to the C: drive, repeat 
this command for every node in the cluster, substituting the drive letter for each drive you plan to use.

For the installation to be successful, you must use the same user name and password on each node in a cluster 
or use a domain user name. If you use a domain user name, then log on under a domain with a user name and password 
to which you have explicitly granted local administrative privileges on all nodes.

To verify your configuration BEFORE installing CRS:

[oracle] $ cd /staging_area/clusterware/cluvfy
[oracle] $ ./runcluvfy.sh stage -pre crsinst -n docrac1,docrac2 -verbose

To verify your configuration AFTER installing CRS:
$ ./runcluvfy.sh stage -post crsinst -n docrac1,docrac2


8.3 Shared disk considerations:
-------------------------------

Preliminary Shared Disk Preparation
Complete the following steps to prepare shared disks for storage:

-- Disabling Write Caching
You must disable write caching on all disks that will be used to share data between nodes in your cluster. 
To disable write caching, perform these steps:

Click Start, then click Settings, then Control Panel, then Administrative Tools, then Computer Management, 
then Device Manager, and then Disk drives
Expand the Disk drives and double-click the first drive listed
Under the Disk Properties tab for the selected drive, uncheck the option that enables the write cache
Double-click each of the other drives listed in the Disk drives hive and disable the write cache as described 
in the previous step

Caution:

Any disks that you use to store files, including database files, that will be shared between nodes, 
must have write caching disabled.

-- Enabling Automounting for Windows 2003
If you are using Windows 2003, then you must enable disk automounting, depending on the Oracle products 
you are installing and on other conditions.

You must enable automounting when using:

Raw partitions for Oracle Real Application Clusters (RAC)
Cluster file system for Oracle Real Application Clusters
Oracle Clusterware
Raw partitions for a single-node database installation
Logical drives for Automatic Storage Management (ASM)

To enable automounting:

Enter the following commands at a command prompt:

c:\> diskpart
DISKPART> automount enable
Automatic mounting of new volumes enabled.

Type exit to end the diskpart session

Repeat steps 1 and 2 for each node in the cluster.


8.4 Reviewing Storage Options for Oracle Clusterware, Database, and Recovery Files:
-----------------------------------------------------------------------------------

This section describes supported options for storing Oracle Clusterware files, Oracle Database software, 
and database files. 

-- Overview of Oracle Clusterware Storage Options

Note that Oracle Clusterware files include the Oracle Cluster Registry (OCR) and 
the Oracle Clusterware voting disk.

There are two ways to store Oracle Clusterware files:

1. Oracle Cluster File System (OCFS): The cluster file system Oracle provides for the Windows and Linux communities. 
If you intend to store Oracle Clusterware files on OCFS, then you must ensure that OCFS volume sizes 
are at least 500 MB each.

2. Raw storage: Raw logical volumes or raw partitions are created and managed by Microsoft Windows 
disk management tools or by tools provided by third party vendors.

Note that you must provide disk space for one mirrored Oracle Cluster Registry (OCR) file, 
and two mirrored voting disk files.

-- Overview of Oracle Database and Recovery File Options

There are three ways to store Oracle Database and recovery files on shared disks:

1. Automatic Storage Management (database files only): Automatic Storage Management (ASM) is an integrated, 
high-performance database file system and disk manager for Oracle files. Because ASM requires an 
Oracle Database instance, it cannot contain Oracle software, but you can use ASM to manage database 
and recovery files.

2. Oracle Cluster File System (OCFS): Note that if you intend to use OCFS for your database files, 
then you should create partitions large enough for the database files when you create partitions 
for Oracle Clusterware

Note:

If you want to have a shared Oracle home directory for all nodes, then you must use OCFS.

3. Raw storage: Note that you cannot use raw storage to store Oracle database recovery files.

The storage option that you choose for recovery files can be the same as or different to the option 
you choose for the database files.


Storage Option				Oracle Clusterware	Database	Recovery area
--------------				------------------	--------	-------------
Automatic Storage Management 		No 			Yes 		Yes 
Cluster file system (OCFS) 		Yes 			Yes 		Yes 
Shared raw storage 			Yes 			Yes 		No 


-- Checking for Available Shared Storage with CVU
To check for all shared file systems available across all nodes on the cluster, use the following CVU command:

cluvfy comp ssa -n node_list

Remember to use the full path name and the runcluvfy.bat command on the installation media and include 
the list of nodes in your cluster, separated by commas, for the node_list. The following example is for 
a system with two nodes, node1 and node2, and the installation media on drive F:

F:\clusterware\cluvfy> runcluvfy.bat comp ssa -n node1,node2

If you want to check the shared accessibility of a specific shared storage type to specific nodes 
in your cluster, then use the following command syntax:

cluvfy comp ssa -n node_list -s storageID_list

In the preceding syntax, the variable node_list is the list of nodes you want to check, separated by commas, 
and the variable storageID_list is the list of storage device IDs for the storage devices managed by the 
file system type that you want to check.


===========================================
9. RAC tools an utilities.
===========================================


9.1: removing and adding a failed node:
=======================================

Suppose, using above example, that instance rac3 on node oc3, fails. Suppose that you need to repair
the node (e.g. harddisk crash).

-- Remove the instance:

% srvctl remove instance -d rac -i rac3
Remove instance rac3 for the database rac (y/n)? y

or use DBCA to delete the rac3 instance

-- Remove the node from the cluster:

# cd /u01/app/oracle/product/10.1.0/CRS10gHome/bin
# ./olsnode -n
oc1   1
oc2   2
oc3   3
# cd ../install
# ./rootdeletenode.sh oc3,3
# cd ../bin
# ./olsnode -n
oc1   1
oc2   2
#

Suppose that you have repared host oc3. We now want to add it back into the cluster.
Host oc3 has the OS newly installed, and its /etc/host file is just like it is on the other nodes.

-- Add the node at the clusterware layer:

From oc1 or oc2, go to the $CRS_Home/oui/bin directory, and run

# ./addNode.sh

A graphical screen pops up, and you are able to add oc3 to the cluster.
Al CRS files are copied to the new node.

To start the services on the new node, you are then prompted to run "rootaddnode.sh" on the active node
and "root.sh" on the new node.

# ./rootaddnode.sh

# ssh oc3
# cd /u01/app/oracle product/10.1.0/CRS10gHome
# ./root.sh


9.2 CRSCTL, OCRDUMP, OCRCONFIG, OCRCHECK:
=========================================

These are the Primary tools to view status, manipulate OCR and Voting Disk:

Example: Checking status Voting Disks:
$ crsctl query css votedisk

[root@node1-pub ~]# crsctl query css votedisk
 0.     0    /u02/ocfs2/vote/VDFile_0
 1.     0    /u02/ocfs2/vote/VDFile_1
 2.     0    /u02/ocfs2/vote/VDFile_2
Located 3 voting disk(s).

Example: You can dynamically add and remove voting disks after installing Oracle RAC. 
Do this using the following commands where path is the fully qualified path for the additional voting disk. 
Run the following command as the root user to add a voting disk:

# crsctl add css votedisk /dev/raw/raw9

Example: Run the following command as the root user to remove a voting disk:

# crsctl delete css votedisk path


Example: Adding a extra OCR file:
# ocrconfig -replace ocr /dev/raw/raw2

Example: show the automatic ocr backups:
# ocrconfig -showbackup


9.3 Some other commandline tools:
=================================

- olsnodes

Example: If you want to list the nodes in the cluster running CRS
$ olsnodes
oranode1
oranode2


5. The "crs_" tools:

There are a few other tools that provide information of your cluster. These tools are targeted at registered applications like the nodeapps or other apps.

For example, to list the status of the apps in the cluster, use crs_stat:

/home/oracle-->$CRS_HOME/bin/crs_stat -t


Name           Type           Target    State     Host
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    aix1
ora....x1.lsnr application    ONLINE    ONLINE    aix1
ora....ix1.gsd application    ONLINE    ONLINE    aix1
ora....ix1.ons application    ONLINE    ONLINE    aix1
ora....ix1.vip application    ONLINE    ONLINE    aix1
ora....SM2.asm application    ONLINE    ONLINE    aix2
ora....x2.lsnr application    ONLINE    ONLINE    aix2
ora....ix2.gsd application    ONLINE    ONLINE    aix2
ora....ix2.ons application    ONLINE    ONLINE    aix2
ora....ix2.vip application    ONLINE    ONLINE    aix2
ora....test.db application    ONLINE    ONLINE    aix1
ora....x1.inst application    ONLINE    ONLINE    aix1
ora....x2.inst application    ONLINE    ONLINE    aix2
/home/oracle--> 


Note 3: showing all nodes from a node:
--------------------------------------

# lsnodes -v

# cd /u01/app/oracle/product/10.1.0/CRS10gHome/bin
# ./olsnodes -n
oc1   1
oc2   2
oc3   3


9.4: using svrctl:
==================


The Server Control SVRCTL utility is installed on each node by default. 
You can use SRVCTL to start and stop the database and instances, manage configuration information,
and to move or remove instances and services.

Some SVRCTL operations store configuration information in the OCR. 
SVRCTL performs other operations, such as starting and stopping instances, by sending request
to the Oracle Clusterware process CSRD, which then starts or stops the Oracle Clusterware resources.

srvctl must be run from the $ORACLE_HOME of the RAC you are administering. 
The basic format of a srvctl command is 

srvctl <command> <target> [options]

where command is one of

enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config

and the target, or object, can be a 
-database, 
-instance, 
-service, 
-ASM instance, or the 
-nodeapps.

1. SVRCTL:
srvctl is the primary tool DBAs use to configure and manipulate Instances, RAC database and processes.

Example 1. Bring up the MYSID1 instance of the MYSID database.
$ srvctl start instance -d MYSID -i MYSID1

Example 2. 
Stop the MYSID database and all its instances, on all nodes:
$ srvctl stop database -d MYSID
$ srvctl stop database -d MYSID -o immediate
Start the MYSID database and all instances:
$ srvctl start database -d MYSID

Example 3. Stop the orcl3 and orcl4 instances, associated to the orcl database, immediate.
$ srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate -c "sysback/oracle as sysoper" 

Example 4. Stop the nodeapps (listener and other apps) on node1
$ srvctl stop nodeapps –n node1 

Example 5. Check status RAC database
$ svrctl status database -d orcl

srvctl <command> <target> [options]
command: enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config
target: database, instance, service, ASM instance, or the nodeapps

srvctl commands, will affect the state information in the OCR


-- Example 1: To view help:

% svrctl -h
% svrctl command -h

-- Example 2: To see the SRVCTL version number, enter

% svrctl -V

-- Example 3. Bring up the MYSID1 instance of the MYSID database.

% srvctl start instance -d MYSID -i MYSID1

-- Example 4. Stop the MYSID database: all its instances and all its services, on all nodes.

% srvctl stop database -d MYSID

The following command mounts all of the non-running instances, using the default connection information:

% srvctl start database -d orcl -o mount

-- Example 5. Stop the nodeapps on the myserver node. NB: Instances and services also stop.

% srvctl stop nodeapps -n myserver

-- Example 6. Add the MYSID3 instance, which runs on the myserver node, to the MYSID clustered database.

% srvctl add instance -d MYSID -i MYSID3 -n myserver

-- Example 7. Add a new node, the mynewserver node, to a cluster.

% srvctl add nodeapps -n mynewserver -o $ORACLE_HOME -A 149.181.201.1/255.255.255.0/eth1
(The -A flag precedes an address specification.)

-- Example 8. To change the VIP (virtual IP) on a RAC node, use the command

% srvctl modify nodeapps -A new_address

-- Example 9. Status of components 

. Find out whether the nodeapps on mynewserver are up.

 % srvctl status nodeapps -n mynewserver
  VIP is running on node: mynewserver
  GSD is running on node: mynewserver
  Listener is not running on node: mynewserver
  ONS daemon is running on node: mynewserver

. Find out whether the ASM  is running:

  % srvctl status asm -n docrac1
  ASM instance +ASM1 is running on node docrac1.

. Find status of cluster database

  % srvctl status database -d EOPP
  Instance EOPP1 is running on node dbq0201
  Instance EOPP2 is running on node dbq0102

  % srvctl config database -d EOPP
  dbq0201 EOPP1 /ora/product/10.2.0/db
  dbq0102 EOPP2 /ora/product/10.2.0/db

  % srvctl config service -d EOPP
  opp.et.supp PREF: EOPP1 AVAIL: EOPP2
  opp.et.grid PREF: EOPP1 AVAIL: EOPP2


-- Example 10. The following command and output show the expected configuration for a three node 
               database called ORCL.

% srvctl config database -d ORCL

server01 ORCL1 /u01/app/oracle/product/10.1.0/db_1
server02 ORCL2 /u01/app/oracle/product/10.1.0/db_1
server03 ORCL3 /u01/app/oracle/product/10.1.0/db_1


-- Example 11. Disable the ASM instance on myserver for maintenance.

% srvctl disable asm -n myserver


-- Example 12. Debugging srvctl

Debugging srvctl in 10g couldn't be easier. Simply set the SRVM_TRACE environment variable.

% export SRVM_TRACE=true


-- Example 13. Question Version 10G RAC

Q: how to add a listener to the nodeapps using the srvctl command ??
or even if it can be added using srvctl ??

A: just edit listener.ora on all concerned nodes and add entries ( the usual way).
srvctl will automatically make use of it.
For example

% srvctl start database -d SAMPLE

will start database SAMPLE and its associated listener LSNR_SAMPLE. 


-- Example 14. Adding services.

% srvctl add database -d ORCL -o /u01/app/oracle/product/10.1.0/db_1
% srvctl add instance -d ORCL -i ORCL1 -n server01
% srvctl add instance -d ORCL -i ORCL2 -n server02
% srvctl add instance -d ORCL -i ORCL3 -n server03

% srvctl remove instance -d rac -i rac3
% srvctl disable instance -d orcl -i orcl2
% srvctl enable instance -d orcl -i orcl2 


-- Example 15. Administering ASM Instances with SRVCTL in RAC

You can use SRVCTL to add, remove, enable, and disable an ASM instance as described in the following procedure:

Use the following to add configuration information about an existing ASM instance:
% srvctl add asm -n node_name -i asm_instance_name -o oracle_home

Use the following to remove an ASM instance:
% srvctl remove asm -n node_name [-i asm_instance_name]

-- Example 16. Stop multiple instances.

The following command provides its own connection information to shut down the two instances orcl3 and orcl4
using the IMMEDIATE option:

% srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate -c "sysback/oracle as sysoper" 

-- Example 17. Showing policies.

Clusterware can automatically start your RAC database when the system restarts.
You can use Automatic or Manual "policies", to control whether clusterware restarts RAC.

To display the current policy:

% srvctl config database -d database_name -a

To change to another policy:

% srvctl modify database -d database_name -y policy_name

-- Example 18.

% srvctl start service -d DITOB

-- Example 19.

Relocate a service from one instance to another

% srvctl relocate  service -d ORACLE -s CRM -i RAC04 -t RAC01

-- Example 20.

Suppose you defined the HR service using the following command:
% srvctl add service -d RACDB -s HR -r RAC02,RAC03 -a RAC01

After a while you realize that there is a workload peak on the HR service,
and you decide to temporarily start HR on the RAC04 instance.

% srvctl modify service -d RACDB -s HR -i RAC02,RAC03,RAC04 -a ROC01
% srvctl stop service -d RACDB -s HR
% srvctl start service -d RACDB -s HR

-- Example 21.

How to disable oracle autostart function in RAC? 

You should be able to modify the service with:

srvctl modify database -d <db_name> -y MANUAL 


9.6: crsctl:
============

crsctl is the primary tool to manipulate CRS.

Example 1: Where is the Voting Disk (part of CRS) located?
$ crsctl query css votedisk
0. 0 /dev/raw/raw2 

Example 2: Do you want to check the health of the Clusterware?
$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

Example 3: start or stop or enable or disable CRS
$ crsctl start crs
$ crsctl stop crs
$ crsctl enable crs
$ crsctl disable crs

Example 4: Checking CRS Status on local node:
[root@node1]# crsctl check crs
Cluster Synchronization Services appears healthy
Cluster Ready Services appears healthy
Event Manager appears healthy

Example 5: Checking CRS Status in cluster
[root@node1]# crsctl check cluster
node1-pub    ONLINE
node2-pub    ONLINE 


Use CRSCTL to Control Your Clusterware

Oracle Clusterware enables servers in an Oracle database Real Application Cluster to coordinate simultaneous 
workload on the same database files. The crsctl command provides administrators many useful capabilities. 
For example, with crsctl, you can check Clusterware health disable/enable Oracle Clusterware startup on boot, 
find information on the voting disk and check the Clusterware version, and more.

>>> 1. Check the health of the Clusterware
# crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

You can also check the status of an individual daemon using the following syntax, where daemon is one of crsd, cssd, or evmd:

# crsctl check daemon


>>> 2. Do you want to reboot a node for maintenance without Clusterware coming up on boot?
## Disable clusterware on machine2 bootup:
# crsctl disable crs
## Stop the database then stop clusterware processes:
# srvctl stop instance –d db –i db2
# crsctl stop crs
# reboot 

## Enable clusterware on machine bootup:
# crsctl enable crs
# crsctl start crs
# srvctl start instance –d db –i db2 


>>> 3. Check the location of the voting disk:
# crsctl query css votedisk
0. 0 /dev/raw/raw2 


>>> 4. Do you need to find out what clusterware version is running on a server?
# crsctl query crs softwareversion
CRS software version on node [db2] is [10.2.0.2.0]


9.7: Adding and Removing Voting Disks:
======================================

You can dynamically add and remove voting disks after installing Oracle RAC. Do this using the following 
commands where path is the fully qualified path for the additional voting disk. Run the following command 
as the root user to add a voting disk:

# crsctl add css votedisk path

Run the following command as the root user to remove a voting disk:

# crsctl delete css votedisk path


9.9 CLUVFY:
===========

The Cluster Verification Utility pre or post validates an Oracle Clusterware environment or configuration.  
We found the CVU utility to be very useful for checking a cluster server environment for RAC. 
The CVU can check shared storage, interconnects, server systems and user permissions. The Universal Installer runs 
the verification utility at the end of the cluster ware install. The utility can also be run from the command line 
with parameters and options to validate components. 
 
For example, a script that verifies a cluster using cluvfy is named runcluvfy.sh and is located on 
the /clusterware/cluvfy directory in the installation area. This script unpacks the utility, sets environment 
variables and executes the verification command.
 
This command verifies that the hosts atlanta1, atlanta2 and atlanta3 are ready for a clustered database 
install of release 2.
 
./runcluvfy.sh stage -pre dbinst -n atlanta1,atlanta2,atlanta3 -r 10gR2 -osdba dba –verbose
 
The results of the command above check user and group equivalence across machines, connectivity, 
interface settings, system requirements like memory, disk space and kernel settings and versions, 
required Linux package existence and so on. Any problems are reported as errors, all successful 
checks are marked as passed.
 
Many other aspects of the cluster can be verified with this utility for Release 2 or Release 1.

Some more examples:

-- Checking for Available Shared Storage with CVU
To check for all shared file systems available across all nodes on the cluster, use the following CVU command:

% cluvfy comp ssa -n node_list

Remember to use the full path name and the runcluvfy.bat command on the installation media and include 
the list of nodes in your cluster, separated by commas, for the node_list. The following example is for 
a system with two nodes, node1 and node2, and the installation media on drive F:

% runcluvfy.bat comp ssa -n node1,node2

If you want to check the shared accessibility of a specific shared storage type to specific nodes 
in your cluster, then use the following command syntax:

% cluvfy comp ssa -n node_list -s storageID_list

In the preceding syntax, the variable node_list is the list of nodes you want to check, separated by commas, 
and the variable storageID_list is the list of storage device IDs for the storage devices managed by the 
file system type that you want to check.


More on Using CLUVFY:
---------------------

Using the Cluster Verification Utility to Diagnose Problems:

The Cluster Verification Utility (CVU) can assist you in diagnosing a wide variety of configuration problems. 

This section contains the following topics:

-- Enabling Tracing

-- Checking the Settings for the Interconnect

-- Troubleshooting a Node with Status of UNKNOWN

-- Verifying the Existence of Node Applications

-- Verifying the Integrity of Oracle Clusterware Components

-- Verifying the Integrity of the Oracle Cluster Registry

-- Verifying the Integrity of Your Entire Cluster

Enabling Tracing:
You can enable tracing by setting the environment variable SRVM_TRACE to true. After setting this variable to true, run the command 
that you want to trace. The CVU trace files are created in the CRS_HOME/cv/log directory. Oracle RAC automatically rotates the log files, 
and the most recently created log file has the name cvutrace.log.0. You should remove unwanted log files or archive them 
to reclaim disk space, if needed. The CVU does not generate trace files unless you enable tracing.

Checking the Settings for the Interconnect:
Cache Fusion enhances the performance of Oracle RAC by utilizing a high-speed interconnect to send data blocks 
to another instance's buffer cache. The high-speed interconnect should be a private network with the highest bandwidth to maximize performance.

For network connectivity verification, the CVU discovers all the available network interfaces if you do not specify 
an interface on the CVU command line.

To verify the accessibility of the cluster nodes from the local node or from any other cluster node, use the component verification 
command nodereach as follows:

$ cluvfy comp nodereach -n node_list [ -srcnode node ] [-verbose]

To verify that the other cluster nodes can be reached from the local node through all the available network interfaces or through 
specific network interfaces, use the component verification command nodecon as follows:

$ cluvfy comp nodecon -n node_list [ -i interface_list ] [-verbose]

You can also use the nodecon command without the -i option, as shown in the following example:

$ cluvfy comp nodecon -n all [-verbose]

When you issue the nodecom command as shown in the previous example, it instructs the CVU to perform the following tasks:

. Discover all the network interfaces that are available on the cluster nodes.
. Review the corresponding IP addresses and subnets for the interfaces.
. Obtain the list of interfaces that are suitable for use as VIPs and the list of interfaces to private interconnects.
. Verify the connectivity among all the nodes through those interfaces.

You can run the nodecon command in verbose mode to identify the mappings between the interfaces, IP addresses, and subnets. 
To verify the connectivity among the nodes through specific network interfaces, use the comp nodecon command with the -i option. 
For example, you can verify the connectivity among the nodes docrac1, docrac2, and docrac3, through interface eth0 by running the following command:

$ cluvfy comp nodecon -n docrac1, docrac2, docrac3 -i eth0 -verbose

Troubleshooting a Node with Status of UNKNOWN:
If you run the cluvfy command using the -verbose argument and the CVU responds with UNKNOWN for a particular node, then this is because 
the CVU cannot determine whether a check passed or failed. The cause of this could be because a node is not reachable, 
or as a result of any system problem that was occurring on that node at the time that the CVU was performing a check.

The following is a list of possible causes for an UNKNOWN response:

. The node is down.
. Executable files that the CVU requires are missing in the CRS_home/bin directory or the $ORACLE_HOME/bin directory.
. The user account that ran the CVU does not have privileges to run common operating system executable files on the node.
. The node is missing an operating system patch or required package.
. The kernel parameters on that node were not configured correctly and the CVU cannot obtain the operating system 
  resources required to perform its checks.

Verifying the Existence of Node Applications:
To verify the existence of node applications, namely the virtual IP (VIP), Oracle Notification Services (ONS), 
and Global Service Daemon (GSD), on all the nodes, use the CVU comp nodeapp command, using the following syntax:

$ cluvfy comp nodeapp [ -n node_list] [-verbose]

Verifying the Integrity of Oracle Clusterware Components:
To verify the existence of all the Oracle Clusterware components, use the component verification comp crs command, 
using the following syntax:

$ cluvfy comp crs [ -n node_list] [-verbose]

Verifying the Integrity of the Oracle Cluster Registry
To verify the integrity of the Oracle Cluster Registry, use the component verification comp ocr command, using the following syntax:

$ cluvfy comp ocr [ -n node_list] [-verbose]

Verifying the Integrity of Your Entire Cluster
To verify that all nodes in the cluster have the same view of the cluster configuration, use the component verification comp clu command, as follows:

$ cluvfy comp clu

 - cluvfy The Cluster Verification Utility
This tool has a very extended syntax and usage. You can use it to verify your systems before, during, and after installations as a sanity check..

Some examples that should give you an impression:
$ ./runcluvfy.sh stage –post hwos –n node1,node2 -verbose
$ ./runcluvfy.sh stage -pre crsinst -n docrac1,docrac2 -verbose
You can easily make a whole day study, of all ins and outs of this tool.


9.9 Further checking:
=====================

Use the following command to obtain component names, where module_name is crs, evm, css or the name of the module:

# crsctl lsmodules module_name

For example, viewing the components of the css module might return the following results:

# crsctl lsmodules css
The following are the CSS modules :: 
CSSD
COMMCRS
COMMNS

>>> Enabling Debugging of Oracle Clusterware Components:

You can enable debugging for the Oracle Cluster daemons, Event Manager (EVM), and their modules by running crsctl commands as follows, 
where component_name is the name of an Oracle Clusterware component for which you want to enable debugging, 
such as crs, evm, or css, module is the name of module as it appears in the output for the crcstl lsmodules command, 
and debugging_level is a number from 1 to 5:

# crsctl debug log component module:debugging_level

For example, to enable tracing for the CSSD module of the css component, you could use the following command:

# crsctl debug log css CSSD:1

To enable extra debugging on the currently running CRS daemons as well as those that will run in the future:

# crsctl debug log crs

>>> Enabling Debugging for an Oracle Clusterware Resource:

You can use crsctl commands to enable resource debugging using the following syntax, where resource_name is the name 
of an Oracle Clusterware resource, such as ora.docrac1.vip, and debugging_level is a number from 1 to 5:

# crsctl debug log res resource_name:debugging_level

To obtain a list of the resources available for debugging, use the following command:

# crs_stat


By default, Oracle enables traces for DBCA and the Database Upgrade Assistant (DBUA). 
For the CVU, GSDCTL, SRVCTL, and VIPCA, you can set the SRVM_TRACE environment variable to TRUE 
to make Oracle generate traces. Oracle writes traces to log files. For example, Oracle writes traces 
to log files in Oracle home/cfgtoollogs/dbca and Oracle home/cfgtoollogs/dbua for DBCA 
and the Database Upgrade Assistant (DBUA) respectively.


=================================
10: Example tnsnames.ora in RAC
=================================


Example 1:
----------

Lets take an example of a RAC service called SVC with two instances SVC1 and SVC2 running on host1 and host2 
(with virtual addresses host1_vip and host2_vip). The client tnsnames would look something like this:

SVC =(DESCRIPTION =     
      (ADDRESS = (PROTOCOL = TCP)(HOST = host1_vip)(PORT = 1521))     
      (ADDRESS = (PROTOCOL = TCP)(HOST = host2_vip)(PORT = 1521))     
      (LOAD_BALANCE = yes)     
      (CONNECT_DATA =       
      (SERVER = DEDICATED) (SERVICE_NAME = SVC)      
      (FAILOVER_MODE=(TYPE=select)(METHOD=basic)(RETRIES=10)(DELAY=1))))


Example 2:
----------

tnsnames.ora File


RAC =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1_vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2_vip)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = RAC)))

RAC1 =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = linux1_vip)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = TEST)(INSTANCE_NAME = RAC1)))

RAC2 =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = linux2_vip)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = TEST)(INSTANCE_NAME = RAC2)))

The entries RAC1 and RAC2 are optional.


Example 3:
----------

TEST =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = testlinux1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = testlinux2)(PORT = 1521)))
(CONNECT_DATA =
(SERVICE_NAME = TEST))))

TEST1 =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = testlinux1)(PORT = 1521)))
(CONNECT_DATA =
(SERVICE_NAME = TEST)(INSTANCE_NAME = TEST1)))

TEST2 =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = testlinux2)(PORT = 1521)))
(CONNECT_DATA =
(SERVICE_NAME = TEST)(INSTANCE_NAME = TEST2)))

EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC)))
(CONNECT_DATA =
(SID=PLSExtProc)(PRESENTATION = RO)))
 
LISTENERS_TEST =
(ADDRESS = (PROTOCOL = TCP)(HOST = testlinux1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = testlinux2)(PORT = 1521))


Example 4:
----------

TAF (Transparent Application Failover)
Transparent Application Failover actually refers to a failover that occurs when a node or instance 
is unavailable due to an outage or other reason that prohibits a connection to be established on that node. 
This can be set to on with the following parameter FAILOVER. Setting it to ON will activate the TAF. 
It is turned on by default unless you set it to OFF to disable it. Now, when you turn it on you have two types 
of connections available by the means of the FAILOVER_MODE parameter. The type can be session, which is default 
or select. When the type is SESSION, if the instance fails, then the user is automatically connected to the next 
available node without the user’s manual intervention. The SQL statements need to be carried out again 
on the next node. However, when you set the TYPE to SELECT, then if you are connected and are in the middle 
of your query, then your query will be restarted after you have been failed over to the next available node. 
Take this example of our tnsnames.ora file, (go to the section beginning with CONNECT_DATA):

 (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = fokerac.wolga.com)
      (FAILOVER_MODE =
        (TYPE = SELECT)
        (METHOD = BASIC)
	(RETRIES = 180)
	(DELAY = 5)
      )
  )


==============================================
11: Notes about Backup and Restore of RAC
==============================================


11.1 Backing up Voting Disk:
---------------------------

Run the following command to backup a voting disk. Perform this operation on every voting disk
as needed where 'voting_disk_name' is the name of the active voting disk, and 'backup_file_name'
is the name of the file to which you want to backup the voting disk contents:

# dd if=voting_disk_name of=backup_file_name

When you use the dd command for making backups of the voting disk, the backup can be performed while 
the Cluster Ready Services (CRS) process is active; you do not need to stop the crsd.bin process 
before taking a backup of the voting disk.

-- Adding and Removing Voting Disks
You can dynamically add and remove voting disks after installing Oracle RAC. Do this using the following 
commands where path is the fully qualified path for the additional voting disk. Run the following command 
as the root user to add a voting disk:

# crsctl add css votedisk path

Run the following command as the root user to remove a voting disk:

# crsctl delete css votedisk path


11.2 Recovering Voting Disk:
---------------------------

Run the following command to recover a voting disk where 'backup_file_name'
is the name of the voting disk backupfile, and 'voting_disk_name' is the name of the active
voting disk:

# dd if=backup_file_name of=voting_disk_name


11.3 Backup and Recovery OCR:
----------------------------

Oracle Clusterware includes two important components: the voting disk and the OCR. The voting disk is a file 
that manages information about node membership, and the OCR is a file that manages 
cluster and Oracle RAC database configuration information.


Oracle Clusterware automatically creates OCR backups every 4 hours. At any one time, Oracle Clusterware 
always retains the latest 3 backup copies of the OCR that are 4 hours old, 1 day old, and 1 week old.
This happes on one node, the second of the cluster install.
The auto backup is stored in the CRS_HOME/cdata directory.

You cannot customize the backup frequencies or the number of files that Oracle Clusterware retains. 
You can use any backup software to copy the automatically generated backup files at least once daily 
to a different device from where the primary OCR file resides. The default location for generating backups 
on Red Hat Linux systems is "CRS_home/cdata/cluster_name" where cluster_name is the name of your cluster 
and CRS_home is the home directory of your Oracle Clusterware installation.


-- Viewing Available OCR Backups
To find the most recent backup of the OCR, on any node in the cluster, use the following command:

# ocrconfig -showbackup

-- Backing Up the OCR
Because of the importance of OCR information, Oracle recommends that you use the ocrconfig tool to make copies 
of the automatically created backup files at least once a day.

In addition to using the automatically created OCR backup files, you should also export the OCR contents 
to a file before and after making significant configuration changes, such as adding or deleting nodes 
from your environment, modifying Oracle Clusterware resources, or creating a database. 
Exporting the OCR contents to a file lets you restore the OCR if your configuration changes cause errors. 
For example, if you have unresolvable configuration problems, or if you are unable to restart your cluster database 
after such changes, then you can restore your configuration by importing the saved OCR content 
from the valid configuration.

>>> Export the contents of the OCR to a file:
---------------------------------------------

To export the contents of the OCR to a file, use the following command, where backup_file_name is the name 
of the OCR backup file you want to create:

# ocrconfig -export backup_file_name


>>> Recovering the OCR:
-----------------------

This section describes two methods for recovering the OCR. The first method uses automatically generated 
OCR file copies and the second method uses manually created OCR export files.

In event of a failure, before you attempt to restore the OCR, ensure that the OCR is unavailable. 
Run the following command to check the status of the OCR:

# ocrcheck 

This command does an Integrety Check of the OCR.
If this command does not display the message 'Device/File integrity check succeeded' for at least one copy 
of the OCR, then both the primary OCR and the OCR mirror have failed. You must restore the OCR from a backup.

Example:

CRS > ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 102508
Used space (kbytes) : 5360
Available space (kbytes) : 97148
ID : 1255672694
Device/File Name : /dev/dba_ocr_disk
Device/File integrity check succeeded

Device/File not configured

Cluster registry integrity check succeeded


>>> Restoring the Oracle Cluster Registry from Automatically Generated OCR Backups:
-----------------------------------------------------------------------------------

When restoring the OCR from automatically generated backups, you first have to determine which backup file 
you will use for the recovery.

To restore the OCR from an automatically generated backup on a Red Hat Linux system:

Identify the available OCR backups using the ocrconfig command:

# ocrconfig -showbackup

Note:

You must be logged in as the root user to run the ocrconfig command.

Review the contents of the backup using the following ocrdump command, where file_name is the name 
of the OCR backup file:

# ocrdump -backupfile file_name

As the root user, stop Oracle Clusterware on all the nodes in your Oracle RAC cluster by executing 
the following command:

# crsctl stop crs

Repeat this command on each node in your Oracle RAC cluster.

As the root user, restore the OCR by applying an OCR backup file that you identified in step 1 
using the following command, where file_name is the name of the OCR that you want to restore. 
Make sure that the OCR devices that you specify in the OCR configuration exist, and that these OCR devices 
are valid before running this command.

# ocrconfig -restore file_name

As the root user, restart Oracle Clusterware on all the nodes in your cluster by restarting each node, 
or by running the following command:

# crsctl start crs

Repeat this command on each node in your Oracle RAC cluster.

Use the Cluster Verify Utility (CVU) to verify the OCR integrity. Run the following command, 
where the -n all argument retrieves a list of all the cluster nodes that are configured as part of your cluster:

$ cluvfy comp ocr -n all [-verbose]


>>> Recovering the OCR from an OCR Export File:
-----------------------------------------------

Using the ocrconfig -export command enables you to restore the OCR using the -import option if your 
configuration changes cause errors.

To restore the previous configuration stored in the OCR from an OCR export file:

Place the OCR export file that you created previously with the ocrconfig -export command in an accessible 
directory on disk.

As the root user, stop Oracle Clusterware on all the nodes in your Oracle RAC cluster by executing 
the following command:

# crsctl stop crs

Repeat this command on each node in your Oracle RAC cluster.

As the root user, restore the OCR data by importing the contents of the OCR export file using the 
following command, where file_name is the name of the OCR export file:

# ocrconfig -import file_name

As the root user, restart Oracle Clusterware on all the nodes in your cluster by restarting each node, 
or by running the following command:

# crsctl start crs

Repeat this command on each node in your Oracle RAC cluster.


>> >Replacing an OCR:
---------------------

If you need to change the location of an existing OCR, or change the location of a failed OCR to the location of a working one, 
you can use the following procedure as long as one OCR file remains online.

To change the location of an OCR:

Use the OCRCHECK utility to verify that a copy of the OCR other than the one you are going to replace is online using the following command:

$ ocrcheck 

  Note:

  The OCR that you are replacing can be either online or offline.

Verify that Oracle Clusterware is running on the node on which the you are going to perform the replace operation using the following command:

$ crsctl check crs

Run the following command to replace the OCR using either destination_file or disk to indicate the target OCR:

# ocrconfig -replace ocr destination_file
# ocrconfig -replace ocr disk

Run the following command to replace an OCR mirror location using either destination_file or disk to indicate the target OCR:

# ocrconfig -replace ocrmirror destination_file
# ocrconfig -replace ocrmirror disk

If any node that is part of your current Oracle RAC environment is shut down, then run the following command on the stopped node 
to let that node rejoin the cluster after the node is restarted:

# ocrconfig -repair


You may need to repair an OCR configuration on a particular node if your OCR configuration changes while that node is stopped. 
For example, you may need to repair the OCR on a node that was not up while you were adding, replacing, or
removing an OCR. To repair an OCR configuration, run the following command on the node on which you have stopped the Oracle Clusterware daemon:

ocrconfig -repair ocrmirror device_name

This operation only changes the OCR configuration on the node from which you run this command. For example, if the OCR mirror 
device name is /dev/raw1, then use the command syntax ocrconfig -repair ocrmirror /dev/raw1 on this node to repair its OCR configuration.


>>> Add or remove Voting Disk:
------------------------------

You can dynamically add and remove voting disks after installing Oracle RAC. Do this using the following commands 
where path is the fully qualified path for the additional voting disk. Run the following command as the root user to add a voting disk:

# crsctl add css votedisk path
Run the following command as the root user to remove a voting disk:

# crsctl delete css votedisk path


Use the CVU to verify the OCR integrity. Run the following command, where the -n all argument retrieves 
a list of all the cluster nodes that are configured as part of your cluster:

$ cluvfy comp ocr -n all [-verbose]


===================================
12. Adding Nodes and Instances:
===================================


12.1 Adding a node:
===================


This chapter describes how to add nodes and instances in Oracle Real Application Clusters (Oracle RAC) environments. 
You can use these methods when configuring a new Oracle RAC cluster, or when scaling up an existing Oracle RAC cluster.

This chapter includes the following sections:

A: Preparing Access to the New Node

B: Extending the Oracle Clusterware Home Directory

C: Extending the Oracle Automatic Storage Management Home Directory

D: Extending the Oracle RAC Software Home Directory

E: Creating a Listener on the New Node

F: Adding a New Cluster Instance on the New Node

For this chapter, it is very important that you perform each step in the order shown.

Suppose we want to add the node docrac3 to the existing cluster of the systems docrac1 en docrac2.


>>> A: Preparing Access to the New Node

To prepare the new node prior to installing the Oracle software, refer to Chapter 2, "Preparing Your Cluster".
It is critical that you follow the configuration steps for the following procedures to work. These steps include, 
but are not limited to the following:

Adding the public and private node names for the new node to the /etc/hosts file on the existing nodes, docrac1 and docrac2

Verifying the new node can be accessed (using the ping command) from the existing nodes

Running the following command on either docrac1 or docrac2 to verify the new node has been properly configured:

# cluvfy stage -pre crsinst -n docrac3


>>> B: Extending the Oracle Clusterware Home Directory

Now that the new node has been configured to support Oracle Clusterware, you use Oracle Universal Installer (OUI) 
to add an Oracle Clusterware home to the node being added to your Oracle RAC cluster. This chapter assumes that you are adding 
a node named docrac3 and that you have already successfully installed Oracle Clusterware on docrac1 in a nonshared home, 
where CRS_home represents the successfully installed Oracle Clusterware home.

To extend the Oracle Clusterware installation to include the new node:

1. Verify the $ORACLE_HOME environment variable on docrac1 directs you to the successfully installed Oracle Clusterware home on that node.

2. Go to CRS_home/oui/bin and run the addNode.sh script.

cd /opt/oracle/crs/oui/bin
./addNode.sh

OUI starts and first displays the Welcome window.

3. Click Next.

The Specify Cluster Nodes to Add to Installation window appears.

4. Select the node or nodes that you want to add. After selecting docrac3, click Next.

5. Verify the entries that OUI displays on the Summary Page and click Next.

6. Run the rootaddNode.sh script from the CRS_home/install/ directory on docrac1 when prompted to do so.

   Basically, this script adds the node applications of the new node to the OCR configuration.

7. Run the orainstRoot.sh script on the node docrac3 if OUI prompts you to do so.

8. Run the CRS_home/root.sh script on the node docrac3 to start Oracle Clusterware on the new node.

9. Add the new node's Oracle Notification Services (ONS) configuration information to the shared Oracle 
   Cluster Registry (OCR). Obtain the ONS port identifier used by the new node, which you need to know 
   for the next step, by running the following command from the CRS_home/opmn/conf directory on the 
   docrac1 node:

   cat ons.config

   After you locate the ONS port number for the new node, you must make sure that the ONS on 
   docrac1 can communicate with the ONS on the new node, docrac3.

10. From the CRS_home/bin directory on the node docrac1, run the Oracle Notification Services configuration utility
    as shown in the following example, where remote_port is the port number from step 9, and docrac3 is the name 
    of the node that you are adding:

    ./racgons add_config docrac3:remote_port

    At the end of the cloning process, you should have Oracle Clusterware running on the new node. 
    To verify the installation of Oracle Clusterware on the new node, you can run the following command 
    as the root user on the newly configured node, docrac3:

    CRS_home/bin/cluvfy stage -post crsinst -n docrac3 -verbose


>>> C: Extending the Oracle Automatic Storage Management Home Directory

To extend an existing Oracle RAC database to a new node, you must configure the shared storage 
for the new database instances that will be created on new node. You must configure access to the same 
shared storage that is already used by the existing database instances in the cluster. For example, 
the sales cluster database in this guide uses Oracle Automatic Storage Management (ASM) for the database 
shared storage, so you must configure ASM on the node being added to the cluster.

Because you installed ASM in its own home directory, you must configure an ASM home on the new node using OUI. 
The procedure for adding an ASM home to the new node is very similar to the procedure you just completed 
for extending Oracle Clusterware to the new node.

To extend the ASM installation to include the new node:

1. Ensure that you have successfully installed the ASM software on at least one node in your cluster environment. 
   To use these procedures as shown, your $ASM_HOME environment variable must identify your successfully 
   installed ASM home directory.

2. Go to the $ASM_HOME/oui/bin directory on docrac1 and run the addNode.sh script.

3. When OUI displays the Node Selection window, select the node to be added (docrac3), then click Next.

4. Verify the entries that OUI displays on the Summary window, then click Next.

5. Run the root.sh script on the new node, docrac3, from the ASM home directory on that node when OUI prompts you to do so.

You now have a copy of the ASM software on the new node.


>>> D: Extending the Oracle RAC Software Home Directory

Now that you have extended the Oracle Clusterware and ASM homes to the new node, you must extend the Oracle Database home 
on docrac1 to docrac3. The following steps assume that you have already completed the previous tasks described in this chapter, 
and that docrac3 is already a member node of the cluster to which docrac1 belongs.

The procedure for adding an Oracle RAC home to the new node is very similar to the procedure 
you just completed for extending ASM to the new node.

To extend the Oracle RAC installation to include the new node:

1. Ensure that you have successfully installed the Oracle RAC software on at least one node in your 
   cluster environment. To use these procedures as shown, your $ORACLE_HOME environment variable 
   must identify your successfully installed Oracle RAC home directory.

2. Go to the $ORACLE_HOME/oui/bin directory on docrac1 and run the addNode.sh script.

3. When OUI displays the Specify Cluster Nodes to Add to Installation window, select the node to be added (docrac3), then click Next.

4. Verify the entries that OUI displays in the Cluster Node Addition Summary window, then click Next.

5. Run the root.sh script on the new node, docrac3, from the $ORACLE_HOME directory on that node when OUI prompts you to do so.


After completing these steps, you should have an installed Oracle RAC home on the new node.


>>> E: Creating a Listener on the New Node

To service database instance connection requests on the new node, you must create a Listener on that node. 
Use the Oracle Net Configuration Assistant (NETCA) to create a Listener on the new node. 
Before beginning this procedure, ensure that your existing nodes have the $ORACLE_HOME environment variable 
set correctly.

To create a new Listener on the new node using Oracle Net Configuration Assistant:

1. Start the Oracle Net Configuration Assistant by entering netca at the system prompt from the 
   $ORACLE_HOME/bin directory.

   NETCA displays the Welcome window. Click Help on any NETCA window for additional information.

2. Select Listener configuration, and click Next.

   NETCA displays the Listener Configuration, Listener window.

3. Select Add to create a new Listener, then click Next.

   NETCA displays the Listener Configuration, Listener Name window.

4. Accept the default value of LISTENER for the Listener name by clicking Next.

   NETCA displays the Listener Configuration, Select Protocols window.

5. Choose TCP and move it to the Selected Protocols area, then click Next.

   NETCA displays the Listener Configuration, TCP/IP Protocol window.

6. Choose Use the standard port number of 1521, then click Next.

   NETCA displays the Real Application Clusters window.

7. Select Cluster configuration for the type of configuration to perform, then click Next.

   NETCA displays the Real Application Clusters, Active Nodes window.

8. Select the name of the node you are adding, for example docrac3, then click Next.

   NETCA creates a Listener using the configuration information provided. You can now exit NETCA.

You should now have a Listener named LISTENER running on the new node.


>>> F: Adding a New Cluster Instance on the New Node

You can use the Oracle Database Configuration Assistant (DBCA) to add database instances to new nodes. 
Before beginning this procedure, ensure that your existing nodes have the $ORACLE_HOME environment variable 
set correctly.

To create a new cluster instance on the new node using DBCA:

1. Start DBCA by entering dbca at the system prompt from the $ORACLE_HOME/bin directory.

   DBCA displays the Welcome window for Oracle RAC. Click Help on any DBCA page for additional information.

2. Select Oracle Real Application Clusters database, and then click Next.

   DBCA displays the Operations window.

3. Select Instance Management, and then click Next.

   DBCA displays the Instance Management window.

4. Select Add an Instance, then click Next.

   DBCA displays the List of Cluster Databases window, which shows the databases and their current status, 
   such as ACTIVE or INACTIVE.

5. In the List of Cluster Databases window, select the active Oracle RAC database to which you want to add an instance, 
   for example sales. Enter the user name and password for the database user that has SYSDBA privileges. Click Next.

   DBCA will spend a few minutes performing tasks in the background, then it will display the 
   Instance naming and node selection window.

6. In the Instance naming and node selection window, enter the instance name in the field at the top of this window 
   if the default instance name provided by DBCA does not match your existing instance naming scheme. 
   For example, instead of the sales3 instance, you might want to create the sales_03 instance.

   Click Next to accept the default instance name of sales3.

   DBCA displays the Instance Storage window.

7. In the Instance Storage window, you have the option of changing the default storage options and file locations 
   for the new database instance. In this example, you accept all the default values and click Finish.

   DBCA displays the Summary window.

8. Review the information in the Summary window, then click OK to start the database instance addition operation. 
   DBCA displays a progress dialog box showing DBCA performing the instance addition operation.

9. During the instance addition operation, if you are using ASM for your cluster database storage, 
   DBCA detects the need for a new ASM instance on the new node.

   When DBCA displays a dialog box, asking if you want to ASM to be extended, click Yes.

   After DBCA extends ASM on the new node and completes the instance addition operation, DBCA displays a dialog box 
   asking whether or not you want to perform another operation. Click No to exit DBCA.

You should now have a new cluster database instance and ASM instance running on the new node. 
After you terminate your DBCA session, you should run the following command to verify the administrative privileges 
on the new node and obtain detailed information about these privileges:

# CRS_home/bin/cluvfy comp admprv -o db_config -d oracle_home -n docrac3 -verbose


==============================
13: Starting and Stopping RAC:
==============================


13.1 Stopping the Cluster:
==========================
						
						
Before you shut down any processes that are monitored by Enterprise Manager Grid Control, set a blackout in 						
Grid Control for the processes that you intend to shut down. This is necessary so that the availability records 						
for these processes indicate that the shutdown was planned downtime, rather than an unplanned system outage.						
Shut down all Oracle RAC instances on all nodes. To shut down all Oracle RAC instances for a database, 						
enter the following command, where db_name is the name of the database:	
					
$ oracle_home/bin/srvctl stop database -d db_name						
						
Shut down all ASM instances on all nodes. To shut down an ASM instance, enter the following command, 						
where node is the name of the node where the ASM instance is running:						

$ oracle_home/bin/srvctl stop asm -n node						
						
Stop all node applications on all nodes. To stop node applications running on a node, enter the following command, 						
where node is the name of the node where the applications are running						

$ oracle_home/bin/srvctl stop nodeapps -n node						
						
Log in as the root user, and shut down the Oracle Clusterware or CRS process by entering the following command 						
on all nodes:						

# CRS_home/bin/crsctl stop crs                        # as root						
						
						
13.2 Starting the Cluster:						
==========================
						
# CRS_home/bin/crsctl start crs                       # as root						
$ oracle_home/bin/srvctl start nodeapps -n node						
$ oracle_home/bin/srvctl start asm -n node						
$ oracle_home/bin/srvctl start database -d db_name    # will start all instances of the Database						
						
						
Example of stopping or starting Services on a RAC node:						
						
# shutdown services on a RAC node						
sudo -u oracle /opt/oracle/product/10.2.0/crs/bin/srvctl stop instance -d p01cfd -i pl01cfd2 -o immediate						
sudo -u oracle /opt/oracle/product/10.2.0/crs/bin/srvctl stop asm -n t-prod-oranode-11						
sudo -u oracle /opt/oracle/product/10.2.0/crs/bin/srvctl stop nodeapps -n t-prod-oranode-11						
						
# startup services on a RAC node						
sudo -u oracle /opt/oracle/product/10.2.0/crs/bin/srvctl start nodeapps -n t-prod-oranode-11						
sudo -u oracle /opt/oracle/product/10.2.0/crs/bin/srvctl start asm -n t-prod-oranode-11						
sudo -u oracle /opt/oracle/product/10.2.0/crs/bin/srvctl start instance -d p01cfd -i pl01cfd2 						
						

====================================
14: Other Noticable items in 10g RAC
====================================


14.1 SPFILE:
============


If an initialization parameter applies to all instances, use *.<parameter> notation, otherwise
prefix the parameter with the name of the instance.
For example:

*.OPEN_CURSORS=500
prod1.OPEN_CURSORS=1000

Assume that you start an instance with an SPFILE containing the following entries:

*.OPEN_CURSORS=500
prod1.OPEN_CURSORS=1000

For the instance with the Oracle system identifier (sid) prod1, the OPEN_CURSORS parameter remains set to 1000 even though it has a 
database-wide setting of 500. The instance-specific parameter setting in the parameter file for an instance prevents 
database-wide alterations of the setting. This gives you control over parameter settings for instance prod1. These two types of settings 
can appear in any order in the parameter file.

If another DBA runs the following statement, then Oracle updates the setting on all instances except the instance with sid prod1:

ALTER SYSTEM SET OPEN_CURSORS=1500 sid='*' SCOPE=MEMORY;

In the example instance with sid prod1, the parameter begins accepting ALTER SYSTEM values set by other instances 
if you change the parameter setting by running the following statement:

ALTER SYSTEM RESET OPEN_CURSORS SCOPE=MEMORY sid='prod1';

Then if you execute the following statement on another instance, the instance with sid prod1 also assumes the new setting of 2000:

ALTER SYSTEM SET OPEN_CURSORS=2000 sid='*' SCOPE=MEMORY;


14.2 Rolling upgrades, opatch :
===============================

>>> Patch:
Oracle issues product fixes for its software called patches. When you apply the patch to your Oracle software installation, 
a small collection of files are replaced to fix certain bugs. 
OPatch is an Oracle supplied utility that facilitates Oracle software patching.

The opatch binary file is located in the $ORACLE_HOME/OPatch directory. You can either specify this path when executing OPatch, 
or you can update the PATH environment variable to include the OPatch directory. For example, on RedHat Linux systems 
you would use a shell command similar to the following:

$ export PATH=$PATH:/opt/oracle/10gR2/db_1/OPatch


>>> Patchset:
A group of patches form a patch set. When you apply a patch set, many different files and utilities are modified. 
This results in a version change for your Oracle software, for example, from Oracle Database 10.2.0.1.0 
to Oracle Database 10.2.0.2.0. To apply a patch set you use the Oracle Universal Installer (OUI).


>>> RAC Rolling Upgrade

"One-off" patches or interim patches to database software are usually applied to implement known fixes for software problems 
an installation has encountered or to apply diagnostic patches to gather information regarding a problem. Such patch application 
is often carried out during a scheduled maintenance outage.

Oracle now provides the capability to do rolling patch upgrades with Real Application Clusters with little or no database downtime. 
The tool used to achieve this is the "opatch" command-line utility.

The advantage of a RAC rolling upgrade is that it enables at least some instances of the RAC installation to be available 
during the scheduled outage required for patch upgrades. Only the RAC instance that is currently being patched 
needs to be brought down. The other instances can continue to remain available. This means that the impact on the application 
downtime required for such scheduled outages is further minimized. Oracle's opatch utility enables the user to apply the patch 
successively to the different instances of the RAC installation.

Rolling upgrade is available only for patches that have been certified by Oracle to be eligible for rolling upgrades. 
Typically, patches that can be installed in a rolling upgrade include:

-Patches that do not affect the contents of the database such as the data dictionary
-Patches not related to RAC internode communication
-Patches related to client-side tools such as SQL*PLUS, Oracle utilities, development libraries, and Oracle Net
-Patches that do not change shared database resources such as datafile headers, control files, and common header definitions of kernel modules

Rolling upgrade of patches is currently available for one-off patches only. It is not available for patch sets.

Rolling patch upgrades are not available for deployments where the Oracle Database software is shared across the different nodes. 
This is the case where the Oracle home is on Cluster File System (CFS) or on shared volumes provided by file servers 
or NFS-mounted drives. The feature is only available where each node has its own copy of the Oracle Database software.

The opatch utility applies a patch successively to nodes of the RAC cluster. The nature of the patch enables a RAC installation 
to run in a mixed environment. Different instances of the database may be operating at the same time, and the patch may have been applied 
to some instances and not others. The opatch utility automatically detects the nodes of the cluster on which a specific RAC deployment 
has been implemented. The patch is applied to each node, one at a time. For each node, the DBA is prompted to shut down the instance. 
The patch is applied to the database software install on that node. After the current node has been patched, 
the instance can be restarted. After the patch is applied on the current node, the DBA is allowed to choose the next RAC node 
to apply the patch to. The cycle of instance shutdown, patch application, and instance startup is repeated. 
Thus, at any time during the patch application, only one node needs to be down.

To check if a patch is a rolling patch, execute the following on UNIX platforms:

$ opatch query -is_rolling

You can check to see whether a patch has been marked rolling upgradeable by running "$ opatch query –is_rolling" or 
by checking the online_rac_installable flag for a value of TRUE within "etc/config/inventory" directory. 
Usually, the patch README file should detail whether the patch is rolling upgradeable, but when in doubt, you can use one of the aforementioned methods.


Patches can be rolled back with the opatch utility. This enables the DBA to remove a troublesome patch or a patch 
that is no longer required. This can be done as a rolling procedure.

To roll back a patch across all nodes of a RAC cluster, execute the following command:

$ opatch rollback -id patch_id -ph patch_location

To roll back a patch on the local node only, enter the following command:

$ opatch rollback -local -id patch_id -ph patch_location

To check the results of a patch rollback, check the logs in the following location:

$ORACLE_HOME/.patch_storage/patch_id/patch_id_RollBack_timestamp.log


14.3 Redo Thread in RAC:
========================

Example:
--------

Suppose you have one RAC database, and two instances: RAC01 and RAC02.
Right now, you have 3 enabled threads of 2 group each:
- thread 3 consisting of log groups 5 and 6 is currently being used by instance RAC01
- thread 2 consisting of log groups 3 and 4 is currently being used by instance RAC02
- thread 1 consisting of log groups 1 and 2 is not being used

Now we want RAC01 to use thread 1 with log groups 1 and 2:

SQL> alter system set thread=1 scope=spfile sid='RAC01';


============================================
15. TROUBLESHOOTING RAC
============================================


Note 1: VIP on wrong Interface:
===============================

I  was installing RAC, and during the clusterware install I picked up the wrong interfaces for public and private. 
I had 10.x.x.x for the public IP on eth0 and 192.x.x.x for the private IP (interconnect) on eth1. 
I also had 10.x.x.x for the VIP. During the install I choose eth1 to be the public interface. Right after the install 
I lost connection to the machine via the 10.x.x.x IP.

What had happened was I had a 10.x.x.x IP on both eth0 and eth1, which was messing up the routing.

The solution? Simply modify the VIP in the cluster configuration. There’s actually a metalink article about this. 
Here are the essential commands:

srvctl stop nodeapps -n NODE1
srvctl stop nodeapps -n NODE2
srvctl modify nodeapps -n NODE1 -A 10.5.5.101/255.255.255.0/eth0
srvctl modify nodeapps -n NODE2 -A 10.5.5.102/255.255.255.0/eth0
srvctl start nodeapps -n NODE1
srvctl start nodeapps -n NODE2

They worked just fine. So if you ever mess up the interfaces, this is how you fix it.If you need to change the private interface, 
then you need to use oifcfg. To verify your current settings use: 

oifcfg getif
eth0  10.5.5.0  global  public
eth1  192.168.0.0  global  cluster_interconnect

And use delif/addif to remove and re-create your private interface.


Note 2: Troubleshooting and debuging:
=====================================


You can enable debugging for the CRS, OCR, CSS, and EVM modules and their components by setting environment variables 
or by issuing crsctl debug commands using the following syntax:

# crsctl debug log module_name component:debugging_level

Run the following command to obtain component names where module_name is the name of the module, crs, evm, or css:

# crsctl lsmodules module_name

You must issue the crsctl debug command as the root user, and supply the following information:

- module_name—The name of the module: CRS, EVM, or CSS.
- component—The name of a component for the CRS, OCR, EVM, or CSS module. See Table F-1 for a list of all of the components.
- debugging_level—A number from 1 to 5 to indicate the level of detail you want the debug command to return, 
  where 1 is the least amount of debugging output and 5 provides the most detailed debugging output
 

For example, to enable tracing for the CSSD module of the css component, you could use the following command:

# crsctl debug log css CSSD:1

Other examples:
# crsctl debug log crs "CRSRTI:1,CRSCOMM:2"
# crsctl debug log evm "EVMCOMM:1" 
# crsctl debug log crs "CRSRTI:1,CRSCOMM:2,OCRSRV:4"

You can use crsctl commands to enable resource debugging using the following syntax:

# crsctl debug log res "ora.node1.vip:1"

This has the effect of setting the environment variable USER_ORA_DEBUG, to 1, before running 
the start, stop, or check action scripts for the ora.node1.vip resource.


# crsctl debug statedump evm - dumps state info for evm objects
# crsctl debug statedump crs - dumps state info for crs objects
# crsctl debug statedump css - dumps state info for css objects


To enable extra debugging on the currently running CRS daemons as well as those that will run in the future,
use the following command. Debugging information remains in the Oracle Cluster Registry (OCR) for use during the next startup.

# crsctl debug log crs


-- Running the Oracle Clusterware Diagnostics Collection Script

Run the diagcollection.pl script as the root user to collect diagnostic information from an Oracle Clusterware installation. 
The diagnostics provide additional information so that Oracle Support Services can resolve problems. Run this script from 
the operating system prompt as follows, where CRS_home is the home directory of your Oracle Clusterware installation:

# CRS_home/bin/diagcollection.pl --collect

This command displays the status of the Cluster Synchronization Services (CSS), Event Manager (EVM), 
and the Cluster Ready Services (CRS) daemons.

Debugging srvctl in 10g couldn't be easier. Simply set the SRVM_TRACE environment variable.

% export SRVM_TRACE=true


============================================
16. Most important wait events in 10gR2 RAC:
============================================