Those of us that have dealt with RAC environments for a while are familiar with the behavior of Oracle Services in an Oracle Cluster. Services are an essential component for managing workload in a RAC environment. If you’re not defining any non-default services in your RAC database, you’re making a mistake. To learn more about services, I strongly recommend reading the definitive whitepaper by Jeremy Schneider on the topic.
In an Oracle RAC cluster, services can be started, stopped, and relocated from one instance to another. However, if you have multiple services for your database, then it becomes difficult to start them at a cold start. Due to dependencies in Oracle Clusterware, services are stopped when all instances of a database are stopped. However, when you start instances, services are left OFFLINE and you’re forced to start them manually. If you have just one additional service (first of all, you should probably use more than one unless you have a very small user community), you can just start the service and that will start the instances necessary to bring the service up via dependencies in Oracle Clusterware. If you have multiple services, though, you still have to start the rest of them manually.
There is a solution to this problem that uses the normal Oracle Clusterware operations to automatically start any services for the instance that was started on the local node automatically. This is done with a small perl script that is run as an Oracle Clusterware cluster callout. Cluster callouts are scripts or programs that Oracle Clusterware will run whenever a cluster “event” (like starting or stopping an instance) occurs. It calls these programs with many command-line arguments that can be parsed to determine what the event is about and why it was fired. The program can use these arguments to filter out the events that aren’t interesting to it and act on the events that it wishes to act upon. You create a new “callout” by placing an executable (script or binary) in the $ORA_CRS_HOME/racg/usrco
directory. Each executable file found in that directory will be executed for each event that happens locally.
In our case, we want to look for instance startup events and then query Oracle Clusterware to determine which services should be started and start them. A script to do exactly that is provided by the Oracle RAC and Clusterware team on their sample code page titled “Start Services on Instance Up“. There you will find a small zip archive that has a couple of perl scripts and a short readme file explaining how to install and use them. In my case, I had to modify the script to handle some hostname parsing. In particular, I had to modify line 54 in startMatchedSRVonUp.pl to read like this “$myHost = (split(/\./,$_))[0];
“. That is because my “/bin/hostname” command returns the fully-qualified domain name (node1.dannorris.net) instead of just the short hostname (node1) as the script expects by default. So, a small change there may help you too. It logs by default in the /tmp directory and doesn’t delete its log, so you should be able to easily debug it using that information.
Coincidentally, I had occasion to install this script at a new customer implementation of Oracle RAC recently and it also came up in my session at Hotsos Symposium 09 earlier this week as well. If you’ve found a better way to handle this issue, please fire off a comment below for the good of all mankind.
Thank you for your idea.
I understood little of a service.
After I knew a service helpful about performance management. I try to learn and test about it.
Thank You again for stuff from your session at Hotsos and the definitive whitepaper by Jeremy Schneider.
surachart: Thanks for stopping by. There's also a video highlight reel of Hotsos sessions by Marco Gralike on YouTube at http://is.gd/n5Cp if you'd like to see more.
So this script would run when the event “instance_startup” happens?
Could you possibly do an instance startup trigger that would call DBMS_SERVICES?
Not saying it is a better way, but just another way : )
wow! thank you for your stuff š
The script runs for any cluster event. The only ones it cares about are instance startup events. All other events are discarded and the script exits unless it sees INSTANCE as the first argument to the script (which is how the callouts pass the args).
I had to test your idea as I hadn't tried it yet. Unfortunately, DBMS_SERVICE doesn't actually change Oracle Clusterware's status. So, I don't think it'd be a complete solution. I also like the thought of placing one generic script in Oracle Clusterware and avoid maintaining a database startup trigger in each of the databases I have in the cluster (thinking of the case where there are 10+ databases in the RAC cluster).
Good idea–I didn't know until I tried that a database startup trigger fires each time a new instance opens the database. So, a two-node cluster would fire the trigger twice (once in each instance). That could mess things up if you didn't know about it ahead of time and could be a problem for those environments that are counting on having it fire exactly once at startup time. Glad you got me thinking enough to test that out!
So if you do an lsnrctl status, do you see the service? It just may not show up in crs, but maybe there is something srvctl could do to “probe”. I will have to play with it also.
I think it would (I didn't check and already shut down my cluster for the night). Not sure if CRS would ever determine that it is there, though. I doubt it.
Yes, since it is based on what is in the OCR. You could do a external call to srvctl though : )… I wonder if this has changed in 11g.
Is your RAC cluster 10g or 11g? Maybe some beta tester could check who is doing 11gR2 :)…. hmmm
My cluster is on 11gR1. Agreed, you could do an external callout from the DB to srvctl, but that seems like a lot of work when you ultimately do just the same thing that the clusterware callout script does :).
This got me to thinking. The MAA guide does not specify any mount options when adding a standby using srvctl:
http://www.oracle.com/technology/deploy/availab…
I am thinking that there is a way to have CRS put the standby in managed recovery mode, but not finding it easy to find in the documentation. Why is it not made more clear, more obvious? Especially if Oracle is touting this thing called “Best Practices” for “Maximum Availability”? =)
Hi Charles,
I haven't seen anything like that in the docs either. I don't think there's any way for CRS to handle it as the “agent” for RAC instances isn't extensible as far as I know. So, to offer this functionality, Clusterware development would have to get involved to support it (at least that's how I see it–not knowing how things are organized internally there).
Your question is a good one and I pinged a few friends that might have some more thoughts. If they don't post them here (assuming I get responses at all), I'll summarize and post results.
I got some feedback from one of my contacts (suffice it to say this is authoritative information). Basically, they said that they don't recommend using CRS to put the DB in standby mode. There is a “-r physical_standby” mode that was created, but not used, so avoid it. Instead, DG Broker is the recommended way to manage RAC standbys and it is documented too: http://download.oracle.com/docs/cd/B19306_01/se…
Hope that helps.
Thanks, I appreciate that. So this really begs the question; “why do they recommend against having CRS put the standby in standby mode?” I am asking from the angle that I have more to learn, and I would like to understand the ramifications of this action.
The DG Broker is another beast. I have heard so many good things about it that I am recommending to our group that we start adopting it. We tried it out in 9i and were burned badly (too complicated, too esoteric, not very friendly).
PS – In light of this post, I am passing out Jeremy's whitepaper like candy to the SunGard crowd, because currently they do not support Services which makes my job as a DBA a little more difficult than it needs to be. Funny how the support staff I am talking has no clue what Oracle Services even are. =) Ah well….
I think it isn't recommended because there's no reliable way to do it. The -r physical_standby option wasn't really developed (apparently), so all CRS knows how to do is “startup open” and nothing else.
In 10g R2, I had good experiences with DG Broker–not sure about versions before then, but I can guess it may not have been so good back in V1 or V2 (DG was introduced in 9i, so that's basically V1 in my eyes).
Hi.
rac: oracle $ srvctl config db -d DB -a
…
DB_ROLE: PHYSICAL_STANDBY
START_OPTIONS: MOUNT
…
So You need to change startup option to MOUNT to not OPEN DB
there were some bugs that CRS ignored startup options and just opened(read only) DB
but they fixed in 10.2.0.4.
To start MRP user collout may be used,
but it will be much better if it was coded at CRS level
Denix
Hi Dan,
I was the guy who asked about the autostart during your excellent hotsos symposium presentation. Only now I saw that you also this blog post. I also found the OTN sample scripts in the meantime and blogged about it, but you were faster… Thanks again.
Martin
Hi Martin,
I'm glad you posted as I was thinking of you specifically when I wrote this. Since I didn't get your contact information, I'd hoped you would be watching :). Thanks for the question and getting me thinking…
This is done with a small perl script that is run as an Oracle Clusterware cluster callout. Cluster callouts are scripts or programs that Oracle Clusterware will run whenever a cluster āeventā (like starting or stopping an instance) occurs. It calls these programs with many command-line arguments that can be parsed to determine what the event is about and why it was fired. The program can use these arguments to filter out the events that arenāt interesting to it and act on the events that it wishes to act upon.
Find more jobs: http://www.staffingpower.com/