Tuesday, October 23, 2007

The AppsDBA Blog Has Moved!

The AppsDBA Blog has moved! You can find the blog here.

Wednesday, January 03, 2007

Throttling RMAN

Oracle’s Recovery Manager product has been around since Oracle 8.0 and has always included limits on channel I/O rates. Prior to 9.0.1 the parameter was called READRATE and was an option on the SET LIMIT CHANNEL command. In 9.0.1 Oracle introduced the RATE option as part of the ALLOCATE CHANNEL command. The problem however wat that it was difficult to determine exactly what affect the throttling had.

Typically the desire to throttle RMAN stems from the desire to not consume all of the database server’s I/O bandwidth running a backup. The idea is generally to perform a backup in the background. In 10g Release 1 Oracle has now made this a little easier. I actually stumbled across this only recently. Like a lot of additions Oracle makes this one goes pretty much unnoticed until you need it. Then it seems pretty cool since it makes the job of quantifying I/O rates a little easier.

The following shows an example of the output of throttling an RMAN backup. It appears that if no throttling takes place then no messages. I believe this includes the setting of the RATE parameter so high that it doesn’t have an affect.

RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13> 14> 15> 16> 17> 18>

old RMAN configuration parameters:

CONFIGURE CHANNEL DEVICE TYPE DISK MAXPIECESIZE 2000000 K FORMAT '/u01/backups/DBSID/DBSID.bkup.%U';

new RMAN configuration parameters:

CONFIGURE CHANNEL DEVICE TYPE DISK MAXPIECESIZE 2000000 K FORMAT '/u01/backups/DBSID/DBSID.bkup.%U' RATE 2000000;

new RMAN configuration parameters are successfully stored

channel ORA_DISK_1: backup set complete, elapsed time: 00:12:26

channel ORA_DISK_1: throttle time: 0:09:01

channel ORA_DISK_2: finished piece 1 at 08-DEC-06

channel ORA_DISK_2: backup set complete, elapsed time: 00:12:26

channel ORA_DISK_2: throttle time: 0:10:39

Finished backup at 08-DEC-06

Sunday, December 03, 2006

Patching OEM Grid Control Release 2

I have recently had the pleasure of patching two separate OEM Grid Control Release 2 installations and wanted to pass on what I found. It was not at all obvious how to apply the current Critical Patch Updates since Grid Control is a hodgepodge of different versions in different ORACLE_HOMEs. The following will detail what I believe are the proper patches and the order to apply them in.

The base release of Grid Control Release 2 is 10.2.0.1.0 and the installation creates three separate ORACLE_HOMEs. One for the database repository (db10g), one for the management server (oms10g) and one for the agent (agent10g).

The Oracle products include version 10.1.0.2.0 for the database and version 10.1.2.0.2 for the 10g Application Server. The following will list the steps and patches needed to bring Grid Control Release 2 up to the October 2006 CPU level:

1) Grid Control Release 2 is version 10.2.0.1.0, this is installed first. This will create three ORACLE_HOMEs. The default is to name the database repository db10g, the management server oms10g and the agent agent10g.

2) The next step is to patch all three ORACLE_HOMEs to Grid Control Release 10.2.0.2.0. This is done with patch 3731593.

3) Step 3 is to apply the 10.1.0.5.0 database patch set to the db10g and oms10g ORACLE_HOMEs.

4) Now the Critical Patch Update for database version 10.1.0.5.0 can be applied to the db10g ORACLE_HOME. This is patch 5490845.

5) Next the Critical Patch Update for Oracle Application Server 10.1.2.0.2 should be applied to the oms10g ORACLE_HOME. The patch number is 5483346.

6) No Critical Patch Updates need to be applied to the agent10g ORACLE_HOME. When the Grid Control patch was run it updated the OEM agent to 10.2.0.2.0
and there are no CPU patches for this version yet.

Friday, December 01, 2006

oratab, dbstart & dbshut

The oratab, dbstart and dbshut files don’t get much attention and to the best of my knowledge Oracle hasn’t changed them in a long time. But from what I can tell somewhere in 10g Release 1 Oracle added the ability to start and stop a listener in dbstart and dbshut. And it appears that in the 10.2.0.2.0 patch set for 10g Release 2 that Oracle has added a new “W” flag to the third field for databases in the oratab file and added ASM support for that flag in dbstart and dbshut.

Oracle added ASM support in the base release of 10g Release 1 and now in the 10.2.0.2.0 patch set for 10g Release 2 they further enhanced this support by recognizing an additional “W” flag in the third field for databases in the oratab file.

The support for listener startup and shutdown is pretty crude. Both dbstart and dbshut accept an input parameter designating the listener ORACLE_HOME. This parameter is then used to set an $ORACLE_HOME_LISTNER variable (note that is not a typo, that is how it is spelled in the script) that is used to determine if there is a tnslsnr executable in the $ORACLE_HOME_LISTNER/bin directory. If so then the version is checked to insure that it is a “Version 10 listener”. If all of this is successful then a $ORACLE_HOME_LISTNER/bin/lsnrctl start command is issued. This of course is better than nothing but probably doesn’t help sites that run multiple listeners and earlier versions.

I think the more interesting developments are with the ASM support and the changes to the oratab file. I was initially hopeful that the “W” flag would help solve some of the startup problems with ASM dependent databases. Unfortunately it is only meant for single instance databases that depend on “an ASM server that is auto-started by CRS”. It doesn’t help the site that is running ASM dependent databases and an ASM instance on a single node. In fact, there is still no good way that I have found to differentiate ASM dependent databases from non-dependent ones, if for example you ran multiple databases on a single node and not all were using ASM features.

Oracle has had code in dbstart since 10g Release 1 to handle a “delayed” startup until the ASM instance is up before starting the other databases listed in the oratab file. Now in 10.2.0.2.0 they have added code to check an ASM server to see if it is up before attempting to start any ASM dependent databases with a “W” flag in their third field. The code uses a call to the srvctl utility to determine the ASM status and waits for up to 5 minutes for the ASM service to start.

One of the byproducts of all of this is that databases with a “W” flag need to be treated as if they had a “Y” flag during environment setup. This really means that a utility needs to set the ORACLE_SID environment variable. Utilities like “orasetup” treat database entries with a “N” flag as “remote” databases and therefore set TWO_TASK rather than ORACLE_SID. This has all sorts of advantages over just setting ORACLE_SID, but it also means that the utility has to be updated. Custom startup scripts, like dbcontrol, also need to aware of what the “W” flag means so those databases are treated appropriately. Both orasetup and dbcontrol have been updated to fully support these new changes.

Friday, November 03, 2006

Open World and NoCOUG

Two presentations in two weeks - and on different topics. I thought my Queue Overlap Analysis presentation went well at Open World even though it was in the last time slot of the conference. I also presented Oracle Workload Characterization at the Northern California Oracle Users Group conference yesterday. That was actually a lot of fun. Cary Millsap was there and even though we barely had time to say hello it was nice to see him.

Tuesday, December 27, 2005

The Business Major Approach

We as technical people often get too caught up in the technology and preciseness of our industry and forget that sometimes "good enough" is the right answer. I call this "the business major approach" and this term was coined by my good friend and colleague Neil Jensen. Many times there is no economic justification to process the extra 20% (aka the 80-20 rule).

There was a question recently on the Oracle-L list about the appsdba.com script, cpu_test.sql, and whether its variation of testing LIOs was a valid approach. A couple of people responded, and two main objections were raised. One, since not all LIOs are the same one should test across a broader spectrum of LIO types, and two since CPU speed is just a small part of a "system" one should really take into account all the other factors of system capacity and scalability.

Of course this is all true, and even echoes the disclaimer in the actual script, but back to my main point. The script takes the business major approach of giving some quantifiable means of providing a first stroke at judging relative CPU speed. The beauty of the script is its simplicity. Yes, it only measures the simplest of Logical I/Os and yes it does it in a PL/SQL loop. That's not the point. The point is it does it consistently, uses CPU exclusively, and has proven over the years to be a pretty reliable predictor of the relative CPU speed between platforms or even database releases. Would you base your hardware acquisition decisions on it? Of course not, but it does give an indication of the relative differences between systems and it's very easy to use.

So, when do we use the business major approach and when do we take the scientific approach? There is a place for both, and ultimately I think the decision is based on cost and risk. In fact, that probably means the business major always wins.

Saturday, December 17, 2005

New SQL Trace Data?

I found a new extended SQL trace line or class of lines this week while working on an Oracle Applications system. Many may have already run across this, but I hadn't and I found it very interesting. Apparently when Oracle Forms makes RPC calls now, they get logged in the extended SQL trace file. Here's the format of the lines:

RPC CALL:APPS.FND_TRACE.GET_TRACE_FILENAME;

RPC EXEC:c=70000,e=68146

So, I updated the AppsDBA Interval Resource Profiler to account for these new lines. Instead of adding the cpu time directly to cpu service though, I decided to create a new "event" called RPC EXEC and put the CPU time there. We'll see how this works, but for now I thought this was a pretty cool discovery.

System Level Data

I was scanning Tom Kyte's blog and found a comment referring to Mogens Noorgard's blog We Do Not Use Blogs. I took a look and his latest discussions have concerned system wide performance and if Statspack is useful. There is a general consensus among the enlightened that Statspack, and system wide data in general, is not useful for performance tuning and actually hurts the effort to fix poorly performing jobs. I don't disagree, but I think we're missing the point. As I've pointed out for a long time now, Statspack or the data that it's based on, is not performance tuning data, it's capacity data. In other words, use system level data for capacity planning and workload measurement. It's not the case that it's useless. It is just misused! If you're really interested in this, see appsdba.com. By the way, I whole heartedly agree with using extended SQL trace data for performance tuning, or response time optimization.