[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference eps::oracle

Title:Oracle
Notice:For product status see topics: UNIX 1008, OpenVMS 1009, NT 1010
Moderator:EPS::VANDENHEUVEL
Created:Fri Aug 10 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1574
Total number of notes:4428

1551.0. "drd devices are 300% slower than ADVFS on the same device" by GUIDUK::SOMER (Turgan Somer - SEO06 - DTN 548-6439) Fri Apr 25 1997 06:01

Here are some data we gathered while testing DRD performance:

1. DRD device reads from shared (SCSI) devices take about 300% longer than 
   comparable/identical ADVFS file system reads from the same shared device
	
2. Raw (rrzxx) device reads from local (non-shared device) take essentially 
   the same time as a ADVFS file system read on comparable/identical files
	
3. Subsequent repeat ADVFS file system reads are 1500% faster than the 
   initial read (i.e., the entire read was cached)
	
4. Raw device reads do not take advantage of any hardware or UNIX OS 
   caching irrespective of whether they are a local (rrzxx) or a shared device 
   (DRD). None of the reads were cached.
	
5. DRD device reads of a DRD device allocated to Node1 by Node2 
   took about 10% longer than if the read was done on Node1
	
6. Mirroring and or striping of data had no significant impact on raw 
   device reads and inconsequential or marginal improvement on ADVFS file 
   system reads

Test Environment:

2 8200's Dual Processors with 512 MB Memory (carewise1 and carewise2
                                               are their node names)
2 KZPSA's in each 8200's connecting to two shared SCSI's with dual
   redundant HSZ50's on each shared SCSI bus
10 RZ29's on each dual redundant HSZ50's
Dual Memory Channel
DU 4.0A and TruCluster 1.4
Oracle Parallel Server (OPS)

Test Parameters:

The tests yielding the above summary findings were performed on Alpha 
8200's which were being used solely by the tester.  All tests were run 
singularly and the preceding test was allowed to complete prior to 
executing the subsequent test.

All timings involved the use of a 100MB (1024*1024*100 = 104,857,600 bytes) 
file or section from the same raw device.  The test timings used in 
developing the above findings were "read only" in context from a singular 
device so as to minimize potential I/O, CPU bandwidth bottlenecks.  Where 
parameters were changed (i.e. blocksize), their noted effect was 
negligible.


Test Results

The following test results are edited script output of the actual tests 
performed:

Script started on Wed Apr 23 10:43:30 1997
# csh
--->		Create Test Files by reading from distributed raw device.

root@carewise1 /u01/oradata/test 41# /usr/bin/time dd if=/dev/rdrd/drd1 of=t1 
                                                      bs=32768 count=3200
3200+0 records in
3200+0 records out

real   65.0
user   0.0
sys    3.3

root@carewise1 /u01/oradata/test 43# /usr/bin/time dd if=/dev/rdrd/drd1 of=t2 
                                                      bs=32768 count=3200
3200+0 records in
3200+0 records out

real   65.3
user   0.0
sys    3.5

root@carewise1 /u01/oradata/test 45# ls -l

Ktotal 307216
drwxr-xr-x   2 root     dba           8192 Apr 23 10:53 .
drwxr-xr-x   8 oracle   dba           8192 Apr 23 10:42 ..
-rw-r--r--   1 root     dba      104857600 Apr 23 10:45 t1
-rw-r--r--   1 root     dba      104857600 Apr 23 10:53 t2


--->		Copy file system object to null device (Read of file on 
                                                    shared AdvFS device)

root@carewise1 /u01/oradata/test 48 /usr/bin/time dd if=t2 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   19.7
user   0.0
sys    1.8

root@carewise1 /u01/oradata/test 49# /usr/bin/time dd if=t1 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   19.7
user   0.0
sys    1.8

--->		Repeat reads show effects of hardware/UNIX OS cacheing

root@carewise1 /u01/oradata/test 50# /usr/bin/time dd if=t1 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   1.3
user   0.0
sys    1.3

--->		Change in blocksize used during reads

root@carewise1 /u01/oradata/test 51# /usr/bin/time dd if=t1 of=/dev/null bs=4096

25600+0 records in
25600+0 records out

real   1.7
user   0.1
sys    1.6

root@carewise1 /u01/oradata/test 52# /usr/bin/time dd if=t2 of=/dev/null bs=4096

25600+0 records in
25600+0 records out

real   19.9
user   0.1
sys    2.1

root@carewise1 /u01/oradata/test 53# /usr/bin/time dd if=t2 of=/dev/null bs=8192

12800+0 records in
12800+0 records out

real   1.4
user   0.0
sys    1.4

root@carewise1 /u01/oradata/test 54# /usr/bin/time dd if=t1 of=/dev/null bs=8192

12800+0 records in
12800+0 records out

real   19.9
user   0.0
sys    1.8

--->		Perform raw read from DRD to null device

root@carewise1 /u01/oradata/test 55# /usr/bin/time dd if=/dev/rdrd/drd1 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out

real   46.1
user   0.0
sys    0.7

--->		Repeat reads show no effects from hardware/UNIX OS caching

root@carewise1 /u01/oradata/test 56# /usr/bin/time dd if=/dev/rdrd/drd1 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out

real   46.1
user   0.0
sys    0.7

--->		Perform raw read from Local (non-shared) Device to null device

root@carewise1 /u01/oradata/test 57# /usr/bin/time dd if=/dev/rrz33c 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out

real   15.4
user   0.0
sys    0.5

--->		Repeat read show no effect from hardware/UNIX OS caching

root@carewise1 /u01/oradata/test 58# /usr/bin/time dd if=/dev/rrz33c 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out

real   15.4
user   0.0
sys    0.5

--->		Another local disk performs as expected

root@carewise1 /u01/oradata/test 59# /usr/bin/time dd if=/dev/rrz35c 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out

real   15.4
user   0.0
sys    0.6

root@carewise1 /u01/oradata/test 61# # exit

--->		Tests on second node show slight performance hit (aprox 10%)

root@carewise2 /u02/oradata/test 60# /usr/bin/time dd if=/dev/rdrd/drd1 of=t2 
                                                      bs=32768 count=3200
3200+0 records in
3200+0 records out

real   71.2
user   0.0
sys    3.9

root@carewise2 /u02/oradata/test 61# ls -l

total 204816
drwxr-xr-x   2 root     dba           8192 Apr 23 10:49 .
drwxr-xr-x   8 oracle   dba           8192 Apr 23 10:45 ..
-rw-r--r--   1 root     dba      104857600 Apr 23 10:48 t1
-rw-r--r--   1 root     dba      104857600 Apr 23 10:51 t2


---> 		Read of file on shared AdvFS device

root@carewise2 /u02/oradata/test 63# /usr/bin/time dd if=t1 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   19.8
user   0.0
sys    1.9

---> 		Cached read

root@carewise2 /u02/oradata/test 64# /usr/bin/time dd if=t1 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   1.4
user   0.0
sys    1.4

root@carewise2 /u02/oradata/test 65# /usr/bin/time dd if=t2 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   17.4
user   0.0
sys    1.8

root@carewise2 /u02/oradata/test 66# /usr/bin/time dd if=t2 of=/dev/null 
                                                            bs=32768
3200+0 records in
3200+0 records out

real   1.4
user   0.0
sys    1.4

--->		Blocksize change

root@carewise2 /u02/oradata/test 67# /usr/bin/time dd if=t1 of=/dev/null 
                                                            bs=65536
1600+0 records in
1600+0 records out

real   20.0
user   0.0
sys    1.8

root@carewise2 /u02/oradata/test 68# /usr/bin/time dd if=t2 of=/dev/null 
                                                            bs=8192
12800+0 records in
12800+0 records out

real   17.5
user   0.0
sys    1.9


---->	Reads of raw device over memory channel showing 10% performance hit

root@carewise2 /u02/oradata/test 70# /usr/bin/time dd if=/dev/rdrd/drd1 
                                         of=/dev/null bs=32768 count=3200

3200+0 records in
3200+0 records out
real   52.0
user   0.0
sys    1.1

root@carewise2 /u02/oradata/test 71# /usr/bin/time dd if=/dev/rdrd/drd1 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out
real   51.6
user   0.0
sys    1.2

root@carewise2 /u02/oradata/test 72# /usr/bin/time dd if=/dev/rdrd/drd1 
                                         of=/dev/null bs=32768 count=3200
3200+0 records in
3200+0 records out
real   51.0
user   0.0
sys    1.0


Conclusions

These tests, in my opinion, isolate the poor Distributed Raw Device I/O 
performance from other influencing and obscuring factors (i.e., 
Oracle, disk striping, mirroring, etc.).  


DRD's are required to support OPS. In my opinion, a 300% I/O performance 
hit is an unacceptable compromise for this high-availability feature.  The 
distributed disk infrastructure has adequate system support for distributed 
AdvFS.  The distributed I/O performance for DRD's should be as good or 
better than that for distributed AdvFS.  The hardware cache on the HSZ 
controllers and UNIX OS I/O caching should work for both DRD and 
distributed AdvFS I/O.

One of the motivating factors in using raw devices with Oracle in general, 
is to improve I/O performance.  In OPS/TruCluster environment the opposite 
seems to hold true. 

Any comments on our test methodology and findings?


X-Posted in SMURF::ASE and EPS::ORACLE

    
T.RTitleUserPersonal
Name
DateLines
1551.1Is CFS the solution ?NNTPD::"srinath@qca.mts.dec.com"Sat Apr 26 1997 09:487
Hi

This looks to be a very good info for all performance tuning folks.
Well is the Cluster File System the answer to all this low performance?

/srinath
[Posted by WWW Notes gateway]
1551.2please check SMURF::ASE 1996.* tooALFAM7::GOSEJACOBMon Apr 28 1997 08:484
    May I sugguest that you also have a look at the replies to entry 1996
    (same one as the base note here) in SMURF::ASE.
    
    	Martin