[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference decwet::advfs_support

Title:AdvFS Support/Info/Questions Notefile
Notice:note 187 is Freq Asked Questions;note 7 is support policy
Moderator:DECWET::DADDAMIO
Created:Wed Jun 02 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1077
Total number of notes:4417

1017.0. "Temp hang" by NETRIX::"mcdonald@decatl.alf.dec.com" (John McDonald) Fri Mar 14 1997 15:27

I have a customer that has a 4100/3GB memory, running Oracle. The are using
advfs for their filesystems. It's a development system, so they create
tablespaces and such quite often. They have found a minor problem -
if they create a 500MB tablespace, it takes about 4 minutes, which is
fine, but when it finishes, all of the filesystems on the system seem
to hang for about 20-30 seconds. They can't do an 'ls' and no one can
log in.

I thought that maybe they were skimping on ubc, but ubc-maxpercent is
at 100%.

Any ideas?

John McDonald
Atlanta CSC
mcdonald@decatl.alf.dec.com
 
[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
1017.1Sounds like an I/O backlog being clearedUNIFIX::HARRISJuggling has its ups and downsFri Mar 14 1997 15:579
    If I were to guess, this is a problem with the UBC being flushed to
    disk and because there are most likely 100's of megabytes worth of
    unflushed UBC information, the I/O subsystem is backed up.  All new
    requests for service get put onto the end of the I/O queue and when all
    the stuff in front of it finishes, the new requests get to happen (such
    as reading the executable code for the 'ls' command from disk, or
    opening the /etc/passwd file on disk, etc...).
    
    					Bob Harris
1017.2Thanx.NETRIX::"mcdonald@decatl.alf.dec.com"JohnFri Mar 14 1997 16:359
I figured it might be something to do with UBC being flushed, but
I wasn't sure. I thought there might be a know problem I wasn't
aware.

Thanx for the reply.

john

[Posted by WWW Notes gateway]
1017.3IO bottleneckNETRIX::"tsao@zk3.dec.com"Fri Mar 14 1997 19:2578
The poor response time is most likely due to synchronous read or write 
requests that not only must wait for its own IO, but also all other
IO requests currently issued to the target disk.
One of these synchronous requests includes AdvFS flushing its own domain
log.

Determine how many IO requests the disk is busy processing:

"advfsstat -i 1 -v 2 domain_name" will display IO activity on each domain
volume every second. Look for the column entitled, "dev",
that is short for the AdvFS volume's active IO device queue. You can
monitor the number of active IO requests to a disk.
You want to know if this number is high (say > 1000).

Also run "iostat" on the disk device to track  the number of bytes per second
and transactions per second. (bps*1024)/tps will tell you the average
IO request size in bytes. The transactions per second (tps) should roughly
correlate to the advfsstat "dev" count.

You might see if the number of active IO's correlates to the user response
time. Let's say a user request is delayed because of an AdvFS log flush
and that delay is 30 seconds (30000 milliseconds.)
Let's assume the average IO request takes 10 milliseconds, then
 30000 ms / (10 ms/request) = 3000 requests.

If the advfsstat -v command shows the number of active IO requests are
in the thousands, then user response issue is probably related to the
synchronous log flush.

In that case, you can have the customer try moving the AdvFS log 
to another AdvFS volume to minimize or eliminate bottlenecks with
asynchronous data IO. Choose a small disk partition such as an 
"a" partition for the new volume.
To totally eliminate data IO to this log volume, you need to allocate
all data blocks to a file. 

The following instructions assumes the customer has installed the
Advanced Utilities to use the addvol and migrate commands and the 
customer has a spare disk partition available.

VVVVVVVVVVVVV
MOVING LOG TO SEPARATE VOLUME
to move log and have only log on a disk volume,
Add small disk partition to a domain to move log onto.
#addvol /dev/rz9a dmn1
Switch log to second volume.
#/sbin/advfs/switchlog dmn1 2
Do showfdmn to obtain the total free blocks on the volume with the log.
# showfdmn dmn1

               Id              Date Created  LogPgs  Domain Name
312387a9.000b049f  Thu Feb 15 14:21:13 1996     512  dmn1

  Vol   512-Blks        Free  % Used  Cmode  Rblks  Wblks  Vol Name 
   1     2050860     1908016      7%     on    128    128  /dev/rz10c
   2L     131072      122752      6%     on    128    128  /dev/rz9a
      ----------  ----------  ------
         2181932     2030768      7%

Below allocates all free blocks to a file, /adv1/foo.

# dd if=/dev/zero of=/adv1/foo count=122752
122752+0 records in
122752+0 records out
Below moves all of /adv1/foo's data to the other non-log volume.
# migrate -d 2 /adv1/foo
Now noone else can allocate on the volume containing the log.
^^^^^^^^^^^^^^^^^^^
Rerun the database create and completion to see if this helps.

If the above helps, then instead of wasting the disk space by
allocating the blocks to /adv1/foo, they can instead migrate
some historical data files that are rarely referenced to fill up
the space.



[Posted by WWW Notes gateway]
1017.4Thanx.NETRIX::"mcdonald@decatl.alf.dec.com"John McDonaldMon Mar 17 1997 10:416
Thanx for all of the detail! I passed this along to the customer and
he's going to check it out next time he creates a tablespace.

John McDonald

[Posted by WWW Notes gateway]
1017.5VIRGIN::SUTTERWho are you ??? - I'm BATMAN !!!Thu Mar 20 1997 02:5110
Re: .3	

Would it make sense to Prestoserve that partition? 

As far as I know Prestoserve is meant for sync. writes.
Would Prestoserve introduce new risks?

Regards, 

Arnold
1017.6Presto NETRIX::"tsao@zk3.dec.com"Thu Mar 20 1997 11:3624
Presto will help make synchronous writes faster including
AdvFS log writes that I mentioned. 
Presto does not help when an application is writing a
very large amount of data to disk and that data will
not be read or written again in the near future.

If the customer already has a presto board and they have
a multiple AdvFS volume domain, then activate presto
on the volume containing the AdvFS log. Use showfdmn
command to find the log volume.
Prestoizing the other volumes that the database is writing
its data onto would probably show little peformance
improvement. The reason is that presto's 16 MB's (or less) of 
NVRAM would soon be overloaded and would need to start
start flushing data to disk. The situation would quickly
wind up with the same performance problem that the customer
is currently experiencing.

Of course with that said, if they already have a
presto board and only a one-volume domain, then they 
should try activating presto on it to see if it improves
their particular situation anyways.

[Posted by WWW Notes gateway]