[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference spezko::cluster

Title:	+ OpenVMS Clusters - The best clusters in the world! +
Notice:	This conference is COMPANY CONFIDENTIAL. See #1.3
Moderator:	PROXY::MOORE

Created:	Fri Aug 26 1988
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5320
Total number of notes:	23384

5273.0. "Impact to system disk of adding additional node(s)" by DV780::BAILEYR (Randy Baileyr@mail.dec.com) Tue Apr 01 1997 22:53

OpenVMS VAX 6.1
VAX 7830 (*5)
RZ29 = 2-vol. shadowed system disk

My customer site has 5 nodes in a homogeneous CI-based VAXCluster, 
single system disk.  Due to increasing demand, they are adding a 
6th and 7th node to the cluster.  Their question: What general 
impact on the system disk is adding a 6th and 7th node going to
have?  They are considering 2 system disks, but are also concerned
with the extra management effort involved.

Does there exist any metrics that Digital has researched that 
would suggest when adding additional nodes would *not* help?
I'm thinking of something like a graph listing disk queue length
or disk thruput - something similar to:

Sys    |                    x
Disk   |
       |                x
       |
       |            x
       |
Queue  |         x
Length |    x
       |x
       ------------------------------------------
        1   2    3   4   5   6
              Number of nodes

I realize all of this depends on the current system load/configuration,
etc., but is something that the performance team has developed as
a "guideline"?  Is the same rule of thumb (strive for <20 I/O's)
still accurate?

I'm currently generating graphs (using PSPA) of disk queue length and
thruput on the system disk from all existing 5 nodes.  That might
be the best indicator I have to go by for now.   I didn't see any
way to use DECps modelling (v2.1) for adding an addtional node.
 
TIA
Randy

T.R	Title	User	Personal Name	Date	Lines
5273.1	See _Guidelines for VMScluster Configuration_	XDELTA::HOFFMAN	Steve, OpenVMS Engineering	`Wed Apr 02 1997 14:36`	22
	A queue length averaging .5 or higher is a "rule of thumb" that indicates the spindle is effectively saturated -- half (or better) of all I/Os are waiting for a previous I/O. (One needs to figure out why the spindle is so active to rule out "hot files" or "hot applications".) There are some typical configuration solutions here that involve relocating the pagefiles and swapfiles to other spindles, as well as relocating the queue and system authorization databases and any other "hot files" else-spindle. Extra physical memory can also help reduce I/O loading. We're running several large VMSclusters off a relatively small number of shadowset system disks here in OpenVMS engineering -- usually one to two system disks in use at any one time, for all systems of an architecture present in the VMScluster. I'd tend to add a second system disk to any configuration, just for any rolling upgrades, for near-online prototyping, and for emergencies. There are guidelines in the _Guidelines for VMScluster Configuration_ manual around processor throughput, I/O throughput, etc.
5273.2	Do not forget locking!	ESSB::JNOLAN	John Nolan	`Mon Apr 07 1997 11:49`	5
	I would not be too concerned about the IO impact as this can be reduced with the likes of RAMDISK and pagefile relocation. However more nodes means your Locking rates will increase and this may impact you espechially if your are a TP type site.
5273.3	tune your system disk	COL01::VSEMUSCHIN	Duck and Recover !	`Mon Apr 07 1997 15:25`	27
	I think, that such graph (system disk syturation vs number of nodes) is possible because there are very many things you could do to optimize the performance of your system disk (or to break it down ...) First of all, as the previous noter said, remove all page/swap files from system disk. Nowadays when >100MB configurations aren't scarce it rather impossible that you'll run out your modified page list before SYPAGSWPFILES.COM will be executed. Then you could improve performance (see Steve's note) using cashing (VIOC) and shadowing. To shadow your system disk effectively remove as many files used to write data as possible. Operator.log, accountng.dat, qman* files and audit server log would be first candidates. Then look at which .log files are another candidates. I suppose that pathworks root and login directories for ucx$* accounts already wiped off (are they ?). Almost all layered products (like Rdb or DECnet) offer possibilities to configure their work data files. When you reduce amount of write IO's to your system disk both shadowing and cashing will give you performance. SYSUAF.DAT and RIGHTSLIST.DAT could be moved to another disk to increrase the login performance. DECwindows fonts. I'm not sure about it, but if there are a lot of DECW'users you could ask in DECWindows notes conference, whether is possible to move fonts off from the system disk. HtH Seva
5273.4		EVMS::MORONEY	Hit <CTRL><ALT><DEL> to continue ->	`Thu Apr 10 1997 23:19`	4
	If you use the DOSD (dump off system disk) to move all the SYSDUMP.DMP files to other disks you can enable minimerge on the system disk. Without minimerge, a node crash will result in a full merge of the system disk, and this can really slow things down.
5273.5	Good info! Does DOSD work with 6.1?	DV780::BAILEYR	Randy Baileyr@mail.dec.com	`Tue Apr 15 1997 15:37`	21
	Re: -.all They've actually taken steps that Steve in .1 suggested - moved the page/swapfiles to other disks, a solid-state disk for hot system files, and shadowed the system disk. I ran some reports & graphs from DECps data and there's really no queue at all ( <.02 max fm all 5 nodes) and I/O's are very low ( <2 max). Each one of the 5 nodes 2.0gb memory, so the lack of diskspace to hold the dump files will be one "bargaining" chip. With this version of OpenVMS (6.1), can we use DOSD? I read an article recently that said it was a goal to "retrofit" DOSD back to v5.5-x. Thanks for the responses. Yup, RTFM did help, especially the Guidelines for VMSCluster Configuration. Randy
5273.6	Shared Dump Documented; Check Network and Lock Traffic; DECamds	XDELTA::HOFFMAN	Steve, OpenVMS Engineering	`Tue Apr 15 1997 18:09`	48
	:They've actually taken steps that Steve in .1 suggested - moved the :page/swapfiles to other disks, a solid-state disk for hot system :files, and shadowed the system disk. Ok. :I ran some reports & graphs from DECps data and there's really no :queue at all ( <.02 max fm all 5 nodes) and I/O's are very low ( <2 :max). Then they appear to have a good idea of what would be involved when adding another node. :Each one of the 5 nodes 2.0gb memory, so the lack of diskspace to :hold the dump files will be one "bargaining" chip. Dumpfiles can be set to for selective (partial) dumps, and dumpfiles themselves can be set up to be shared. Between the two, one can economize on the disk space. (One creates a SYS$COMMON:[SYSEXE]SYSDUMP-COMMON.DMP, and then sets up SYS$SPECIFIC:[SYSEXE]SYSDUMP.DMP aliases in each system root. This is documented.) :With this version of OpenVMS (6.1), can we use DOSD? I read an :article recently that said it was a goal to "retrofit" DOSD back :to v5.5-x. I'd assume "no", at least for the time being. And if I were to believe that I needed DOSD, I'd upgrade. (And given this situation, I don't think DOSD is necessary -- these systems should not be crashing particularly often, and a shared dump file can fit nicely onto a 4 GB RZ29 or a 9 GB RZ40 disk...) -- I'd load DECamds, and use that -- it's one of the best VMScluster monitoring tools around, and it's part of OpenVMS. (Older DECamds versions require a VMScluster PAK, while newer versions will require an OpenVMS PAK.) -- After you've finished looking at the disk I/O activity, start looking at the network I/O rates, and start looking at the distributed lock traffic.