[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference msgaxp::optical

Title:	Optical Products

Moderator:	TAPE::SENEKER

Created:	Wed May 04 1988
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	841
Total number of notes:	3218

770.0. "Jukebox devices MVTIMEOUT" by HGOVC::CSCHAN () Tue Feb 04 1997 04:58

	This notes has been cross posted in VMScluster note conference.

	There is a 3 nodes CI/NI cluster: Alpha 4000, DEC7730 and DEC7750 
	with RW534 direct connection. After a node leave and re-joint the 
	cluster, many jukebox devices go into mount verification timeout 
	stage. Customer had to reboot the whole cluster to get access those
	jukebox devices. 

	According to the information from customer, one of event happened 
	in the following sequence:
	- Customer shutdown and reboot the Alpha 4000 system
	- Many disks went into mount verification stage.
	- Sometime later, find many mounted jukebox devices (13 out of 20 
	  mounted devices) went into mount verification timeout in Alpha 4000.
	- Same things happened on DEC7750 that has the RW534 direct connected.
        - It is different on DEC7730 system, all mounted jukebox devices have
	  gone to mount verification timeout stage. This is  the most busy
          system in the cluster.
	- Customer tried to dismount those MVtimout devices, process hung.
        - Customer had to reboot the whole cluster.


	VMS6.2, OSMS 3.3-1

	Sysgen Parameter: MVTIMEOUT: 3600 
			  MSCP_LOAD: 1
   			  MSCP_SERVE_ALL:1
	
	All the Jukeboxs devices are served clusterwise. There are 88 (176 
	logical units) cartridges in the optical disk library and only 20 
        mounted usually.
 

	Questions:

	1. Is it a normal behavior for cluster with jukebox devices?

	2. How can we prevent this problem happens?

	I plan to increase the MVTIMEOUT value but I am not sure it can
	fix the problem and whether there is any side effect.


	
	Regards, C.S.

T.R	Title	User	Personal Name	Date	Lines
770.1	check MSCP_CREDITS/MSCP_BUFFERS	TAPE::SENEKER	Head banging causes brain mush	`Tue Feb 04 1997 13:14`	11
	The following verbage is in the OSMS User's Manual: The value of the sysgen parameters, MSCP_CREDITS and MSCP_BUFFERS, should be increased on the node serving optical disks that are mounted cluster wide. The MSCP_CREDITS value should be increased by one for each volume being served to the cluster. The MSCP_BUFFER value should allow for a minimum of 16 pages of buffer area per MSCP_CREDITS value. Rob
770.2		HGOVC::CSCHAN		`Wed Feb 05 1997 05:42`	10
	Re .-1 We found both parameters MSCP_CREDITS and MSCP_BUFFERS are set to the default value. The field will request the customer increase them to the correct vaule. However, these 2 parameters seems to be improve I/O throughput only. Any additional idea? Regards, C.S.
770.3		TAPE::SENEKER	Head banging causes brain mush	`Wed Feb 05 1997 13:49`	12
	Summarized alot... VMS (cluster code) uses a MSCP_CREDIT to manage /track I/O for each MSCP served device. If/when VMS has to share MSCP_CREDITS between multiple devices then the possiblity exists that that VMS will detect a timeout of I/O completion from some device. When this happens it will place all devices on that node into mount verification. I don't remember were the associated value of 16 came from for MSCP_BUFFERS. Rob
770.4	Can mount/cluster without DNS/DFS?	HGTIMA::CSCHAN		`Sun Feb 09 1997 23:14`	57
	I found an artical in TIMA that quote "[RW5XX] Document...Use of Clusters with Optical Software". I think it is a good reference but there are some queries: 5.1.D VMS MOUNT and DISMOUNT operations for each optical volume will consume time equivalent to the traditional magnetic disk volume VMS MOUNT and DISMOUNT, plus a per-volume swap time of fifteen seconds plus the specified MINSWAP delay. > Base on this information, should I set the sysgen parameter: > > MVTIMEOUT > (15 + MINSWAP) * number of mounted jukebox devices ? > 5.2 Use of OSMS and OSDS without decDFS and decDNS 5.2.B. Platters must not be mounted /SERVED nor /CLUSTER nor /SYSTEM (which implies /SERVED). Served platters trigger OpenVMS MSCP services which currently are not compatible with removable storage. > The cluster does not use decDFS and decDNS. Could those Platters > be mounted /cluster or /system? 5.2.C. MOUNT and DISMOUNT operations may consume several hours on larger autochangers with as many as 288 volumes. This time may be reduced by mounting only as many volumes as are absolutely required, by mounting volumes as /NOWRITE (readonly) volumes where possible, by using MCR JBUTIL SET PARAMETER /MINSWAP=5 to set the platter hold time to a minimum, and by keeping as few files open as is necessary. 5.2.D. Cluster transitions must be avoided. Processors joining an active cluster do not respect the "special" nature of optical disks. This limitation is particularly true when the node rejoining the cluster contains the interface adapter to the autochanger and drives. Any cluster transition will cause all disks with outstanding I/O (optical as well as magnetic) > clusterwide to begin a mount-verification, which may not > complete before an operation timeout occurs which causes another > mount verification to begin, ad infinitum. > The customer has to shutdown/reboot one of the system often. In this > case, how can we eliminate "operation timeout" situation happening?
770.5	update of old information	TAPE::SENEKER	Head banging causes brain mush	`Mon Feb 10 1997 13:16`	60
	The information in the TIMA article is somewhat out of date. I will correct various parts here. 5.1.D VMS MOUNT and DISMOUNT operations for each optical volume will consume at least the specified MINSWAP time plus the amount of time required by the jukebox to physically move the optical disk to or from it's storage slot. When the time required for the VMS MOUNT and DISMOUNT operations is greater than the specified MINSWAP time then these operations will consume the time required for the MOUNT or DISMOUNT operation plus the amount of time required by the jukebox to physically move the optical disk to or from it's storage slot. > > There is no reason (that I am aware of) to set MVTIMEOUT to any > value larger than the default value of 3600 seconds (1 hour). > 5.2 Use of OSMS and OSDS without decDFS and decDNS 5.2.B. Platters must not be mounted /SERVED nor /CLUSTER nor /SYSTEM (which implies /SERVED). Served platters trigger OpenVMS MSCP services which currently are not compatible with removable storage. > > This information is out of date. The last two releases of OSMS > have allows rewritable disks to be used in a cluster. The /SYSTEM > and /CLUSTER qualifiers are allowed. > 5.2.C. MOUNT and DISMOUNT operations may consume several hours on larger autochangers with as many as 288 volumes. This time may be reduced by mounting only as many volumes as are absolutely required, by mounting volumes as /NOWRITE (readonly) volumes where possible, by using MCR JBUTIL SET PARAMETER /MINSWAP=5 to set the platter hold time to a minimum, and by keeping as few files open as is necessary. > > This is still true. > 5.2.D. Cluster transitions must be avoided. Processors joining an active cluster do not respect the "special" nature of optical disks. This limitation is particularly true when the node rejoining the cluster contains the interface adapter to the autochanger and drives. Any cluster transition will cause all disks with outstanding I/O (optical as well as magnetic) clusterwide to begin a mount-verification, which may not complete before an operation timeout occurs which causes another mount verification to begin, ad infinitum. > > This information is out of date. The last two releases of OSMS > have allows rewritable disks to be used in a cluster and a mechaism > was added to allow mount-verification to complete before it caused > another timeout. >