[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference humane::scheduler

Title:SCHEDULER
Notice:Welcome to the Scheduler Conference on node HUMANEril
Moderator:RUMOR::FALEK
Created:Sat Mar 20 1993
Last Modified:Tue Jun 03 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1240
Total number of notes:5017

1123.0. "Load Balancing problem - is it ON or OFF or both!" by KERNEL::TITCOMBER () Fri Jun 28 1996 21:23

    Can anybody shed light on the following load balancing problem?

    A customer has a 4 node cluster of VAX systems all running Scheduler. 
    As one node is running V5.5-2 while the others are running V6.1, they
    have set up Scheduler with the logical NSCHED$ pointing to a
    search-listed logical, as one does for mixed architecture (VAX & Alpha)
    clusters.  So for instance on the V5.5-2 node:

$ sh log nsched$
   "NSCHED$" = "DISK$DATA:[NSCHED]" (LNM$SYSTEM_TABLE)
        = "DISK$DATA:[NSCHED.V552]"

    and for V6.1:

$ sh log nsched$
   "NSCHED$" = "DISK$DATA:[NSCHED]" (LNM$SYSTEM_TABLE)
        = "DISK$DATA:[NSCHED.V61]"

    The V5 and V6 images for Scheduler are then located in the appropriate
    directory.  Having said all this, it may not be relevant, but for the
    sake of completeness...

    Anyway, the big question is why one of the nodes has a rating when none
    of the others do?

    We get the same following information returned from all 4 nodes in the
    cluster:

$ sched show stat
Node   Version  Started              Jobs  Jmax   Log  Pri Rating
PORTOS V2.1B-7  22-JUN-1996 18:02:41    0    15     5    4 <-- Default
KERMIT V2.1B-7  22-JUN-1996 22:40:00    1    15     5    4    871
ARAMIS V2.1B-7  23-JUN-1996 02:42:16    0    15     5    4
GONZO  V2.1B-7  23-JUN-1996 03:02:44    0    15     5    4

$ sched show load
Load Balancing is OFF


    The interesting thing is that each node believes that load balancing is
    turned OFF, in which case there should be no rating value displayed for
    any nodes in the cluster.  The one that does have a rating value
    displayed is one of the V6.1 nodes.

    Furthermore, to add more confusion, each node logfile has reference to
    load balancing being turned on, but not off:

	.
	.
	.
$ Run NSCHED$:NSCHED.EXE
Nsched Version V2.1B-7  starting...
Setting Debugging OFF
Setting Job Max to  15
Setting Logging to  5
Setting Default Job Priority to  4
Setting Load Balancing ON            <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Setting Restart Params to CLEAR on job completition
Setting Remote Jobs DISABLED
Setting brkthru/notify wait to  300  seconds
%DCL-I-SUPERSEDE, previous value of NSCHED$MAILBOX has been superseded
%DCL-I-SUPERSEDE, previous value of NSCHED$TERM_MAILBOX has been superseded
CPUtype= 19  CPU_count= 2  Total pages= 669897   Meg= 342   VUPS= 64
Remote jobs not enabled
	.
	.
	.



    This problem has been uncovered in the process of investigating why
    some jobs in the Scheduler database have been scheduled and left in the
    requested state "Run" for hours (and in some cases days) while other
    jobs on the same nodes run OK.

    What is going on here?

    If anybody has seen this before or has any ideas or suggestions then I
    would be grateful for some assistance.

    Thanks in advance,

    Rich
  
    
T.RTitleUserPersonal
Name
DateLines
1123.1weird - try turning it on (or off) ...RUMOR::FALEKex-TU58 KingMon Jul 01 1996 05:476
    What happens if you do
    
    $ sched set load on 
    
    (from an account with sysprv or oper priv)  ??