[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

3782.0. "TSAM hangs/RWMBX/Terminal Server Global Alarm Rules" by CUJO::HILL (Dan Hill-Net.Mgt.-Customer Resident) Tue Sep 22 1992 13:50

    I have a global alarm rule for verify reachability of about 40 terminal
    servers on our network (DS100s, DS200s, DS250s, DS300s).  The rule
    expression is as follows:
    
    (TERMINAL_SERVER * SELFTEST STATUS <> "NORMAL", AT EVERY 00:05:00)
    
    I have also tried :
    
    (TERMINAL_SERVER * SOFTWARE STATUS <> "NORMAL", AT EVERY 00:05:00)
    
    These alarm rules put the process which enables the rules into a
    ResourceWaitMBX state (RWMBX).
    
    I've noticed the same problems in other notes (3545, 2775, ...).
    
    Is there a fix for this problem yet?
    
    
    Also, is there a better way of determining terminal server
    reachability?
    
    -Dan
T.RTitleUserPersonal
Name
DateLines
3782.1Happening here also...SIOG::TINNELLYConsultancy for fee NOT free..Tue Dec 01 1992 15:1115
    
    Hello,
    
    I am currently on site at Dupont and am experiencing the same problems.
    I am getting proccesses going into RWMBX  when i do something like 
    SHOW STATUS from the iconic map on any of the terminal servers. I
    thought it may have had something to do with the fact that the system
    was upgraded to VMS A5.5 since last week when everything seemed fine.
    The process going into RWMBX is the TS_AM_SRV process and the SYSTEM
    process.
    
    The only way out was shut down and start up again and evrything is fine
     again for the moment. Any update on this one greatly appreciated.
    
    regards peter. 
3782.2TOOK::FONSECAI heard it through the Grapevine...Tue Dec 01 1992 17:4612
It may be one of several problems:

If you have not installed the TSAM V1.0.2 kit referenced in note 3.???, then
you should.  But the CSC would have told you that.  You've talked to the
CSC right?

Secondly, TSM and TSAM have a known problem working under VMS 5.5-2
systems with FDDI controllers (MFA-0 for ethernet circuit.)  This has
been fixed for TSM, but the fix has not made it into TSAM yet.  So if your 5.5-2
system only has FDDI controllers, you are out of luck for now.

-Dave
3782.3v1.0.0SIOG::TINNELLYConsultancy for fee NOT free..Wed Dec 02 1992 09:1017
3782.4TOOK::FONSECAI heard it through the Grapevine...Thu Dec 03 1992 20:305
Peter-

Go with the V1.0.2 kit, it will undoubtedly fix your problems...

-Dave
3782.5TS_AM V1.0.2 fixes RWMBX, but %MCC-E-NOENTITY prob exists.CUJO::HILLDan Hill-Net.Mgt.-Customer ResidentTue Dec 08 1992 03:2420
    Using the same alarm rule mentioned in .0 I successfully monitored 10
    terminal servers for two hours with no RWMBX problems (every 3
    minutes).  I did, however, encounter another problem.
    
    Seems that if a terminal server is having a problem and the code
    bugchecks, status codes will be written in the SOFTWARE STATUS field. 
    That field is not long enough to contain the error information and the
    alarm rule is disabled with the following error:
    
    Software Logic Error, %MCC-E-NOENTITY, no corresponding entity instance
    exists.
    
    The SOFTWARE STATUS field should contain the following:
    PC=184EE0, SP=000400, SR=002700, MEM=000000, CODE=400
    
    DECmcc TSAM V1.0.2 shows only the following:    
    PC=184EE0, SP=000400, SR=002700, MEM=000000,
    
    
    -Dan
3782.6TOOK::FONSECAI heard it through the Grapevine...Tue Dec 08 1992 16:1210
Dan,

I don't have an answer.  A quick search through the code seems to
indicate the only limit is 80 characters, but this is obviously
getting truncated somewhere.  I remember changing the limit for another
attribute (but in a place affecting all) from 40 to 80, but don't
recall testing the change.  Yet another thing for me to check on
when I work on TSAM again....

-Dave
3782.7rwmbx gone, but..SIOG::TINNELLYConsultancy for fee NOT free..Wed Dec 09 1992 09:4817
    
    Dave,
    
    V1.0.2 seems to have fixed the RWMBX problems, thankfully. I
    am seeing the same data in the Software Status field discussed
    by Dan. However I am not sure that the particular terminal 
    server was experiencing any problems. I had to remove the 
    offending server from the domain, as the rule was being
    disabled, not a good solution.
    
    I am also experiencing problems(discussed in a later note) where
    the alarms seem to hang up, and when you try and do a SHOW 
    STATUS on any terminal server for example it just sits there
    with the clock being dispalyed forever. Sometimes the STOP
    directive will work, other times you have to restart DECmcc.
    
    many thanks Peter
3782.8Polling same servers in multiple domains..SIOG::TINNELLYConsultancy for fee NOT free..Wed Dec 09 1992 10:0513
    
    Dave,
    
    One extra bit of info is that the way the domains are set up here,
    is that I have one main domain with all of the 126 terminal servers in
    it. We also have lots of other domains based on buildings with the 126
    servers spread out across  these domains. The rule is set up testing 
    for reachability of the servers in the main domain and also in the other
    domains . Does this cause a problem for TSAM?
    
    Grasping at anything...
    
    Peter
3782.9RWMBX still ocurringSIOG::TINNELLYConsultancy for fee NOT free..Wed Dec 09 1992 11:5214
    
    I take it all back, the RWMBX problems have just started happening 
    again. The MCC_TS_AM_SRV has it and some other processes.
    
    With regard to rules being disabled, i deleted the rules in the main 
    domain , so i would not have multiple rules polling the same T/S. 
    This did not change anything, when I do a  SHOW STATUS on the server
    it just hangs. Also the rules are still being disabled for some strange
    reason.
    
    I have a call logged, but any input greatly appreciated.
    This is getting embarassing in front of a customer.
    
    regards peter.
3782.10No good newsTOOK::FONSECAI heard it through the Grapevine...Wed Dec 09 1992 17:2216
Peter,

I've let my management know that this problem is continuing to
go unsolved.  Right now I'm stretched just trying to meet the
the schedule for TSM, and I just lost my only co-worker to the lay-off this
week....  I feel like all I can give you is excuses, and not help.

All I can suggest is to back off on the alarm frequency.  I don't think
that having alarms set up in different domains the way you do would
be the cause of the problem.  Even with the V1.0.2 update, I suspect
TSAM continues to have some fundemental problems in the threaded environment. :-(
(Which is after all what DECmcc is! )

Good luck!

-Dave
3782.11With the CSCBERE::TINNELLYConsultancy for fee NOT free..Thu Dec 10 1992 13:1711
Dave,

Thanks for the help to date, I am working with the CSC on the problem, and
hopefully we can solve it. It was my understanding that TSM was the biggest
seller in the Network management space in Europe, and hopefully DECmcc/
TSAM would migrate into that position. It would seem like one person supporting
this area is a little bit thin on the ground, to say the least.

regards for now 
peter.
3782.12Window Hang Cont.KERNEL::WARDJOWed Dec 23 1992 11:1750
    I have been working on this problem with Peter. 
    
    So far we have solved the RWMBX problem by increasing PQL DEFAULT BYTE
    LIM to 120000 (FROM 65000). However we are still getting the window
    hang when doing a show server status after enabling the alarm rules.
    
    There are about 57 rules that poll the server every 5 mins. Some of the
    system parameters are detailed below.
    
    As these seem ok and a show on a node4 entity is ok it looks like a 
    problem with the TSAM (FCL also hangs when looking a terminal server).
    Is there a logical that can be defined to enable a dump of what is
    going on?
    
    Any other suggestions appreciated.
    
    Jon
    
    
    System info:-
    
    
      GBLPAGFIL 12200
      LOCKIDTBL 2999
      GBLSECTIONS 600
    
      Bytlm: 100000
    
    
              Summary of Local Memory Global Sections
    
          301 Global Sections Used, 23066/21934 Global Pages Used/Unused
    
      Nonpaged Dynamic Memory
    
      Current Size (bytes)    1204224
      Initial Size (NPAGEDYN) 1204224
     Maximum Size (NPAGEVIR) 4818432
    
      Paged Dynamic Memory
      Current Size (PAGEDYN)  1541632
       Free Space (bytes)     559296
    
      Paging File Usage (pages):               Free  Reservable   Total
    
      SWAPFILE.SYS                           32400     32400      32400
      PAGEFILE.SYS                           135055    20845      150000
    
    
    
3782.13TOOK::FONSECAI heard it through the Grapevine...Wed Dec 30 1992 16:1811
There are a couple of debug flags you might try.  Try defining
MCC_TS_AM_LOG to "1C0".  These flags will only work with the latest
version V1.0.2, and will not show you much besides certain locks that
TSAM takes out before trying to connect to a server.  I'm not sure this will
reveal much that is useful.

Trying to push 57 rules through TSAM every 5 minutes sounds like you are
running it on the hairy edge.  Have you experimented with opening up the
alarm interval at all?

-Dave
3782.14TSAM window hang - UpdateKERNEL::WARDJOWed Jan 06 1993 14:37143
    An update -
    
    I double checked the alarm rules and there are only 28 for servers. We
    increased the time interval to 30 mins and this made little improvement. 
    The window still hung but we were able to cancel out of it.
    Defining the logical just produced a series of lock_id's (as was
    suggested it would). The log file is below.
    
    Can anyone suggest anything else?
    
    
    Jon
    
    
    a[Ha[J
    a[1m
       a(0lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
       x  a(BThis is a private computer facility. Access to it for anya(0 x
       x  a(Breason must be specifically authorised. If you are not soa(0 x
       x  a(Bauthorised, your continued access and further inquiry maya(0 x
       x  a(Bexpose you to criminal and/or civil proceedings          a(0 x
       a(0mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
    a(Ba[0m
    Username: SYSTEM
    
    Password:
            Welcome to VAX/VMS version A5.5 on node MDOPCC
        Last interactive login on Tuesday,  5-JAN-1993 13:54
        Last non-interactive login on Tuesday,  5-JAN-1993 14:01
    aZ
    a[c
    a[62"pa F
    a>
    a>
    
    X Toolkit Warning: not a valid window ID
    
    Acquire MCC_TS_AM_MBX lock_id = 196647
    
    Acquire PAD lock.  lock_id = 131108
    
    Release/Delete lock_id = 131108
    
    Release/Delete lock_id = 196647
    
    Routine <TSAM_DELETE_CONTEXT>
    
    PAD-lock release
    
    Release server lock
    
    Acquire MCC_TS_AM_MBX lock_id = 131105
    
    Acquire MCC_TS_AM_MBX lock_id = 65586
    
    Acquire PAD lock.  lock_id = 65587
    
    a[K
    
    a[K
    
    a[K
    
    a[K
    
    Acquire MCC_TS_AM_MBX lock_id = 131115
    
    Acquire PAD lock.  lock_id = 131113
    
    Release/Delete lock_id = 131113
    
    Release/Delete lock_id = 131115
    
    Routine <TSAM_DELETE_CONTEXT>
    
    PAD-lock release
    
    Release server lock
    
    Acquire MCC_TS_AM_MBX lock_id = 196653
    
    Acquire PAD lock.  lock_id = 196651
    
    Release/Delete lock_id = 196651
    
    Release/Delete lock_id = 196653
    
    Routine <TSAM_DELETE_CONTEXT>
    
    PAD-lock release
    
    Release server lock
    
    Acquire MCC_TS_AM_MBX lock_id = 262189
    
    Acquire PAD lock.  lock_id = 262187
    
    Release/Delete lock_id = 262187
    
    Release/Delete lock_id = 262189
    
    Routine <TSAM_DELETE_CONTEXT>
    
    PAD-lock release
    
    Release server lock
    
    Acquire MCC_TS_AM_MBX lock_id = 327723
    
    Acquire PAD lock.  lock_id = 196649
    
    Routine <tsam_mbx_rcvrqst>
    
    >>>>>>>>>>>>>>>>>>>>>mbiosb.io_w_status = 2096
    
    Release/Delete lock_id = 196649
    
    Release/Delete lock_id = 327723
    
    Routine <TSAM_DELETE_CONTEXT>
    
    PAD-lock release
    
    Release server lock
    
    Acquire MCC_TS_AM_MBX lock_id = 393259
    
    Acquire PAD lock.  lock_id = 262185
    
    Release/Delete lock_id = 262185
    
    Release/Delete lock_id = 393259
    
    Routine <TSAM_DELETE_CONTEXT>
    
    PAD-lock release
    
    Release server lock
    
    SYSTEM       logged out at  5-JAN-1993 15:31:11.95
    
     
                                                                          
3782.15$ SET BADGE/EXPIRY_DATE=15-JAN-1993BERE::TINNELLYConsultancy for fee NOT free..Mon Jan 11 1993 09:2521
Hello Folks,

Well I was hoping that this problem would be sorted before I depart
Digital Ireland on Friday, and leave a happy customer behind. I am sure
it will be sorted, I have great belief in Digital. Digital always was
and still is a great company to pull the stops out and fix the problems.

After 19.8 years working for Digital in Galway, Ayr Scotland and Dublin, I 
have great memories of working with excellent colleagues throughout the 
Corporation.

I would particulary like to say thank you to the excellent support I
have received from people in this notesfile.

I am confident that Digital will turn its future around and regain its
strong position in the computing market.

bye

Peter