[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1430.0. "MCC_DNA4_EVL process crashes with LIB-F-SECINTFAI" by GIDDAY::BROOKS () Tue Sep 03 1991 00:47

I haven't seen any problems similar to this one so, I've entered this new note.

One of my customers is using DECmcc BMS V1.1. They are using alarms fairly 
extensively and have applied the lastest V1.1 patches to the system.

A problem has occured with the MCC_DNA4_EVL process terminating on two 
occasions. Refer to logfile below;

 
$ manage/enter/presen=mcc_dna4_evl
Network object  is declared, Status = 52854793
Waiting for the event message from EVL.....
The connection with EVL is established.
** Unable to connect to NMCC  **
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
A fatal error ocuured when sending event = 410 to MCC event manager!
The EVL sink is terminated!
%LIB-F-SECINTFAI, secondary interlock failure in queue
  SYSTEM         job termianted at "date,time"

We managed to get the MCC_DNA4_EVL process restarted if you shutdown all MCC 
processes. Otherwise the next event that fires abort the MCC_DNA4_EVL process
with the same message.

I don't really know why the process is terminating. Any clues or suggestions
in trying to track this problem ?

Regards,
Niguel Brooks  STL CSC.
T.RTitleUserPersonal
Name
DateLines
1430.1sounds like event managerActuallyTOOK::CALLANDERJill Callander DTN 226-5316Thu Sep 05 1991 13:0414
it sounds like the event manager not the evl process is the problem.
The reason you have to shut down MCC is that the event manager is
owned by all mcc systems (the event manager only shuts down and cleans up
when ALL mcc sessions on a system have been terminated, including the
background ones like batch jobs and the such). When you closed and restarted
mcc you reinitialized the event manager. There is a known leak in the
event manager that will cause problems if large number of events are being
sent to the manager without mcc being brought down once in a while. Now
I believe that some work was done to provide a patch to a few customers
so that the event pool could be increased (causing the frequencey of
the problem to be reduced). Why don't you send mail to took::merrifield (bob)
and see if he can help you get the patch if this is a big problem. 

jill
1430.2interlock failure <> memory leakTOOK::GUERTINDon't fight fire with flamesThu Sep 05 1991 14:0310
    Actually, the secondary interlock failure is not caused by a memory
    leak.  It indicates that you probably have a multiple CPU machine (a
    6000 perhaps?), and that the CPU cannot grab the low level interlock
    from the other CPUs.  I have only seen this happen which multiple MCC's
    are starting up at the same time (alarms, DECnet EVL, etc.).  The
    workaround is to stagger the MCCs to start up at different times. In
    V1.2 we don't use VMS low level interlocks.
    
    -Matt.
    
1430.3Still have not put a finger on it, but i'm close.GIDDAY::BROOKSFri Sep 06 1991 03:3334
Thanks for the quick replies. The information supplied has help me to understand 
the problem in more detail.

I have talked the customer, who informed me that the problem has not re-occured
for a few weeks now and everything seems fine. Its a little hard trying to 
diagnose the problem when it is not easily reproducable.

I should have helped you out a little more and told you that the MCC station is
running on a VS3100 M76, running VMS V5.4-2.

Re: 1430.1 
Jill,

The problem has not caused enough convern at this stage to request that patch
to increase the pool allocation. But it is good to know that there could be one
available if the frequency increased. I tend to believe the problem is directly
related to the EVENT MANAGER.

Re: 1430.2
Matt,

I not sure if the problem still has the same effect on a single processor 
machine. I have asked them that if the problem does re-occurs to shutdown all 
processes and stagger the startup of each process to see if there is any effect
to the operation running of MCC.

I guess we will wait to see if the problem is resolved in V1.2 for now.

Thanks,

Niguel Brooks STL CSC.