[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

1430.0. "MCC_DNA4_EVL process crashes with LIB-F-SECINTFAI" by GIDDAY::BROOKS () Tue Sep 03 1991 00:47

I haven't seen any problems similar to this one so, I've entered this new note.

One of my customers is using DECmcc BMS V1.1. They are using alarms fairly 
extensively and have applied the lastest V1.1 patches to the system.

A problem has occured with the MCC_DNA4_EVL process terminating on two 
occasions. Refer to logfile below;

 
$ manage/enter/presen=mcc_dna4_evl
Network object  is declared, Status = 52854793
Waiting for the event message from EVL.....
The connection with EVL is established.
** Unable to connect to NMCC  **
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
Ready to read the next event message...
A fatal error ocuured when sending event = 410 to MCC event manager!
The EVL sink is terminated!
%LIB-F-SECINTFAI, secondary interlock failure in queue
  SYSTEM         job termianted at "date,time"

We managed to get the MCC_DNA4_EVL process restarted if you shutdown all MCC 
processes. Otherwise the next event that fires abort the MCC_DNA4_EVL process
with the same message.

I don't really know why the process is terminating. Any clues or suggestions
in trying to track this problem ?

Regards,
Niguel Brooks  STL CSC.

T.R	Title	User	Personal Name	Date	Lines
1430.1	sounds like event managerActually	TOOK::CALLANDER	Jill Callander DTN 226-5316	`Thu Sep 05 1991 13:04`	14
	it sounds like the event manager not the evl process is the problem. The reason you have to shut down MCC is that the event manager is owned by all mcc systems (the event manager only shuts down and cleans up when ALL mcc sessions on a system have been terminated, including the background ones like batch jobs and the such). When you closed and restarted mcc you reinitialized the event manager. There is a known leak in the event manager that will cause problems if large number of events are being sent to the manager without mcc being brought down once in a while. Now I believe that some work was done to provide a patch to a few customers so that the event pool could be increased (causing the frequencey of the problem to be reduced). Why don't you send mail to took::merrifield (bob) and see if he can help you get the patch if this is a big problem. jill
1430.2	interlock failure <> memory leak	TOOK::GUERTIN	Don't fight fire with flames	`Thu Sep 05 1991 14:03`	10
	Actually, the secondary interlock failure is not caused by a memory leak. It indicates that you probably have a multiple CPU machine (a 6000 perhaps?), and that the CPU cannot grab the low level interlock from the other CPUs. I have only seen this happen which multiple MCC's are starting up at the same time (alarms, DECnet EVL, etc.). The workaround is to stagger the MCCs to start up at different times. In V1.2 we don't use VMS low level interlocks. -Matt.
1430.3	Still have not put a finger on it, but i'm close.	GIDDAY::BROOKS		`Fri Sep 06 1991 03:33`	34
	Thanks for the quick replies. The information supplied has help me to understand the problem in more detail. I have talked the customer, who informed me that the problem has not re-occured for a few weeks now and everything seems fine. Its a little hard trying to diagnose the problem when it is not easily reproducable. I should have helped you out a little more and told you that the MCC station is running on a VS3100 M76, running VMS V5.4-2. Re: 1430.1 Jill, The problem has not caused enough convern at this stage to request that patch to increase the pool allocation. But it is good to know that there could be one available if the frequency increased. I tend to believe the problem is directly related to the EVENT MANAGER. Re: 1430.2 Matt, I not sure if the problem still has the same effect on a single processor machine. I have asked them that if the problem does re-occurs to shutdown all processes and stagger the startup of each process to see if there is any effect to the operation running of MCC. I guess we will wait to see if the problem is resolved in V1.2 for now. Thanks, Niguel Brooks STL CSC.