[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

3018.0. "ULTRIX issues" by WELLIN::MCCALLUM () Mon May 18 1992 15:22

	
	I have done some demos of DECmcc ULTRIX(1.2.7).ULTRIX 4.2. It has 
	crashed a couple of times, but these are some reproducable problems.

	1. If I start the iconic map with notification enabled with a 
	config file I get the following errors and notification has to be 
	restarted.
	
	In pop-up windows:

	Unexpected condition returned to notification pm
	Error trying to receive a packet

	and in the owning process:

	disaster IPC server unmarshalled input sanity check.

	I have to re-enable notifications to get it to work.
    
	2. 
	When I start MCC and then try to start the data collector I get 
	an error message saying 

	tmp: is a directory

	Data collector is already started

	But it doesn't work.

	If I then disable it and re-enable it, I get the same tmp message, 
	but it starts and works. I am in root (/) and tmp is a directory !)

	3. If I start the iconic map with an entity in the domain but not 
	in the map (because I couldn't save the map - the system crashed)
	then the system shows just the icon for the device not in the map
	rather than the map itself. This may be OK, but is not as VMS.
	To get the map I have to delete the entity from the domain and 
	re-open the domain.

	4. A performance rule on a DECnet circuit fails

	CREATE MCC 0  ALARMS RULE busy_in      -
  EXPRESSION = (NODE4 .dna_node.wlort1 circuit syn-0 inbound util > 5.0   ,-
                         AT EVERY 00:15, FOR START -0:00 DURATION=00:01 )   ,-
  PROCEDUR =  mcc_alarms_log_alarm.csh              ,-
  EXCEPTION HANDLER  = mcc_alarms_log_exception.csh                 ,-
  PARAMETER          = "NODE_ALARMS.LOG"                                     ,-
  CATEGORY           = "line utilisation too high"                           ,-
  PERCEIVED SEVERITY = CRITICAL                                             ,-
	in domain welwyn
        
	Works on VMS but not ULTRIX

	Gives error messages - software logic error and
	statistic requires two counter samples.

	This may be a syntax problem, but it works on VMS

	5. When you leave MCC it leaves several processes around. 
	Is this right ?  


	Rgds,

	Dave.
                                                                
T.RTitleUserPersonal
Name
DateLines
3018.1Processes designed to stay upTOOK::MINTZErik Mintz, DECmcc Development, dtn 226-5033Mon May 18 1992 15:2814
>	5. When you leave MCC it leaves several processes around. 
>	Is this right ?  
 
Off hand, I can answer this one; perhaps someone else can help with
the others.

On ULTRIX, each MM runs in its own process.  So once you dispatch to
a module (ie execute a command which requires the use of that MM),
it will run until you explicitely stop it with mcc_kill, or until
the system reboots.

-- Erik


3018.2more infoDADA::DITMARSPeteMon May 18 1992 16:1426
	1. If I start the iconic map with notification enabled with a 
	config file I get the following errors and notification has to be 
	restarted.
	
	In pop-up windows:

	Unexpected condition returned to notification pm
	Error trying to receive a packet

	and in the owning process:

	disaster IPC server unmarshalled input sanity check.

What was in your notify request startup file?
How many domains do you have defined?
We have done work since t1.2.7 to reduce resource consumption.  This error
message has been associated with doing notification on many (hundreds) of 
domains.

	4. A performance rule on a DECnet circuit fails

Try specifying the timespec using all three fields, i.e. 
               AT EVERY 00:15:00, FOR START -0:00:00 DURATION=00:01:00 )   ,-
Timespec parsing can be, unfortunately, different under the two operating 
systems.  Relying on defaults can get you in trouble.

3018.3more infoWELLIN::MCCALLUMTue May 19 1992 09:0325
    
    I have about 20-30 domains and  my notify startup file
    contains two lines
    
    Notify Domain .WORLD
    Notify Domain welwyn entity list = (collector number1), events=(any
    event)
    
    This time when I restarted it I got
    AES_COPY
    Bad AES had version: 1
    
    AES_bomb
    Excreption in mgmt module id-10 CMA code 177db005
    
    It started the second time
    
    The alarm rule now works with fule time specification - thanks.
    
    Starting the data collector is normal if I am not in the root partition
    (/) with a tmp directory.
    
    Dave.
    
    
3018.4Operating system patches ?WELCLU::MCCALLUMTue May 19 1992 09:143
    
    I'd better check I've got all the ULTRIX patches on - could it be
    related ?
3018.5I suspect resources...DADA::DITMARSPeteTue May 19 1992 21:5121
As far as applying the ultrix patches: yes, you definitely should.
But, the ultrix kernel patches typically prevent system crashes and launched
application annoyances... they haven't been shown to correct "minor" 
notification problems (yet).

The aes_bomb thing could just be another symptom of the fact that you're
running out of some kind of resource during communication between the
notification FM (mgmt module 10) and another FM.  My estimate of how
many domains it would take to exhaust resources is not relevant if your
system is mega-under-configured.

You should check that your kernel is configured as specified in the 
release notes and planning/installing ultrix guide.  If your version of the 
guide doesn't contain information on how the kernel should be configured, 
let us know and we'll post it.  What is max_nofile in your
/sys/conf/mips/param.c?  The guide recommended 256, but locally folks have been
recommending much higher numbers, and the next version of the guide will too.

Are you doing development or testing of any management modules other than the 
BMS kit that we produce?  Is it possible that these mgmt modules haven't
been linked against the t1.2.7 kernel?
3018.6I will checkWELCLU::MCCALLUMThu May 21 1992 16:179
    
    I have lots of swap space and max_nofile is 256. We made the mods as
    per 1.2.4 installation notes, but I must admit we have changed them , so
    if there anything different for 1.2.7 that may be it. I will check.
    
    This problem only happens on first startup - ie when there are no MCC
    processes already. If I exit and re-enter it works ok.
    
    Dave
3018.7#2 was a bug, fixedTOOK::JEAN_LEEWed May 27 1992 13:149
    
    Reply to 3018.0 #2 question:
    
    	Yes, it was a bug. It was addressed and fixed.
    
    	Thank you for your input.
    
    	Jean