[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1203.0. "ALARMS LOG FILE - NEED A DOCTOR" by ZPOVC::KHENGLIM () Mon Jul 01 1991 15:12

ALARMS LOG FILE - NEED A DOCTOR
===============================


I have this problem to report.  I have three alarms log files;
BRIDGE_ALARMS.LOG for all the bridges,
NODE_ALARMS.LOG for all the Phase IV nodes,
TS-ALARMS.LOG for all terminal servers.


SAMPLE OF THE ALARM RULE PROCEDURE
----------------------------------
CREATE MCC 0 ALARMS RULE SIN999_NODE_UNREACHABLE               -
 EXPRESSION        = (NODE4 .MCC.SRPC.SIN999 STATE <> ON      ,-
                      AT EVERY 00:30:00)                      ,-
 PROCEDURE         = MCC_COMMON:MCC_ALARMS_LOG_ALARM.COM      ,-
 EXCEPTION HANDLER = MCC_COMMON:MCC_ALARMS_LOG_EXCEPTION.COM  ,-
 PARAMETER         = "NODE_ALARMS.LOG"                        ,-
 CATEGORY          = "Node unreachable"                       ,-
 DESCRIPTION       = "SIN999 NODE UNREACHABLE"                ,-
 QUEUE             = "ALARMS$BATCH"                           ,-
 PERCEIVED SEVERITY= MAJOR                                    ,-
 IN DOMAIN         = .MCC.SRPC_DOMAIN 



The following procedures was carried out to activate the various alarms
rules for bridges, Node4 and terminal servers.

$MANAGE/ENTERPRISE
MCC>DO ENABLE_DECBRIDGE.COM
MCC>DO ENABLE_NODE4.COM
MCC>DO ENABLE_TS.COM


I also activate the DECmcc Iconic Map on a separate instance beforehand.
The version of DECmcc in used is V1.1

Assuming that two of the Node4 entity in the Iconic Map then turns red
due to exception critical conditions.  Only one of the alarms are saved
onto the NODE_ALARMS.LOG file.  I did various tests and confirm that
if both alarms rule were fired 30 seconds or more apart then both
alarms were saved onto the NODE_ALARMS.LOG file whereas if both alarms
rules were to fired immediately one after another then only one of
the alarms were saved onto the NODE_ALARMS.LOG file.

The same principle applies to both bridges as well as terminal servers.
I also did another test, ie. fired two alarms rule simultaneously but
saved onto different log file.  It work fine as both alarms are captured
onto both log files respectively.

I can temporary solve this problems by using different log files for
each alarms rule created.  The consequences of this is that there will
be too many alarms log files created.


Has anyone experienced similar problems before?  

Are there any better solutions or other alternative plan of actions?

All feedbacks are most welcome.


THANKS IN ADVANCE


T.RTitleUserPersonal
Name
DateLines
1203.1Yes, it's is a problemTOOK::ORENSTEINMon Jul 01 1991 17:4120
    
    Yes, this is a known problem.
    
    If rules fire at the same time, and use the same logfile, 
    the logfile is opened on a first-come-only-served basis.  
    There is no retry builtin if successive rules can not open
    the file.
    
    I don't think that this will be fixed in V1.2, and with
    the Map being so snazzy, the ALARMS command procedures are
    becoming a smaller focus.
    
    As for things to do, you could edit the sample command
    procedure to make it retry, or write the message to a unique
    temporary file and then append it to the logfile.
    
    PS.  I'll make sure this reaches the next round of release
         notes.
    
    aud...
1203.2Use a single job batch queueNSSG::R_SPENCENets don't fail me now...Mon Jul 01 1991 18:4615
    I believe that the reccomended workaround is to use a batch queue for
    the alarms proceedures that will write to a log file that has a limit
    of aonly one job at a time. That way if two alarms fire at one time,
    each proceedure will run and close the log file before another
    proceedure will open it.
    
    To the developers:
     I understand why the focus is on the map. However, it might serve to
    simply document the above reccomendation (I got it from engineering
    myself) in both the proceedures, and perhaps in the manual.
    
    As far as the broadcast and mail proceedures go, they seem to be happy
    with multiple copies running at the same time.
    
    s/rob
1203.3Thanks for the memory ...TOOK::ORENSTEINTue Jul 02 1991 12:428
    
    Thanks for the suggestion.  I will see that it is clearly documented
    in both the procedure and the manual.
    
    And yesm the broadcast and mail have no problems because the data is
    written to unique temporary files before being mailed or sent.
    
    aud...
1203.4It is already in the Alarms manualDFLAT::PLOUFFEJerryTue Jul 02 1991 17:5312
re: -.1

  > Thanks for the suggestion.  I will see that it is clearly documented
  > in both the procedure and the manual.

  I'm not so sure how clear it is, ;) but...

  This point is already documented in the manual.  See page 4-2 (Log
  Notification for Alarm Conditions) and 4-4 (Log Notification for Error 
  Conditions).

                                                                  - Jerry