[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::sns

Title:POLYCENTER System Watchdog for VMS OSF/1 ULTRIX HP-UX AIX SunOS
Notice:Wishes:406,FAQ:845,Kits-VMS:1000,UNIX:694 VMS ECO01 FT kit: 521
Moderator:AZUR::HUREZZ
Created:Fri May 15 1992
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1033
Total number of notes:4584

990.0. "$sense wat show event/conti not update the time stamp of event" by 23111::WILLIAMCHAN () Thu Jan 30 1997 02:45

Hello,

	I've seen this on customer and both our MIS dept. Also, I can
	reproducce on my machine. I would like to know is it related
	to our setting or really need a patch?

	Thanks.

The consolidator setting, and the polling interval is set to 60.

HTSC04::Williamchan > sens wat

Copyright (c) 1994 Digital Equipment Corporation. All Rights Reserved
SNS> show conso/fu
Controller       : V2.2-02

Consolidator     : 6601 V2.2-02
Profile          : DISK$USER_01:[WILLIAMCHAN]HTSC04.DAT;18
Log file         : SNS$LOG.DAT  Disabled
Action routines  : Enabled
DECtalk          : Enabled
Mailbox          : Enabled
Polling interval : 60
Before setting   : Not specified
Since setting    : Not specified
Watchdog information:
   Node    Status    Class            Version  OS Version
  HTSC04  Enabled   HTSC04            V2.2-02  VMS V6.1
  HGOSPS  Enabled   DEFAULT           ??.??


I've created an event that in case the pwrk$monitor process is gone, I'll receive
a mail.

SNS$EDIT> show node htsc04/all
Node : HTSC04
  Class           : HTSC04
  External class  : DEFAULT
  Time difference : +00:00:00
  Transport       : DECNET
  DFS event codes : DSS
  Exclusions list :
    Event code           Device name
    SWL                  CSA*
  Processes list :
    Process              Uic                                          Interval
    pwrk$monitor         [1,4]                                        00:10:00



With the show class information:
SNS$EDIT> show class htsc04
Class : HTSC04
  Event                        Priority                Options
  CPU CPU errors                High
                                                 Action Routine : E-MAIL
                                                 Mailbox        : HTSC04
                                                 Severity level : FATAL
  MEM Memory errors             High
  DSK Disk errors               High
  ETH Ethernet errors           High
  HSC HSC problems              High
  CIC CI cable problems         High
  PRS Printers stalled          Low
  LOP Processes looping         High
  DNF Disks near full           High
  SHS Shadow set Problems       High
  DSS Disk state Problems       High
  DQP Device queue problems     <- Not checked
  BQP Batch queue Problems      <- Not checked
  QCP Queue manager Problems    High
  PRO Missing Processes         High
                                                 Action Routine : E-MAIL
  BAT Missing Batchjobs         High
  ILL Login limits too low      High
  SMP Processors stopped        High
  UNR Nodes unreachable         High
  ORS Nodes out of resources    High
  UNK Nodes unknown             Low
  TIM Time consistency          Low
  WDM No SNS server             Low
  OTH Connection Problems       Low
  SNS SNS internal messages     Low
  SWL Software write locked     Low
  DMM Disabled memory           Low
  VAL Validation error          Low

And the SNS> show event/conti, the time stamp is always shown me the time 
which indicate the first polling but not giving me update at time interval
that already defined.

For example:

The display shown:

	system watchdog: 30-jan-1997 12:37

	High-pripority message

30-jan 11:59 hgosps unreachable

30-jan 11:58 htsc04 process pwrk$monitor with uic=[1,4] is missing





Further information below:

SNS$EDIT> show all
Profile name              : DISK$USER_01:[WILLIAMCHAN]HTSC04.DAT;18
Profile version           : 1.0-0

CONSOLIDATOR Parameters
  Polling interval        : 60
  Action routines switch  : ON
  DECtalk switch          : ON
  Mailbox switch          : ON

DISPLAY Parameters
  Active window           : Both
  Highlight time          : 00:05:00
  Scrolling               : ON

Node : HGOSPS
  Class           : DEFAULT
  External class  : DEFAULT
  Time difference : +00:00:00
  Transport       : DECNET
  DFS event codes :
  Exclusions list :
    Event code           Device name
    SWL                  CSA*

Node : HTSC04
  Class           : HTSC04
  External class  : DEFAULT
  Time difference : +00:00:00
  Transport       : DECNET
  DFS event codes : DSS
  Exclusions list :
    Event code           Device name
    SWL                  CSA*
  Processes list :
    Process              Uic                                          Interval
    pwrk$monitor         [1,4]                                        00:10:00

Class : DEFAULT
  Event                        Priority                Options
  CPU CPU errors                High
  MEM Memory errors             High
  DSK Disk errors               High
  ETH Ethernet errors           High
  HSC HSC problems              High
  CIC CI cable problems         High
  PRS Printers stalled          Low
  LOP Processes looping         High
  DNF Disks near full           High
  SHS Shadow set Problems       High
  DSS Disk state Problems       High
  DQP Device queue problems     <- Not checked
  BQP Batch queue Problems      <- Not checked
  QCP Queue manager Problems    High
  PRO Missing Processes         High
  BAT Missing Batchjobs         High
  ILL Login limits too low      High
  SMP Processors stopped        High
  UNR Nodes unreachable         High
  ORS Nodes out of resources    High
  UNK Nodes unknown             Low

  TIM Time consistency          Low
  WDM No SNS server             Low
  OTH Connection Problems       Low
  SNS SNS internal messages     Low
  SWL Software write locked     Low
  DMM Disabled memory           Low
  VAL Validation error          Low

Class : HTSC04
  Event                        Priority                Options
  CPU CPU errors                High
                                                 Action Routine : E-MAIL
                                                 Mailbox        : HTSC04
                                                 Severity level : FATAL
  MEM Memory errors             High
  DSK Disk errors               High
  ETH Ethernet errors           High
  HSC HSC problems              High
  CIC CI cable problems         High
  PRS Printers stalled          Low
  LOP Processes looping         High
  DNF Disks near full           High
  SHS Shadow set Problems       High
  DSS Disk state Problems       High
  DQP Device queue problems     <- Not checked
  BQP Batch queue Problems      <- Not checked
  QCP Queue manager Problems    High
  PRO Missing Processes         High
                                                 Action Routine : E-MAIL
                                                 Mailbox        : HTSC04
                                                 Severity level : WARNING
  BAT Missing Batchjobs         High
  ILL Login limits too low      High
  SMP Processors stopped        High
  UNR Nodes unreachable         High
  ORS Nodes out of resources    High
  UNK Nodes unknown             Low
  TIM Time consistency          Low
  WDM No SNS server             Low
  OTH Connection Problems       Low
  SNS SNS internal messages     Low
  SWL Software write locked     Low
  DMM Disabled memory           Low
  VAL Validation error          Low

External messages class : DEFAULT
     Match String              Priority                Options
  1  *                          Low
Action routine set : E-MAIL
  Action routine mode : SPAWN
  VMS command         : mail miss.dat williamchan
  Logfile name        : SYS$MANAGER:SNS_EDITOR.LOG
  Logfile switch      : ON

Mailbox set : HTSC04
  Mailbox name : williamchan



	I wonder why the display didn;t show the current status of the 
	detected event for example if the polling interval is 10 minutes;
	then the event window should look like:

	30-JAN 11:00 HGOSPS unreachable                                   
	30-JAN 11:00 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing

	and then after 10 minutes later and later, it should be:

	30-JAN 11:00 HGOSPS unreachable                                   
	30-JAN 11:00 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing
	30-JAN 11:10 HGOSPS unreachable                                   
	30-JAN 11:10 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing
	30-JAN 11:20 HGOSPS unreachable                                   
	30-JAN 11:20 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing
	30-JAN 11:30 HGOSPS unreachable                                   
	30-JAN 11:30 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing
	30-JAN 11:40 HGOSPS unreachable                                   
	30-JAN 11:40 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing
	30-JAN 11:50 HGOSPS unreachable                                   
	30-JAN 11:50 HTSC04 Process pwrk$monitor with UIC=[1,4] is missing

	and so on?

	Thanks for any hints.

Best regards,
William
T.RTitleUserPersonal
Name
DateLines
990.1Timestamp doesn't evolve upon unchanged event messagesAZUR::HUREZConnectivity &amp; Computing Services @VBE. DTN 828-5159Thu Jan 30 1997 06:5612
    The timestamp denotes the date at which the event was detected for
    the first time, or the date at which it was updated (because of a
    change in its status details, e.g. a disk had 3 errors and now has 4,
    a queue was stopping and now is stopped, etc...)
    
    In the process missing case, there's no update.  At each poll time,
    the Consolidator gets the same process missing message and therefore
    marks it as unchanged and does not re-act upon it.
    
    Best Regards,
    
    	-- Olivier.