[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

3768.0. "alarm scheduler" by CTHQ::WOODCOCK () Fri Sep 18 1992 13:31

Hi there,

I'm looking for a little better understanding of how the alarms scheduler 
works. I have noticed a couple of characteristics in the past. The first is
that most times the POLLED time is quite close to what one would think to be 
the SCHEDULED time. That is, if the interval is every half hour it polls very
close to every half hour. The other observation is when the system can't handle
the poll period. Example: poll 100 objects every 5 minutes but it takes 12
minutes to poll all the objects. How does MCC handle this? Does it just keep
scheduling every 5 minutes and skip all those which can't be honored or does
something else happen? The example (from above) would be that it makes the
first poll (which takes 12 min.) then starts the next poll 3 minutes later (the
15 minute mark). Is this true??

thanks for any help,
brad...
T.RTitleUserPersonal
Name
DateLines
3768.1alarms scheduling is very simple (?)MOLAR::ROBERTSKeith Roberts - Network Management ApplicationsFri Sep 18 1992 14:2438
  Brad,

  If you tell alarms to run every 30 minutes, it polls for data when
  you enabled the rule .. and requests the Information Manager to keep
  track of the time and let alarms know when the scheduling time is up,
  then alarms polls again.

  The first poll will require the MM which alarms calls to get loaded 
  in memory .. this might take a second or two.  Then, accessing the
  entity might take many seconds or minutes.  The time stamp alarms
  returns is when alarms actually evaluates the rule.  There is also
  a timestamp from the polled entity .. which is usually a tiny bit
  different from the timestamp alarms returns.

  In your example (100 entities polled every 5 minutes) ... If the poll
  takes longer than scheduled time, the Information Manager returns a
  'time already passed' error to alarms.  Alarms will ignore this error
  (up to 10 times I think) and just keep calling the IM until the call
  goes through .. in your example the timing would be:

	schedule time = at every 00:05:00

	time 00:00:00	Alarms processes the 100 rules - takes 12 minutes
	time 00:10:00	Missed this one
	time 00:15:00	The first call took 12 mintues, so the IM waits
			for the *next* scheduled time.  this takes 12 minutes
	time 00:20:00	Missed this one
	time 00:25:00	Missed this one too.
	time 00:30:00	The second poll took 12 minutes so we missed the 20
			and 25 minute schedule .. alarms polls again.

  Yuck .. so, your 5 minute scheduling turned into 15 minutes scheduling.

  I thought this was going to make things clearer - but rereading it, I
  don't know.  Did this help?

  /keith
3768.2understood, but what ifCTHQ::WOODCOCKFri Sep 18 1992 14:5136
Hi keith,

>  Yuck .. so, your 5 minute scheduling turned into 15 minutes scheduling.
>
>  I thought this was going to make things clearer - but rereading it, I
>  don't know.  Did this help?

Yes sir it does, but for a sanity check I added a couple of words to the 
example. Does it still look accurate?


	schedule time = at every 00:05:00

	time 00:00:00	Alarms processes the 100 rules - takes 12 minutes
>       time 00:05:00   Missed this one
	time 00:10:00	Missed this one
>	time 00:15:00	The first call took 12 mintues, so the IM waits
			for the *next* scheduled time.  the next sceduled time
                        is the 15 minute marker so alarms polls again here.
                        this takes 12 minutes
	time 00:20:00	Missed this one
	time 00:25:00	Missed this one too.
	time 00:30:00	The second poll took 12 minutes so we missed the 20
			and 25 minute schedule .. alarms polls again.

What happens when an interval is missed? Does the user see the error in any
form (alarm is in batch)? Comments: I certainly wouldn't want to get an
exception for every wildcarded entity polled within the rule (which I don't 
think is the case), but I would like to see a single error somewhere, somehow, 
someshape. Does an internal event get generated? If the alarm is DOMAIN xxx 
RULE yyy will the following rule fire if an interval gets missed?: 

expr=(occurs(domain xxx rule * OSI RULE EXCEPTION))

many thanks,
brad...
3768.3No notification is made if a time interval is skippedMOLAR::ROBERTSKeith Roberts - Network Management ApplicationsFri Sep 18 1992 16:0832
>What happens when an interval is missed? Does the user see the error in any
>form (alarm is in batch)? Comments: I certainly wouldn't want to get an
>exception for every wildcarded entity polled within the rule (which I don't 
>think is the case), but I would like to see a single error somewhere, somehow, 
>someshape. Does an internal event get generated? If the alarm is DOMAIN xxx 
>RULE yyy will the following rule fire if an interval gets missed?: 
>
>expr=(occurs(domain xxx rule * OSI RULE EXCEPTION))

  Nope 8{

  The only time you'll get an indication that a time interval has been 
  missed is if there are 10 in a row (I think the number is 10).

  So, if you said 'at every 00:00:01' ... and it took 11 seconds to
  process the rule, you'll get the "time already passed" exception back
  from the evaluation process via, I believe, the "Error Condition"
  status attribute.

  Now you probably want to know:

  (Q) "How can I find out when a time interval gets skipped? (other than
      when 10 of them are skipped)"

  (A) You can't that I can think of.

  Might be nice if Alarms maintained a counter attribute: "Time Intervals
  Missed".  If you saw this counter going up regularly you'd know there
  was a problem.  Could even imagine getting PA involved to calculate
  statistics.

  /keith
3768.4Intervals missed=good ideaCTHQ::WOODCOCKFri Sep 18 1992 17:3515
>  (A) You can't that I can think of.

Bummer eh... 

>  Might be nice if Alarms maintained a counter attribute: "Time Intervals
>  Missed".  If you saw this counter going up regularly you'd know there
>  was a problem.  Could even imagine getting PA involved to calculate
>  statistics.

Interfacing to PA probably isn't worth it but "Time Intervals Missed" would
be a useful counter for figuring out what's amuck when alarms aren't coming
when you think they should be.

thanks,
brad...