[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1336.0. "Alarms on differences of counters" by BERN01::GMUER () Wed Aug 14 1991 07:19

Hi, 

A customer wishes to set alarms on differences of counters between two
polls, e.g.

  <value (current poll)> - <value (last poll)>  <rel. op.>  <constant>

This is a mix between the comparison and the change_of rule format. It
would be a simple way to set alarms on performance or error rates for
TCP/IP and Ethernet net devices.

Thank you for any help and regards

Edgar 

T.RTitleUserPersonal
Name
DateLines
1336.1example pleaseTOOK::STRUTTManagement - the one word oxymoronWed Aug 14 1991 22:146
    Without doubting for a minute the usefulness of what your customer is
    asking, please could you provide an example of where this would be
    useful.  It always helps....
    
    Thanks in advance
    Colin
1336.2An exampleBERN01::GMUERThu Aug 15 1991 09:1916
Re. 1

The customer wants to manage Wellfleet Routers. There is no statistics
partition in the SNMP AM, but we could test some counters:

Examples:    SNMP xxxxxx INTERFACE 1 IfInOctets
             SNMP xxxxxx INTERFACE 1 IfInErrors

But these are absolute values. The customer is interested in the delta
values of the counters to answer following types of questions:

     If we have more than 10000000 IfInOctets/hour then give an alarm.

I hope this helps,

Edgar
1336.3TCP/IP PA in V1.2DELNI::R_PAQUETTue Aug 20 1991 17:344
    
    
    	In V1.2 of DECmcc, there will be a TCP/IP Performance Analyzer FM. 
    This FM will provide statistics based on SNMP counters.
1336.4BSYBEE::EGOLFJohn C. Egolf LKG2-2/T02 x226-7874Tue Aug 20 1991 23:235
	Point of clarity...

	There won't  be  a  TCP/IP PA FM, the TCP/IP statistics will be
	apart of the  current  DECnet and bridge PA.  There will be one
	PA FM for all the above objects.
1336.5sorry, not in 11TOOK::CALLANDERJill Callander DTN 226-5316Mon Aug 26 1991 15:5012
    as to the original question, no you can not in v1.1 do what your
    customer is asking. I wish I could even give you a hack, but none
    come to mind. In v1.2 even without statistics there could be a
    way with the new occurs functions coming from alarms (if an event
    occurrs x number of times in n number of minutes), but for 1.1
    I am sorry there is nothing unless you want to write a quick plug
    in FM to test the value and generate your own event on the change
    (the yourmm example code in the toolkit would make this a feasible
    answer if this is really a hot button with a potential customer).
    
    jill
    
1336.6I need this too!HOTWTR::MURRAY_RUMon Oct 21 1991 20:5324
    This is an old note but I have the same problem.  In fact I think this
    is a MAJOR hole in MCC!  I have a segment in a building that sees
    occasional CRC errors.  On the order of 1-15 a day.  I would like to be
    able to alarm if the number of CRC (or bad frame etc) errors is greater
    that n in any poll cycle.  The majority of the capability is already
    there.  the *,* syntax to denote a change implies that mcc knows the
    old number and the current number.  It would sure be nice if I could
    indicate the old value plus some constant as the alarm level.  Having
    to hard code numbers for errors is awful!  
    
    Even the new function of being able to track an error by the number of
    times that it occurs while better, implies that I have to wait n
    polling cycles in order to detect the problem.
    
    I am currently working with LANbridge 200s.  The real functionality
    that I need is to detect error conditions within a short period of
    time.  If there is indeed no direct way to do this at this time does
    anybody have a suggestion of how to do this?  *,* works but only at the
    expense of seeing many many false alarms.
    
    
    
    Thanks for any help or direction
    Russell Murray
1336.7rule against statistic???JETSAM::WOODCOCKTue Oct 22 1991 11:1619
Hi Russell,

Have you tried looking at using an alarm against a *statistic*. I don't play
with bridges so I'm not sure what is available for stats but... In the node4
world the PA module gives a 'count' of certain errors as an attribute. The
value or count received is for each given polling period. MCC theory states 
you should be able to write an alarm against any of these attributes.

A node4 example would be:

expression=(node4 FOOBAR circ SYN-0 COUNT OF FORWARDING CONGESTION LOSS >0,-
            at every 00:05:00)

This rule would check to make sure there was no congestion loss for each 5 
minute interval.

best regards,
brad...

1336.8Yes, but the statistic always changes.HOTWTR::MURRAY_RUTue Oct 22 1991 15:0017
    HI,
    What you say is true.  However, most of the counters that I have dealt
    with in bridges and VAXes are absolute numbers that increment
    over time.  They are not relative to anything except the last time they
    were zeroed.
    The number I am currently working with in an attempt to solve my
    problem is BAD FRAMES RECEIVED as reported by the lanbridge 200 to MCC. 
    This part works very well.  However, this morning the count was 145. 
    This afternoon it will probably be about 153.  Thus if I say alarm if
    the count is greater than 160, I will have to go reset the alarm
    tomorrow, and 2 days after that again to some other number.  I guess I
    could reset the counters every day but that is very tramatic on a bridge
    as the only way to do it is to reset the entire bridge.  This goes
    through power up diags and generally drops all active connections.
    
    Thanks for your suggestion,
    Russell Murray
1336.9another explanationJETSAM::WOODCOCKTue Oct 22 1991 16:5634
>    What you say is true.  However, most of the counters that I have dealt
>    with in bridges and VAXes are absolute numbers that increment
>    over time.  They are not relative to anything except the last time they
>    were zeroed.

Hi,

I here you and I think you misunderstood what I suggested. Counters are
absolute and I agree they are only relative to the total time since
zeroed. But *STATISTICS* are not, they are relative to the poll interval.
For instance, I see an algorithm in the PA (pg. C3) manual for bad frames.
It seems to be a percentage computation. You *might* be able to write an
expression like this:

(bridge <name> line <name> bad frames >0,at every 00:02:00)

This expression would poll each two minutes and calculate the bad frame %
for *each* two minutes, if the % >0 for any 2 minute interval (ie. you
received bad frames during that interval ONLY) the alarm fires. The PA
module should get the delta from the counters for the beginning of the
interval to the end of the interval, and then calculate the percentage
against the delta counters. Somebody please step in if I'm all wet on my
understanding of this. There are also a few other stats which might be
appropriate for this particular job. They are listed on C3 in the PA
appendix or simply do a (not positive of syntax):

 SHOW BRIDGE <bridge> LINE <line> ALL STATISTICS, for duration 0:0:30

to get a quick idea of what is available (but double check the formulas
to make sure you get what you want). Other potentials are, Inbound Frames
Lost, Outbound Frames Lost, Transmit Frames Lost, etc...

hoping I'm right, and hoping this helps,
brad...
1336.10Why PA?BERN01::GMUERMon Nov 11 1991 06:5813
re .3, .7, .9

All replies to the problem of setting alarms on delta values of counters
points to performance analyzer modules as the solution. I would like to
remember the simple fact, that PA modules are not generic. We can get
statistics for DECnet, bridges and soon for TCP/IP, but what shall we do
with all the other entities in the world ?

I do not see the point, why such a simple idea should not be implemented
in DECmcc. It is an extension of the CHANGE_OF rule expression and solves
a lot of simple performance or error rates problems in a generic way.

Edgar
1336.11I'll agree with thatJETSAM::WOODCOCKMon Nov 11 1991 12:3918
You're right. Actually I think what I suggested really doesn't do much
more than todays change_of function anyway because I used "0" as the
stat. It would be quite helpful if the ALARMS team could implement:

change_of({entity}{attrib}  > 10,at every {interval})

which is what base note requested to begin with. Using PA gives a little
more control than change_of if the value is raised above 0 somewhat but
a lot of times the percentages can be misleading because of lack of
utilization of the circuit. An example is that a circuit with 10 errors
will show a higher error percent value with little utilization compared to
a higher utilized link, although there are still the same 10 errors. That's
why I initially recommended looking for a 'count of' stat which was relavent,
which accomplishes the above expression using PA. Although I agree a general
solution would be more appropriate for the future without using PA.

best regards,
brad...
1336.12Ideas?...TOOK::ORENSTEINTue Nov 12 1991 17:2418
    
    I will file this in the SUGGESTION box for the next round of
    development.  It seems to be important to people.
    
    Would anyone like to suggest some nice syntax for the kind of
    function you are looking for?  Although many of the features
    suggested are great, it is tricky to figure out how to make
    them usable from a command-line/windows point of view...
    
    One thing we have in mind is this:
    
    CHANGED_OF (entity attribute_group, value1, value2)
    
    This would let you create one rule for a change within an attribute
    group.  This function would probably be limited to some number of
    datatypes -- like the ones we already support :)
    
    aud
1336.13Suggestion for the syntaxBERN01::GMUERMon Nov 18 1991 12:047
Another idea for the syntax would be

  CHANGE_OF(entity attribute,rel_operator,value)

It has the same syntax as the already known CHANGE_OF rule.

Edgar
1336.14Binary Expression Evaluation RoutinesMOLAR::ROBERTSKeith Roberts - DECmcc Toolkit TeamMon Nov 18 1991 12:4320
The Alarms Rule Evaluator should work just like a programming language; that
is you should be able to build more complex expressions by combining existing
functions together.

  (node4 rome maximum address > 50)

This type of expression generates a boolean result; True or False.
It should be thought of as:

  if ((node4 rome maximum address > 50) = True) then ...

So, with the change_of function, you would like to write:

  change_of( (node4 rome maximum address > 50), *, *)

This would notify you if the boolean result *changes* value.  If the value
went from 40 to 60, you'd get one notification.  While it stayed above 50
you wouldn't get any more notifications ... until it goes below 50 again.

/keith
1336.15Better make my self clearMOLAR::ROBERTSKeith Roberts - DECmcc Toolkit TeamMon Nov 18 1991 14:229
re: .14

I didn't mean to say that today, you can write a change_of function which
allows an expression as the first argument -- Because you can't

I did want to describe, however, the direction that I'd like to see
the Rule Evaluator to take.

/keith
1336.16Nice idea, but ...BERN01::GMUERTue Nov 19 1991 07:1712
re .14 .15

I like Keith's suggestion. It shows the way to go. But the construct

>>>  change_of( (node4 rome maximum address > 50), *, *)

does not solve our problem. We need an alarm on the delta of a counter,
thus the difference of the first and second value of the counter should
be evaluated.

Edgar