[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

3385.0. "Rule firing, but not correctly" by HANNAH::B_COBB () Mon Jul 20 1992 11:08

    When I create a rule that states:
    
    
    (node4 noon remote node noon state = unreachable, at every 00:05:00)
    
    And the severity is critical...  When the node goes down, instead of
    the rule firing with a RED critical severity, the rule fires with an
    exception of "Node not currently accessible" which has a severity of
    intermediate.  Shouldn't the rule fire with a critical severity?  After
    all, the node is not reachable anymore.  Did I construct the rule
    correctly?  The node I am testing is not a router, but an end node.
    
    Thanks for any help
    
    Bill 
T.RTitleUserPersonal
Name
DateLines
3385.1No, it's really an exception...TOOK::MINTZErik Mintz, dtn 226-5033Mon Jul 20 1992 11:2919
When you try to determine the attribute (node4 A remote node B state ...)
the DNA4 AM contacts node "A", and requests the information about
node "B".  In your case, since "A" and "B" are the same, when the node
goes down, the AM is unable to read the information, and returns an exception.
The alarms FM is then acting on the exception.  The alarms FM has no
protocol specific information that would allow it to realize that
the exception indicates the condition for which you were testing.

The long term solution to this problem is for the DNA4 AM to return
a synthesized "reachability" attribute, like the "IPreachability" provided
by the SNMP AM.  In that case, the AM can try to communicate with a node,
and then translate the resulting exception into an attribute, since
the AM has protocol specific information about what attributes should mean.

In the short term, your best bet is to use different values for "A" and "B"
(that is, essentially, ask some other node whether "noon" is reachable).

-- Erik

3385.2HANNAH::B_COBBMon Jul 20 1992 11:526
    Thanks for the answer.  Should the node to be queried be a routing
    node or can it be any end node?
    
    Thanks
    
    Bill
3385.3End nodes only cache?TOOK::MINTZErik Mintz, dtn 226-5033Mon Jul 20 1992 13:124
I believe that end nodes have a cache of reachability information for those
nodes which they have tried to reach recently.

-- Erik
3385.4ask a routerCTHQ1::WOODCOCKMon Jul 20 1992 15:186
You would want to ask a ROUTER IN THE SAME AREA about the reachability of
another node. End-node routing databases only contain an entry for its
designated router and therefore won't tell you what you want to find out.

best regards,
brad...
3385.5Works, but seems unreliableHANNAH::B_COBBMon Jul 20 1992 16:0016
    Yes, I have tried this with a level 4 routing node.  It seems not to be 
    to reliable.  I set up a reachability rule for node X and brought node
    X down.  The routing node still showed node X as reachable.  I waited 
    and it still was reachable.  I had to physically try to set host to 
    node X before the routing node declared it unreachable.  I saw the
    adjacency for node X drop right away on the routing node's console,
    but node X was still listed as "reachable" until I tried to contact
    it.  This seems a bit hokey.  If a node becomes unreachable, I want 
    my rule to file as soon as possible.  
    
    Any comments on how to make this a bit quicker?  How does everyone else
    handle NODE4 reachability?
    
    Thanks,
    
    Bill  
3385.6exceptions or eventsCTHQ3::WOODCOCKMon Jul 20 1992 18:2634
Hi Bill,

Yes, you are correct with your testing. I went thru the same scenerio when
MCC first came out and found that routers can take up to 5 minutes lag time
before a remote node is changed to unreachable. This is an anomaly of DECnet
which can't be avoided. You can use a couple of methods for determining
reachability more quickly. The first is to poll the node directly and let
exception handling fire if the poll fails.

expression=(node4 xxxxxx buffer size<>576)

The above expression should never fire except when the poll fails. Actually
what I use is a dual purpose alarm with an expression of:

expression=(node4 * circuit * substate <>none)

This polls all node4's in a domain and ensures all the circuits are up. Also,
if any node goes down we'll get an exception and get notified anyway.

You could also use DECnet events to find out what is going down. You have
already seen how the adjacency events are very quick to be generated. This is
due to the fact that these events are from the DATA LINK layer and not the
ROUTING layer of DECnet.

expression=(occurs(node4 xxxx adjacent node * adjacency down))  syntax??

Be careful with the above expression because once you set up the event sink
you will be receiving adjacency events for all nodes adjacent to the router
sending the events to you. These events can cause a heavy load on your system
especially if you're using alarms against them if there is a lot of adjacencies.
You also need TARGETs set up to highlight the proper node.

best regards,
brad...
3385.7HANNAH::B_COBBMon Jul 20 1992 19:5423
    Thanks for the help.  I was polling nodes themselves with:
    
    (node4 mynode remote node mynode if state = unreachable, at every XX:XX)
    
    
    This works nice, but you get an exception instead of a rule fire and
    you get the intermediate severity and it's color.  I wish we could just
    poll a node and if it does not answer, then fire a rule of your choice.
    
    I also do not like the idea that when the rule/exception fires, the
    routing icon changes color instead of the "problem" icon.  But that
    is another issue discussed elsewhere in the conference.  I do not 
    think I want to sink because of what you mentioned about the machine
    getting pounded with events.  As it is my machine slows down with 20
    rules running and I have all of these NML links (from the rules I
    believe for some reason) that create logfiles galore.  I am still
    trying to figure out my strategy with MCC, but it looks like it is
    going to be more difficult than I thought.  
    
    
    Thanks for the help.
    
    Bill
3385.8Not a pretyy solution, but ....MLNCSC::BARILAROTue Jul 21 1992 10:2847
Hi Bill,

	I had your same problem with Node4 reachibility with MCC v1.1 and 
	also v1.2. As other people wrote before there isn't a easy way to
	have the correct notification and graphics informations.
	
	When I started using MCC I would like something that when one node 
        is up the icon is green and when it's down the icon is red.

	The way I use is to ask directly to a node4 something that is
	always true when one node is up, for example I use sintax like:

	expression   		= (Node4 xxxx state = ON, at every 00:10)
	Perceived Severity	= Warning (green)

    	So, every xx minutes you'll receive a WARNING alarm, when the
        node is down you receive an exception, with 
	MCC v1.1 it was "quite" fine because the exception was
	linked to the critical color (RED), with v1.2 (the version
	you probably have) it's not the same, the icon become 
	indeterminate color (Light Blue I think), so I had to 
	modify on the CUSTOMIZE window option the ALARM colors and I 
	associated the red color to the INDETERMINATE alam.

	I agree it isn't a clean a very intelligent  solution, but .... 
        
	One big problem that I had with this kind of rule (I use the
	same logic, asking something that always happen,  also for 
	Stations and Bridges) it's that when the node4 is up, every
	xx minutes you receive a WARNING alarm, and when it's down you
	receive every xx minutes an EXCEPTION. There isn't nothing to do
	if you want to use the NOTIFICATION window, you'll receive an alarm
	every xx minutes, but if you want to use mails or broadcast 
	command files, it's possible to modify them to receive only ONE
	mail (or broadcast) when the node goes down an another when it
	goes up.

	You should modify the standard command files, check the existence 
	of a flag, if the flag exist then exit and not send the mail.
	
	If you want I could send you these modified command files.

				Hope this help,
				Ciao Luciano
                                                                      
	P.S. 	As usual sorry for my english
    
3385.9Needs to be figured out.SKIBUM::GASSMANTue Jul 21 1992 11:3711
    The concept of entity availablity needs to be addressed.  There should
    be an alarm when the availability changes from reachable to
    unreachable, and other problems such as "network partner exited",
    "invalid password", etc should continue to be 'indeterminate'.  There
    should not need to be continous alarms each time the rule is
    re-evaluated, as that degrades the importance of each individual alarm.
    Since most SNMP managers are optimized for this - it's important that
    the MCC support community figure out a way to simulate this feature in
    V1.2, and then support it in V1.3. 
    
    bill
3385.10HANNAH::B_COBBTue Jul 21 1992 12:484
    I aggree with .9, however does MCC engineering feel that the current
    way is acceptable or are they going to look into a better way?
    
    Any comments?
3385.11One of many things we'd like to fixTOOK::MINTZErik Mintz, dtn 226-5033Tue Jul 21 1992 12:586
DECmcc engineering recognizes the limitations of the current situation.
Of course, there are many things that we feel need improvement.
If you feel this should be higher priority than some other improvements,
you could provide that information to product management so that our
requirements are prioritized correctly.

3385.12HANNAH::B_COBBTue Jul 21 1992 13:053
    Fair enough..  Thanks for the help and responses.
    
    Bill
3385.13why fire every interval?CTHQ3::WOODCOCKWed Jul 22 1992 14:2237
Hi Ciao/Bill,


>	The way I use is to ask directly to a node4 something that is
>	always true when one node is up, for example I use sintax like:
>
>	expression   		= (Node4 xxxx state = ON, at every 00:10)
>	Perceived Severity	= Warning (green)
>
>	One big problem that I had with this kind of rule (I use the
>	same logic, asking something that always happen,  also for 
>	Stations and Bridges) it's that when the node4 is up, every
>	xx minutes you receive a WARNING alarm, and when it's down you
>	receive every xx minutes an EXCEPTION. There isn't nothing to do
>	if you want to use the NOTIFICATION window, you'll receive an alarm
>	every xx minutes, but if you want to use mails or broadcast 
>	command files, it's possible to modify them to receive only ONE
>	mail (or broadcast) when the node goes down an another when it
>	goes up.

Why have it FIRE every interval?? If you use:

(node4 * state=off, at every yy)

the rule only fires an exception when the node is down. Using this method set 
severity INDETERMINITE to RED like you have now and set your DEFAULT ICON color
to GREEN. This would keep your notification window clean for the REAL problems.

On the subject of reachability, every AM should be STRONGLY RECOMMENDED to
provide a reachability attribute (whether its real or simulated). This is a
must for managing anything...

best regards,
brad...    

ps. your english is probably better than most :-)
3385.14Not quite...TOOK::MCPHERSONLife is hard. Play short.Wed Jul 22 1992 14:5216
>Why have it FIRE every interval?? If you use:
>
>(node4 * state=off, at every yy)
>
>the rule only fires an exception when the node is down. Using this method set 
>severity INDETERMINITE to RED like you have now and set your DEFAULT ICON color
>to GREEN. This would keep your notification window clean for the REAL problems.
 
    Ummm... I don't think so, Brad.

    If the NODE4's state truly is OFF, then the rule won't be able to evaluate
    (it's using DECnet/NML to get the attribute, remember?) and you'll get an
    EXCEPTION of severity indeterminate...

    /doug

3385.15right, what he saidCTHQ3::WOODCOCKWed Jul 22 1992 17:2330
Hi Doug,


>>>Why have it FIRE every interval?? If you use:
>>>
>>>(node4 * state=off, at every yy)
>>>
>>>the rule only fires an exception when the node is down. Using this method set 
>>>severity INDETERMINITE to RED like you have now and set your DEFAULT ICON color
>>>to GREEN. This would keep your notification window clean for the REAL problems.
 
>    Ummm... I don't think so, Brad.
>
>    If the NODE4's state truly is OFF, then the rule won't be able to evaluate
>    (it's using DECnet/NML to get the attribute, remember?) and you'll get an
>    EXCEPTION of severity indeterminate...
>
>    /doug

Right, that's the idea. This rule will NEVER fire unless the node is 
unreachable via DECnet, and then it's an exception. Default icon=green (it's 
up), indeterminite=red (it's down). The theory is to poll any attribute which 
WON'T fire an alarm unless the node is down. State=off is probably the best 
example of using this method for simple reachability.

Confused??? Good!!! :-) So are the customers trying to implement MCC and hence 
the need for a reachability attribute for EVERY AM!!!

kind regards,
brad...
3385.16I quite agree, but....MLNCSC::BARILAROThu Jul 23 1992 10:5540
RE: .13

Hi Brad,

	I quite agree with you, but there are 2 things that force me to 
	choice this kind of rules. 

	First, most of our customers want to have something graphically that
	show them when a node goes down/up, and second they want only ONE
	message (mail/broadcast or so on) that said that something happens.

	I don't know if you used sometime ENOP (was a product that generate
	alarms on reachibility, lines/circuits use, space disks etc..), 
	this product did exactly this..

	If I use the kind of rule that you describe, I've (the customer has)
	the problems that I haven't indication when the node goes up, I've
	manually to check the state of the node and one time it's up I've
	to reset manually the EXCEPTION alarms to have the green icon back.
	And also until the node remain down I receive a mail/broadcast
	at every xx minutes.  
	
	So, the only solution that I found until now, it's this one, I
	agree it's a dirty one, and sometime also heavy for the system,
	every xx minutes start N batches (for alarms or exceptions), 
	and I still has the problem with the NOTIFICATION.

	I also completly agree with your sentence

> On the subject of reachability, every AM should be STRONGLY RECOMMENDED to
> provide a reachability attribute (whether its real or simulated). This is a
> must for managing anything...

	I'm hungry to find a clear an intelligent solution.

				Best regards,
				Ciao Luciano

P.S.: The word "Ciao" in italian means "Hi" or "Cheers"
    
3385.17i see the need nowCTHQ3::WOODCOCKThu Jul 23 1992 12:5211
Hi Luciano,

I see what you're looking for but I'll have to think on this one for awhile.
To MCCs credit one can usually get around such things. If I think of anything
I'll come back with it. 

>>P.S.: The word "Ciao" in italian means "Hi" or "Cheers"
    
As always, excuse my italian :-)

Ciao brad...
3385.18On supporting reachabilityTOOK::GUERTINIt fall down, go boomThu Jul 23 1992 13:5914
    RE: .9 and last few
    
    During MCC V1.0 design/development there was a proposal for a
    Reachability FM.  It was a "generic" FM which determined reachability
    (perhaps "availability") for entities.  The decision was that the
    Alarms module was the correct place for such functionality.  This may
    require an arbitrarily complex expression, but in theory can be done. 
    So, if people have specific requirements for Alarms FM (the lights are
    on, but no one is home) to support reachability, other than what has
    already been stated, then we (engineering) would be happy to listen. 
    No it doesn't go into a black hole, it just gets added to a very long
    list.
    
    -Matt.
3385.19HANNAH::B_COBBThu Jul 23 1992 15:157
    One other thing is that if you settle for just getting exceptions and
    not real rule fires, then you miss the rule fire when the entity is
    up again giving you the "CLEAR" severity.  This is useful for when 
    something goes down, you can see if it has returned with a "quick
    look" at the notification window. 
    
    
3385.20MARVIN::COBBGraham R. Cobb (DECNIS development), REO2-G/G9, 830-3917Fri Jul 24 1992 10:4823
.5>       I saw the
.5>     adjacency for node X drop right away on the routing node's console,
.5>     but node X was still listed as "reachable" until I tried to contact
.5>     it.  This seems a bit hokey.  If a node becomes unreachable, I want 
.5>     my rule to file as soon as possible.  

All routing  vector  routing protocols (including DECnet Phase IV) have this
problem  (the  "counting  to  infinity"  problem).   It takes a long time to
decide that something is unreachable if it has gone away altogether.  By the
way,  it  isn't  a feature of "DECnet": RIP (used in TCP/IP) has exactly the
same  characteristics  (but in the TCP/IP world reachability is tested using
ping, not by asking routers).

That is  why  "link  state" routing protocols were invented.  DECnet Phase V
uses  a  link  state  protocol and will notice much faster that the node has
gone down.

.5>     Any comments on how to make this a bit quicker?  

Install Phase  V  routers, running Phase V routing!  If you thought that was
difficult then try rewriting your rule in Phase V terms!!

Graham
3385.21The problem still needs to be solved SKIBUM::GASSMANMon Jul 27 1992 10:5422
    The problem statement should be fairly simple - if it's reachable make
    it green, if it is unreachable, make it red.  When the exception path
    is used, you lose granularity of your alarms.  Many polled nodes will
    give you an exception due to "invalid password", "network partner
    exited", and such - which are not RED critical problems.  A manager
    that will be used is one that alerts you when it should, and doesn't
    when things are "indeterminate", but still ok.  The simulated availability 
    is probably the best way to accomplish the required availability feature, 
    however since this feature is not on the V1.3 list yet, this note will 
    have to determine which is the best hack.  The real requirement 
    (based on competitive products) includes the ability to put certain nodes 
    into "MARKED" mode - ie, remove them from the polling list.  This is 
    hard to do when using wild card alarms, yet is useful when you know a 
    node will be down for weeks, and you don't want it to be red.  We're 
    talking features that buyers of other management systems have been used to 
    for two years, so the details of what is needed for parity in the market 
    is well known.  Since availability status can come from many sources
    (events, alarms, remote polling devices, other management systems),
    perhaps a unique Availability FM should be looked at again to solve
    this.  
    
    bill
3385.22hack methodologyCTHQ::WOODCOCKWed Aug 19 1992 18:2597
Hi there,

I've had a chance to think this one over a bit (what's it been a month!!).
Contained here are a couple of ideas/thoughts about reachability of a future
mcc version and also a methodology for a v1.2 'hack'.

At best reachability for the next version 'must' be addressed. This should
not be an issue of when but NEXT. Reachability is the BUSINESS of network
management and an easily understood solution is required without exception
handling.

One approach is to make all AMs supply a reachabilty attribute for alarming.
Another approach is to force all AMs to simply return the same value for 
unreachability. How about "No response from entity". In this situation I would
tend to think ALARMS detection and notification would be possible relatively
easily for unreachable and then re-reachable entities with proper correlation
and colors.

The last option is a reachability AM or FM as Bill has suggested. This is 
propably the 'best' solution but the most work. Having the ability to mark
objects for non-polling is essential. Having a poll exception list would
most likely work in this area.

For those looking for a potential hack for V1.2 read on. I'm not sure if
this meets all the needs but only you can answer. It does require DCL work
which I have not done but shouldn't be too large a project for those with
the time and need.

I have been unable to get an internal event from MCC indicating a transition
of a rule from FIRED -> CLEAR or EXCEPTION -> CLEAR. If anyone has been
successful with this I'd like to see how it was done. This transition is
key to getting colors back to the CLEAR state. Because I can't get this internal
from MCC an external process is required to determine when the object becomes
reachable again.

The hack would involve using the data collector as the central source of
updating the map. Two approaches could be taken, use your current domain
structure as you are today or use a secondary domain for polling. When using
the current domains FILTERS would be required to be set up for the exceptions
(if possible) and have the .com send a collector event to update the map. I
see a couple of problems with the later, setting up the filters each startup
(actually not a biggie) and losing other exceptions in the process. There are
also advantages to using a seperate polling domain which is what I'd recommend
if I were to persue this.

Details:

Create a domain called REACHABILITY and populate it with every entity you'd
like to get availabily on and maintain you current domains for 'viewing'
purposes. The only downfall to this method is ensuring the REACHABILITY domain
accurately reflects what's in the viewed domains. Next write an alarm rule
for each entity class with wildcards to poll all devices and fire with exception
if entity is unavailable. The advantage of this is that you now only require
one alarm rule for each class to poll all entities (if the system can handle
it). You have most likely saved a great deal of resources already with this
reduction of alarms. When the exception procedure fires have it do the 
following:

	- Check for a logical called POLL_BRIDGE_xxxxxx (example)
	- if present exit (it has already been reported)
	- if not send a collector event to a 'viewed' domain updating
	  the entity color RED, update log file, and set the logical
	  POLL_BRIDGE_xxxxxx
	- Also check for the presence of an external reachability job
	  in batch and submit if not present (this job to be described next).

	- A collector is required for each 'viewed' domain.
	- A method of mapping this entity to the 'viewed' domain is required.
	  If you have an alarms process which resubmits itself each night then
	  also have it SHOW DOMAIN * MEMBER *, TO FILE=DOMAIN_MEMBERS.LIS;.
	  DOMAIN_MEMBERS.LIS; can now be used for searches to determine what
	  'viewed' domain the entity resides and hence which collector to send
	  the event to. Also the title of the event should be something like
	  BRIDGE_xxxxxx_REACHABILITY and have the color tell you whether it
	  it is up or down.

External Batch Job:

	- This job runs at some interval equal to or greater than the polling
	  interval of the alarms when something is down.
	- Retrieve all POLL* logicals
	- Create MCC procedure to get name attribute of all reported down 
	  entities from the list of logicals, execute procedure and write to a 
	  file.
	- Search file for entities now back up.
	  - Determine domain/collector (actually could also be in logical name)
	  - Send collector event BRIDGE_xxxxxx_REACHABILITY severity clear
	    to all entities back up.
	  - Update log file for entities back up and delete logical.
	- If all entities now reachable exit, if not resubmit this job.

There you have it, a method which gives proper color (icon color = clear color)
for both up and down and needs far less resources than firing every interval 
for every entity. You also save on alarm rules. A potential masterpiece :-).

best regards,
brad...
3385.23internal events WORK!CTHQ3::WOODCOCKSun Sep 06 1992 15:3019
Hi there,

If anybody is still listening hold the phone. I have gotten the function
in the below paragraph working.

>I have been unable to get an internal event from MCC indicating a transition
>of a rule from FIRED -> CLEAR or EXCEPTION -> CLEAR. If anyone has been
>successful with this I'd like to see how it was done. This transition is
>key to getting colors back to the CLEAR state. Because I can't get this internal
>from MCC an external process is required to determine when the object becomes
>reachable again.

This makes the need for an external job unnecessary as described in -.1.
Stay tuned a new note will most likely follow in a couple of weeks with a
home brewed reachability FM for V1.2. It already works but a couple of bells/
whisles are needed.

best regards,
brad...
3385.24What protocols are you using to determine reachability?CUJO::HILLDan Hill-Net.Mgt.-Customer ResidentFri Sep 11 1992 15:2227
    Hi, Brad,
    
    Which protocols (translation: which AMs) are you using to determine
    reachability?  Can you give me an example of an alarm rule you are
    using to determine DECnet reachability?  What about terminal servers,
    bridges, ip nodes, and the generic Ethernet station?
    
    I have a few global alarm rules of my own which I'll publish in a later
    note. 
    
    Also, if you are looking for reachability polling and a
    reduction in resource consumption, you can modify the generic command
    procedures to fire alarm rules such that no logging is done.  This
    also means no batch processing overhead.
    
    Reachability determination is the PRIMARY reason my customer is using
    DECmcc.  They are tolerating its current deficiencies with the
    expectation that the product will improve after V1.2.
    
    I've heard good news from some in the development groups that there is
    a dedicated effort to address the issue of reachability.  This is
    encouraging, and I hope it continues with TOP PRIORITY.
    
    I'd be interested in testing your procedures.  Let me know if I can
    help.
    
    -Dan
3385.25protocol independent!CTHQ::WOODCOCKMon Sep 14 1992 13:2512
Hi Dan,

Unfortunately the procedures were left behind at the customer site. I am 
waiting for a tape to be made and put on EASYnet so I can pull it up and make
a couple of enhancements. I'm hoping I'll get them this week, if not, I may
rewrite them. Protocol didn't matter with the technique once set up!!! When I
left they were monitoring STATIONS, NODE4s, SNMPs and the addition of anything
else is NO PROBLEM (I think)!! As soon as I've got *something* I'll let ya 
know (it could still be a week or two).

best regards,
brad...
3385.26Some global alarm rule expressions for reachabilityCUJO::HILLDan Hill-Net.Mgt.-Customer ResidentWed Sep 23 1992 21:2922
    I have been testing on a VAXstation 4000 Model 90 with 80MB memory.
    
    What a sweet system.  45 VUPs, 33 SPECmarks.
    
    I have been testing reachability of SNMP and BRIDGE entities mostly, 
    several hundred of them (total).  The performance was great.
    
    Expressions:
    (SNMP * ipReachability = DOWN, AT EVERY = 00:05:00)
    (BRIDGE * operation state = DOWN, AT EVERY = 00:05:00)
    
    I have been trying to determine the strain that polling imposes on the
    node.  Not much for this Model 90 and 300-400 entities.
    
    Once again, please let me state the importance of reachability.
    My customers are beating me up on this issue.  This should be something
    that DECmcc does by default, with NO HACKING or other chicken-rigged
    setups involved.  This should not be "A" top priority, it should be
    "THE" top priority.
    
    Regards,
    Dan
3385.27BRIDGE alarm rule expressionCUJO::HILLDan Hill-Net.Mgt.-Customer ResidentThu Sep 24 1992 15:1416
    I don't know why I have trouble remembering the syntax of this BRIDGE
    alarm rule expression.  I've used it so many times, but I've botched it
    twice in this notes file.  Guess my credibility is completely shot.
    
    At any rate, here is the REAL expression for bridge reachability:
    
    (BRIDGE * DEVICE STATE <> OPERATING, AT EVERY 00:05:00)
    
    This works like a champ, except that LTM bridges don't respond
    properly.  What you can do to help eliminate this:
    
    	Filter the notification of the BRIDGE entities (EXCLUDE them) in
    the notification window FILTERS section.  I haven't been able to stop
    the icons on the map from changing colors, though.
    
    -dan
3385.28Why not CHANGE_OF instead?CHRISB::BRIENENDECmcc LAN and SNMP Stuff...Tue Sep 29 1992 16:425
Wouldn't the expression:

	CHANGE_OF( snmp * ipReachability, *,*), every 00:05:00

...work better?
3385.29color doesn't represent statusCTHQ::WOODCOCKTue Sep 29 1992 16:5310
>Wouldn't the expression:
>
>	CHANGE_OF( snmp * ipReachability, *,*), every 00:05:00
>
>...work better?

2cents - Change_of does not meet the requirement of color status for IS IT UP
or IS IT DOWN. Change_of will always give the same color regardless of
reachablity status (color must be cleared manually).
3385.30V1.2 won't let youTOOK::R_SPENCENets don't fail me now...Mon Oct 05 1992 17:084
    An besides, CHANGE_OF is NOT supported for wildcard rules in V1.2
    
    s/rob
    
3385.31CUJO::HILLDan Hill-Net.Mgt.-Customer ResidentWed Oct 07 1992 17:139
    I sincerely hope that CHANGE_OF wild carding will be supported in the
    next release.  I simply don't have the resources to enable alarm rules
    for 200+ nodes.
    
    The more global alarm rules you can support, the better DECmcc will be
    as a network monitoring and troubleshooting tool.
    
    Thanks,
    Dan
3385.32Note 3894 has potentialCTHQ::WOODCOCKTue Oct 13 1992 16:1250
Hello,

Note 3894 introduces some procedures which was the best solution I could come
up with for V1.2 for reachability and may help with device monitoring today.
Better late than never I guess, is it V1.3 yet???
Going beyond, this has been an interesting exercise in learning what's NEEDED
for users in determining reachability and also other functions. To make life
easier for the MCC user in the future it may be appropriate to modify our 
outlook of DOMAINS. The end solution was to create a FUNCTIONAL domain which
more closely accommodates a user's tasks, then use a couple features to marry
the functional domain back into how the user VIEWs domains. This may not always
apply of course but here is an exercise. There are 6 MCC domains shown below 
but how many user functional domains are there??


                     A                      B
                    / \                    / \
                   C   D                  E   F

In most cases there are 2 functional domains. Alarms is a clear
example. The user wants to monitor all specific devices in each of the 2
hierarchies as a single function for each, one alarm for each with some
exceptions (enter no-poll mark). The same may likely hold true for other
functions, historical data and metrics, etc. The lesson here is that the
arbitrary collection of devices for VIEWing purposes often doesn't meet
functional needs. Maybe EXPAND=TRUE needs EXPANDing :-).

re: -.1

>    I sincerely hope that CHANGE_OF wild carding will be supported in the
>    next release.  I simply don't have the resources to enable alarm rules
>    for 200+ nodes.
    
>    The more global alarm rules you can support, the better DECmcc will be
>    as a network monitoring and troubleshooting tool.
    
While it is likely that CHANGE_OF might be coming in the future it still does
not solve this problem of REACHABILITY. Why, because it must compare attribute
values from two different polls, if it can't get the value there will still be
an EXCEPTION. Once again, if all AMs provide a reachability attribute then this
becomes viable. But...only if you can set a specific severity for a given
value of the attribute, otherwise its RED when it goes down and RED when it
comes up. What is truely needed is a change_of type function which doesn't
burden the user with things like mail, but still gets the color right for when
the device is UP or DOWN. MCC_REACH might fill the bill in the short term but
this should be brought out functionally within MCC itself, dcl and imagination
only go so far.

best regards,
brad...