| The feature you're describing is an interesting one, but it is not yet
implemented into System Watchdog.
If you enter the cluster alias in your profile, then the cluster load
balancing algorithm will decide which is the cluster member actually
connected to... So you may get process missing events somehow randomly
depending upon the presence of the process on the target node selected
independantly of the Consolidator. This obviously doesn't work as
expected.
Besides, the Consolidator has currently no means to know - a priori -
the cluster members list, from a cluster alias, or even what is a
cluster alias, a cluster member name or a standalone node name.
Consolidation of cluster-wide events is done a posteriori, once events
are reported to the Consolidator, as each event packet has a cluster field
into it. It consists in merging, for an event code sublist, identical
event messages coming from distinct cluster members with the same cluster
alias into a single event. PROcess missing is not considered as a
cluster-wide event...
I think the most straightforward way to implement the wished feature
would be merely to add a parameter into the PROcess missing data
specification, say using a /CLUSTER_WIDE qualifier, e.g.
SNS$EDIT> ADD NODE trusted_cluster_member PROCESS proc_name proc_uic -
/CLUSTER_WIDE /INTERVAL=...
so that, for processes marked as cluster-wide, the Agent node trusted to
detect the process presence would scan the cluster process table
instead of only the local node process table.
Of course, this implies a profile structure change, conversion utilities,
etc, which cannot be included in an ECO kit, but rather into a point
release.
What do you think?
Regards,
-- Olivier.
|