|
Hi Pete,
> 1) the contents of the MCC initialization file that is being executed
> by the batch job (file pointed to by MCC$INIT logical, if any)
I have no MCC initialization file.
Here is the alarm definition. Please note that the name of the AM is
not for the public. I have change the name of the AM and all
references to this name to xxx.
DECmcc (T1.0.0)
MCC> sho mcc alarm rule another_xxx_alarm_rule all char
MCC ALARMS RULE ANOTHER_xxx_ALARM_RULE
Characteristics
AT 11-JUN-1990 10:54:00
Examination of attributes shows:
Procedure = ULTMAT$DKA200:[BELANGER.PROJECT.MALTA.
PROTO]MCC$ALARMS_BROADCAST_ALARM.COM;4
Description = "Jon is having fun again. Hi Kate!!!!
"
Category = "Remotely detected alert"
Parameter = "node=(ultmat, nmvt)"
Expression = "(change_of (xxx test alerts recei
ved,*,*), at every 00:00:30)"
> 2) the command procedure that is being submitted
$!
$! M C C $ A L A R M S _ B R O A D C A S T _ A L A R M . C O M
$!
$! This command procedure writes an alarm to a terminal or group of
$! terminals. It extracts information about the alert and generates
$! a message to be broadcast to the specified terminals.
$!
$! Parameters:
$! P1: Rule name
$! P2: Description of Rule
$! P3: Category
$! P4: Rule expression
$! P5: Time of Detection
$! P6: Values that cause the rule to fire
$! P7: TERMINAL=(terminal_id[, ...]) |
$! USERNAME=(username[, ...]) |
$! NODE[=(node[, ...])]
$! These parameters follow the conventions for the VMS command
$! REPLY. One of these three must be specified, or no message
$! will be sent.
$!
$ define sys$error mcc$error.log
$ define sys$output mcc$error.log
$ say := "write sys$output"
$ crlf = "
"
$ tab = " "
$!
$! Make sure the user has specified a terminal or group of terminals.
$!
$ if p7.eqs.""
$ then
$ say "No alarm sent; please supply a terminal, username or node."
$ goto BYE_BYE
$ endif
$!
$! Make sure the parameters are legal.
$!
$ parameters = f$edit(p7, "TRIM, UPCASE")
$ opt = f$extract(0, 1, parameters)
$ if (opt.nes."T").and.(opt.nes."U").and.(opt.nes."N")
$ then
$ say "No alarms sent; illegal option ''parameters'."
$ goto BYE_BYE
$ endif
$!
$! Make sure the OPER privilege is set on.
$!
$ set proc/priv=OPER
$ if .not.f$priv("OPER")
$ then
$ say "No alarms sent; OPER privilege required."
$ goto BYE_BYE
$ endif
$ alarm_line = "** Alarm ''p1'" + crlf + tab + "detected at ''p5' **"
$ alarm_line = alarm_line + crlf + tab + "Alert info to follow."
$!
$! Ask MCC to write some information about the alarm to a temporary file
$!
$ manage/enterprise sho xxx test all status, to file=alert.tmp
$!
$! Parse the temporary file for all the information we are going to need
$!
$ count = 0
$ open/read in alert.tmp
$ READ_LOOP:
$ read/end=DONE in line
$ count = count + 1
$ if count.eq.1
$ then
$ start_loc = f$locate("xxx ", line) + 8
$ ext_len = f$length(line) - start_loc
$ instance = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.3
$ then
$ start_loc = f$locate("AT ", line) + 3
$ ext_len = f$length(line) - start_loc
$ stamp = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.7
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ name = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.8
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ domain = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.9
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ resname = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.10
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ restype = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.11
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ time = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.12
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ desc = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.13
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ cause = f$extract(start_loc, ext_len, line)
$ endif
$ if count.eq.14
$ then
$ start_loc = f$locate("= ", line) + 2
$ ext_len = f$length(line) - start_loc
$ rcvd = f$extract(start_loc, ext_len, line)
$ endif
$ goto READ_LOOP
$!
$! We are all done parsing the file (not pretty but it does work). Close
$! the temporary file and delete it.
$!
$ DONE:
$ close in
$ delete/nolog/noconfirm alert.tmp;*
$!
$! Generate the information message to follow the initial alarm message.
$!
$ line = "xxx " + instance + crlf + "Status" + crlf
$ line = line + "AT " + stamp + crlf + crlf
$ line = line + "Examination of atributes shows:" + crlf
$ line = line + tab + "Name:" + tab + name + crlf
$ line = line + tab + "Domain:" + tab + domain + crlf
$ line = line + tab + "Type:" + tab + resname + crlf
$ line = line + tab + "Time:" + tab + restype + crlf
$ line = line + tab + "Desc:" + tab + desc + crlf
$ line = line + tab + "Cause:" + tab + cause + crlf
$ line = line + tab + "Rcvd:" + tab + rcvd
$ line := "''line'"
$!
$! Display them.
$!
$ if opt.eqs."N"
$ then
$ reply/'p7'/user/bell "''alarm_line'"
$ reply/'p7'/user/bell "''line'"
$ else
$ reply/'p7'/bell "''alarm_line'"
$ reply/'p7'/bell "''line'"
$ endif
$ BYE_BYE:
$ show time
$ exit
|
| Hi again,
There is one known problem in the batch job area, BUT, you're not hitting
the one I know of (USE MODE SCREEN from batch mode in the kit you've got
might not behave correctly ... this has been fixed for EFT update).
So I've got some more questions:
> When the alarm is triggered and the command procedure submitted, the
> batch process gets into a hibernate state within MCC.
1) In the above sentence, "the batch process" refers to the batch job that
the command procedure is runing in, right? And you can see that the
process is in MCC. Or am I on the wrong track, and the MCC that your
alarm rule is enabled in is "the batch process" that is hibernating?
$!
$! Ask MCC to write some information about the alarm to a temporary file
$!
$ manage/enterprise sho xxx test all status, to file=alert.tmp
2) does the above command work when you issue it from the TRM?
3) how about when submitted as a batch job all by itself?
4) is the answer to 2 and 3 still yes if your AM is in the state that it is
in when the alarm rule fires and submits the batch job?
regards,
Pete
|
| Since MCC is multi-threaded, there are potentially many many threads
executing "in parallel". Threading systems try to be as efficient as
possible about system resources. MCC Alarms uses threads to run their
rules in. When no threads a available to run, our threads system (MTS)
will go into hiberate state, since this uses the minumim amount of
system resources. After looking at the rule you use, it appears that
MCC will go into hiberate state for just under 30 seconds, pop out,
gobble up some CPU time, and then go back into hibernate state for
another 30 seconds ("...at every 00:00:30"). If this is what you are
seeing, then everything is probably working fine. If it is just "hung"
in hibernate, then I would suspect there is a problem with the alarm
itself. And the Alarms FM is calling a Framework routine which is
waiting and never returning, thereby placing the main thread in a wait
state and causing the process to go into hibernate state. If this is
the case, you need to talk to someone in the Alarms team.
-Matt.
|