[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1667.0. "NOTIFY with >203 events ==> ACC VIO" by STKHLM::BERGGREN (Nils Berggren EIS/Project dpmt, Sweden DTN 876-8287) Thu Oct 17 1991 13:10

T.RTitleUserPersonal
Name
DateLines
1667.1PROBLEMS QUOTA EXCEEDED ????STKHLM::BERGGRENNils Berggren EIS/Project dpmt, Sweden DTN 876-8287Wed Oct 23 1991 10:1612
    Hi again,
    
    Anybody seen this note?  There's been nearly a week, and no answer...
    
    Is there a 'PROBLEMS QUOTA EXCEEDED'-flag raised for me or what?
    
    The problem is quite anoying since I can't notify the domain on all
    events in one shot.
    
    Is there anyone looking into this problem?
    
       /Nils
1667.2Problem is indeed in MCC_EVENT_DUMPTOOK::T_HUPPERThe rest, as they say, is history.Wed Oct 23 1991 18:2134
    RE .0:
    
    Do you see this problem when the event trace is not on?  The ACCVIO
    shown in your log is in the trace code, not the mainline code of the
    events manager.  It also looks like this is V1.1 code.  If the problem
    is in the mainline event manager code, this is a serious problem.  If
    it is only in the trace code (MCC_EVENT_DUMP), there is less reason to
    panic.
    
    On digging through the MCC_EVENT_DUMP code, I see that there indeed is
    a restriction on the size of dump lines - 1024 bytes.  As the entire
    event filter is being dumped as a single line (no end-of-line character
    until the end of the filter list), and it takes 5 bytes in the printout
    for each filter element, plus 10 bytes for the header, the expected
    blowup is after 203 elements.  Seems to fit the problem very closely.
    
    Do you need to have each event code explicitly in the filter list, or
    can you use a wildcarded event filter (event filter pointer is
    MCC_K_NULL_PTR = all event codes)?  
    
    I suppose we could change the dump so that we did intervene a bit more
    in the formatting, and force a printout (rather than continuing to
    accumulate characters) after N characters.  This is not a difficult
    fix, and we will incorporate this, or something like it, in the V1.2
    code.  Thanks for finding this problem.
    
    RE .1:
    
    Sorry for the delay, but we are running flat out here on V1.2 code. 
    Some of us have been in the critical path over the last few weeks, and
    the schedule is seriously tight.  The light at the end of the tunnel is
    getting brighter now, however.
    
    Ted
1667.3Still ACC-VIO with no tracing onSTKHLM::BERGGRENNils Berggren EIS/Project dpmt, Sweden DTN 876-8287Thu Oct 24 1991 09:23151
1667.4Let's assume there are (at least) 2 problems hereTOOK::GUERTINDon't fight fire with flamesFri Oct 25 1991 12:2020
    I think Ted is right.  Let's assume there are two problems here.  First
    of all the Event Trace facility cannot handle that many events, so you
    cannot use the Event Trace to provide any additional information,
    sorry.  The second problem (the *REAL* problem) is that there is an
    accvio when you do a getevent on hundreds of events.  That accvio
    appears to be happening in the threads code.  Are you doing anything
    with threads?  Anything asynchronous (e.g., ASTs?).  If you are going
    thru the Notification FM, then you may have simply exceeded the number
    of threads that can handled within MCC.  I can only guess.  Could you
    provide any additional information on your operating environment (you
    are not using iconic map, correct?).
    
    PS: I'm happy to see that you're less hostile about this.  Monitoring
    the notes file is NOT a required activity for developers.  It should be
    view as volunteer work, not as a duty.  Hence, placing nasty-grams in
    the notes file to get help is like having a flat tire and giving every
    car that goes by the middle finger in hopes that someone will be upset
    enough to stop and help you.
    
    -Matt.
1667.5reply to .4STKHLM::BERGGRENNils Berggren EIS/Project dpmt, Sweden DTN 876-8287Fri Oct 25 1991 13:1434
repl .4

>>> First of all the Event Trace facility cannot handle that many events, 
>>> so you cannot use the Event Trace to provide any additional information,
>>> sorry.
I can live with that, but shouldn't it be fixed anyway?

>>> Are you doing anything with threads?  
The only thing that's thread-specific is that I create a lock every
time my getevent-directive entry point gets called.  The lock is
deleted in the end_directive-routine.

>>> Anything asynchronous (e.g., ASTs?)
No, all my communications are done with QIOW 

>>> If you are going thru the Notification FM, then you may have 
>>> simply exceeded the number of threads that can handled within MCC. 
I have the problem even if the NOTIFY-directive is the first and only 
thing I do at a MCC-session.  (I guess that the number of threads that
can be handled within MCC are reset when invoking MCC...)
Is it correct that NOTIFICATION creates a thread for each entity 
specified in the entity-list and each thread calls my GETEVENT-entry point?
In that case, 'thread-quota' shouldn't be the problem since I get the 
ACC-VIO even if there's only one entity in the list.  Am I out flying now
or...?

>>> Could you provide any additional information on your operating
>>> environment (you are not using iconic map, correct?).
Not using IMPM, right!  MCC BMS v1.1.
What else do you need?


  Thanks and regards,
    Nils
1667.6Sounds stack related to meTOOK::GUERTINDon't fight fire with flamesMon Oct 28 1991 18:0332
    RE:.-1
    
    >> shouldn't it be fixed anyway?
    Well actually the event trace logs were intended to help the MCC Kernel
    developers debug the event manager.  There has been talk about removing
    it from the production code (I don't know about having an
    mcc_kernel_shr.exe with event tracing in the toolkit).
    
    Since it appears that the accvio is occuring in the threads code.  It
    implies several things:
    
      1) Usually there is a stack problem.
    
      2) Possibly a stack corruption.  For example if you declare
         int x[2];
          then...
         x[1] = 0;
         x[2] = 0;
         you've just clobbered the stack (since the C language indexes from 0).
         Something like this commonly shows up in the MCC Kernel as an
         ACCVIO or reserved operand fault.  I have seen many many
         string-copies copy off the end of the stack and corrupt memory.
    
      3) Slight possibility that there is a stack overflow.  Are you
         allocating any large structures on the stack?  Have any
         extensively recursive routines?
    
    I'm sorry but I'm starting to run out of suggestions.  If you believe
    this to be an MCC bug, please enter a QAR (or I can enter one for you).
    Perhaps you can enter some relevant source code as well.
    
    -Matt.
1667.7512 event codes is OK for mcc_event_getTOOK::T_HUPPERThe rest, as they say, is history.Tue Oct 29 1991 13:3014
    I have tried using 512 event codes going into a single mcc_event_get()
    call, with no problem.  There is nothing in the regular (non-trace)
    event manager code that has a limited buffer for event codes.  They are
    encoded in list form, and the list can be as long as you wish.  
    
    Note that my test is using code that does not go through the mcc_call
    interface, so the environment that the event manager is running is is
    different (little in the way of stack-eating higher-level routines). 
    As Matt points out in .6, the problem is perhaps something that ends
    up blowing up in the event manager because it is the lowest level of
    code.  When there is a stack corruption, the problem shows up where
    routine returns are being made (return to outer space).
    
    Ted
1667.8thread size from notify call to getevetnTOOK::CALLANDERMCC = My Constant CompanionTue Oct 29 1991 18:0626
    another problem, is simply put, notification fm isn't giving you
    a big enough stack to do the job! We picked a number that seemed
    big enough and that is what we use. You may well be exceeding
    the amount of space. If you do a getevetn from the fcl for
    the same ANY EVENTS, does this problem go away? the main 
    difference between the two is that the getevetn gets passed the FCL
    primary thread for use in processing (this is the main stack thready,
    no real limit on its' size), while the notify uses a thread that it
    creates for itself. 
    
    I assume tht you are doing you command ona single entity, because in
    v1.1, if it is a wildcard we will do one getevent per entity requested
    (I know you listed the command, but I didn't look close enough).
    
    As to the event id list being passed, well that was done to help same
    overhead. Since the event manager requires you to pass in the lsit of
    events (or at least it did last time I checked) it was faster for the
    underlying modules if the list was enumerated for the AM's before it
    got passed down. This was especially helpful for modules like DNA5
    where new events could be added to its' dictionary at any time, and
    they would have to read the information from the dictionary at run
    time, while the FCL has the info at hand (in the parse tables) without
    having to access the dictionary.
    
    Sorry for the run on sentences.
    
1667.9I'll do some homework...STKHLM::BERGGRENNils Berggren EIS/Project dpmt, Sweden DTN 876-8287Wed Oct 30 1991 17:2336
RE .6
	Matt,
	
	I'll check the code for stack-problems.  I don't think
	that there is a stack overflow since I'm not having any
	large structures or recursive routines.
	

RE .7
	Ted,
	
	It's nice to here that it works for you.  To bad for me, 
	having to go over my code 'in deap' to find the problem.

RE .8 

	Jill,
	
>>>> If you do a getevetn from the fcl for
>>>> the same ANY EVENTS, does this problem go away?
	If I do a GETEVENT from the FCL, I get a MCC_K_NULL_PTR in 
	the IN_P-argument, so I don't have that problem with a 
	GETEVENT-directive. NOTIFY, on the other hand, change
	'any events' to a list of events defined.  
	
>>>> I assume tht you are doing you command ona single entity, 
	Yes, I'm doing notify on a single entity.


     Thanks all,

      I'll do some homework now and look for coding-errors.  I'll keep
      you informed.
      
      regards,
        Nils
1667.10We need to do some research as wellTOOK::GUERTINDon't fight fire with flamesWed Oct 30 1991 18:5815
    Nils,
    
    Although it is possible that you may have a coding bug, it is just as
    likely (if not more) that it is our (MCC's) bug.  We go through at
    least 3 large management modules (FCL, Notification, Alarms, etc.) any
    one of these can be mismanaging stack memory for >203 events.  I guess
    I was really trying to say it *probably* isn't an MCC Event Manager
    problem.
    
    Jill, could we have someone test this case for notifying an entity with
    several hundred events?  I know we don't have any such entity around,
    but maybe someone on the Notification Services team can think of
    something creative.
    
    -Matt.
1667.11Yes please, do some testingSTKHLM::BERGGRENNils Berggren EIS/Project dpmt, Sweden DTN 876-8287Thu Oct 31 1991 08:0022
    repl .10
    
    I just did the same thing as Ted in .6, and it works with 310 events.
    As he points out, the difference is that the MCC_CALL-mechanism is not
    used. Using the MCC_CALL-mechanism could maybe twist something up, but
    I'll go thru my code to see if I can find any problems.
    
    However, I would very much appreciate if you could do some testing
    just to verify the functionality.
     
    I tried to do it myself by creating (in DAP) 310 events for the  NODE4
    class.  After a new PTB I went into MCC (forms mode) and did a GETEVENT
    NODE4 SEB075  and pressed the <HELP>-key at the arguments-line.  It
    gave me all the events, including the 310 I just defined.  However,
    when looking at the event-list in the event-trace (MCC_EVENT_LOG=1 and
    MCC_EVENT_TRACE=180) I just got the events originally defined, not the
    new ones....   I don't understand that.  Who is removing "my events"?
    So, I just had to delete the 310 events from the NODE4 class and try to 
    think out something else....
    
        Thanks for your help,
             Nils
1667.12I will check for mem clobber in event id list buildTOOK::CALLANDERMCC = My Constant CompanionSat Nov 23 1991 00:406
    I will check the code that builds the event id list, to make sure that
    we are moting putting more in the list than we have allcoated memory
    for. That is the only thing that quickly comes to mind.
    
    jill