[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

543.0. "How does MCC trace Memory related bugs ?" by WAKEME::ROBERTS (Keith Roberts - DECmcc Alarms Team) Tue Dec 11 1990 13:07

    +---------------------------+ TM
    |   |   |   |   |   |   |   |
    | d | i | g | i | t | a | l |           Interoffice memorandum
    |   |   |   |   |   |   |   |
    +---------------------------+

    To:  DECmcc Management                  From:   Keith Roberts
         DECmcc Development                 Date:   10-Dec-1990               
         DECmcc Community                   Mail:   LKG2-2/N1
                                            Enet:   TOOK::D_ROBERTS
                                            DTN:    226-5394

    Subj:  How does MCC trace memory related bugs ? 

    On November 30 1990, Joe Giuffrida from DSTEG reported to Anil Navkal a
    problem when trying to enable 400 Alarm Rules on a local DECnet node. 
    At approximately 300 enabled rules, an error message was displayed on
    the terminal ...

    %STR-F-ERRFREDYN, error freeing dynamic string when returned to LIB$FREE_VM

    Not knowing which module generated this error, Anil asked me to
    investigate.  After 3 days of tracking down the error, a bug in the
    DECnet time-spec validation code was found; the 'mcc_time_delete'
    routine was passed an uninitialized descriptor.

    'mcc_time_delete' calls the STR$FREE1_DX routine to deallocate the data
    referenced by the MCC Descriptor.  Due to the invalid data, the STR$
    routine signaled the error.


      Q: Why did it take 300+ enabled rules before the signal occurred ?

      Q: What does the Kernel exception handler do with these errors ?

      Q: Why didn't DECnet's Fake-VM testing catch this error months before ?


    o It is quite amazing that this bug did not cause an ACCVIO when Alarms
      runs the DTM tests.  The fact that 300+ rules must be enabled before
      any symptoms are revealed should be added to the list of MCC Great
      Mysteries.

    o In many cases, the VMS STR$ routines use the signal mechanism to
      indicate problems.  When the error occurred, it appeared that the
      thread which received the signal was terminated (the rule was left in
      the "In Progress" state).  I imagine that the Kernel exception handler
      displays the error, and kills the thread.

    o DECnet performed Fake-VM testing on all their Show directives.  This
      problem was never detected by Fake-VM ... Very simply because Fake-VM
      can not trace the memory activities of the STR$ routines; the routines
      used by the MCC time services and many other Kernel Common Services.

    How do those developing Access & Functional Modules trace such
    problems?

    MCC needs (simple and consistent) guidelines for the use of dynamic
    memory -- guidelines followed at all levels of MCC software
    development.  Otherwise the debugging of these kinds of problems will
    continue to be frustrating and time-consuming.

                  "We need to work Smarter, Not Harder"
                                                 - jce
T.RTitleUserPersonal
Name
DateLines