[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference perfom::internal_perf_tools

Title:Internal Performance Tools
Moderator:DECCXX::WIBECAN
Created:Mon Jun 21 1993
Last Modified:Thu May 15 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:94
Total number of notes:284

89.0. "Tool: event-annotated disassemblies" by PERFOM::HENNING () Fri Feb 07 1997 15:35

    I've been hacking in perl to try to make IPROBE data reduction 
    a little easier, including annotating disassemblies on both Unix and
    NT.  This program has been used by a grand total of only two other 
    people, and still has bugs.  But if you'd like to try it out, and give
    feedback here (or by email) that would be great.
    
    Prerequisites:
       - perl
       - On NT, the SDK to provide dumpbin.exe, link.exe, and mspdb41.dll
    
    Here's a usage sample, on Unix (it works in similar fashion on NT).
    Note that we start with a reasonably empty directory:
    
% ls -l
total 230832
-rw-r--r--   1 john     users    235380736 Feb  7 11:21 pcsample.dat
-rwxr-xr-x   1 john     users       851968 Feb  7 11:24 tomcatv.e5u4f41k31_97013
0

    And after entering a grand total of two commands, we have lots more
    files.  The two commands are:

% harness.pl -x tomcatv.e5u4f41k31_970130 -e cycles -d pcsample.dat
% harness.pl  -e bcache_miss -d pcsample.dat

    And here's what they created:

% ls
addresses.resolved                    idle_thread.dis
bcache_miss.hot_routines              idle_thread.source_cycles
bcache_miss.rpt                       pcsample.dat
bcache_miss_tomcatv.cmp$main_.rpd     tomcatv.cmp$main_.dib
cycles.hot_routines                   tomcatv.cmp$main_.dis
cycles.rpt                            tomcatv.cmp$main_.source_bcache_miss
cycles_idle_thread.rpd                tomcatv.cmp$main_.source_cycles
cycles_tomcatv.cmp$main_.rpd          tomcatv.e5u4f41k31_970130
idle_thread.dib

    Usually the most interesting report is the report on hot routines.
    But for tomcatv, this is perhaps less interesting than for some other 
    programs, since all the activity is concentrated in the main routine.

% cat cyc*hot*
             Hot Routines for cycles -pthresh 1
 Events  % Routine           Image            Addr
8777164 91 tomcatv.cmp$main_ tomcatv.e5u4f41k31_970130 120011290:120012E4F
 239871  3 idle_thread       /vmunix         FFFFFC00002A9230:FFFFFC00002A9647

% cat bca*hot*
             Hot Routines for bcache_miss -pthresh 1
 Events  % Routine           Image            Addr
  86904 97 tomcatv.cmp$main_ tomcatv.e5u4f41k31_970130 120011290:120012E4F

     On UNIX only, disassemblies include a convenient pointer back to
     source lines.  (Anyone know how to easily do this on NT?)   By
     default, source lines with more than 1% of the events are flagged:

% cat t*sour*cyc*
cycles for tomcatv.cmp$main_ by source line
printing lines with at least 87771.64 events

             tomcatv   129    186055
             tomcatv   136    559522
             tomcatv   138    105669
             tomcatv   140    487934
             tomcatv   147    203506
             tomcatv   148    103816
             tomcatv   160    275929
             tomcatv   161    557867
             tomcatv   162    796056
             tomcatv   163    545592
             tomcatv   164    509669
             tomcatv   165    141740
             tomcatv   186   1014130
             tomcatv   188   1530200
             tomcatv   190    613424
             tomcatv   192    258149
%
% cat t*sour*bca*
bcache_miss for tomcatv.cmp$main_ by source line
printing lines with at least 869.04 events

             tomcatv   122       994
             tomcatv   123      1498
             tomcatv   129      4012
             tomcatv   136      8176
             tomcatv   140      7352
             tomcatv   147      2061
             tomcatv   148      1919
             tomcatv   160      3789
             tomcatv   161      5194
             tomcatv   162      8715
             tomcatv   163      4221
             tomcatv   164      5542
             tomcatv   165      1234
             tomcatv   186      7492
             tomcatv   188     14205
             tomcatv   190      5790
             tomcatv   192      1909

      Notice that, as expected for this benchmark, the hot bcache
      activity corresponds to the hot cycle activity.  If you'd like
      to see that at the instruction level, you can.  Let's look at 
      portions of the disassembly:

Cycles=cycles
BMis=bcache_miss
      file  line       addr    Instr                        Cycles BMis
   tomcatv   130  1200116e8    mult $f28,$f13,$f28           11942   63
   tomcatv   136  1200116ec    subt $f25,$f27,$f25           25877   79
   tomcatv   138  1200116f0    addt $f22,$f20,$f20           12440   30
   tomcatv   129  1200116f4     stt $f2,-8(r17)             185898 4011
   tomcatv   130  1200116f8    addt $f23,$f28,$f23           11152   19
   tomcatv   136  1200116fc    addt $f25,$f29,$f25           22479   63
   tomcatv   140  120011700    addt $f26,$f30,$f26           12887  148
   tomcatv   130  120011704     stt $f23,-8(r18)             30637  311
   tomcatv   140  120011708    mult $f24,$f26,$f26           34621   27
   tomcatv   136  12001170c    mult $f24,$f25,$f24           12094   28
   tomcatv   138  120011710    subt $f20,$f26,$f20           35766  221
   tomcatv   134  120011714    subt $f16,$f24,$f16           12160   13
   tomcatv   147  120011718     stt $f20,-8(r21)            203479 2061
   tomcatv   146  12001171c    cpys $f31,$f20,$f20
   tomcatv   148  120011720     stt $f16,-8(r16)            103751 1918
   tomcatv   145  120011724    cpys $f31,$f16,$f16
   tomcatv   146  120011728  cmptlt $f11,$f20,$f17           34541   50
   tomcatv   145  12001172c  cmptlt $f10,$f16,$f21           12988   22
   tomcatv   146  120011730 fcmovne $f17,$f20,$f11           34708  114
   tomcatv   145  120011734 fcmovne $f21,$f16,$f10           13426   10
   tomcatv   121  120011738     bne r23,1200115d8
   tomcatv   151  12001173c     stt $f11,0(r0)                1019   19
   tomcatv   150  120011740     stt $f10,8000(r0)              976   19

       Each time you run the harness it adds another column to the 
       disassemblies.

To get a copy of this NOT fully debugged data reduction harness, copy it
by decnet from:

	perf::"~henning/bin/harness.pl"

When reporting bugs in this notestream please be sure to specify the 
version you're using - it's the comment on the second line.  If you
report the FIX for the bug along with your bugreport, you will gain
extra credit.

    /John Henning
     CSD Performance Group
     Digital Equipment Corporation
     henning@zko.dec.com
     Speaking for myself, not Digital

     Digital Internal Use Only homepage: http://tlg-www.zko.dec.com/~henning
T.RTitleUserPersonal
Name
DateLines
89.1alternatesPERFOM::HENNINGFri Feb 07 1997 15:405
if perf doesn't answer decnet for you, other routes include:

	perfit::"~henning/bin/harness.pl"

or tlgmax::
89.2neat! nice work, JohnMSBCS::SCHNEIDERindividually twistedThu Feb 27 1997 19:443
    Just thought I'd mention that you have at least one satisfied customer.
    
    Chuck
89.3New versionPERFOM::HENNINGMon Mar 24 1997 16:092
    bug fixed version posted today, with fixes for the source line reports
    (affects unix only)