[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1533.0. "why does only the first thread start?" by IROCZ::RRICHARD () Tue Apr 29 1997 15:43




  Hi,

  I have a customer who's trying to install a DEC product on an Alpha system
  running OpenVMS version 6.2.  I've installed a debug version of the product
  and found that it's creating several threads but for some reason only the 
  first of them is starting.  Issuing the SHOW TASK command under the VMS
  debugger results in the following display.

     DBG> SHOW TASK/ALL
   
     task id     state hold pri  substate         thread_object
     %TASK     1 SUSP       11   Condition Wait   Initial Thread
     %TASK     2 RUN        11                    7629076
     %TASK     3 READY      11   Not yet started  7635748
     %TASK     4 READY      11   Not yet started  7642580
     %TASK     5 READY      11   Not yet started  7649188
     %TASK     6 READY      11   Not yet started  7668508
     %TASK     7 READY      11   Not yet started  7669540

     DBG> SHOW TASK/FULL %TASK 3

     task id     state hold  pri substate        thread_object
     %TASK     3 SUSP         11 Not yet started 7635748
          General alert delivery is enabled

          Next pc:           (unknown)
          Start routine:     DIGITAL_COPYRIGHT+3184
          Scheduling policy: throughput

          Stack storage:
            Bytes in use:          6800        Base:    008B8000
            Bytes available:      17776        SP:      008B6570
            Reserved Bytes:       10752        Top:     008B1FFF
            Guard Bytes:              1

          Thread control block:
            Size:                   520        Address: 0074D048

          Total storage:          35849

     DBG>


  I've found that if I start the the DECthreads debugger:

     DBG> set image CMA$OPEN_RTL
     DBG> call CMA_DEBUG
     DECthreads debug> version
     DECthreads Version V2.12-296e, OpenVMS AXP [OpenVMS V6.2]
     DECthreads debug> exit

  upon exiting the DECthread debugger, threads 3 through 7 startup and the 
  application proceeds to work.  I'm new to DECthreads and up until recently
  didn't have any experience debugging in this environment, so I'm looking for
  anyone who might have some insight they could pass along to expidite the 
  task. Is there some system level parameter or logical I should track
  down?  Is there some additional piece of information I should get?

  Thanks,
  bob richard


T.RTitleUserPersonal
Name
DateLines
1533.1DCETHD::BUTENHOFDave Butenhof, DECthreadsTue Apr 29 1997 16:569
What is thread 2 (%TASK 2) doing? That's the reason the other threads haven't
run. Maybe it's FIFO and not blocking, maybe it's not getting timesliced,
maybe it's got ASTs disabled... there's no way to tell from the information
you've shown.

For starters, go into CMA_DEBUG again and type "thread -f" to get full
information for all the threads. (That may or may not provide the answer.)

	/dave
1533.2more thread informationIROCZ::RRICHARDTue Apr 29 1997 19:35124

  Hi Dave,

  Sorry about the lack of information.  One thing I should have mentioned is
  that the product works ok on another OpenVMS Alpha system.  I've been 
  comparing debug information between the two systems and haven't been able to 
  identify what the significant differences might be. DEChtreads on the working
  system is VT2.12-29 but at this time I don't believe that's important.

  Thanks for the quick response.  Thread 2 is the RADIUS server thread.  It
  reads a database file, creates a UDP socket,  binds it to UDP port 1645,
  and starts three worker threads (tasks 5, 6, and 7) to handle incoming UDP 
  messages.  It appears to have completed successfully.   Threads 3 and 4
  were created by the main program function at the same time it created thread
  2.  Thread 3 is the RADIUS ACCOUNTING thread. It's similar to thread 2 except
  it creates a UDP socket using port 1646.  Thread 4 creates a TCP socket using 
  TCP port 1645.  It's used for remote management.

  Executing the VMS debugger's SHOW TASK/FULL command for each of the threads
  indicates that the scheduling policy for all the tasks is "throughput".  I'm
  not positive as to what this means but page 2-6 of the March 1996 edition of 
  the Guide to DECthreads leads me to believe this allows all threads to get 
  some processing time.  Of course the guide states that such threads could be
  locked out by other threads using a FIFO or RR scheduling.  I assume this 
  refers to threads in the same process and I didn't see any threads using 
  either policy.

  Your response caused me to revisit the threads debugger and I'm currently
  searching through the information it provides for additional clues.  In the
  meantime here's the result of the thread -f command you suggested.  Thread
  1, the "default thread" is in a blocked state but it's in the same state
  on the working system.  Why would entering then exiting the DECthreads
  debugger clear the problem?

  Thanks Again,
  bob richard

  DBG> set image CMA$OPEN_RTL

  DBG> call CMA_DEBUG
  DECthreads debug> thread -f
  Thread 1 (blocked, cond wait) "default thread" (0x0045E760)
    Waiting on condition variable 7 using mutex 32
    Scheduling: throughput policy at priority 11
    Thread specific data: 1: 0x0052B988, 2: 0x00538718, 3: 0x00539928, 4:
     0x00539F80, 5: 0x0052D0B8, 6: 0x00539460
    (*)Stack: 0x7EE53760 (default stack)
    General cancelability enabled, asynch cancelability disabled
    No current vp
    Join uses mutex 19 and condition variable 1; wait uses mutex 20 and
      condition variable 2
    The thread's start function and argument are unknown
    The thread's latest errno is 0

  Thread 2 (running) "<pthread user@0x00746914>" (0x00747F48)
    Scheduling: throughput policy at priority 11
    No thread specific data
    Stack: 0x00569430; base is 0x0056A000, guard area at 0x00563FFF
    General cancelability enabled, asynch cancelability disabled
    Current vp is 0x00000000
    Join uses mutex 62 and condition variable 9; wait uses mutex 63 and
      condition variable 10
    The thread's start function and argument are 0x00010C70 (0x00746910)
    The thread's latest errno is 0

  Thread 3 (ready, not started) "<pthread user@0x00748324>" (0x007499F8)
    Scheduling: throughput policy at priority 11
    No thread specific data
    Stack: 0x0075FF00; base is 0x00760000, guard area at 0x00759FFF
    General cancelability enabled, asynch cancelability disabled
    No current vp
    Join uses mutex 69 and condition variable 12; wait uses mutex 70 and
      condition variable 13
    The thread's start function and argument are 0x00010C70 (0x00748320)
    The thread's latest errno is 0

  Thread 4 (ready, not started) "<pthread user@0x00749DD4>" (0x0074AFD8)
    Scheduling: throughput policy at priority 11
    No thread specific data
    Stack: 0x00769F00; base is 0x0076A000, guard area at 0x00763FFF
    General cancelability enabled, asynch cancelability disabled
    No current vp
    Join uses mutex 71 and condition variable 14; wait uses mutex 72 and
      condition variable 15
    The thread's start function and argument are 0x00010C70 (0x00749DD0)
    The thread's latest errno is 0

  Thread 5 (ready, not started) "<pthread user@0x0074B7A4>" (0x0074FFE0)
    Scheduling: throughput policy at priority 11
    No thread specific data
    Stack: 0x008E3F00; base is 0x008E4000, guard area at 0x008DDFFF
    General cancelability enabled, asynch cancelability disabled
    No current vp
    Join uses mutex 82 and condition variable 16; wait uses mutex 83 and
      condition variable 17
    The thread's start function and argument are 0x00010C70 (0x0074B7A0)
    The thread's latest errno is 0

  Thread 6 (ready, not started) "<pthread user@0x0075031C>" (0x00750348)
    Scheduling: throughput policy at priority 11
    No thread specific data
    Stack: 0x008EDF00; base is 0x008EE000, guard area at 0x008E7FFF
    General cancelability enabled, asynch cancelability disabled
    No current vp
    Join uses mutex 84 and condition variable 18; wait uses mutex 85 and
      condition variable 19
    The thread's start function and argument are 0x00010C70 (0x00750318)
    The thread's latest errno is 0

  Thread 7 (ready, not started) "<pthread user@0x00750724>" (0x00750750)
    Scheduling: throughput policy at priority 11
    No thread specific data
    Stack: 0x008F7F00; base is 0x008F8000, guard area at 0x008F1FFF
    General cancelability enabled, asynch cancelability disabled
    No current vp
    Join uses mutex 86 and condition variable 20; wait uses mutex 87 and
      condition variable 21
    The thread's start function and argument are 0x00010C70 (0x00750720)
    The thread's latest errno is 0

  DECthreads debug> exit


1533.3Could this be a DECthreads bug?IROCZ::RRICHARDTue Apr 29 1997 23:2810

  Hi,

  We found that we can reroduce the customer's problem if we replace the 
  SYS$LIBRARY:CMA*.* files on our working system with those from the customers
  failing system.  Could this be a bug in DECthreads V2.12-296e?

  Regards,
  bob richard
1533.4SPECXN::DERAMODan D'EramoTue Apr 29 1997 23:584
        It does sound like the problem in topics 1410 and 1411 ... the
        newer versions of the patch kits (reply 8.11) fixed that.
        
        Dan
1533.5DCETHD::BUTENHOFDave Butenhof, DECthreadsWed Apr 30 1997 11:4214
Dan, thanks for remembering 1411!

Yes, it does sound like the same problem. If thread 2 is "blocked" on a
socket, it's really "running" (as shown in the output) as far as we're
concerned, and would give way to another thread only if timesliced. If, in
fact, timeslicing hadn't been enabled properly (the 1411 problem), it would
remain "running" until the socket I/O completed.

Unfortunately, I don't have a database correlating DECthreads version to
patch kit, so I'm not sure whether V2.12-296e is the broken ALPCMAR03_062
patch. At least, the fact that there's a letter at the end implies that it
was a patch of some sort, so it's certainly possible.

	/dave
1533.6problem may be related to notes 1411 and 1410IROCZ::RRICHARDWed Apr 30 1997 15:5112

  Hi,

  Thanks for the help.  The problem goes away when I use the T2.12-296 version
  of CMA$RTL.EXE and returns when the V2.12-296e version is used. So, it 
  appears that version V2.12-296e may be broken.  I've asked the customer to 
  install ALPCMAR04_062.

  Regards, 
  bob richard