| What is thread 2 (%TASK 2) doing? That's the reason the other threads haven't
run. Maybe it's FIFO and not blocking, maybe it's not getting timesliced,
maybe it's got ASTs disabled... there's no way to tell from the information
you've shown.
For starters, go into CMA_DEBUG again and type "thread -f" to get full
information for all the threads. (That may or may not provide the answer.)
/dave
|
|
Hi Dave,
Sorry about the lack of information. One thing I should have mentioned is
that the product works ok on another OpenVMS Alpha system. I've been
comparing debug information between the two systems and haven't been able to
identify what the significant differences might be. DEChtreads on the working
system is VT2.12-29 but at this time I don't believe that's important.
Thanks for the quick response. Thread 2 is the RADIUS server thread. It
reads a database file, creates a UDP socket, binds it to UDP port 1645,
and starts three worker threads (tasks 5, 6, and 7) to handle incoming UDP
messages. It appears to have completed successfully. Threads 3 and 4
were created by the main program function at the same time it created thread
2. Thread 3 is the RADIUS ACCOUNTING thread. It's similar to thread 2 except
it creates a UDP socket using port 1646. Thread 4 creates a TCP socket using
TCP port 1645. It's used for remote management.
Executing the VMS debugger's SHOW TASK/FULL command for each of the threads
indicates that the scheduling policy for all the tasks is "throughput". I'm
not positive as to what this means but page 2-6 of the March 1996 edition of
the Guide to DECthreads leads me to believe this allows all threads to get
some processing time. Of course the guide states that such threads could be
locked out by other threads using a FIFO or RR scheduling. I assume this
refers to threads in the same process and I didn't see any threads using
either policy.
Your response caused me to revisit the threads debugger and I'm currently
searching through the information it provides for additional clues. In the
meantime here's the result of the thread -f command you suggested. Thread
1, the "default thread" is in a blocked state but it's in the same state
on the working system. Why would entering then exiting the DECthreads
debugger clear the problem?
Thanks Again,
bob richard
DBG> set image CMA$OPEN_RTL
DBG> call CMA_DEBUG
DECthreads debug> thread -f
Thread 1 (blocked, cond wait) "default thread" (0x0045E760)
Waiting on condition variable 7 using mutex 32
Scheduling: throughput policy at priority 11
Thread specific data: 1: 0x0052B988, 2: 0x00538718, 3: 0x00539928, 4:
0x00539F80, 5: 0x0052D0B8, 6: 0x00539460
(*)Stack: 0x7EE53760 (default stack)
General cancelability enabled, asynch cancelability disabled
No current vp
Join uses mutex 19 and condition variable 1; wait uses mutex 20 and
condition variable 2
The thread's start function and argument are unknown
The thread's latest errno is 0
Thread 2 (running) "<pthread user@0x00746914>" (0x00747F48)
Scheduling: throughput policy at priority 11
No thread specific data
Stack: 0x00569430; base is 0x0056A000, guard area at 0x00563FFF
General cancelability enabled, asynch cancelability disabled
Current vp is 0x00000000
Join uses mutex 62 and condition variable 9; wait uses mutex 63 and
condition variable 10
The thread's start function and argument are 0x00010C70 (0x00746910)
The thread's latest errno is 0
Thread 3 (ready, not started) "<pthread user@0x00748324>" (0x007499F8)
Scheduling: throughput policy at priority 11
No thread specific data
Stack: 0x0075FF00; base is 0x00760000, guard area at 0x00759FFF
General cancelability enabled, asynch cancelability disabled
No current vp
Join uses mutex 69 and condition variable 12; wait uses mutex 70 and
condition variable 13
The thread's start function and argument are 0x00010C70 (0x00748320)
The thread's latest errno is 0
Thread 4 (ready, not started) "<pthread user@0x00749DD4>" (0x0074AFD8)
Scheduling: throughput policy at priority 11
No thread specific data
Stack: 0x00769F00; base is 0x0076A000, guard area at 0x00763FFF
General cancelability enabled, asynch cancelability disabled
No current vp
Join uses mutex 71 and condition variable 14; wait uses mutex 72 and
condition variable 15
The thread's start function and argument are 0x00010C70 (0x00749DD0)
The thread's latest errno is 0
Thread 5 (ready, not started) "<pthread user@0x0074B7A4>" (0x0074FFE0)
Scheduling: throughput policy at priority 11
No thread specific data
Stack: 0x008E3F00; base is 0x008E4000, guard area at 0x008DDFFF
General cancelability enabled, asynch cancelability disabled
No current vp
Join uses mutex 82 and condition variable 16; wait uses mutex 83 and
condition variable 17
The thread's start function and argument are 0x00010C70 (0x0074B7A0)
The thread's latest errno is 0
Thread 6 (ready, not started) "<pthread user@0x0075031C>" (0x00750348)
Scheduling: throughput policy at priority 11
No thread specific data
Stack: 0x008EDF00; base is 0x008EE000, guard area at 0x008E7FFF
General cancelability enabled, asynch cancelability disabled
No current vp
Join uses mutex 84 and condition variable 18; wait uses mutex 85 and
condition variable 19
The thread's start function and argument are 0x00010C70 (0x00750318)
The thread's latest errno is 0
Thread 7 (ready, not started) "<pthread user@0x00750724>" (0x00750750)
Scheduling: throughput policy at priority 11
No thread specific data
Stack: 0x008F7F00; base is 0x008F8000, guard area at 0x008F1FFF
General cancelability enabled, asynch cancelability disabled
No current vp
Join uses mutex 86 and condition variable 20; wait uses mutex 87 and
condition variable 21
The thread's start function and argument are 0x00010C70 (0x00750720)
The thread's latest errno is 0
DECthreads debug> exit
|
| Dan, thanks for remembering 1411!
Yes, it does sound like the same problem. If thread 2 is "blocked" on a
socket, it's really "running" (as shown in the output) as far as we're
concerned, and would give way to another thread only if timesliced. If, in
fact, timeslicing hadn't been enabled properly (the 1411 problem), it would
remain "running" until the socket I/O completed.
Unfortunately, I don't have a database correlating DECthreads version to
patch kit, so I'm not sure whether V2.12-296e is the broken ALPCMAR03_062
patch. At least, the fact that there's a letter at the end implies that it
was a patch of some sort, so it's certainly possible.
/dave
|
|
Hi,
Thanks for the help. The problem goes away when I use the T2.12-296 version
of CMA$RTL.EXE and returns when the V2.12-296e version is used. So, it
appears that version V2.12-296e may be broken. I've asked the customer to
install ALPCMAR04_062.
Regards,
bob richard
|