[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1470.0. "Realtime with Threads ?" by STOSS1::KASEFANG () Thu Jan 23 1997 16:58

T.RTitleUserPersonal
Name
DateLines
1470.1We need some basic information...WTFN::SCALESDespair is appropriate and inevitable.Thu Jan 23 1997 17:5815
1470.2Here's a little more infoSTOSS1::KASEFANGFri Jan 24 1997 13:3333
    
    DUNIX 4.0a is the OS.
    
    The clock thread performs a timed wait for 1/60 of a second on a
    condition variable.  The variable is not signaled.
    
    When the wait times-out the clock thread signals the 3 medium priority
    threads.  They are currently using 3 threads at the same prioity.  They
    will want to use 7 priorities for their implementation.  The low
    priority threads use the lowest priority.
    
    Schedule Policy/Priorities
    
    Seven priorities are defined specific to each schedule policy.
    
    	Highest is pri_rr_max - 1, pri_fifo_max - 1 .... .
    	Lowest is pri_rr_max - 7, pri_fifo_max - 7 .... .
    
    The following policies were tested round robin, fifo, bg_mp and other.
    
    The higher priority threads never appeared to execute in fifo and rr.
    
    All threads appeared to function unpredictably in other and bg_mp.
    
    The high priority thread didn't appear to execute on a timely basis in
    other.
    
    I've requested code and the output from setld.  I should have both a
    little later.
    
    Thanks for the assistance,
    
    David
1470.3Need to use "Real-time" scheduling policiesWTFN::SCALESDespair is appropriate and inevitable.Fri Jan 24 1997 14:1820
OK, David, it looks like the customer has avoided most of the worst pitfalls.

As observed, in order for the clock thread to be responsive, it must have a
scheduling policy of either FIFO or RR; OTHER will not necessarily preempt
the currently running thread.

The fg_np (i.e., OTHER) and the bg_np scheduling policies do not run in
strict priority order.  These policies are intended to run high priority
threads more than low priority threads while avoiding starvation of any of
the threads.  Thus, this customer probably wants to stick to the "strict
priority/preemption" (aka "real-time") scheduling policies, at least for all
but the three low priority threads.

.2> The higher priority threads never appeared to execute in fifo and rr.

This sounds like a bug.  I'd be interested to know if you can reproduce it.



					Webb
1470.4SMURF::DENHAMDigital UNIX KernelFri Jan 24 1997 20:118
    If this a serious realtime application (OK, soft, but serious),
    then these folks are probably going to want system-contention-scope
    threads coming in the next functional release (V4.0D tentatively).
    Until then, the threads may be FIFO with a high priority, but
    they'll be competing against other threads/processes on the system
    on a timeshare basis.
    
    Webb will explain more :^) if he thinks it's relevant.
1470.5Is this a dedicated platform?WTFN::SCALESDespair is appropriate and inevitable.Mon Jan 27 1997 14:2812
OK, since Jeff prompted...

David, is the customer's system dedicated to this job?  Or, does the customer
expect to run other processes/applications on this system "in the background"
while the simulator is running?

If the answer is that there will be other stuff running on the system, then,
yes, the customer will need the SCS stuff, which will be available in the next
release.


				Webb
1470.6I'm the lucky winner!!RHETT::PARKERMon Jan 27 1997 19:5240
    
    Thanks for the replies Jeff & Webb. David is in class this week so
    he called us to work with the customer. They have narrowed down the
    application to a mere 5000 lines of code! :-)
    
    I'm working on it now - I thought I would try building it on 3.2C
    statically in order to get the system contention scope behavior as
    well as building it on 4.0/A/B to see what happens there as well. 
    
    Basically, what I am seeing is that lower priority threads do not
    appear to be preempted by higher priority threads. If I change the
    code so that the lower priority threads call sched_yield(), then 
    it's much better. I've tried both FIFO/RR as well as the OTHER -
    bg_np/fg_np scheduling policies without much difference. 
    
    So, yes I think it's a process versus contention scope related issue
    but I thought I should test it to make sure. We are up against SGI
    in this bid so it may be pretty tough! 
    
    I'll update here with findings in a day or so. Of course, if anybody
    has any other suggestions, I'm open to them. There are about a dozen 
    threads in all, some at low priority, some at medium, and some a high
    priority - namely the clock related thread.
    
    On 4.0, they are linking like: 
    
    cc -g -o relsim *.o -lpthread -lpthreads -lmach -lexc -lc
    
    On 3.2C, I know I need -non_shared but will I also use the following
    libraries ? 
    
    -lpthreads -lmach -lc_r -lc in that order? I always get confused by
    this and this time, I'll write it down and plaster it to my cube wall!
    :-)
    
    Thanks for the input!!
    
    Lee Parker
    Realtime Expertise Center 
    
1470.7Building on Digital UNIXDCETHD::BUTENHOFDave Butenhof, DECthreadsTue Jan 28 1997 11:0942
>    On 4.0, they are linking like: 
>    
>    cc -g -o relsim *.o -lpthread -lpthreads -lmach -lexc -lc

They should be linking "cc -g -o relsim *.o -pthread" if they're using the
POSIX api, or "cc -g -o relsim *.o -threads" if they're using the DCE thread
api (also known, though slightly inaccurately, as "draft 4 POSIX").

There is no reason not to use the proper switches when compiling and linking
with cc. For example, not only have you gotten the libraries in the wrong
order, but you've omitted the CRITICAL definition of _REENTRANT.

The only reason to ever use the -l list directly is that, when building a
shared library, one must use ld to link, and ld doesn't accept either
-pthread or -threads. You should still always use -threads or -pthread for
the compilation, and let the compiler driver take care of remembering
-D_REENTRANT for you!

(For the record, "-threads" is "-D_REENTRANT -lpthreads -lpthread -lmach
-lexc" and "-pthread" is "-D_REENTRANT -lpthread -lmach -lexc"... and "cc"
ALWAYS adds "-lc" at the end, so you don't need to.)

>    On 3.2C, I know I need -non_shared but will I also use the following
>    libraries ? 
>    
>    -lpthreads -lmach -lc_r -lc in that order? I always get confused by
>    this and this time, I'll write it down and plaster it to my cube wall!
>    :-)

No, you do NOT need -non_shared on 3.2C, unless you really want to build
non-shared. (I recommend against it, but if they really want to, they can.)

Again, you should always use the proper compiler switch when you're compiling
and linking with cc. 3.2C didn't support POSIX threads, only DCE threads, so
there's no "-pthread", only "-threads". There's no libpthread on 3.2C, and
threaded programs didn't use libexc (there wasn't a .so anyway), and threaded
code additionally required the reentrant version of the C runtime. So the
libraries were "-lpthreads -lmach -lc_r" in that order. Again, cc always
includes -lc at the end, so you don't need to worry about it. And, as in 4.0,
it's CRITICAL that you define _REENTRANT -- either by using "-threads", or by
using the -D_REENTRANT directly. It's really much easier just to remember
"-threads".
1470.8Thanks for the clarification!RHETT::PARKERTue Jan 28 1997 12:2571
Hi Dave,

Thank you for the clarification. I know this has been discussed many
times, your note now makes it crystal clear! Just having the one switch
as opposed to specifically linking in several libraries is much better!

>    On 4.0, they are linking like:
>
>    cc -g -o relsim *.o -lpthread -lpthreads -lmach -lexc -lc

That's what was in their Makefile and I did wonder if it was correct. 
Of course, it worked... :-)

>>They should be linking "cc -g -o relsim *.o -pthread" if they're using 
>>the POSIX api, or "cc -g -o relsim *.o -threads" if they're using the 
>>DCE thread api (also known, though slightly inaccurately, as "draft 4 
>>POSIX").

Well, I tried that and now I'm getting an Unresolved on pthread_exit. 
I/they must be doing something wrong!

>>There is no reason not to use the proper switches when compiling and 
>>linking with cc. For example, not only have you gotten the libraries 
>>in the wrong order, but you've omitted the CRITICAL definition of 
>>_REENTRANT.

    That was in their makefile but it was not being defined. Thanks for
    pointing that out. 
    
>>The only reason to ever use the -l list directly is that, when building 
>>a shared library, one must use ld to link, and ld doesn't accept either
>>-pthread or -threads. You should still always use -threads or -pthread 
>>for the compilation, and let the compiler driver take care of remembering
>>-D_REENTRANT for you!

>> Understood. Thanks.

>>(For the record, "-threads" is "-D_REENTRANT -lpthreads -lpthread -lmach
>>-lexc" and "-pthread" is "-D_REENTRANT -lpthread -lmach -lexc"... and "cc"
>>ALWAYS adds "-lc" at the end, so you don't need to.)

>    On 3.2C, I know I need -non_shared but will I also use the following
>    libraries ?
>
>    -lpthreads -lmach -lc_r -lc in that order? I always get confused by
>    this and this time, I'll write it down and plaster it to my cube wall!
>    :-)

>>No, you do NOT need -non_shared on 3.2C, unless you really want to build
>>non-shared. (I recommend against it, but if they really want to, they can.)

Well, I was going to do this in order to make sure I get system contention
scope behavior. I think it was you who suggested doing this IF one really
needs that. But, since 3.2C didn't support POSIX threads, I guess I can 
    scrap that idea. :-) I think it's time for me to take a course on 
    DECthreads! 

>>Again, you should always use the proper compiler switch when you're compiling
>>and linking with cc. 3.2C didn't support POSIX threads, only DCE threads, so
>>there's no "-pthread", only "-threads". There's no libpthread on 3.2C, and
>>threaded programs didn't use libexc (there wasn't a .so anyway), and threaded
>>code additionally required the reentrant version of the C runtime. So the
>>libraries were "-lpthreads -lmach -lc_r" in that order. Again, cc always
>>includes -lc at the end, so you don't need to worry about it. And, as in 4.0,
>>it's CRITICAL that you define _REENTRANT -- either by using "-threads", or by
>>using the -D_REENTRANT directly. It's really much easier just to remember
>>"-threads".


    
1470.9DCETHD::BUTENHOFDave Butenhof, DECthreadsWed Jan 29 1997 10:2925
> Well, I tried that and now I'm getting an Unresolved on pthread_exit. 
> I/they must be doing something wrong!

Probably. If you can't figure it out, I'll need a more complete example to
guess what's happening, though.

> Well, I was going to do this in order to make sure I get system contention
> scope behavior. I think it was you who suggested doing this IF one really
> needs that. But, since 3.2C didn't support POSIX threads, I guess I can 
> scrap that idea. :-) I think it's time for me to take a course on 
> DECthreads! 

Ah. You wanted to link static on 3.2C and bring the binary to 4.0. I
understand now. Yes, if you're using POSIX threads rather than DCE threads or
CMA, you don't have that option. (And, by the way, I still wouldn't RECOMMEND
that -- there are a lot of improvements in the 4.0 libraries that you'll miss
by doing that.)

And, uh, by the way... are you SURE they're really using "POSIX threads"?
It's hard not to be a little confused by the fact that POSIX threads and DCE
threads both have "pthread_" names. (It's unfortunate that OSF insisted we
use the pthread_ prefix for DCE threads, and it's unfortunate that I gave in
to them instead of fighting harder, but none of that helps now.)

	/dave
1470.10More infoRHETT::PARKERWed Jan 29 1997 13:5791

Hi Dave, 

>> Thanks for the information. I found the 4.0 "Guide to DECthreads" 
>> manual yesterday morning - sorry about .6, asking how to build. 
>> You should have just told me to RTFM!  :-)

>> I went to appendix A and right there in black and white is how
>> to compile and link and lots of other good information. 

>> Great job on the manual!! - I'm going to be getting very familiar 
>> with it over the next few months! From what I've read so far, it's
>> very well written - clear and concise!

> Well, I tried that and now I'm getting an Unresolved on pthread_exit.
> I/they must be doing something wrong!

Probably. If you can't figure it out, I'll need a more complete example to
guess what's happening, though.

>> Sorry - they had left out pthread.h in a couple of their .c's. Guess
>> that's just one reason to make sure you are building correctly! ;-)

> Well, I was going to do this in order to make sure I get system contention
> scope behavior. I think it was you who suggested doing this IF one really
> needs that. But, since 3.2C didn't support POSIX threads, I guess I can
> scrap that idea. :-) I think it's time for me to take a course on
> DECthreads!

Ah. You wanted to link static on 3.2C and bring the binary to 4.0. I
understand now. Yes, if you're using POSIX threads rather than DCE threads or
CMA, you don't have that option. (And, by the way, I still wouldn't RECOMMEND
that -- there are a lot of improvements in the 4.0 libraries that you'll miss
by doing that.)

>> Ok - thanks for the info! We are just starting to see issues w/ using
>> realtime and threads. For some apps that "broke" on 4.0 (due to not
>> having system contention scope, that was the only alternative. Now
>> people are starting to port to the final 1003.1c POSIX routines and 
>> they are going to wait for 4.0D and just go with process contention 
>> scope until then (and may wind up sticking with it depending on the 
>> degree of determinism they can acheive). 

And, uh, by the way... are you SURE they're really using "POSIX threads"?
It's hard not to be a little confused by the fact that POSIX threads and DCE
threads both have "pthread_" names. (It's unfortunate that OSF insisted we
use the pthread_ prefix for DCE threads, and it's unfortunate that I gave in
to them instead of fighting harder, but none of that helps now.)

>> Well, I think I am. They are including pthread.h and the routines 
>> being used are those described in PART II of the manual. Now I'm really
>> confused! :-} I'm now building using the -pthreads option to cc and it
>> seems to compile fine. I tried -threads and that seems to work too. 

>> Is there an easy way to tell? 

>> Perhaps I should file a high/med priority QAR and see if someone can
>> look into it. The application is a realtime flight similator and they
>> are comparing Digital UNIX/Alpha to SGI. They narrowed it down to a 
>> test case of about 2400 lines of code. I've been running it here and
>> I am finding that even though they appear to be filling in the attribute
>> structure correctly, specifying PTHREAD_EXPLICIT_SCHED, the threads 
>> are not shown by ps axm -OSCHED to be using RR scheduling policy. Of 
>> course, since we are now using process contention scope, I'm not sure
>> that ps(1) will be able to correctly report that anymore. 

>> Just called the customer ...
>> They started this application in summer/95 and it runs correctly on
>> Solaris and SGI. On Solaris the routines are different, don't contain
>> the pthread_ prefix... But, that fact that it runs correctly on SGI
>> does make me concerned. I could go into a lot more detail on what I
>> have tried so far but I'm not sure it's worth the time. I guess I could
>> also file an IPMT instead but I'm not convinced they are not doing 
>> something incorrectly on Digital UNIX. Looks like they started with the
>> draft 4 routines and are bringing that up to the final POSIX routines.

>> Would someone up there be able to spend a little time on this over the
>> next day or two if I file a QAR? Or, I could put a tar file on our 
>> internal anonymous ftp server if someone wants to pick it up. 

>> BTW: Where does one file QAR's for DECthreads on Digital UNIX ? If it's
>> the same as any other Digital UNIX QAR, I already know how to do that.

>> Thank you all for your input so far!! It's been very helpful!!

Lee Parker
Realtime Expertise Center


    
1470.11Show us the reproducer! (;-)WTFN::SCALESDespair is appropriate and inevitable.Wed Jan 29 1997 14:2753
.10> I'm now building using the -pthreads option to cc and it
.10> seems to compile fine. I tried -threads and that seems to work too. 
.10>
.10> Is there an easy way to tell? 

Both interfaces provide routines with the same names for the most part, and
the routine signatures are pretty much the same as well.  (Unfortunately.)

The most obvious difference is that the D4 interface returns -1 on error, and
the standard interface returns the ERRNO code.

Also, PTHREAD_EXPLICIT_SCHED is a constant in the standard interface (it's
analog in the D4 interface is PTHREAD_DEFAULT_SCHED), so it sounds like they
are using the standard (-pthread) interface.


.10> the threads are not shown by ps axm -OSCHED to be using RR scheduling
.10> policy

Process contention scope scheduling parameters do not show up in the ps
output (it shows the scheduling parameters of the DECthreads "virtual
processor" instead).


.10> BTW: Where does one file QAR's for DECthreads on Digital UNIX ? If it's
.10> the same as any other Digital UNIX QAR, I already know how to do that.

RTFNF.  ;-)  See note 3.3.  (Yes, it's the same as for any other component of
the Digital Unix base operating system.)


.6> Basically, what I am seeing is that lower priority threads do not
.6> appear to be preempted by higher priority threads. If I change the
.6> code so that the lower priority threads call sched_yield(), then
.6> it's much better. I've tried both FIFO/RR as well as the OTHER -
.6> bg_np/fg_np scheduling policies without much difference.

This would strike me as a bug.  Can you write a small test program which
demonstrates that you don't see preemption with FIFO/RR policies?  If so,
please enter a QAR or open an IMPT case, as appropriate.


					Webb


P.S. Lee, your convetion for quoting previous text is very confusing.  The
typical convention is to add characters at the beginning of the lines which
you are quoting, not at the beginning of the lines you are writing.  (This
way, it's the lines which are quoted from quotations that acquire multiple
characters at the front; the new text is obvious by its lack of quotation
characters...)  [Alternatively, in notes conferences it makes sense to put
the note/reply number in the quote prefix, which can be useful since it
serves as "bibliography" as well as quote-indicator.]
1470.12Coming up!RHETT::PARKERWed Jan 29 1997 15:1630
    
    Hi Webb, 
    
    Ok, I think I may try to narrow it down to a smaller test case.
    
    I kinda thought that ps(1) would not work correctly when using
    process contention scope. Thanks for the info...How about when
    system contention scope comes back? Or, is this not going to work
    then either since the scheduling for threads is now done in user
    mode?
    
    .11> P.S. Lee, your convetion for quoting previous text is very confusing.
    .11> The typical convention is to add characters at the beginning of the
    .11> lines which you are quoting, not at the beginning of the lines youare
    .11> writing. (This way, it's the lines which are quoted from quotations
    .11> that acquire multiple characters at the front; the new text is obvious
    .11> by its lack of quotation characters...)  [Alternatively, in notes
    .11> conferences it makes sense to put the note/reply number in the quote
    .11> prefix, which can be useful since it serves as "bibliography" as well
    .11> as quote-indicator.
    
    Amen for that suggestion!! I just about drove myself crazy trying to
    follow a couple of my own note strings!! :-)
    Now, if we can just get a vi or emacs editor for notes...
    
    Thanks, 
    
    Lee
    
    
1470.13DCETHD::BUTENHOFDave Butenhof, DECthreadsWed Jan 29 1997 15:4316
>    I kinda thought that ps(1) would not work correctly when using
>    process contention scope. Thanks for the info...How about when
>    system contention scope comes back? Or, is this not going to work
>    then either since the scheduling for threads is now done in user
>    mode?

ps will be able to show system contention scope threads, but it won't help to
associate those kernel threads with the user threads in your program. (But
that's not a new problem.) Pete's toyed with the notion of modifying ps to
use libpthreaddebug.so to show user thread information. That'd be "cute", but
maybe not practical.

You can always run the program with ladebug to see the full scoop on all the
threads.

	/dave
1470.14More infoRHETT::PARKERThu Jan 30 1997 18:35115
    
    
    

Thanks for the info! 

.13> You can always run the program with ladebug to see the full 
.13> scoop on all the threads.

I'm new to ladebug too - any special tricks to doing this. I have
used ladebug on this app w/ some interesting side-effects. These
led me to try to fix this compiler warning :

cc: Warning: rt_pkg.c, line 183: In this statement, the referenced 
type of the pointer value "&rt_clock" is "function (pointer to unnamed 
struct) returning void ", which is not compatible with "function (pointer 
to void) returning pointer to void".
  rt_clock_address = &rt_clock;
--^
cc: Warning: rt_pkg.c, line 274: In this statement, the referenced type 
of the pointer value "&rt_task" is "function (pointer to unnamed struct) 
returning void" , which is not compatible with "function (pointer to void) 
returning pointer to void".
  rt_address = &rt_task;
--^

The offending line :

  void *(*rt_clock_address)(void *);

  rt_clock_address = &rt_clock;

I changed to :

  void (*rt_clock_address)(clock_specific_struct *);

  rt_clock_address = &rt_clock;

And, the other one:

  void *(*rt_address)(void *);

  rt_address = &rt_task;

changed to:

  void (*rt_address)(thread_specific_struct *);

  rt_address = &rt_task;

Those warnings maybe were ok but I thought I should try to fix it ...

And, once I've fixed that, I start getting other warnings that really 
concern me! Like :


cc: Warning: rt_pkg2.c, line 232: In this statement, the referenced 
type of the pointer value "rt_clock_address" is "function (pointer to 
unnamed struct) returning void", which is not compatible with "function 
(pointer to void) returning pointer to void".
  status = pthread_create(&rt_clock_id,
-----------^
cc: Warning: rt_pkg2.c, line 337: In this statement, the referenced type 
of the pointer value "rt_address" is "function (pointer to unnamed struct) 
returning void", which is not compatible with "function (pointer to void) 
returning pointer to void".
  status = pthread_create(&rt_task1_id,
-----------^
cc: Warning: rt_pkg2.c, line 348: In this statement, the referenced type 
of the pointer value "rt_address" is "function (pointer to unnamed struct) 
returning void", which is not compatible with "function (pointer to void) 
returning pointer to void".
  status = pthread_create(&rt_task2_id,
-----------^
cc: Warning: rt_pkg2.c, line 355: In this statement, the referenced type 
of the pointer value "rt_address" is "function (pointer to unnamed struct) 
returning void", which is not compatible with "function (pointer to void) 
returning pointer to void".
  status = pthread_create(&rt_task3_id,
-----------^

Needless to say, Warnings like this on the pthread_create() concerns me
a lot! :-)

The definitions that it's complaining about are :


  status = pthread_create(&rt_clock_id,
                          &rt_clock_attr,
                          rt_clock_address,
                          &rt_clock_data);
  check(status, "rt_clock create error");

.... 

BTW: The check routine checks for (!= 0) and not for (!= -1)

Thanks for that tip in an earlier note.

Anyway, I was getting TRAP signals when running in ladebug when the
pthread_create() occured but I was stepping and may be that's why?

Remember, I'm just try to find out why their code is not behaving
as expected. They say this all works on SGI - I'm doubtful!!

Any comments or suggestions. I've narrowed the code down a bit
but it's still ~2000 lines...

Thanks again!! 

Lee


    
1470.15Duh!RHETT::PARKERThu Jan 30 1997 19:3210
    
    
    Oh, duh! Nevermind, I looked at the man page for pthread_create(3)
    and now I see what's going on. Well, almost! :-0
    
    I'll keep plugging away at it. Man, this threads stuff is really
    different!
    
    Lee
    
1470.16Incoming...RHETT::PARKERTue Feb 04 1997 17:4926
    
    Hi Folks, 
    
    Well, I've worked with this program enough to convince myself
    that we are dealing with a bug here. I added some calls to
    pthread_getschedparam(3) to verify that the scheduling policy 
    is round robin and the priorities are set correctly. This looks
    good but, unless the lower priority threads do something that
    block, like sleep(2), the higher priority thread does not get
    to run. If I call sched_yield(3) instead of the sleep, then it
    helps a little. But the one second sleep allows the highest
    priority thread to run the most. 
    
    This is just a heads up on an incoming IPMT - priority 2. 
    
    My apologies if I have overlooked something. I don't think I 
    have though. In any event, we can win against SGI if we can
    show them that this works. Unfortuantely, it already works
    on their UNIX. It works on Solaris too but they got the boot
    anyway! ;-)
    
    Feel free to beat me up if I missed something!! 
    
    Lee
    
    
1470.17It's a (now) known problemWTFN::SCALESDespair is appropriate and inevitable.Tue Feb 04 1997 20:3612
We've had a report of a similar problem (preemption after blocking in a system
call not working) from another customer.  It turns out that the problem reported
in this note boils down to the same problem, because on V4.0a, the manager
thread (the one responsible for, among other things, waking threads at timeout),
is basically like any other thread and it's blocking in a system call.  

I expect that the problem in this note will be resolved in V4.0c, in which the
manager thread is treated specially.  The general fix to the preemption problem
will be available in the following release at the earliest, and possibly not
until the next functional release.

Thanks for the IMPT case (it just arrived)....  :-)