| There's really not enough information in your message to guess what might be
going wrong. If your pseudocode is even roughly accurate, then you have a
number of coding problems to resolve. If it's not accurate, then you need to
supply a lot more information.
>The program crashes after the wait on r_var. The next statement apprantly is
>the pthread_mutex_unlock(&r_mutex..). Why would this give us an access
>violation? We are not locking/using this mutex anywhere else except this
>place. We have locked and unlocked the mutex whenever we did a cond_wait and
>we hv initialised the mutex and condition variables.
If you get an accvio unlocking a mutex that you've properly initialized, then
it was probably corrupted by some asynchronous action of the program. There's
no way to even guess how or where this might have happened with only such
sketchy pseudocode.
>rval = pthread_attr_getstacksize(&check_attr,&st_size);
You don't show how "check_attr" is initialized. You cannot set a stacksize of
0 bytes, so the following call will never return a value of 0 in st_size, IF
check_attr is really a properly initialized attributes object. Are you sure
you're not, perhaps, expecting "rval" to contain the stack size on return?
That was how the obsolete DCE thread interfaces worked. But in POSIX threads,
a function's return value (except for pthread_getspecific()) is only the
function's status -- an error code from <errno.h>. The value 0 merely means
that the function succeeded.
>while (1)
> pthread_cond_timedwait(&cond_var,&cond_mutex,&exptime);
> dequeue from the queue and store it in a datastructure.
This, if it accurately represents the actual code, is an illegal use of
condition variable waits. (Well, not strictly "illegal", but incorrect and
usually meaningless.) You must always wait in a loop, because a condition
wait may return for a number of reasons even when your program did not signal
or broadcast the condition variable. Depending on how your "dequeue" code
works, this may or may not be relevant. However, if you are ASSUMING, without
verification, that the queue is not empty merely because the wait has
returned, then this error could cause memory corruption.
>m1_func()
> Receive messages
> process messages.
> Queues up x entries in a queue
> pthread_cond_signal(&cond_mutex,&cond_var);
> pthread_cond_wait(&cond_var);
>end of m1_func()
In other places, you have included mutex lock and unlock calls in your
pseudocode. Did you merely omit them here, or are they actually absent in
your real code? You cannot modify shared data (your queue) without holding
the relevant mutex. That's a guaranteed memory corruption. While you can
signal the condition variable without holding the mutex, it is usually not a
good idea. And you cannot wait on a condition variable without holding the
associated mutex.
>Is this due to the insufficient Stack Size in any of these threads ?
Probably not, but, as I said, it's impossible to make any meaningful guesses
with so little information.
/dave
|