[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::dnu_osi

Title:DECnet/OSI for {ULTRIX,OSF/1}
Notice:Indicate version and platform when writing...see #2 for kits
Moderator:BULEAN::CARR
Created:Wed Sep 25 1991
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2187
Total number of notes:10469

2132.0. "xti osi select and threads on UNIX" by VARESE::BIOTTI () Mon Feb 17 1997 14:30

 {cross posted in CMA, DNU_OSI and DIGITAL_UNIX conf.}

 I'm using select and xti osi with threads. 

 I've realised that I've a problem starting from OSF32 
 and DNA32.   All was fine with OSF30 DNA30.

 When I issue more than one t_snd in a very short time from a thread
 and I wait for events (answers) coming from the network 
 with a select call in another thread, I observe that the 
 select only gets the first 2 or 3 events in an acceptable
 time, then select fires with an event only at a distance
 of some (3 or 4) seconds. 
 I've done a decnet osi thace with ctf and I see the delays. 

 select is called with a timeout of 1 second. 
 xti calls are wrapped. 
 I've tried to put pthread_lock_global_np and a pthread_unlock_global_np
 around select and I get good response times, that is select doesn't 
 complete many times with timeout before getting the event. 
 If I do all in a syncronous way it's fine even without select wrap.

 I know I have a work around that would be to decrease the timeout,
 wrap select and may be call pthread_delay_np or/and pthread_yeld to allow 
 other threads to take the control but I would prefer to understand 
 if there is a problem somewhere.
 
 With OSF40 (plus an xti patch sharable I've got for a t_look problem)
 the behaviour is even worst in the sense that select doesn't fire 
 anymore even if there are some data that should still come in.

 I've the problem with OSF32C too.
 I've the problem with DNA32A too. 
 I haven't installed yet DNA32B MUP, I've only read the release 
 notes and they don't seem to mention anything about a problem like the
 above.  Anyway the installation of DNA32B  is my next step. 

 Thanks for any advice 
 ABiotti Basestar Open group
  
T.RTitleUserPersonal
Name
DateLines
2132.1UPSAR::WALLACEDigital: A Dilbertian CompanyMon Feb 17 1997 18:0913
    I don't think anything in DECnet 3.2B will help with this, but
    it doesn't hurt to run the latest fixes. 
    
    Let me see if I understand you correctly - based on a ctf
    trace you see a message recieved that should cause select
    to complete, but it doesn't, at least not for 3 or 4 
    seconds.  Is that correct?
    
    My guess is it's not DECnet.  Would it be possible to try
    your program over TCP/IP?
    
    Vince
    
2132.2moreVARESE::BIOTTITue Feb 18 1997 06:3222
    
>    Let me see if I understand you correctly - based on a ctf
>    trace you see a message recieved that should cause select
>    to complete, but it doesn't, at least not for 3 or 4 
>    seconds.  Is that correct?
 
  No. When I see the message coming from the decnet trace, select
  fires at the same time.  The problem is that the response message
  comes late even according to the decnet osi trace.     

>    My guess is it's not DECnet.  Would it be possible to try
>    your program over TCP/IP?
    
  I'm running with a device (PLC).  I should develop a server application 
  TCP based to do this.  I see what I can do. 

  I've already verified that the problem is not in my physical device 
  since OSF30 works fine. 

  I'm wondering if the interaction of select with thread could be the 
  origin of the problem.    
 
2132.3UPSAR::WALLACEDigital: A Dilbertian CompanyTue Feb 18 1997 18:2811
    I'm still confused.  You're doing a select for read, and as soon
    as a message is received, the select fires?  So what's the problem?
    
    I can see two other possibilies.  Either the local node is delaying
    sending the initial "request" packet (or whatever you call the
    action that generates a response from the remote node); or else
    the remote node is taking a long time to generate a response.
    
    Maybe you could type in a time-line of what the events are.
    
    Vince
2132.4closeVARESE::BIOTTIThu Feb 20 1997 14:029
 After a study of other decnet traces I've understood
 my problem that's due to a limitation on my device and
 not to a decnet problem. The different behaviour between 
 3.0 and 3.2 of OSF1 made me thinking to a bug somewhere
 in 3.2 but that's not the case. 
  
 Thanks