[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cma

Title:DECthreads Conference
Moderator:PTHRED::MARYSTEON
Created:Mon May 14 1990
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1553
Total number of notes:9541

1520.0. "Getting Insufficient Virtual Memory Error Under 4.0b" by HYDRA::BRYANT () Wed Apr 09 1997 16:19

I've got a partner who can't run his threaded program, which runs on 3.2 fine,
under 4.0b.  He's getting a "forrtl: sever (41): insufficient virtual memory"
error.  He believes it to be threads related.  I'm not sure so I'm just looking
for any hints on what may be wrong before I receive his libaries to reproduce
this.
Thanks.
Pat Bryant
Software Partners Engineering  

Here are the stats
------------------

This is what he's using to link the program DEMO:

decunix> cat linkdbg
f77 -v -o DEMO demo.o \
fmsaut.dbg \
fmsnoshr.a fmslib.a fmsint.a fmslib.a blas.a \
-lpthread -lmach -lexc -lc

decunix> ./linkdbg
/usr/bin/cc -v -o DEMO /usr/lib/cmplrs/fort/for_main.o -O4 demo.o
fmsaut.dbg fmsno
shr.a fmslib.a fmsint.a fmslib.a blas.a -lpthread -lmach -lexc -lc -lUfor
-lfor -l
Futil -lm_4sqrt -lm -lots
/usr/lib/cmplrs/cc/ld -o DEMO -g0 -O4 -call_shared
/usr/lib/cmplrs/cc/crt0.o /usr/
lib/cmplrs/fort/for_main.o demo.o fmsaut.dbg fmsnoshr.a fmslib.a fmsint.a
fmslib.a
 blas.a -lpthread -lmach -lexc -lc -lUfor -lfor -lFutil -lm_4sqrt -lm -lots
-lc
/usr/lib/cmplrs/cc/ld:
1.80u 1.90s 0:09 38% 0+107k 0+429io 0pf+0w 107stk+17704mem
decunix>

Here's a dump:

decunix> dbx -r ./DEMO

forrtl: severe (41): insufficient virtual memory
thread 0xa signal IOT/Abort trap at >*[nxm_thread_kill, 0x3ff8053eab0]  ret
    r3
1, (r26), 1


(dbx) where
>  0 nxm_thread_kill(0x4, 0x140150860, 0x3ff80193d3c, 0x980, 0x14015c018)
[0x3ff80
53eab0]
   1 pthread_kill(0x3ffc0082590, 0x20, 0x0, 0x0, 0x11fffffb5) [0x3ff8056ed4c]
  2 (unknown)() [0x3ff805756ec]
   3 __tis_raise(0x11fffffb5, 0x3ffc0080310, 0x3ff8010fb04, 0x3ffc0080c50,
0x3ff80
159f44) [0x3ff8010fb00]
   4 raise(0x3ff8010fb04, 0x3ffc0080c50, 0x3ff80159f44, 0x3ff80575618,
0x3ff80170a
6c) [0x3ff80159f40]
   5 abort(0x3ffc0560c30, 0x3ffc05655d0, 0x3ff80d13180, 0x0, 0x600000000)
[0x3ff80
170a68]
   6 for__issue_diagnostic(0x29, 0x2, 0x6, 0x11ffff830, 0x0) [0x3ff80d0b614]
   7 for__io_return(0x0, 0x0, 0x0, 0x0, 0x0) [0x3ff80d0baec]
   8 for_write_seq_lis(0x3ffc00802a0, 0x140142a00, 0x11ffffca0,
0x120009fd0, 0x140
02f760) [0x3ff80d4b0bc]
   9 fms$_fmsaut(NOWDAT = [1]   2
[2]     4
[3]     1997
, NOWTIM = [1]  10
[2]     47
[3]     0
, SERIAL = 0.0) ["d5/fmsaut.f":4, 0x1200165bc]
  10 fms$_fmsini(0x0, 0x474e414c, 0x400000002, 0xa000007cd, 0x2f)
["d5/fmsini2.f":
1951, 0x12001a9c4]
  11 fmsini(0x120016900, 0x120016940, 0x120016980, 0x1200169c0,
0x120016a10) ["d5/
fmsini.f":1716, 0x1200166a4]
  12 demo(0x120016980, 0x1200169c0, 0x120016a10, 0x8008460d, 0x1200164e8)
["d5/dem
o.f":2, 0x12001653c]
  13 main() ["for_main.c":203, 0x1200164e4]
(dbx) quit


Place in code where it's failing:

decunix> cat fmsaut.f
        SUBROUTINE FMS$_FMSAUT (NOWDAT, NOWTIM, SERIAL)
        INTEGER*4    NOWDAT(3), NOWTIM(3)
        REAL*8       SERIAL
        print *,'Hello'  <--- This is where it fails
        return
        end

I had him bump up his limits to unlimited:

decunix> limit
cputime         unlimited
filesize        unlimited
datasize        1048576 kbytes
stacksize       32768 kbytes
coredumpsize    unlimited
memoryuse       58944 kbytes
descriptors     4096 files
addressspace    1048576 kbytes

Unix 4.0B sysconfig -q proc
===========================
proc:
max-proc-per-user = 64
max-threads-per-user = 256
per-proc-stack-size = 2097152
max-per-proc-stack-size = 33554432
per-proc-data-size = 134217728
max-per-proc-data-size = 1073741824
max-per-proc-address-space = 1073741824
per-proc-address-space = 1073741824
autonice = 0
autonice-time = 600
autonice-penalty = 4
open-max-soft = 4096
open-max-hard = 4096
ncallout_alloc_size = 8192
round-robin-switch-rate = 0
round_robin_switch_rate = 0
sched-min-idle = 0
sched_min_idle = 0
give-boost = 1
give_boost = 1
maxusers = 32
task-max = 277
thread-max = 552
num-wait-queues = 64

Unix 4.0B sysconfig -q vm
=========================
vm:
ubc-minpercent = 10
ubc-maxpercent = 100
ubc-borrowpercent = 20
ubc-maxdirtywrites = 5
ubc-nfsloopback = 0
vm-max-wrpgio-kluster = 32768
vm-max-rdpgio-kluster = 16384
vm-cowfaults = 4
vm-mapentries = 200
vm-maxvas = 1073741824
vm-maxwire = 16777216
vm-heappercent = 7
vm-vpagemax = 32768
vm-segmentation = 1
vm-ubcpagesteal = 24
vm-ubcdirtypercent = 10
vm-ubcseqstartpercent = 50
vm-ubcseqpercent = 10
vm-csubmapsize = 1048576
vm-ubcbuffers = 256
vm-syncswapbuffers = 128
vm-asyncswapbuffers = 4
vm-clustermap = 1048576
vm-clustersize = 65536
vm-zone_size = 0
vm-kentry_zone_size = 16777216
vm-syswiredpercent = 80
vm-inswappedmin = 1
vm-page-free-target = 128
vm-page-free-min = 20
vm-page-free-reserved = 10
vm-page-free-optimal = 74
vm-page-prewrite-target = 256
dump-user-pte-pages = 0
kernel-stack-guard-pages = 1
vm-min-kernel-address = 18446744071562067968
contig-malloc-percent = 20
vm-aggressive-swap = 0
new-wire-method = 1
vm-segment-cache-max = 50
vm-page-lock-count = 0
gh-chunks = 0
gh-min-seg-size = 8388608
gh-fail-if-no-mem = 1


T.RTitleUserPersonal
Name
DateLines
1520.1DCETHD::BUTENHOFDave Butenhof, DECthreadsWed Apr 09 1997 17:0114
Well, it sounds like the program ran out of virtual memory. I don't see any
connection to threads except that, under POSIX, the old ANSI C raise(), which
is used by abort() [which is called by FORTRAN to report the error], is
defined to call pthread_kill() rather than the old kill() -- and DECthreads
implements pthread_kill(). Therefore putting us at the bottom of the call
stack.

I have no idea WHY or HOW the program ran out of virtual memory, or any
evidence on which to speculate. The space used by libraries, both at load
time and at runtime, changes all the time -- the fact that it hit a limit on
4.0 and not earlier (even assuming the system was configured identically)
doesn't mean much in itself.

	/dave
1520.2Look at 1469.26EDSCLU::GARRODIBM Interconnect EngineeringWed Apr 09 1997 17:268
    Take a look at note 1469.26 and others in that string. Maybe
    that is related to the problem.
    
    We had terrible problems getting threaded programs that used a lot of
    threads running on Digital UNIX V4. Seems like some system vm
    parameters need tweaking.
    
    Dave
1520.3Still can't get this to workHYDRA::BRYANTFri Apr 18 1997 15:2532
I've boosted several system parameters and still can't get a non-shared version
of this app to work. The application is reporting the error:

fms$_fork: 12 = pthread_create(0,4831836840,fms$_io,0)

When the same app builds by producing a shared library it works.  Built as
non-shared then it fails with the above message.  Here is the build which
produces a shared library:

LIBS="-lUfor -lfor -lm -lpthread -lmach -lexc -lc"
ld             \
-shared        \
-o fmslib.so   \
-all           \
   fmslib.a    \
-none          \
fmsint.a       \
blas.a         \
$LIBS          \
-set_version fmslib.51
#
f77 -call_shared -o DEMO_SHARE demo.o \
fmsnoshr.a fmslib.so

Here's the one that doesn't:

f77  -o DEMO_NOSHARE demo.o \
fmsnoshr.a fmslib.a fmsint.a fmslib.a \
blas.a \
-lpthread -lmach -lexc -lc 

Any thoughts on this?
1520.4DCETHD::BUTENHOFDave Butenhof, DECthreadsMon Apr 21 1997 10:2210
Well, for one thing, we don't HAVE a non-shared libpthread for 4.0B, so you
cannot link a threaded application non-shared. (Mixing shared libraries with
static archives is not really supported.) I have no idea whether that could
be related to your problem.

> fms$_fork: 12 = pthread_create(0,4831836840,fms$_io,0)

What, exactly, are you passing for the second argument here? That big integer
is, I hope, the address of an attributes object, but your display certainly
doesn't make that obvious.