[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference iosg::all-in-1_v30

Title:*OLD* ALL-IN-1 (tm) Support Conference
Notice:Closed - See Note 4331.l to move to IOSG::ALL-IN-1
Moderator:IOSG::PYE
Created:Thu Jan 30 1992
Last Modified:Tue Jan 23 1996
Last Successful Update:Fri Jun 06 1997
Number of topics:4343
Total number of notes:18308

2718.0. "OAFC$SERVER and quota problem" by CROCKE::YUEN (Banquo Yuen, Darwin Australia) Mon May 17 1993 13:53

    After upgrade from V2.4 to V3.0, the OAFC$SERVER went on perfectly for
    three weeks, then suddenly it got an "insufficient memory" error which
    was also logged in SYS$MANAGER:OAFC$SERVER.LOG, I define the system
    logical OAFC$SERVER_PGFLQUOTA to 100000 because this error occurred
    when the page file count reached 60000 which is the default found in
    SYS$STARTUP:OAFC$STARTUP.COM.  At the same time I found that the
    customer has around 100 shared-drawer but the default maximum drawer
    for the server is only 50, so I changed that to 100.
    
    After that the server goes wrong in the rush hour everyday.  Everytime 
    when it goes wrong, the page count is around 40000.  The error message
    may be "RMS error ...", "%OAFC-I-DSTNF A target was specified that does 
    not exist", "Server not available ... " etc...
    
    I have tried to increase the BIOLM and DIOLM but didn't help.  I have
    looked at the other quota, they are quite sufficient when the server
    goes into this state of delirium, keeps mummmm crazy error messages.  The
    error messages are not logged in the SYS$MANAGER log files.
    
    I don't know how to check whether the BIOLM and DIOLM have been exceeded or
    not.  Other quota I can check by showing the process.  Moreover I don't
    know how this will happen because the process has EXQUOTA privilege.
    Furthermore I don't know whether the PQL in SYSGEN will have any effect
    or not.
    
    The ALL-IN-1 system is running on a cluster of 2 nodes.  The other less
    powerful node thus less users runs perfectly, no error reported even
    when the other node goes wrong, ie. also why I think it may be a quota
    problem.
    
    To regain the soberity of the problem server, I have to stop and start
    the server.
    
    And what is the effect of stopping and starting the server when a lot
    of users have already attached to the server?
    
    Thanks
    Banquo
T.RTitleUserPersonal
Name
DateLines
2718.1Max links/ChannelcntJGODCL::SHERLOCKL.U.F.C. The phoenix has risenMon May 17 1993 14:3210
    Banquo,
    
    You should try checking the EXECUTOR in NCP for the Maximum links
    value, also check the SYSGEN parameter CHANNELCNT. We had similar
    problems to yours during peak periods and after increasing these
    values the problem was solved.
    
    As far as stopping the File Cabinet Server, this shouldn't cause
    any problems as the server will allow any clients that are connected
    to the server to complete their operations.
2718.2JGODCL::SHERLOCKL.U.F.C. The phoenix has risenMon May 17 1993 14:4013
    Re - 1
    Banquo,
    
    as a P.S. on a 4 node cluster with 750 ALL-IN-1 users, each node has
    the following parameters:
    
    SYSGEN parameter CHANNELCNT - current value 1024
    
    NCP Executor maximum links 255
    
    HTH
    
    Tim
2718.3Could be quotas - but 40k pages is a lot.IOSG::CHINNICKgone walkaboutMon May 17 1993 18:4516
    
    It may be that the FCS is leaking some memory although it sounds like
    you are hitting the limit pretty fast. I'm not sure if a patched
    version was produced for a problem like this - not really my department
    - but it is worth checking.
    
    If you are pushing PGFLQUOTA - you also may need to up the SYSGEN
    parameter VIRTUALPAGECNT which is the maximum virtual size of a
    process. You might also run very short on pagefile space if you aren't
    careful so I'd suggest increasing the pagefile size too.
    
    EXQUOTA privilege will only help for disk-quota and not VMS process
    quotas.
    
    Paul.
    
2718.4secondary server 128CROCKE::YUENBanquo Yuen, Darwin AustraliaTue May 18 1993 13:2619
    I have checked the NCP max link and max alias link (even thought I
    don't think it will use the cluster alias), I have checked the
    VIRTUALPAGECNT, page file size but not CCANNELCNT, so I will check
    this tomorrow when I am there.
    
    While waiting for your reply, I created another server on the node
    today with object number 128, but it would not work, it just sits
    there and saying running/enable even the server 73 is on a heavy load.
    
    Moreover, this server 128 would not work even I stopped server 73.
    So how can I make a secondary server work?  I have checked the
    partition field in the partition.dat, they are all blanks, shall I
    cheange it to something like 0::"128" or anything like that or even to
    0:: or whatever (but I guess this only specify the location of the
    partition.dat)
    
    Thanks
    Banquo
                  
2718.5How load balancing is achieved.IOSG::STANDAGETue May 18 1993 14:2518
    
    Banquo,
    
    Sorry, this will have to be brief - and is actually something I'm doing
    at this very minute!
    
    ALL-IN-1 will always use DECnet object 73 unless you have DNS
    naming enabled (i.e. the server is running Distribution level 1).
    
    However, you can force selected clients to use a different server
    object by defining OAFC$SRV_OBJ in their LOGIN.COM. Brokering to remote
    systems will always cause a connection to a remote object 73 server,
    unless you also have OAFC$BROKER_OBJ defined.
    
    
    Kevin.
    
    
2718.6Otherwise it'll never work!IOSG::STANDAGETue May 18 1993 14:3411
    
    Finally, when defining OAFC$SRV_OBJ make sure it is the object number
    ONLY - and nothing else !
    
    e.g.
    
    DEFINE OAFC$SRV_OBJ 177
    
    
    Kevin.
    
2718.7don't use OAFC$BROKER_OBJCHRLIE::HUSTONMon May 24 1993 19:1714
    
    re.5
    
    >However, you can force selected clients to use a different server
    >object by defining OAFC$SRV_OBJ in their LOGIN.COM. Brokering to remote
    >systems will always cause a connection to a remote object 73 server,
    >unless you also have OAFC$BROKER_OBJ defined.
    
    I would not recommend having a customer site use OAFC$BROKER_OBJ, 
    it is not supported. OAFC$SRV_OBJ is supported so feel free to use 
    that.
    
    --Bob
    
2718.8That's trueIOSG::STANDAGEMon May 24 1993 19:254
    Ooops...forgot that! :-)
    
    k.
    
2718.9probably the channelcntCROCKE::YUENBanquo Yuen, Darwin AustraliaWed May 26 1993 14:3913
    Hello
    
    The sysgen channelcnt is only 300, so this is probably the reason why
    the server is not working properly, but the customer don't want to
    schedule a reboot, so I have created another server and use the
    OAFC$SVR_OBJ to distribute user groups to use different server, the
    customer is happy for the moment.
    
    And I discovered the channel count and number of channel connected are
    actually shown on sai screen rather than the sri screen.
    
    Thanks
    Banquo