[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference forty2::x500

Title:X.500 Directory Services
Notice:Sprt: FORTY2::X500_SUPPORT, Kits: 216.*, try dir/titl=OFFICIAL
Moderator:FORTY2::PULLEN
Created:Tue Jan 30 1990
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1016
Total number of notes:4299

1011.0. "TCP/IP Support in a VMS-Cluster ?" by OSITEL::rtont2.rto.dec.com::Erich (SI-Office) Mon May 26 1997 12:09

 Hi all,

 I have a question about TCP/IP Support in a VMS-Cluster.
 We have a situation where a user agent (ALL-IN-1 V 3.2) runs
 on all CPUs of a homogeneous alpha-cluster (2 CPUs).
 The agent uses XAPI and XDS. 
 The DSA and MTA run on one CPU only at a time. If this CPU fails,
 DSA and MTA will be started on the second CPU.
 The transport is RFC1006. We use DECnet alias and Internet alias
 (via UCX).
 
 Is it possible that connect-requests from the agent, from peer MTAs
 and Lookup-Clients to the DSAs and MTAs presentation address are
 directed to the "wrong" CPU, i.e. the CPU where the DSA and MTA 
 are NOT running ?
 Is it possible to force the connect-requests to the CPU
 where DSA and MTA are running and switch (transparently) to the
 second CPU if the first one  fails ?

 regards, Erich

T.RTitleUserPersonal
Name
DateLines
1011.1Cluster alias Address is a bad ideaBIKINI::DITEJohn Dite@RTO DTN 865-4065Mon May 26 1997 21:4333
> The DSA and MTA run on one CPU only at a time. If this CPU fails,

Why the MTA? The MTA can run on both systems in parallel.
 
> The transport is RFC1006. We use DECnet alias and Internet alias
> (via UCX).
 
If you are running over RFC1006 then you can't be using the DECnet alias!

The Internet alias cluster address is a completely different kettle of
fish! 

If you know UCX then you will realise that the UCX IP cluster does not
have the same functionality as the DECnet cluster alias.

If you know that the DSA can only run on one node in the cluster
then it does not seem a sensible thing to do to use the cluster alias
address (either DECnet or UCX IP alias address) as you will not be able
to influence where your Connect Request will be passed to.

However, you can add all the cluster members individual addresses to
your dua_defaults.dat so that, assuming one DSA is running on one of your
cluster members, a connect can/will be established ie. X.500 tries the
various addresses in dxd$dua_defaults.dat till if finds one that works.

Now, what you still have to deal with is how to 'restart' a DSA if it fails,
either on the same node or on the alternative node.

Now if you had something like DECsafe for OpenVMS ;-).

Hope this helps.
John

1011.2OpenVMS DSA Failover wish BIKINI::DITEJohn Dite@RTO DTN 865-4065Tue May 27 1997 00:1114
This is more aimed at Engineering. I know money is in short supply 
and X.500 on VMS may not have such a rosy future as on other platforms.

However one 'small thing' that could/should be done to alleviate the problem
of failover in a cluster is the use of the distributed lock manager.

Why can't the DSA server processes be started on each individual member of the
cluster. The first one reserves an exclusive cluster-wide DSA specific lock 
and carries on working in the usual fashion. The other DSA servers then idle
waiting for the release of this lock. In the event of DSA failure, the next
waiting DSA server receives the lock and carries on with his startup etc.

One can but dream....

1011.3decnet alias not usedOSITEL::rtont2.rto.dec.com::ErichSI-OfficeTue May 27 1997 12:527
 Hi John,

 thanks for the input, it was very valuable for me.
 DECnet alias: it is present, therefore I mentioned it, but we do not use
 it for MTA/DSA transport and agent connect

 regards, Erich
1011.4FORTY2::PULLENJulian PullenTue May 27 1997 20:505
For the record the DSA on VMS does take an exclusive cluster-wide
DSA specific lock. If the lock fails the DSA currentlyt exits.


	Julian
1011.5still hopeing...BIKINI::DITEJohn Dite@RTO DTN 865-4065Wed May 28 1997 12:4524
Julian,


>For the record the DSA on VMS does take an exclusive cluster-wide
>DSA specific lock. If the lock fails the DSA currentlyt exits.

I realised this, observed by the message below in 
dxd$directory:DXD$DSA_STARTUP_OUTPUT.LOG if a second DSA tries to start if 
another DSA is running.

%SYSTEM-W-NOTQUEUED, request not queued
Warning: DSA not started, because a DSA is already running.

It's at this point I would have thought one could implement the changes in that 
any subsequent DSA server processes would hibernate and idle (ie  not be 
accessible) and wait for this lock to be released rather than exiting. 
In the event of the lock being released the first waiting DSA to reserve the 
exclusive lock would then start loading the DIT and establish his channels
to DECnet/OSI etc..

As you may have gathered at present there is no optimal failover solution for
X.500 and OpenVMS. 

	John (still hopeing)
1011.616.11.160.133::FORTY2::PALKAAndrew Palka Altavista DirectoryWed May 28 1997 14:1217
The dsa does not currently wait until the previous dsa exits.
However, even if it did that would not completely solve the problem.
You have to issue the ncl 'create dsa' and 'enable dsa' commands
to make the dsa start up.

I believe that it is perfectly possible to write a command procedure
which does what is necessary, using it's own locks. This would start
up the dsa process and issue the ncl commands when it obtains the
lock (you probably need to write a program to obtain the lock. I
dont think you can do that directly from ncl, apart from using 'open' 
to poll the status of a lock file).

If you want a solution quickly then that is the way to go, otherwise
you should make your requests known to the product manager for
possible inclusion in some future release.

Andrew
1011.7Product Manager=Nad Nadesan ?BIKINI::DITEJohn Dite@RTO DTN 865-4065Wed May 28 1997 20:5910
We will most probably do something by evaluating the DSA DXD$DSA_LOCK.

We will try and make a request to Product Management (Nad Nadesan) 
and we'll hope that it's all in Version 3.2 ;-0.

Now that UNIX cluster (Steel?) is coming out with a common file system and 
something similar to the DLM what impact will that have on the future design
of X.500 for Digital UNIX?

John