[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference helix::realtime

Title:Realtime Conference
Moderator:HELIX::LUNGER
Created:Mon Feb 24 1986
Last Modified:Mon Jun 02 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1241
Total number of notes:4452

1235.0. "Is the realtime features of DU capable of?" by HGOM11::SYLVIAXIONG () Mon Mar 03 1997 06:59

My customer wants to use two AlphaServer 4100 running Digital 
UNIX (realtime kernel) as front-end realtime data acquisition 
systems.  

1) Data acquistion hardware is PBXDP PCI-based synchronous 
   communication card using HDLC protocol.  
2) The systems are responsible for data receiving, simple 
   data processing and data sending.
3) The requirements for processing capacity are:
   A. To receive 1000 data frames per second, one frame is about 
      100 bytes
   B. To do simple process simulatenously, such as compairsion, 
      combination, etc.
   C. To send 300 data frames per second simulatenously

The questions are:
1) Whether the realtime features of Digital UNIX is able to meet 
   the requirements of this data collection application without 
    losing data?  How the realtime features of Digital UNIX meet
    the requirements? 
2) Are there any special features support for network communication
   in Digital Realtime UNIX?

I am in urgent need of your help.  Thanks in advance. 


Sylvia
                     
T.RTitleUserPersonal
Name
DateLines
1235.1HELIX::SONTAKKETue Mar 04 1997 13:432
    You should ask this question in the DECnet/OSI conference as that
    device is supported by the WAN Support for Digital UNIX Systems
1235.2some helpHELIX::KAUFFMANAnd I don't know why...Tue Mar 04 1997 13:4628
    Hi Sylvia,
    
    The only way to validate the performance for sure is
    to do a benchmark with the hardware and software configuration
    of the customers system.
    
    I can give you some general information that may help.    If you look
    at the realtime performance report available on the external
    web under: http://www.digital.com/oem/products/rtunix/rtunix.htm
    you'll find basic realtime performance information including ISR
    latency numbers.    We do not have results for a 4100 but they
    are most likely better than the VME 2100 (sable based) system
    that is reported.
    
    I am not familiar with the PBXDP interface but hope that it is
    capable of doing some local buffering (and perhaps DMA).   If
    so and the application does buffering of the data it should be
    able to keep up with a 1Khz data rate ( this assumes that
    equates to 1 interrupt per frame or 1 millisecond).   Looking at the 
    VME 2100 ISR latency numbers in a multi-user test the Mean ISR latency
    time is 12.5 microseconds and the worst case is 262 microseconds
    so the data rate seems very reasonable assuming the rest of
    the data manipulation is simple as you have stated.
    
    Hope this helps a little.   Perhaps others that know the
    PBXDP can add some additional information.
    
    Good Luck...Jeff
1235.3more questions about RT-DU???BEJVC::SYLVIAXIONGThu Mar 06 1997 07:5529
Jeff,

I was just back from the customer site and got more info about
the project, more questions about realtime features of Digital UNIX.

1) How many memory space the realtime kernel of Digital UNIX takes up?
   What's the maximum system overhead of realtime kernel of Digital UNIX?

2) Is there limit on memory space locked by a realtime process?  The
   customer told me that there would be a 7GB data file, which needs to 
   be locked in memory in his realtime application.

3) Is it possible to get better realtime responsiveness when 
   ubc_maxpercent is reduced below 50% or set to 0?

3) The system clock resolution on Digital Alpha system is 1/1024 second.
   How CLOCK_REALTIME clock measures time in nanoseconds?  Is CLOCK_REALTIME
   clock a soft or hard clock?  How high-resolution clock is implemented by
   configuring kernel option MICRO_TIME?

4) The realtime performace data of Digital UNIX is based on process.  Is
   there data based on threads?   
 
I will post more info about the customer's project and his concerns.


Thanks & Regards

Sylvia
1235.4Some infoRHETT::PARKERThu Mar 06 1997 12:4061

Sylvia, 

I'll take a shot at these questions. I'm sure Jeff or someone will clarify
if I've missed some things. 

Hth,
Lee
--------------------------------------------------------------------------

> 1) How many memory space the realtime kernel of Digital UNIX takes up?
>    What's the maximum system overhead of realtime kernel of Digital UNIX?

All kernels are realtime as of Digital UNIX 3.0. The kernel is usually
between 7-9MB in size on 4.0. In order to make the kernel more fully
preemptable, you enable rt-preempt-opt in the generic subsystem in 
/etc/sysconfigtab, like so:
generic:
	rt-preempt-opt = 1 

>    customer told me that there would be a 7GB data file, which needs to
>    be locked in memory in his realtime application.

No, not really. You would need to "tune" a couple of vm parameters in
order to wire down 7GB - vm-maxwire and vm-maxvas + possibly some others.

> 3) Is it possible to get better realtime responsiveness when
>    ubc_maxpercent is reduced below 50% or set to 0?

If they are doing a lot of writes to the file, then yes, lowering the
ubc_maxpercent attribute would help with the issue where the system response
is less than ideal when the update daemon sync's the disks.  I don't think
I would lower it to 0 though. 

> 3) The system clock resolution on Digital Alpha system is 1/1024 second.
>    How CLOCK_REALTIME clock measures time in nanoseconds?  Is CLOCK_REALTIME
>    clock a soft or hard clock?  How high-resolution clock is implemented by
>    configuring kernel option MICRO_TIME?

You can't get nanosecond granularity without an external clock. The MICRO_TIME
option does give finer granularity - the time returned by the clock_gettime 
routine is extrapolated between clock ticks to give apparent microsecond
granularity - so it's sort of a soft clock. The actual resolution of the 
CLOCK_REALTIME is 976562 microseconds.

> 4) The realtime performace data of Digital UNIX is based on process.  Is
>    there data based on threads?
    
    If you are using 4.0, then you are going to have to wait until 4.0D 
    to get the best realtime performance. Currenty, only process contention 
    scope has been implemented in the DECthreads 2 level scheduler. They 
    should have system contention scope back in by 4.0D, I think. As long as 
    not much else is running on the system, then that may be ok. 
    
    




    
1235.5BEJVC::SYLVIAXIONGFri Mar 07 1997 07:1617
Lee,

The physical memory configuration will be 8GB.  Under this 
situation, is it possible to lock 7GB data file in memory?  
Is there limit on memory space locked by a realtime process
for Digital UNIX?

In section 6.1 Clock Functions, on Page 6-2, chapter 6 of 
Guide to realtime programming, there is a statement, "The 
CLOCK_REALTIME clock measures time in nanoseconds", how to 
understand it? 

Thanks & Regards

Sylvia

 
1235.6RHETT::PARKERFri Mar 07 1997 13:1025
    
    Sylvia,
    
    With 8GB of physical memory, you should be able to lock a 7GB
    file into memory -> you will need to up some of the VM params
    mentioned before, esp. vm-maxwire. 
    
    > CLOCK_REALTIME clock measures time in nanoseconds
    
    We are required by POSIX to return seconds & nanosecoonds.If you
    had an external clock that actually had nanosecond resolution,
    then you would get that. But, since the system clock only has
    ~millisecond granularity, that's about as fine as you can get unless
    you use the options MICRO_TIME kernel config option. Then, you can 
    get apparent micro-second granularity. Sections 6.1.4 & 6.1.5 in the
    "Guide to Realtime Programming" explain this fairly well. 6.1.4 also
    explains that POSIX 1003.1a mandates that if a program requests a 
    timer value that is not an exact multiple of the system clock
    resolution (976.5625 microseconds), the actual time period will be 
    slightly larger than the requested time period. 
    
    Hope this helps! 
    
    Lee
    
1235.7HELIX::SONTAKKEMon Mar 10 1997 17:5613
    The specific question about wiring down the 7GB file in the memory
    needs to moved to Digital_Unix conference.  For all I know, nobody else
    might have tried to do that yet.  Not many organizations in the world
    have a budget which allows to have 8GB memory on-board and can afford
    to allocate 7GB of that to single file :-)
    
    The MICROTIME gives the ability for high granularity time stamping. 
    You could measure events with apparent granulairty of microsecond. 
    However, event scheduling still is limited to the granularity of the
    system clock which is running at around 1ms (actually 0.9765625 to be
    exact)
    
    - Vikas
1235.8HGOM11::SYLVIAXIONGTue Mar 11 1997 07:27117
Here are the info about my customer's project -- realtime satellite data 
processing system.

1. Proposed configration
------------------------

		    Side 1			    Side 2
		---------------			---------------
Satellite	Data Collection			Data Collection
Station		  Instrument		  	  Instrument
		---------------			---------------
		     .				     .		
		     . HDLC		       HDLC  .
		     .				     .
		     	
		     | PBXDP                   PBXDP |		
		-----------			-----------   	Realtime
Data		AlphaServer --------    -------	AlphaServer   	Data  
Processing	   4100 	   |	|	   4100	      	Acquisition
Center		-----------	   |	|	-----------
				   |    |
				   |    |
		-----------	   ------	-----------	Realtime
		AlphaServer ------|MC Hub|----- AlphaServer	Data
		   4100 	   ------	   4100		Processing
		-----------	   |    |	-----------
				   |    |
				   |    |
		-----------	   |    |	-----------	Database
		AlphaServer --------    ------- AlphaServer	Server
		   4100 			   4100
		-----------			-----------


A) Six AlphaServers are composed into memory channel cluster.   Side one
   and side two are the backup systems for each other.   

B) For realtime data collection and processing systems, each AlhpaServer 4100 
   is configured with 2 CPU and 8GB memory.  
	
C) Realtime data collection is through PBXDP PCI-based synchronous 
   communication controller based on HDLC protocol.  PBXDP is capable of
   providing line speeds up to 5Mbps.   

D) OS is Digital UNIX configured in realtime kernel. 


2. Application and performance requirements
-------------------------------------------

A) Realtime data acquisition system
   
   As for peak performace, to receive 1000 data frame per second and to 
   send 300 data frame per second through PBXDB, to pass 500 data frame 
   per secnod through memory channcel to data processing system 
   simulatenously,  one data frame is 100 bytes in maximum.  In addition 
   to data collection,  the system will do some simple data processing, 
   such as encoding, decoding, comparision and formatting, etc.  The 
   procedure is to receive 2 data frames, do some processing, to form one 
   initial data  and pass it to processing system, it is required to finish 
   the procedure in 3 milliseconds without data lost.  

B) Data processing systme

   For one initial data, there will be a computation of 400,000 instructions,
   the result will be passed back to data acquisition system for sending to
   satellite station. 


3. Questions about Digital realtime features 
--------------------------------------------

A) The realtime performace of context switch and preemption was reported 
   based on processs, which was little concerned by this customer.  Since the 
   overhead of threads is smaller than the overhead of processes in general, 
   the customer asked whether it will get better performance when process 
   Input A and Output A are combined into one process with two threads, Input A
   and Output A.  

	Data acquisition system		|	Data processing system
					|	
   	--> Input A	Output A -->	|
	    First	Second		|
	    Priority	Priority	|
					|
					|
	<-- Output B    Input B <--     |
	    Fouth	Third		|
	    Priority	Priority	|

  
B) Because it is required to finish procedures from Input A to Output A in 
   3 milliseconds,  this customer thought that Digital UNIX system overhead 
   would be the major factor,  how about the system overhead in this case? 


C) Now, data acquisition system and data processing system are both 2 CPU 
   systems.  Because there will be large amount of data transmittion between
   two systems, which maybe the major infection to the realtime response of 
   the application, is it possible to get better performance when data 
   acquisition and processing systems are combined into one system with 4 CPU? 

D) Digital UNIX realtime interface supports timesharing scheduling policy and
   fixed-priority, preemptive scheduling policy.   Are these two kinds of
   scheduling policies able to coexist simulatenously, that is some processes 
   run under timesharing scheduling policy, some processes run under fixed-
   priority, preemptive scheduling policy?

Please kindly give your advises, many thanks!


Regards

Sylvia

   

1235.9HELIX::SONTAKKEMon Mar 17 1997 16:001
    I thought the memory channel hub only supports 4 systems.
1235.104 was an Encore limit???BBPBV1::WALLACEjohn wallace @ bbp. +44 860 675093Mon Mar 17 1997 22:068
    I'm pretty sure it *works* with more. As usual, "support" may be a
    different matter.
    
    Adding more nodes and longer cables is one part of Digital's "added
    value" to what came in from Encore.
    
    regards
    john
1235.11HELIX::SONTAKKETue Mar 18 1997 13:571
    Is this being discussed anywhere else?
1235.12Some late-in-the-day thoughts. No warranty, etc.BBPBV1::WALLACEjohn wallace @ bbp. +44 860 675093Tue Mar 18 1997 18:3287
    Hi Sylvia,
    
    Re .8: You have quite a project on here. Who else do you have to help?
    You probably need more than folks can offer here.
    
    For a start, you need someone who has in-depth knowledge of the HDLC
    stuff. Do the drivers and software you plan to use give you frame level
    access with the kind of functionality you need ? Some software doesn't
    give you full access; I don't know about the DU stuff. You may need
    full access if the HDLC link has been "modified" in any way.
    
    You also need a good estimate of the per-frame CPU time requirement
    just to pass the data up or down the driver(s). 
    
    But here are my thoughts so far.
    
    2A:
    The processing you describe in 2A and 2B doesn't seem to need a 7GB
    file in memory. Where does the 7GB come in ? 
    
    main() { /* Data acquisition program */
    	while (1) {
    	    receive (frame1);
    	    receive (frame2);
    	    mangle_1 (frame1, frame2, big_7gb_array?, frame3);
    	    /* ^------ what kind of work goes on in here ?	*/
    	    
    	    send (frame3); /* the "initial" frame		*/
    	}
    		
    }
    
    2B: You mention that the computation per "initial message" will be some
    400,000 instructions. Do you have more info? The time to perform these
    instructions will vary enormously depending on whether the data is in
    on-chip cache, main memory, or somewhere in between. Obviously, the
    100bytes per HDLC frame fit nicely in on-chip cache but the 7GB database
    is main-memory speed. What kind of mix can you expect ?
    
    main() { /* Data processing program				*/
    	while (1) {
    	    receive (frame_in);	/* Get "initial message"	*/
    	    mangle_2 (frame_in, frame_out, another_7gb_array_?);
    	    /* ^---- 400K instructions */
    	    send (frame_out);
    	}
    }
    
    3A: As you say, in general you get much better context switch times
    between threads than between processes.
    
    If your data acquisition system really is an input-processing job and
    an output-processing job, with little interaction between them, it
    *might* be convenient to have separate processes for simplicity. But if
    you use two threads you might save context switch time.
    
    3B: What is the impact of missing your 3mS deadline? If it is
    catastrophic, maybe DU as configured in .8 is not the right answer. If
    it is acceptable to be late "occasionally" then we carry on looking at
    DU. (I would be surprised if it was catastrophic because I don't expect
    100% availability and integrity from a satellite downlink).
    
    3C: The 4100 has excellent system bus bandwidth. At first glance it
    might be reasonable to expect a single system with all the data in one
    box to "perform better" than the same number of CPUs in several boxes
    with an MC interconnect, because the latency and thruput of the 4100
    system bus is better than the MC interconnect. You save money, too, so
    long as you can fit enough processing power in one box. And you've
    already got a master/standby setup. Maybe you don't need MC at all?
    
    NB "perform better" here means "give better thruput". 
    
    3D: Yes, realtime and roundrobin can coexist in the same box. Do you
    want them to ? With everything in the same box you may find you have
    worse latencies for some RT things, so if that is important there may
    still be advantages to splitting RT from TS in separate boxes, so the
    TS stuff cannot block the RT stuff so easily...
    
    What is the "output" of this system ? Files on disk (or records in a
    database) ? Users at screens ? Frames back up the uplink within n mS of
    the matching incoming frames ? That affects where you split the RT and
    the "timesharing"... 
    
    hope this helps a little
    
    regards
    john
1235.13HELIX::SONTAKKEWed Mar 19 1997 11:428
    By the way, 
    
    VM enforces a system wide wiring limit of 80% of available memory. 
    That will be ~6G on an 8G system.  The limit can be reconfigured by
    changing vm-syswiredpercent in /etc/sysconfitab.
    
    - Vikas
    
1235.14SMURF::DENHAMDigital UNIX KernelThu Mar 20 1997 11:405
    And further by the way, change that 80% wire limit very carefully!
    The kernel really doesn't like running out of wired memory. You
    can end up wedging things to you suck up too much wired memory.
    Anyone who's dealt with wired memory leaks knows that things can
    fall apart pretty badly...
1235.15HGOM11::SYLVIAXIONGTue Mar 25 1997 07:3760
John,

Yes, it's a about US$ 1M project.  

The processing described in 2B of .8 will need 8GB basic data.  
As for 400,000 instructions,  it means that at least 0.4 mips 
computing capacity is needed to generate a result from an initial 
data, told by my customer.

One of output in 2B is back up to the Satelite station by HDLC 
connection.

I don't know much about HDLC, so I posted the questions to the 
conference OZROCK::X25_OSF.  The attached is answer to my ABC 
questions.  Could I posted your question to this conference in 
order to get the answer for my project?


Thanks & Regards

Sylvia

  

          <<< OZROCK::DISK$NAC$PUBLIC:[NOTES$LIBRARY]X25_OSF.NOTE;1 >>>
              -< Proudly built by the engineers of NaC Australia >-
================================================================================
Note 872.1                  Questions about PBXDP???                      1 of 1
OZROCK::MUGGERIDGE "X.25 is 1-2-3"                   29 lines   9-MAR-1997 20:10
--------------------------------------------------------------------------------
        
>>        1) Is PBXDP supported on AlphaSever 4100 running Digital UNIX?
  
Yes.
      
>>        2) Data communication is based on HDLC protocol, which software 
>>           driver is better for this case?   WAN Support for Digital 
>>           UNIX System V2.0A (SPD 42.47) or DECnet/OSI V4.0 for Digital 
>>           UNIX (SPD 41.92)?  Is there HDLC APIs, which support       
>>           network programming to PBXDP?
  
WAN Support for Digital UNIX System V2.0A is the only software you require
for this.

There are HDLC APIs.  The WDD Programmer's reference contains more details.
      
>>        3) The performance requirements for PBXDP are to receive 1000 
>>           data frame/sec and to send 500 data frame/sec simulatenously,
>>           one data frame is 100 bytes in maximum.  Is the communication 
>>           capacity of PBXDP able to meet these requirments?      
  
This is a definite maybe!  Looking at the numbers your system will be required
to process more than 1500 interrupts per second and run the lines between 1-2 
Mb/s.I don't see any problems with that, as long as your aware of the other 
activity on your system.

Naturally, you will need to use V.35 or X.21 type interfaces for these speeds.

Matt.