[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

1610.0. "Wanted: Sanity check on my rusty statistics" by MINDER::WIGLEYA () Mon May 18 1992 15:21

   After years keeping my head down as a humble programmer, I somehow 
   managed to get the job of doing some statistical modelling of a messaging 
   system that we are designing. I have done some analysis, consulting my 
   old school books in the process, and I would now like to throw open the 
   fruits of my labour to comment from greater intellects than my own (and 
   there will be many of those!).
   
   Please feel free to criticize, comment, suggest alternative approaches!
   
   - Andy Wigley @MCO
   
   
   1.0 Problem:
   
   It is required to deliver a message to 300 recipients with a 99% 
   certainty of completion within 20 seconds.
   This is to take place on a system that has to be designed to handle a 
   peak messaging rate of 70 messages/sec.
   
   1.1 Redefinition of problem:
   
   Allowing MTA-MTA and UA-MTA transmit times, and other software delays 
   reduces the delivery time to 15 seconds.
   At a time of peak activity (70 msgs/sec) any one message must be 
   delivered with less than 1% probability of delivery time exceeding 15 
   seconds. 
   
   2.0 Message Delivery time
   
   Test programs have shown that a single message can be 'delivered' to an 
   RF72 (i.e. the file cabinet server can make the appropriate file updates) 
   in 0.076 seconds. 
   
   2.1 Determining the number of messages for a disk that can be delivered 
   in 15 seconds with 99% certainty
   
   The ERLANG C statistical model is appropriate. 
   Using just one server (disk), we can adjust the arrival rate to find the 
   number of messages that can be queued up for this disk such that the 
   probability that a single message will have to wait more than 15 seconds 
   is approximately 1%.
   
   No. of Servers               1 
   Service Time (secs)          .076 
   Arrival Rate (/secs)         12.852 
   Util. per server (%)         97.6752 
   All servers idle (%)         2.32606 
   All servers busy (%)         97.728 
   Avge No. in the queue        41.0599 
   Avge No. in  system          42.0366 
   Average wait time            3.19482 
   Average flow time            3.27082 
    PROBABILITY OF WAIT TIME IN QUEUE EXCEEDING t secs IS:
    1             71.9732 
    2             53.0057 
    3             39.0367 
    ..
    13            1.83226 
    14            1.34942 
    15            .993783 
   
   This arrival rate of 12.852 messages/second is equivalent to 192.78 
   messages/15sec.
   
   
   3.0 Distribution of addressees across disks
   
   In the 15 second period, the distribution of addressees across disk 
   volumes becomes crucial.
   
   The problem now becomes one of:
   Given a user population evenly distributed across different disk volumes, 
   how many disks do you require for the probability that, in any 15 second 
   period, more than 361.2 messages will go to the same disk to become 
   insignificant.
   
   If you repeatedly deliver samples of 932 messages to n disks and count 
   how many of the messages go to each disk, over a large sample, the 
   results will be evenly distributed around the mean with rapidly 
   diminishing observations the further from the mean you go. This can be 
   drawn as a normal distribution curve.
   
   p is the probability that one message will arrive at a given disk
   q is the probability that one message will go to any other disk
   n is the number of samples
   m is the expected mean, which is n/number of disks 
   The standard deviation of the distribution is squareroot(npq)
   The probability that a single sample will have x messages addressed to 
   one disk is derived by dividing the variance of x from the mean m and 
   looking the result up in the 'Table of Partial areas under the Normal 
   Curve'. 
   
   Total population of messages during the 15 seconds is: 
   	10 seconds at 70 msgs/sec (peak rate) 
   	+ 5 seconds at the average rate of 46 msgs/sec
   	TOTAL : 932 msgs/(15 seconds).
   
      No. of 		Probability of more than 192.78 in 932 messages 
       disks			going to 1 disk
   	2			~ 99.99%
   	3			~ 99.99%
   	4			~ 99.99%
   	5			  30.08%
   	6			~  0.00%
   
   
   4.0 Conclusions
   
   As long as the potential addressees are evenly distributed across 6 or 
   more disks, it is virtually impossible that 932 messages will be 
   distributed across disks such that the probability that the delivery is 
   completed in 15 seconds is less than 99%.
   
T.RTitleUserPersonal
Name
DateLines
1610.1How long are the messages?CIV009::LYNNLynn Yarbrough @WNP DTN 427-5663Mon May 18 1992 15:443
Perhaps I missed something - but is the *length* of the messages relevant?
Perhaps you have assumed that message length is short enough that it is 
overwhelmed by other considerations, but it should be stated somewhere.
1610.2Do I get my grade??MINDER::WIGLEYATue May 19 1992 07:4816
   No the length of the message doesn't affect this, since 'a delivery' of a 
   message doesn't result in a copy of the full message being written to 
   every recipients' directory.
   
   It is like ALL-IN-1 - a single shared copy of the message is created for 
   all recipients, and the individual 'deliveries' are updates to each 
   recipients indexed file (their file cabinet).
   
   The time to create the single shared copy is assumed to always be shorter 
   than the time to make the file updates, and all the operations go on in 
   parallel in any case.
   
   The service time of 0.076 seconds/message quoted was taken from tests 
   making the kind of updates that will be required.
   
   - Andy