[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

950.0. "AUTOGEN is hunting for knee of the curve" by POOL::HALLYB (The smart money was on Goliath) Mon Oct 17 1988 13:31

Consider the problem of tuning the value of    F| *
SYSGEN's SYSMWCNT -- the system working set.   a|  *
If one observes a large number of system       u|   *
page faults, then it would behoove one to      l|    * 
increase the value of SYSMWCNT, subject to     t|     *
certain sanity restrictions.  The converse     s|       *
is also true -- if there are few faults then	|          *
perhaps one can decrease the parameter and	|	       *
return memory to the users.  One can graph	|		     * 
this sort of behavior as shown at right --	|			      *
the more memory you give to the system working	+------------------------------
set, the fewer faults.						Memory

The literature is full of graphs like the one above, although usually there
is a more pronounced "knee" of the curve.  But it must be remembered that
typically such graphs are drawn in ideal conditions, where the only variable
is memory.  Often, even the workload is kept constant.

AUTOGEN has the unenviable job of looking at historic points and deciding if
SYSMWCNT should be raised, lowered, or kept the same.

Can anyone offer some algorithmic hints?  The basic problem is:  how does one
decide when one has passed the "knee of the curve", under program control?
This is the sort of thing where you can eyeball a smooth curve and say "AHA --
right there is the knee!".  Well, this must run without human intervention
and may not have that smooth a curve to work from.  On the other hand, exact
predictions are not necessary, either.

  John
T.RTitleUserPersonal
Name
DateLines
950.1max (B - C) ?VINO::JMUNZERMon Oct 17 1988 14:126
    John:
    
    Is there a way to assign costs/benefits to points on the chart:
    costs of page faults, and benefits of more user memory?
    
    John M.
950.2now, how do you calculate that, again...HERON::BUCHANANand the world it runnes on wheelesMon Oct 17 1988 15:163
	How about minimum radius of curvature?

Andrew
950.3Consider only abstract dataGLOBBO::HALLYBThe smart money was on GoliathMon Oct 17 1988 15:2113
>    Is there a way to assign costs/benefits to points on the chart:
>    costs of page faults, and benefits of more user memory?

    For any particular situation, yes.  But I don't think this is the
    sort of thing one can specify in general.  At least, not without
    writing large quantities of extra code; something we do not wish
    to do at this time.
    
    Locating the "knee on the curve" is something humans do naturally.
    Is there an algorithm for doing this, looking only at a curve,
    not knowing any scale or what the data represents?  There must be!

      John H.
950.4if the shoe fits (your knee)VINO::JMUNZERMon Oct 17 1988 20:325
How about fitting two straight lines to the curve?  Fit one line to the points
[first, P] and the other to the points [P, last].  Vary P until the total of
the two fits is best.  ?

John M.
950.5If the point fits, knee it!GLOBBO::HALLYBThe smart money was on GoliathMon Oct 17 1988 20:527
    You mean construct two lines via least-squares, and look for the
    two lines that provide least total variation?  That sounds worth
    pursuing further.
    
    Re: .2
    
    Could you please clarify that a bit?
950.6CLT::GILBERTMultiple inheritence happensTue Oct 18 1988 12:5110
    There is no 'knee' of the curve.  The curve is basically a hyperbola.
    If you change either scale of the hyperbola -- the x axis or the y axis,
    you'll find that the 'knee' moves.
    
    The best approach seems to be the following.  Fit the points to a curve
    (hypothesized to be a hyperbola), and decide (somehow) on the optimal
    value for d faults / d memory.  For example, suppose that in a balanced
    system you're willing to trade 0.5 Meg of memory for 100 fewer page faults
    per second.  Use this and the equation of the curve to determine the point
    on the curve that has this slope.
950.7VINO::JMUNZERTue Oct 18 1988 13:456
>    You mean construct two lines via least-squares, and look for the
>    two lines that provide least total variation?

    Right.
    
    John
950.8Knee --> local min?LARVAE::TURNERMark Turner * DTN 849-3429 * SPYDER::TURNERWed Oct 19 1988 11:346
If this graph is rotated 45 degrees counterclockwise, does the problem
become equivalent to that of finding a local minimum?



						Mark
950.9re .8GALLO::JMUNZERWed Oct 19 1988 12:584
    No, I'm afraid that the problem is finding how many degrees to rotate
    the curve.
    
    John
950.10I really think this is itHERON::BUCHANANand the world it runnes on wheelesWed Oct 19 1988 18:1549
	Someone asked for more details on what minimum radius of curvature is
all about.   I haven't done any of this stuff since I was sixteen, and I haven't
any analysis books (algebra is my turn on).   However, this is a work-related
problem, and hopefully this reply will trigger interest in the problem by
someone like Xia or Deramo who *are* analytically oriented.

	I should preface this by saying that I am pretty sure that this *is*
the best solution to the problem, and is the formal idea which people have been
feeling their way towards in the last couple of replies.   Curvature is all
about knees.   (Perhaps I should take that for my next personal_name).

	Right, let's forget for a oment that we've got a finite data set.
let's pretend that we're speaking about a curve in the Caretesian plane.   And
let's say that it hasn't any kinks or jerks in it: that it's smooth.  Then at
any point on the curve, there is a unique tangent, which is the instantaneous
slope of the curve.   Similarly, we can speak of the instantaneous curvature
of the slope.   I.e., what is the location and radius of that circle which
has the same tangent as the curve, and the same "curvature" as the slope that
we're looking at.

	Just as we can find the tangent by calculus, I.e. by computing
( f(x+h) - f(x) ) / h, and shrinking h till eensyweensy, so we can do the
same sort of thing to find the "curvature".   The curvature, formally, I seem
to recall is defined as the reciprocal of the raidus of the tangent circle that 
we're trying to define.

	Scribbling on a piece of paper just now, I think the way that this
works is as follows.   Pick two points close together on the curve.   Find
the tangents, and hence the normals, of the curve at those points.   The
intersection of those normals is the centre of the circle.   Now bring one
of those points closer to the other.   The intersection point approaches the
tangent circle with the correct radius (I allege, baselessly).   I should add
that this is just a conceptual way of looking at it: the real details won't 
require an iterative approach.

	Now, let's imagine that our existing data points are cubically splined
together.   All we need to, for each point, is to use the cubic equation which
is local to that point, and work out what the curvature is.   Cubics are
really simple, so the expression that you'll get should be extremely simple,
and readily implementible in AUTOGEN.   All you need to do is to compare the
expression for the curvature from one point to the next, and keep track of the
minima.   Probably, given the problem we're looking at, there will be only one
of these, and that's your knee!

	Gee, this is so simple, perhaps I should do it myself!

I'll fiddle around a bit when I get home.

Andrew
950.11Help Available On a Similar Problem?DRUMS::FEHSKENSWed Oct 19 1988 19:2920
    Hmm, I've got a similar problem; I've been doing some fooling around
    with chaotic systems on my Amiga, including plotting the Lorenz
    attractor.  What I wanted to do was color the curve based on its
    local curvature.  However, there is no analytic expression for the
    curve (it's space filling and defined by three nonlinear differential
    equations; one plots it parametrically against time by iterating the
    difference equations derived from the differential equations) so
    I have to derive the curvature from a sequence of points.  I looked
    up the discussion of curvature in my trusty old copy of Thomas,
    but couldn't easily (hey, I'm a lazy guy) derive an expression for
    curvature based on a sequence of points (this is in 3-space,
    incidentally; I plot a 2 dimensional projection of the curve; the
    color has to be based on the 3-space curvature).  Ideally, the
    expression would be based on the last point's (x, y, z) and the
    newly computed point's (x, y, z), so I don't have to save anything
    that I don't already have handy.  I can dig up the differential
    equations if that helps (sorry, I haven't memorized them).
    
    len.
    
950.12straying off the flow of discussion...CTCADM::ROTHLick Bush in '88Wed Oct 19 1988 22:4251
    Here is how to calculate the curvature of solution curves to the
    Lorenz attractor.

    The Lorenz equations are a 3-D vector field; picking a point (x,y,z)
    specifies an initial condition, leading to a well defined integral
    curve.

    To calculate the curvature, you want to calculate the magnitude of the
    cross product of the velocity and acceleration normalized by the square
    of the magnitude of the velocity.

	Curvature = |(V.cross.A)|/|V|^2

    This can be written down as a simple matrix equation.

    For the Lorenz attractor the vector field is:

	Vx = s*(x-y)
	Vy = r*x-y-x*z		r, s, b = parameters
	Vz = x*y-b*z

    The acceleration is given by a linear transformation of the
    velocity (using the chain rule):

	|Ax|   | dVx/dx dVx/dy dVx/dz | |Vx|
	|Ay| = | dVy/dx dVy/dy dVy/dz |*|Vy|
	|Az|   | dVz/dx dVz/dy dVz/dz | |Vz|

    And the cross product can be written as the matrix product:

	|Nx|   |  0  -Vz  Vy | | s   -s   0 | |Vx|
	|Ny| = |  Vz  0  -Vx |*| r-z -1  -x |*|Vy|
	|Nz|   | -Vy  Vx  0  | | y    x  -b | |Vz|

    where the partial derivative matrix is is explicitly filled in.

    Just take the length of N divided by the square of the length of
    V and that's the curvature:

	sqrt(Nx^2+Ny^2+Nz^2)/(Vx^2+Vy^2+Vz^2)

    Note that the curvature of the field as defined here is a scalar function
    of position in space and does not require solving the equations themselves.
    In fact, you could plot a colored 'map' of the curvature of a plane lying
    within XYZ space in any arbitrary orientation...  it might look very
    interesting.  Plotting the magnitude of the vector field, or its curl
    would also be nice variants.  In fact, solving for the curves of the
    curl of the vector field may be very interesting - especially if superposed
    on a plot of the attractor curves themselves...

    - Jim
950.13Radius of CurvatureCLT::GILBERTMultiple inheritence happensWed Oct 19 1988 23:3547
950.14CLT::GILBERTMultiple inheritence happensThu Oct 20 1988 00:298
950.15that looks correctLISP::DERAMODaniel V. {AITG,LISP,ZFC}:: D'EramoThu Oct 20 1988 01:4221
     I vaguely recall parameterizing a curve in terms of its
     arc length, so that P(s) = <x(s), y(s), z(s)> (a vector
     in Euclidean 3-space).  Then the first derivative (with
     respect to s) yields the unit tangent vector to the curve,
     and the second derivative yields some mess from which
     the curvature and normal vector are determined.  I think
     a cross product of the tangent and normal vectors gives
     some vector called B [I forget its name, and don't recall
     ever having understood its significance].
     
     In 2-space things become a little more simplified, because
     it is like having z(s) be a constant and so its derivatives
     vanish.
     
     If I rederived the results, or mounted an expedition
     into my back room for my old books, why I wouldn't be
     at all surprised to discover that the results for the
     two dimensional case are the same as were posted in the
     previous two replies. :-)
     
     Dan
950.16Next stepHERON::BUCHANANand the world it runnes on wheelesThu Oct 20 1988 08:0246
	OK, that gives us an expression for the curvature.   But now we need
to apply it to the problem in hand, which is to identify the turning point
from a finite set of values.   So simply differentiating the curvature and
spotting its zeros boots us nothing.

	We should also be aware that we are now entering the world of
*modelling*, i.e. that various statements which we make are judgments about the
behaviour of the data.   These statements can really only be judged by the
chap with the AUTOGEN problem, or someone else who understands these things.

	The curvature is a function of the first and second derivatives, so
we should seek a representation which makes their computation simple.   We have
no global function which we can fit the curve to, but we understand that
normally optical inspection permits one to spot the knee readily.   Therefore,
rather than mucking about with least squares, I suggest we should use local
curves, i.e. something like cubic splines.   First and second derivatives are
easy-peasy for cubics.

	So Proposed Solution A involves computing a cubic around each point of
the curve, and calculating the curvature.   The point with the minimum 
curvature is what we're looking for.

	Although A will work, there is still scope for further subtlety
perhaps.   Firstly, if there is not much noise in the data, then instead of
calculating the absolute curvature, we could just compute the relative size
of the curvature either side of a point.   I.e.: does the curvature rise or
fall as we move from one cubic to the next cubic.   The advantage of this
sneaky approach is that the first derivatives of the cubics are *defined*
to be the same at the datapoint.   It's one of the basic constraints of the
cubic spline method.   So instead of asking a question about the *curvature*,
we're posing a question about the *2nd derivative*, solely.

	So we then take as our answer, the first point for which the curvature
falls.   The disadvantage is that if we have noisy data, then there might be
several such points, and the code-writing to handle those cases might outweigh
the algorithmic simplification in just doing some comparisons rather than
calculating the absolute solution.   Also, are we really saving much
computation?

	Secondly, we could jettison the cubic idea and take a much more robust
(ie unsound) approach.   Why don't we just model the derivatives as
differences?   Suppose that we have y_0, y_1 and y_2, which for simplicity are
equally spaced...   High priority interrupt, but I've left you enough to be
going on with I hope.

Andrew
950.17Need more informationCLT::GILBERTMultiple inheritence happensThu Oct 20 1988 16:5031
John -
	Could you throw a bit more physical information into this problem?

	It appears that increasing SYSMWCNT will always reduce the system
	page-fault rate.  Hence, SYSMWCNT should always be increased.

	But wait!  This needs to be balanced against the process page-fault
	rates.  So the problem is an optimization problem, but there's not
	enough information about it to discover anything about the optimum.
	Knowing something about the process page-fault rates would help in
	providing a solution.

	For example, suppose the optimum value of SYSMWCNT is assumed to
	occur when

		 system page-fault rate     process page-fault rate
		------------------------ = -------------------------
		system memory being used   process memory being used

	That is, the system and process page fault rates are balanced --
	they are proportional to the amount of memory *used* by each.
	Of course, the amount of memory 'being used' is not a quantity
	that's readily available -- it depends on (and hopefully can be
	estimated from) the working-set size, the page-fault rate, *and*
	something that indicates the fraction of those page-faults that
	could've been avoided with a larger working-set.

	Given an approach like this, I think SYSGEN has a better chance
	of getting a good solution.

					- Gilbert
950.18Would That All Questions Were So Easily AnsweredDRUMS::FEHSKENSThu Oct 20 1988 17:0310
    re .-n ... .-m, for small values of m and n:
    
    Thank you.  The treatment of curvature in Thomas is based on
    parametrization with respect to arc length, and (as noted earlier),
    I was just too lazy (and algebraically rusty) to put the work in
    to derive the curvature of the Lorenz attractor.  What you've given
    me looks pretty usable.  Again, many thanks.
    
    len.
    
950.19That solution has problemsPOOL::HALLYBThe smart money was on GoliathThu Oct 20 1988 19:4140
    re: .17
    
    Peter, of course you're right about this problem being one of making
    the right tradeoffs.  One of the tradeoffs we have to make is that of
    how much developer time we spend doing this, and how much work we make
    AUTOGEN do at the customer site. 
    
    Morefurtherover, although the problem was stated in terms of SYSMWCNT,
    one can hope to apply the same solution algorithms to other places
    where similar tradeoffs are made.  Some examples include trying
    to properly size the file system caches, the resource hash table,
    logical name hash tables, and maybe some RMS_ and PQL_ parameters.
    CONSEQUENTLY MY "IDEAL SOLUTION" WOULD BE INDEPENDENT OF THE NATURE
    OF THE UNDERLYING DATA.
    
    When the algorithm starts to get as sophisticated as .17 requires,
    then perhaps we have gone beyond AUTOGEN's domain and into something
    more along the lines of VPA.  Perfectly reasonable, but more work
    than we want to give the customer for free :-)
    
    The conflict I have right now is that the kinds of problems as posed
    in .0 all have solutions that jump off the page when you look at
    the pretty graphs that show up in the memory management literature.
    Despite the evident hyperbolic nature of such graphs, there is
    frequently a "knee" that is obvious to the human eye.  Unlike smooth
    hyperbolae, "knee"s are independent of the scale used.  An unfortunate
    problem is that when you curve fit sample data to a hyperbola or splines,
    you end up with nice differentiable curves -- but the "knee" is
    usually that point where the curve is NOT differentiable!  I mean,
    why do you think they call it a "knee"?
    
    The least-squares-fit approach seems to mimic this "knee" behavior
    better than smooth curve fitting.  Though maybe 3 lines would be
    better than 2, he hypothesized.
    
    Is this a self-contradictory problem?  Fit a smooth curve to sample
    data and then bitch that the curve is everywhere differentiable
    (hence has no well-define "knee")?

      John
950.20A simple-minded approachAKQJ10::YARBROUGHI prefer PiFri Oct 21 1988 12:089
How about this approach: find the leftmost and rightmost points of the 
region in question. Now for all data points in between, find the point
'mid' that minimizes the angle between the lines {mid,right} and 
{mid,left}. Assuming the data are reasonably smooth, that is at least a
good approximation to the 'knee', and doesn't involve a whole lot of
calculation. If the data are spiky, a preliminary smoothing operation 
should do the trick.

Lynn 
950.21Dimensional AnalysisCLT::GILBERTMultiple inheritence happensFri Oct 21 1988 16:4039
950.22Elaboration, Please, If Possible?DRUMS::FEHSKENSFri Oct 21 1988 16:5912
    re .12 - I realize this is a diversion from the subject at hand,
    but I'm curious about how the expression for curvature is derived,
    specifically
    
    	Curvature = |(V.cross.A)|/|V|^2
    
    The rest follows naturally from this expression, and if I'd been
    able to find or derive an expression like this from the stuff in
    Thomas, I'd have been on my way long ago.
    
    len.
    
950.23a general approachPULSAR::WALLYWally Neilsen-SteinhardtFri Oct 21 1988 18:4756
    in reply to .0 and .19
    
    I think that .1 .6 and .17 are on the right track, and I believe
    all the talk about curvature is interesting but a side issue.
    
    Since John in .19 says he wants a general solution for similar problems
    I'll try to state the general problem here.
    
    In general, you have a set of inputs, called Xi, which control a
    state described by various utilities Uj.  What you want to do is
    combine these utilities Uj into one big utility U, and then determine
    the Xi which maximizes U.  For convenience, one usually assumes
    (hopes) that U is a weighted sum of Uj.  One also assumes (hopes)
    that Xi are approximately independent in their effect on U.
    
    For the special case in .0, X1 is   | System faults
    SYSMWCNT, U is low fault rate, 	| Process Faults
    U1 is explicitly system fault rate	| Utility
    and U2 is only made explicit in	|    S			P
    .17 as process fault rate.  U is	|     S		       P
    a weighted sum of U1 and U2, with	|      S   UUUUUUUU   P
    U1 system faults give a high weight	|       S U        U P
    by folklore because they affect all |        US        PU
    users.  				|	U  SSSSPPPP  U
    					|      U PPPPPPSSSSSS U
    					+------------------------------
    						SYSMWCNT
    
    What you're really looking for is the maximum of U in the curve
    above.  The knee only gets into it by accident.  If S and P have
    well defined knees, then U will be nearly constant between the knees,
    so you can safely choose the knee of one if you lack data about
    the other.  
    
    If the knees are not well defined, then maximizing U is a better
    strategy.  And in my experience, there are a lot fewer good knees
    in the real world than there are in text books. (pun intended)

    In the special case of .0 I would define U by
    
    	U = 10 * system page fault rate
    		+ process page fault rate

    and X = SYSMWCNT
    
    and then try to get some data for U, choose a general curve for
    U, like
    
    	U(x) = U0 - U1 * (X-Xm)^N
    
    then choose parameters by least squares and take Xm at the best
    guess for the optimal SYSMWCNT.
    
    Is this too much work for AUTOGEN?  I'd guess so, but that's because
    as a stockholder I'd rather have customers pay for VPA or SPM to
    get the real goodies.
950.24splines and least squaresPULSAR::WALLYWally Neilsen-SteinhardtFri Oct 21 1988 18:5517
    in reply to 950.*
    
    There seems to be some confusion about when to use splines, linear
    least squares and non-linear least squares.
    
    When you use splines you are assuming that the error in your data 
    points is low.
    
    When you use linear least squares, you are assuming that the 'true'
    function is well approximated by a piecewise linear function.
    
    When you use non-linear least squares, you are assuming that the 
    'true' function is well approximated by the function you are using.
    
    If you have noisy data, the safest approach is generally the third.
    The most dangerous approach is to average yi at each xi and then
    try to fit splines through the averaged points.
950.25we need examplesCTCADM::ROTHLick Bush in '88Fri Oct 21 1988 21:4519
    The thing I'm uncomfortable about here is a lack of example
    graphs to look at.  Much of this discussion makes various assumptions
    about the 'shape' of the graphs, but how clean are these graphs in
    reality, and how much noise is there in the data.

    It may be difficult to have a program reliably find a knee on a graph of
    noisy data, even if the eye can catch it.  I'd want to be able to
    run the program, give it a graph, and have it highlight what it thinks
    is the knee with a marker, before I'd be happy to accept such judgement.

    I think a better approach would be to define a sensible "objective
    function", and maximize it.  You could have this function reflect
    some feature such as a 'knee' on a graph if desired - but this should
    really have some basis other than that it looks nice.

    Surely there is much published prior art in this area - have you
    gone foraging thru the literature?

    - Jim
950.26CTCADM::ROTHLick Bush in '88Mon Oct 24 1988 10:2333