Sei sulla pagina 1di 8

LEAST-SQUARES FITTING OF A STRAIGHT LINE

DEREKYORK
Geophysics Division, Departnte?zt of Physics, Uninersity of Toronto, Toronto, Ontario
Received January 21, 1966

ABSTRACT
A detailed discussion of the calculation of the "best straight line" by the
method of least squares is given. The most general solution is found and the
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

conditions under wllicl~certain previously derived special solutions are valid are
clearly stated. The "best" slope is shown to be given by the solution of the "Least-
Squares Cubic". An example i: given to illustrate the method. It is shown that the
best slope is not necessarily bounded by values found from the regressions of
x on y and y on x.

I t happens occasionally, in allnost all branches of research, that one is


faced with the task of drawing the best straight line through data on a cartesian
plot. If the X coordinates* alone have been measured with perfect accuracy,
the best line is that which minimizes the suin of the squares of the Y residuals.
If all the errors occur in the X coordinates, the best line is that which minimizes
the sum of the squares of the X residuals. I-Iowever, i t usually happens in
practice that both coordinates of any one point are in error. In such situations
investigators have frequently thrown all the error onto the Y , to find one line
For personal use only.

and then attributed all the error to the X i to find another line. These two lines
are only identical if the points to be fitted are exactly collinear. Faced with
two such "best lines", the usual recourse has been to take their arithmetic
mean. Despite the well-known unsatisfactory nature of this approach, it is
still used surprisingly often.
An apparently improved method, first suggested by Adcock (1575), is to
find a line such that the sum of the squares of the perpendicular distances of
the points from this line is a minimum. Karl Pearson (1901) termed this line
the "major axis". However, such an analysis, in its simplest form, yields a line
that is not invariant under a change of scale. If the line y = a +
bx is being
fitted by this method, the coefficients a and b, with their standard deviations,
are given by (Kermack and Haldane 1950)

(1)
b = Ci v," - Ci u," + [(xiv," - Ci u,"]" 4(Ciu ~ v , ] ~ ] +
2C i uivi
where
U i = X i - X, Vi = Yi - P,

*Xi, Yi are observed values; xi, y; are adjusted values.


Canadian Journal of Physics. Volume 44 (1QG6)
1080 CANADIAN JOURNAL O F PHYSICS. VOL. 44. 19GG

Jones (1937) and Teissier (1948) pointed out that it is possible to secure such
invariance by expressing the coordinates in standard measure: Y/u, is then
plotted against X/uz and the perpendicular distances of points from the line
are minimized. a, is the standard deviation of the ordinates Yi, i.e., u,2 =
X i ( Y, - P)2/(n - 1) ; similarly uz2 = C f ( X t - x ) 2 / ( n - I ) , n being the
number of points plotted.
The coefficients a and b , with their standard deviations, are now given by
(Kermack and Haldane 1950)
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

and

r being the coefficient of correlation defined above.


This line is usually terined the "reduced major axis". Its chief attraction
is the ease with which the slope may be calculated from equation (3).
Worthing and Geffner (1946) adopted a similar approach but standardized
their coordinates by dividing each Yi by the probable error p, of the Yi
(assumed constant), and each X i by an assumed constant probable error, pz.
For personal use only.

The best values for the slope and intercept are then given by

where c = p,2/pz2 ; and


(6) a= P-Xb.
However, both these modifications of Adcock's method are restrictive in
that they cause all the points to be adjusted to the best straight line along
parallel paths.
What is clearly desired, in general, is a least-squares fitting that allows one
to take advantage of the experimenter's estimates of the uncertainties in the
various X , and Y,. In general, the errors in the coordinates vary from point to
point with no necessarily fixed relation to each other. The most satisfying
study of this problem seems to be that of Derning (1943), who proposed that
the "best" straight line is given by minimizing the sum in the following
equation :

where X f , Yt are the observations, xi, y, are the adjusted values of these, and
w(Xt), w(Yt) are the weights of the various observations. The adjusted points
(x,, y,) are, of course, to lie on a straight line. Deming, however, simplified the
problem by expanding the straight-line function in a Taylor series about
assumed values of slope, intercept, and adjusted points. Squared and higher
terms in the expansion were neglected. While this approach is more satisfactory
YORK: LEAST-SQUARES FITTING O F A STRAIGHT LINE 1081
than the earlier ones, the neglect of terms in the Taylor expansion can cause
significant error in some instances, as Deming recognized.
An exact treatment of the problem is given here. xi, y, are the adjusted
values of X i , Y i which make the sum in equation (7) a minimum. Since \;\re
require xi, yi t o lie on the best straight line, we must have
(8) yi=a+bx,, i = 1, . . . , n.
If these values of xi, yi, a , and b make S a minimum, we have
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

However, we also have, from equation (8),


(10) b6xi - 6yi + 6a + xi6b = 0.
If we multiply each of equations (10) by its own undetermined multiplier ( X i )
and take the sum of the ensuing equations and equation ( 9 ) ,we have
Ci {6xi[w(Xi>
(xi - X i ) + bXiI 1 + Ci {Gyi[w(Y i )(yi - Y i ) - Xi] 1
+ 6a X i X i + 6b X i Xixi = 0.
Equating coefficients t o zero gives
(11) xi - X i = -Xib/w(Xi), yi - Y i = Xi/w( Y i ) ,
For personal use only.

(12) Ci X i = 0,
(13) Ci xixi = 0.
Substituting in equations (8) for y, and xi from equations (11),we obtain

where

We now have n + 2 equations in (14), (13),and (12) and n + 2 unknowns in


a , b, and Xi. From (12) and (14) we have

(15) Ci W i ( a + bXi - Y i ) = 0,
and from (13) and (14) we get
1082 C.INADIAN JOURNAL O F PHYSICS. VOL. 44. 1966

Hence a and b are found by solving equations (15) and (16) simultaneously.
( I t might be noted here that Deming's Taylor-series approximation is equivalent
to neglecting the second term in equation ( I G ) . )
Now equation (15) gives

Therefore,
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

where

This demonstrates that the best line goes through the center of gravity of
the data ( X , 7 )when this point is defined as above.
Eliminating a between equations (15) and (16) yields

+ C wiuiv, = 0,
For personal use only.

where

The slope of the best straight line is now given by solving equation (20)
for b. We call equation (20) the "Least-Squares Cubic". Before considering
how to solve this equation in the general case, however, it is instructive t o
derive certain solutions under special conditions.
Special Solutions of the "Least-Squares Cubic1'
(i)No Errors in X i ; Y i Subject to Errors
In this instance, w(Xi) = a , W i = w ( Y i ) , and the least-squares cubic
becomes

This solution agrees with the usual one for the weighted regression of y on x
that would be appropriate in this case.
(ii)No Errors in Y , ; X i Subject to Errors
In this case, w ( Y i ) = a , W i = w(X,)/b2,and the least-squares cubic gives

This is the usual formula for the weighted regression of x on y, which is the
correct approach in this instance.
YORK: LEAST-SQUARES F I T T I S G O F A STR.AIGHT LINE 1083
(iii) Both X i and Yi Subject to Error
I t is assumed that w(Xi)/w(Yi) = c (a constant); the least-squares cubic
then becomes
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

This formula has been given by Deming (1943). We may consider several
subclasses:
(a) If w(X,) = w(Yi) = a constant, then c = 1 and

This is identical with equation (I), which was derived by minimizing the sum
of the squares of the perpendicular distances of the points from the best line.
For personal use only.

In this specific case, it is the appropriate formula to employ.


(b) If w(Xi) = l/pz2 for a11 X i and w(Yi) = l/pV2for all Y,, but w(Xi) #
w( Yi) :

where c = p,2/pz2. This value of b is identical with that given in equation (5)
and it is the one derived by Worthing and Geffner. Under the special conditions
given here, it is the valid expression to use.

(Xi)
i X i -2 } 1 and w(yi) = {xi(Yi - ~ ) ~ l - l
0
\
= .
n-1 n-1 f
we get
b = uv/uz,
which is identical with the value of the slope given in equation (3), and is the
slope of the "reduced major axis". We see that this line is the best one to use
only under these particular conditions of weighting.
General Solution of the "Least-Squares Cubic"
Equation (20) is, of course, not really a cubic. I-Iowever, it may be reduced
to one by substituting a n approximate value (estimated by eye) for the slope
where required in the Wi. Equation (20) may then be solved exactly for b a s
a cubic. One of the three roots so obtained will obviously be the desired value
of the slope. If we write equation (20) as
1084 CANADIAN JOURNAL O F PHYSICS. VOL. 44. 1966

b3 - 3ab2 + 30b - y = 0,
where
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

and

we have for the three real roots


For personal use only.

where

cos 4
a 3 - :a0
-
ir+
=
(a2- p)3'2 -
In all the examples considered by the author so far, the required root has
been b3. If the choice of an approximate slope for inclusion in W i is a t all
reasonable, the ensuing b3 will represent an excellent approximation to an
exact solution of equation (19). However, any desired degree of accuracy may
be obtained by iteration, substituting b3 in W i for Z, and resolving. The best
value for the intercept is then given by equation (15). From equations (11)
and (14),x and y residuals are found to be

xi - X i = - b.- wi (a + bXi - Y,),


w(X,>
and

T o estimate the uncertainties in the slope and intercept thus found, the
author has adopted Deming's simplification of the Taylor-series expansion. I t
is then easy to show that

and
YORK: LEAST-SQUARES FITTING O F A STRAIGI-IT LIiY16 1055

Were it necessary to estimate a and b from the "Least-Squares Cubic" by


hand calculation, then several hours of tedium would be endured by the
investigator. However, the increasingly ready availability of high-speed coin-
puters banishes this worry in inost laboratories. The author has written a
program in Fortran IV language for the IBM 7094 computer which carries
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

out the con~putationsoutlined above and prints out the estimates of a, b, u,,
ub, X,8, and the x and y residuals a t each point. I t is necessary merely to feed
in the X, Y coordinates, their appropriate weights, and an approximate
value for b.
T o illustrate the method, we consider the example given by Pearson (1901).
The system of points described in Table I is to be fitted to the "best straight

'TABLE I
Data talcen from Pearson (1901)

X Y
For personal use only.

line". Pearson gave all the values equal weight and estinlated the f o l l o ~ v i n ~ :
(a) minimizing the y residuals gives the slope b = -0.540, (b) minimizing the
x residuals gives b = -0.566, (c) minimizing the perpendicular distances of
the points from the line gives for the "major axis" b = -0.546. The "major
axis" was shown to go through the centroid (3.82, 3.70).
The particular conditions of equal weight for all the measurements assumed
by Pearson are those of class (iii) (a) above and the major axis is, therefore,
the "best-fitting" line to use. The "Least-Squares Cubic", of course, gives this
solution if we substitute w(Xt) = 1 = w(Yt), for all i, in the W,. Making this
substitution and taking b = -0.540 as the approximate slope, we obtain from
the solution of the cubic b3 = -0.546, agreeing with Pearson's major axis
estimate, The centroid is found to be (3.82, 3.70), in agreement with Pearson.
Ilowever, if varying weights are to be assigned to the numbers in Table I ,
the "major axis" is no longer the best fit and we must employ the "Least-
Squares Cubic". Let us suppose t h a t the weights of the observations are now
those given in Table 11. Use of these weights and the approximate value
-0.546 for the slope yields b3 = -0.477 for the solution of the Least-Squares
Cubic. Thus the slope of the best line under these conditions of weighting is
over 12% less than the slope of the major axis. I t may seem t h a t we have
CANADIAN JOURNAL O F PHYSICS. VOL. 44. 1966

TABLE I1
Weights to be applied to data in
Table I
Can. J. Phys. Downloaded from www.nrcresearchpress.com by Queens University on 11/26/14

resorted to somewhat extreme conditions of weighting. In fact, somewhat


similar weightings occur typically in the author's experience when experiments
are designed to yield constant percentage errors in the nleasured parameters,
the weight of an observation being talcen as the reciprocal of the square of its
standard deviation.
An equally interesting result appears if we calculate the weighted regression
of y on x, then of x on y , using the appropriate weights from Table 11. We find
that (a) weighted regression of y on x gives b = - 0.611, (b) weighted regression
For personal use only.

of x on y gives b = -0.630. The "Least-Squares Cubic" estimate found above


is thus about 24% smaller than the weighted regression values. I t may then
be concluded that the properly weighted least-squares estimate is not neces-
sarily bounded by the results of taking the regression of y on x and then x on
y ; nor are the two weighted regressions necessarily bounds.
With data less scattered about a line than those of Table I , the difference
between the "Least-Squares Cubic" solution and other estimates is, of course,
considerably less. I-Iowever, the writer has found that numerous authors in
his own field deduce slopes from excellent data which may differ by as much as
2.5% from the slopes estimated by solving equation (20).

ACKNOWLEDGMENTS
The author is pleased t o acknowledge useful comments on this work from
Mr. J. Purdy, Dr. A. Hayatsu, and Professors R. M. Farquhar and J. C.
Savage. Professor R . D. Russell read the script and gave the author consider-
able assistance. During the course of this work the author has been assisted
by a grant from the National Research Council of Canada.

REFERENCES
ADCOCK,R. J . 1878. Analyst, 5, 53.
DEMING, W. E. 1943. Statistical adjustment of data (John Wiley and Sons, New York).
JONES, H. E. 1937. Metron, 13 (No. l ) , 21.
KERMACK, K. A. and HALDANE, J. B. S. 1950. Biometrika, 37, 30.
PEARSON, I<. 1901. Phil. Mag. (6), 2, 559.
TEISSIER, G. 1948. Biometrics, 4 (No. I ) , 14.
WORTHING, A. G. and GEFFNER, J. 1946. Treatment of experimental data (John Wiley and
Sons, New Yorlr).

Potrebbero piacerti anche