Thompson - Computing for Scientists and Engineers (523188), страница 36
Текст из файла (страница 36)
Herefrom point to point, as shown in6.3STRAIGHT-LINE LEAST SQUARES191FIGURE 6.2 Various weighting schemes for least-squares fitting, shown for fits to a straightline. The conventional fitting method is OLS - y:x with y weights (dashed verticals) only. For xweights only (dashed horizontals) the fitting method is OLS - x:y. For weights on both x and ydata that are in a constant ratio from point to point the fitting method is IDWMC.
If the weight ratio is not constant. then we have IDWM.Figure 6.2 shows graphically the ideas behind different weighting schemes. Inorder to motivate your interest in proceeding with the tedious analysis that follows,we first explain this figure, then we tackle the algebra. The best-fit straight linesshown are obtained using Program 6.2 developed in Section 6.6. We now interpret the various weighting schemes shown.In Figure 6.2 ordinary least squares (OLS - y:x) is conventional least squares,in which the x values are assumed to be known precisely (xj = Xj), so that theweighting is applied in the y direction only. For the four data illustrated this produces a fairly flat line if all the y data have equal weights.
Alternatively, if the y values are assumed to be precise, then the weighting is in the x direction only, and amuch steeper line results, as shown by OLS - x:y in Figure 6.2.The most-general possibility for a weighting model is that the weights from pointto point are independent and not simply related: We call this the independent diagonal weighting model (IDWM), using the nomenclature in Macdonald and Thompsons’ article.
To illustrate it we make the length of the weighting rectangle at eachdatum proportional to the weight wxj and the height of the rectangle proportional tothe weight wyj. A general method for solving such a least-squares problem approximately was given by Deming, and an exact method for straight-line fits was described by York and by Reed.192LEAST-SQUARES ANALYSIS OF DATAConstant ratio of weightsWithin IDWM there is an especially simple weighting model for which the ratio of xto y weights is constant from point to point. We call this IDWMC, with the C denoting a constant weight ratio, as illustrated in Figure 6.2.
We express this asfollows:(6.19)Thus, for the IDWMC data in Table 6.1, = 0.5, because the x weights are half they weights and this ratio of weights is constant for all four data points. We now derive the formula and some interesting results for straight-line fits in the IDWMC.Write for the straight-line least-squares fit the model function as(6.20)and consider in (6.6) the contribution to the objective function from uncertainies inboth the xj and the yj. There are two contributions to the error in the observed yj.The first is from the uncertainty in measuring the y value, (yj), and the second isfrom the uncertainty in the x value, (xj). For independent errors the total uncertainty at the jth point has a standard deviation(6.21)It is therefore appropriate to use the inverse of this quantity for the weight wyj in(6.6).
With this method of setting up the objective function, , the (xj - xj)2terms in (6.6) contribute just a constant, (xj)2, to the objective function, and donot affect its minimization. Thus, the quantity to be minimized is the y-data contribution to the objective function,given by(6.22)where the contribution from the jth point is(6.23)and the squared and weighted difference between data and prediction at this point isin terms of the predicted value at the jth point6.3 STRAIGHT-LINE LEAST SQUARES193(6.25)and the weight coming from the variance in the y data at the jth datum(6.26)In this weighting model the contribution to the objective function from the data thatare not modeled (here the xj ) is just the total number of data points, n.Exercise 6.8Verify each of the steps in the derivation of (6.22) for the objective function inthe IDWMC model (Figure 6.3), starting with the objective function (6.6) andgoing through the steps from (6.19) onwards.
nAll this algebra has a purpose, which becomes clearer when you look at Figure 6.3. Namely, the least-squares minimization that we are performing corresponds to minimizing the sums of the squares of the shortest distances (perpendiculars) from the data points to the best-fit straight line, as we now demonstrate.FIGURE 6.3 Geometry of the IDWMC model. The local coordinates are the x and y data divided by their errors. The perpendicular distance from the data values (xj, yj) to the fit that isminimized is Oj,in terms of the scaled coordinates x* =and y* =194LEAST-SQUARES ANALYSIS OF DATAIn order to justify the claim that minimization of defined by (6.6) is equivalent to minimizing sums of squares of perpendiculars, we need to consider the dimensions and units of the variables x and y, and of the slope parameter a2. If wejust plotted y against x and discussed perpendiculars, the best fit would depend onthe units used for x and y.
In particular, the value of the slope (a2) changes if theirunits change. An appropriate way around this difficulty for our weighting model isto define local dimensionless coordinates, the data divided by their standard deviations, as shown in Figure 6.3. Note that the slope in terms ofvariables becomes(6.27)Exercise 6.9Use the geometry of the triangles in Figure 6.3 to show that (6.23) through(6.27) are equivalent to the relation between the vertical and perpendicular distances in the figure. Hint: Use the identity l/(1 + tan2 ) = cos2 .
nNote that this result does not depend on the assumption of constant weight ratios,(the IDWMC model), but it holds for the more general independent diagonal weighting model (IDWM in Figure 6.2).We have obtained a visually appealing and intuitive criterion for fitting. Also, itis apparent from this geometric construction that the same straight line should be obtained if we plot x on y rather than y on x, because the perpendicular is invariant under this change of axes.
This claim is validated in the following algebraic derivationof the slope and intercept formulas.In the following let(6.28)By setting the derivative ofin (6.22) with respect to a1 to zero, you will readilyfind that that the intercept on the y axis is given by(6.29)where the weighted averages of x and y values are(6.30)This result for a1 is not immediately usable, since it contains the unknown slope, a2.We may use it, however, to calculate the differences between data, yj, and fit, yj, in(6.24) as(6.3 1)6.3 STRAIGHT-LINE LEAST SQUARES195The calculation in terms of differences from the mean also has the advantage of minimizing subtractive-cancellation effects, as discussed in detail in Exercise 4.8 in Section 4.3.
Now compute the sums of products(6.32)(6.33)(6.34)These are the key quantities used to calculate interesting properties of the straightline fit.Exercise 6.10(a) Derive the least-squares equation for the slope of y on x, a2, by substituting(6.3 1) in the formula (6.16) for the derivative of the objective function withrespect to a2 to show that(6.35)where, the ratio of x weights to y weights, is given by (6.19).(b) Show that this equation predicts that the best-fit slope for x on y is just l/a2.To do this, demonstrate the symmetry of this equation under interchange of xand y, recalling that under such an interchange is replaced by . nThe quadratic equation for the slope, (6.35), has a solution that is insensitive to subtractive cancellation (as discussed fully in Section 4.3) when computed in the form(6.36)for Sxx > Syy, for example when x errors are relatively negligible.
The slopeshould be computed in the form(6.37)for Sxx , < Syy, as when y errors are relatively negligble (0).196LEAST-SQUARES ANALYSIS OF DATAExercise 6.11Use the results in Exercise 4.9 and the discussion below it to show that the appropriate formulas have been selected for solution (6.36) or (6.37) of the slopeequation (6.35). nTo summarize our derivation for a constant ratio of weights, IDWMC: Theslope is given by (6.36) or (6.37), then the intercept is simply obtained from (6.29).We next consider some properties of the straight-line least-squares slopes, then wederive a compact expression for the minimized objective function.Properties of the least-squares slopesWhen both variables have errors, even if these are in the constant ratio given by(6.19), there is an additional parameter in the straight-line least-squares analysis,namely the ratio of x to y weights, It is therefore important to understand howinfluences the fit, as we investigate in the following.An interesting limit of the result (6.36) for the slope is for negligible x errors,(OLS-y: x in Figure 6.2), for which (6.36) has as its limit(6.38)Similarly, for negligible y errorshas the limit= 0 (OLS - x:y in Figure 6.2) formula (6.37)(6.39)Now to verify the result suggested by Figure 6.3, that is, either(6.40)or the slopes are limited by(6.41)so that the slope for any value ofon y, a2(0).lies between that for y onand that for xExercise 6.12To derive the inequalities (6.40) or (6.41), consider the quantity under thesquare-root sign in the denominator of (6.36), then apply the Schwartz inequalityto show that a 2Sxy/Sxx, = a2(0) if Sxy > 0.