Thompson - Computing for Scientists and Engineers (523188), страница 35
Текст из файла (страница 35)
You are probablyalready familiar with orthogonality in the context of vector geometry. To understandthe analogy, work the following exercise.Exercise 6.4Suppose that k and I are two three-dimensional vectors with components expressed in Cartesian coordinates.(a) Write down, in a form analogous to (6.6), the orthogonality condition forthese two vectors.(b) If k is a fixed vector, what is the set of all vectors orthogonal to k? nIn the following subsection we relate this orthogonality property to the key conceptsof least-squares fitting.For linear least-squares fits with polynomials it is advantageous that the polynomials be orthogonal, but this property does not hold if just simple powers of theindependent variable x are used. You can investigate this problem and its cure in thefollowing exercise.Exercise 6.5(a) Make a sketch of the powers of x, L(x) = xL, for x between -1 and +l,then use this to indicate why these polynomials cannot be orthogonal over thisrange if positive weights are assumed.(b) Generate polynomials of orders L = 0,1,2,...
by the following recurrencerelations, called the Schmidt orthogonalization procedure:(6.8)6.2ORTHOGONAL FUNCTIONS AND LINEAR LEAST SQUARES187(6.9)where the coefficients are(6.10)with the denominator sum given by(6.11)and the second set of coefficients is given by(6.12)The normalization scale of these polynomials is such that the power of xL hasunit coefficient.(c) Prove that the polynomials generated in (b) are orthogonal by applying, for agiven L, the method of induction over I to the sequence of sums over the data(6.13)Thus we have a general procedure for generating orthogonal polynomials, withtheir coefficients depending on the weights associated with each datum.
nThis result may seem rather abstract, but it may surprise you that it provides astraightforward route to the Legendre polynomials, a favorite set of fitting functions,especially in problems which involve three-dimensional polar coordinates, whereusually x = cos and is the polar angle. Another reason for using Legendrepolynomials is that, with a suitable choice of weight factors and on replacement ofsummation by integration in the above equations, they form orthogonal polynomialssuitable for linear least squares, as you can readily discover.Exercise 6.6(a) Show by generating the lowest-order orthogonal polynomials havingL = 0,1,2, that if the weight factors are all unity and if the orthogonality summation over data values xj is replaced by integration over x from -1 to +1, then(x) can be taken to be the Legendre polynomial of order L, PL (x), namely188LEAST-SQUARES ANALYSIS OF DATA(6.14)The standard normalization of the Legendre polynomials is PL (1) = 1.(b) Verify the approximate correctness of the orthogonality condition (6.13) byusing some Legendre polynomials in the numerical-integration programs in Section 4.6 to integrate products of the functions over the range -1 to +l.
Noticethat if you use the trapezoid rule for integration (Section 4.6), then the summation and integration are equivalent, if the xj are equally spaced and if end-pointcorrections are negligible. nOrthogonal functions are much used in analysis and applications. In particular,if the orthogonal functions are polynomials (such as those of Chebyshev, Hermite,Laguerre, Legendre, and other ancient heroes), a general classification of their properties can be made. An extensive and accessible treatment of orthogonal polynomials, including applications to numerical analysis, partial differential equations, andprobablity theory and random processes, is provided in the monograph by Beckmann.Orthogonality and least squaresSuppose that all the parameters to be determined, aL with L = 1,2,..., N (if there areN parameters), appear linearly in the definition of the fitting function, Y, as(6.15)We call the process by which the aL in (6.15) are adjusted so as to minimize the objective function in (6.6) a linear-least-squares fit.
This is to be distinguished fromthe more-restrictive least-squares fit to a straight line (Section 6.3), in whichY(x) = a1 + a2x, so that 1(x) = 1 and 2(x) = x.To find the fitting parameters that minimize requires that the derivative ofwith respect to each of the aL be zero. In the situation that the xj are precise, this requires from (6.6) that(6.16)By inserting the linear expansion (6.15) into this equation we get6.2ORTHOGONAL FUNCTIONS AND LINEAR LEAST SQUARES189This is a set of N linear equations for the coefficients aK that minimize the objectivefunction.
In general, these equations have to be solved by matrix methods. Further,adjustment of one coefficient propagates its influence to the values of all the othercoefficients.The use of orthogonal functions for the (xj) greatly simplifies the solution ofthe equations (6.17) as you may immediately prove.Exercise 6.7Show that if the functions (xj) satisfy the orthogonality condition (6.13), thenthe left-hand side of (6.17) collapses to a single nonzero term for each L, resulting in an immediate formula for the linear-least-squares coefficients aL , namely(6.18)so that each coefficient is obtained by an independent calculation, but all coefficients are interrelated through the data values yj and their weights wj.
nThus, if appropriate orthogonal functions can be found, their use greatly simplifies the least-squares fitting, because each coefficient is found independently of theothers. For example, if a different number of fitting functions is decided upon (forexample if N is increased) there is no need to recalculate all the coefficients, becauseformula (6.18) does not depend upon the value of N. The quality of the fit, however, does depend on N.
Because of this linear independence of the fitting coefficients, each of them can be uniquely associated with its corresponding function. Forexample, in Fourier expansions, if x is the time variable, then the aL are the amplitudes of the successive harmonics of L times the fundamental frequency.One must be careful with the application of orthogonal functions when weights areinvolved, for the following reason: A given type of function is orthogonal for a setof xj only for a particular choice of weights. For example, the cosine and sine functions for Fourier expansions (Chapters 9 and 10) are orthogonal over the range 0 to2 if the weight factors are unity (or constant).
In data analysis such a choice ofweights usually conflicts with the probability-derived weights discussed in Section 6.1, in which the weights are inversely proportional to standard deviations ofmeasurements. A common compromise is to use (6.13) in formal work and in situations where the functions are orthogonal with weight factors of unity, but to use thegeneral linear-least-squares formula when other weighting schemes or choices of xjare made.
Unpredictable results will be obtained if formulas derived from differentweighting models are combined.With the above general background, which also serves as the foundation forFourier expansions in Chapters 9 and 10, we are ready to study the special case ofstraight-line least squares, which is of very common use in scientific and engineering applications.190LEAST-SQUARES ANALYSIS OF DATA6.3 ERRORS IN BOTH VARIABLES:STRAIGHT-LINE LEAST SQUARESIn least-squares fitting models there are two related components that are usually discussed in only a cursory way.
The first component is a model for the errors in thevariables, and the second component is a model for the weight that each datum hasin the fitting. In Section 6.1 we indicated in the context of maximum likelihood theconnection between a possible error model (Gaussian distributions of independenterrors from point to point) and the weighting values in (6.6). The subject of leastsquares fitting when both variables contain errors is perennially interesting, withmany pitfalls and possibilities.
They have been summarized in an article by Macdonald and Thompson.In the following we discuss a variety of weighting models for straight-line leastsquares fitting, we then particularize to the case in which there is a constant ratio ofx-data weights to y-data weights from point to point, then we derive some interesting yet useful symmetry and scaling properties for this weighting model.Weighting modelsThe ideas behind generalized weighting models can be illustrated by a mechanicalanalogy that will be very familiar to readers with a background in physics or engineering.
Suppose that we are making a straight-line least-squares fit, so that in (6.6)the Xj and Yj define the best-fit straight line. If we literally hung weights wxj andwyj at the data points (xj,yj), then the objective function (6.6) would be proportionalto the moment of inertia of the distribution of mass about the line defined by the (Xj,Yj). The best-fit straight line would then be that which minimizes the moment of inertia about this line.As an example of weighting of both x and y data, we choose the four (x, y) datapairs in Table 6.1, which apparently have large errors, so that the distinction between the various weighting models is accentuated.TABLE 6.1 Data and weights for straight-line least squares with weighting model IDWMC,which has a constant ratio of x-data weights to y-data weights,= 0.5.Figure 6.2.