Heath - Scientific Computing (523150), страница 58
Текст из файла (страница 58)
However, it can sometimes fail, and it can also sometimes convergerapidly. Under what conditions would each ofthese two types of behavior occur?6.29 Suppose we want to minimize a function f : Rn → R using a secant updatingmethod. Why would one not just apply Broyden’s method for finding a zero of the gradientof f ?(c) Asymptotically, as the solution is approached, what should be the value of this linesearch parameter for Newton’s method?6.24 Consider Newton’s method for minimizing a function of n variables:(a) When might the use of a line search parameter be beneficial?(b) When might the use of a line search parameter not be beneficial?6.30 To what method does the first iterationof the BFGS method for minimization reduceif the initial approximate Hessian is6.25 Many iterative methods for solving multidimensional nonlinear problems replace thegiven nonlinear problem by a sequence of linearproblems, each of which can be solved by somematrix factorization.
For each method listed,what is the most appropriate matrix factorization for solving the linear subproblems? (Assume that we start close enough to a solutionto avoid any potential difficulties.)(a) Newton’s method for solving a system ofnonlinear equations(b) Newton’s method for minimizing a function of several variables6.31 In secant updating methods for solvingsystems of nonlinear equations or minimizinga function of several variables, why is it preferable to update a factorization of the approximate Jacobian or Hessian matrix rather thanupdate the matrix itself?(a) The identity matrix I?(b) The exact Hessian at the starting point?6.32 For solving a very large unconstrainedoptimization problem whose objective functionhas a sparse Hessian matrix, which type ofmethod would be better, a secant updatingmethod such as BFGS or the conjugate gradient method? Why?212CHAPTER 6.
OPTIMIZATION6.33 How does the conjugate gradient methodfor minimizing an unconstrained nonlinearfunction differ from a truncated Newtonmethod for the same problem, assuming theconjugate gradient method is used in the latter as the iterative solver for the Newton linearsystem?6.34 For what type of nonlinear least squaresproblem, if any, would you expect the GaussNewton method to converge quadratically?6.35 For what type of nonlinear least squaresproblem may the Gauss-Newton method converge very slowly or not at all? Why?6.36 For what two general classes of leastsquares problems is the Gauss-Newton approximation to the Hessian exact at the solution?6.37 The Levenberg-Marquardt method addsan extra term to the Gauss-Newton approximation to the Hessian.
Give a geometric or algebraic interpretation of this additional term.6.38 What are Lagrange multipliers, andwhat is their relevance to constrained optimization problems?6.39 Consider the optimization problemmin f (x) subject to g(x) = o, where f : Rn →R and g: Rn → Rm .(a) What is the Lagrangian function for thisproblem?(b) What is a necessary condition for optimality for this problem?6.40 Explain the difference between rangespace methods and null space methods forsolving constrained optimization problems.6.41 What is meant by an active set strategyfor inequality-constrained optimization problems?6.42 (a) Is it possible, in general, to solvelinear programming problems by an algorithmwhose computational complexity is polynomialin the size of the problem data?(b) Does the simplex method have this property?Exercises6.1 Consider the function f : R2 → R definedbyvalue x0 is such that x0 − x∗ is an eigenvectorof A, where x∗ is the solution?f (x) = 12 (x21 − x2 )2 + 12 (1 − x1 )2 .6.3 Prove that the block 2×2 Hessian matrixof the Lagrangian function for constrained optimization (see Section 6.5) cannot be positivedefinite.(a) At what point does f attain a minimum?(b) Perform one iteration of Newton’s methodfor minimizing f using as starting point x0 =T[2 2] .(c) In what sense is this a good step?(d ) In what sense is this a bad step?6.4 Consider the linear programming problemmin f (x) = −3x1 − 2x2xsubject to the inequality constraints6.2 Let f : Rn → R be given byf (x) = 12 xT Ax − xT b + c,where A is an n×n symmetric positive definitematrix, b is an n-vector, and c is a scalar.(a) Show that Newton’s method for minimizing this function converges in one iterationfrom any starting point x0 .(b) If the steepest descent method is used onthis problem, what happens if the starting5x1 + x2 ≤ 6,4x1 + 3x2 ≤ 6,3x1 + 4x2 ≤ 6,x1 ≥ 0,x2 ≥ 0.(a) How many vertices does the feasible regionhave?(b) Since the solution must occur at a vertex,solve the problem by evaluating the objectivefunction at each vertex and choosing the onethat gives the lowest value.COMPUTER PROBLEMS213(c) Obtain a graphical solution to the problemby drawing the feasible region and contours ofthe objective function, as in Fig.
6.8.6.5 How can the linear programming prob-lem given in Example 6.11 be stated in thestandard form given at the beginning of Section 6.5.1? (Hint: Additional variables maybe needed.)Computer Problems6.1 (a) The functionf (x) = x2 − 2x + 2has a minimum at x∗ = 1. On your computer, for what range of values of x near x∗is f (x) = f (x∗ )? Can you explain this phenomenon? What are the implications regarding the accuracy with which a minimum canbe computed?(b) Repeat the preceding exercise, this timeusing the function−x2f (x) = 0.5 − xewhich has a minimum at x∗ =(a) f (x) = x4 − 14x3 + 60x2 − 70x.(b) f (x) = 0.5x2 − sin(x).(c) f (x) = x2 + 4 cos(x).(d ) f (x) = Γ(x). (The gamma function, defined byΓ(x) =,√6.3 Use a library routine, or one of your owndesign, to find a minimum of each of the following functions on the interval [0, 3].
Drawa plot of each function to confirm that it isunimodal.Z∞tx−1 e−t dt,x > 0,02/2.6.2 Consider the function f defined by0.5if x = 0f (x) =.1 − cos(x))/x2 if x 6= 0(a) Use l’Hôpital’s rule to show that f is continuous at x = 0.(b) Use differentiation to show that f has alocal maximum at x = 0.(c) Use a library routine, or one of your owndesign, to find a maximum of f on the interval[−2π, 2π], on which −f is unimodal. Experiment with the error tolerance to determinehow accurately the routine can approximatethe known solution at x = 0.(d ) If you have difficulty in obtaining a highlyaccurate result, try to explain why.
(Hint:Make a plot of f in the vicinity of x = 0, sayon the interval [−0.001, 0.001] with a spacingof 0.00001 between points.)(e) Can you devise an alternative formulationof f such that the maximum can be determinedmore accurately? (Hint: Consider a doubleangle formula.)is a built-in function on many computer systems.)6.4 Try using a library routine for onedimensional optimization on a function that isnot unimodal and see what happens. Does itfind the global minimum on the given interval,merely a local minimum, or neither? Experiment with various functions and different intervals to determine the range of behavior thatis possible.6.5 If a water hose with initial water velocity v is aimed at angle α with respect to theground to hit a target of height h, then thehorizontal distance x from nozzle to target satisfies the quadratic equation(g/(2v 2 cos2 α))x2 − (tan α)x + h = 0,where g = 9.8065 m/s2 is the accelerationdue to gravity.
How do you interpret the tworoots of this quadratic equation? Assumingthat v = 20 m/s and h = 13.5 m, use a onedimensional optimization routine to find themaximum distance x at which the target canstill be hit, and the angle α for which the maximum occurs.214CHAPTER 6. OPTIMIZATION6.6 Write a general-purpose line search routine. Your routine should take as input a vector defining the starting point, a second vector defining the search direction, the name of aroutine defining the objective function, and aconvergence tolerance. For the resulting onedimensional optimization problem, you maycall a library routine or one of your own design. In any case, you will need to determinea bracket for the minimum along the searchdirection using some heuristic procedure.
Testyour routine for a variety of objective functionsand search directions. This routine will be useful in some of the other computer exercises inthis section.6.7 Consider the function f : R2 → R definedbyf (x) = 2x31 − 3x21 − 6x1 x2 (x1 − x2 − 1).(a) Determine all of the critical (stationary)points of f analytically (i.e., without using acomputer).(b) Classify each critical point found in part aas a minimum, a maximum, or a saddle point,again working analytically.(c) Verify your analysis graphically by creating a contour plot or three-dimensional surfaceplot of f over the region −2 ≤ xi ≤ 2, i = 1, 2.(d ) Use a library routine for minimization tofind the minima of both f and −f .
Experiment with various starting points to see howwell the routine gets around other types of critical points to find minima and maxima. Youmay find it instructive to plot the sequence ofiterates generated by the routine.6.8 Consider the function f : R2 → R definedbyf (x) =2x21−1.05x41+x61 /6+ x1 x2 +x22 .Using any method or routine you like, howmany stationary points can you find for thisfunction? Classify each stationary point youfind as a local minimum, a local maximum, ora saddle point. What is the global minimumof this function?6.9 Write a program to find a minimum ofRosenbrock’s function,f (x1 , x2 ) = 100(x2 − x21 )2 + (1 − x1 )2using each of the following methods:(a) Steepest descent(b) Newton(c) Damped Newton (Newton’s method witha line search)You should try each of the methods from eachTof the three starting points x0 = [ −1 1 ] ,TT[ 0 1 ] , and [ 2 1 ] .
For any line searchesand linear system solutions required, you mayuse either library routines or routines of yourown design. Plot the path taken in the planeby the approximate solutions for each methodfrom each starting point.6.10 Let A be an n × n real symmetric matrix with eigenvalues λ1 ≤ · · · ≤ λn .
It canbe shown that the stationary points of theRayleigh quotient (see Section 4.3.7) are eigenvectors of A, and in particularλ1 = minxT AxxT xλn = maxxT Ax,xT xxandxwith the minimum and maximum occurring atthe corresponding eigenvectors. Thus, we canin principle compute the extreme eigenvaluesand corresponding eigenvectors of A using anysuitable method for optimization.(a) Use an unconstrained optimization routineto compute the extreme eigenvalues and corresponding eigenvectors of the matrix6A = 2123111.1Is the solution unique in each case? Why?(b) The foregoing characterization of λ1 andλn remains valid if we restrict the vector x tobe normalized by taking xT x = 1.
Repeat parta, but use a constrained optimization routineto impose this normalization constraint. Whatis the significance of the Lagrange multiplier inthis context?COMPUTER PROBLEMS2156.11 Write a routine implementing the BFGSmethod of Section 6.3.5 for unconstrained minimization. For the purpose of this exercise, youmay refactor the resulting matrix B at each iteration, whereas in a real implementation youwould update either B −1 or a factorization ofB to reduce the amount of work per iteration.You may use an initial value of B0 = I, butyou might also wish to include an option tocompute a finite difference approximation tothe Hessian of the objective function to use asthe initial B0 . You may wish to include a linesearch to enhance the robustness of your routine.
Test your routine on some of the othercomputer problems in this chapter, and compare its robustness and convergence rate withthose of Newton’s method and the method ofsteepest descent.6.12 Write a routine implementing the conjugate gradient method of Section 6.3.6 forunconstrained minimization. You will need aline search routine to determine the parameterαk at each iteration. Try both the FletcherReeves and Polak-Ribiere formulas for computing βk+1 to see how much difference thismakes. Test your routine on both quadraticand nonquadratic objective functions.