J89 (1121213), страница 3
Текст из файла (страница 3)
The most common form solves the following minimization problem incontinuous space [7,33]:⎡⎛⎞⎤mrmin Le (x, c)T = min ⎣f (x) + c ⎝(11)|hi (x)| +gj (x)+ ⎠⎦ .xxi=1j=1It has been shown that, for continuous and differentiable problems and when certain constraint qualification conditions are satisfied, there exists c∗ > 0 such that thex∗ that minimizes (11) is also a global optimal solution to the original problem [7,33].In fact, c needs to be larger than the summation of all the Lagrange multipliers at x∗ ,while the existence of the Lagrange multipliers requires the continuity and differentiability of the functions.Besides (11), there are various other formulations of exact penalty methods [4,11–13].
However, their results are limited to continuous and differentiable functionsand to global optimization. Their theoretical results were developed by relating theirpenalty terms to their Lagrange multipliers, whose existence requires the continuityand differentiability of the constraint functions.In our experiments, we only evaluate our proposed methods with respect todynamic penalty methods P3 and P4 for the following reasons.
It is impractical toimplement P1 because it requires choosing some suitable penalty values a priori. Thecontrol of progress in solving P2 is difficult because it requires tuning many (·(m+r))parameters that are hard to generalize. The method based on solving P5 is also hardto generalize because it depends on choosing an appropriate sequence of violationthresholds. Reducing the thresholds quickly leads to large penalties and the searchtrapped at infeasible points, whereas reducing the thresholds slowly leads to slow convergence.
We do not evaluate exact penalty methods because they were developedfor problems with continuous and differentiable functions.J Glob Optim2.2 Necessary and sufficient conditions on constrained local minimizationWe first describe in this section the theory of ESPs that shows the one-to-one correspondence between a CLMm of Pm and an ESP of the penalty function. We thenpresent the partitioning of the ESP condition into multiple necessary conditions andthe formulation of the corresponding subproblems. Because the results have beenpublished earlier [27,30], we only summarize some high-level concepts without theprecise formalism and their proofs.Definition 5 For penalty vectors α ∈ Rm and β ∈ Rr , we define a penalty function ofPm as:Lm (z, α, β)T = f (z) + α T |h(z)| + β T g(z)+= f (z) +mi=1αi |hi (z)| +rβj gj (z)+ .(12)j=1Next, we informally define a constraint-qualification condition needed in the maintheorem [27].
Consider a feasible point z = (x , y )T and a neighboring point z =→→(x + p , y )T under an infinitely small perturbation along direction p ∈ X in the xsubspace. When the constraint-qualification condition is satisfied at z , it means that→there is no p such that the rates of change of all equality and active inequality constraints between z and z are zero. To see why this is necessary, assume that f (z) at→z decreases along p and that all equality and active inequality constraints at z havezero rates of change between z and z . In this case, it is not possible to find some finitepenalty values for the constraints at z in such a way that leads to a local minimumof the penalty function at z with respect to z .
Hence, if the above scenario were→true for some p at z , then it is not possible to have a local minimum of the penaltyfunction at z . In short, constraint qualification at z requires at least one equality or→active inequality constraint to have a nonzero rate of change along each direction pat z in the x subspace.Theorem 1 Necessary and sufficient condition on CLMm of Pm [27]. Assuming z∗ ∈ Zof Pm satisfies the constraint-qualification condition, then z∗ is a CLMm of Pm iff thereexist some finite α ∗ ≥ 0 and β ∗ ≥ 0 that satisfies the following extended saddle-pointcondition (ESPC):Lm (z∗ , α, β)T ≤ Lm (z∗ , α ∗∗ , β ∗∗ )T ≤ Lm (z, α ∗∗ , β ∗∗ )T(13)for any α ∗∗ > α ∗ and β ∗∗ > β ∗ and for all z ∈ N m (z∗ ), α ∈ Rm , and β ∈ Rr .Note that (13) can be satisfied under rather loose conditions because it is truefor a range of penalty values and not for unique values. For this reason, we call(z∗ , α ∗∗ , β ∗∗ )T an ESP of (12).
The theorem leads to an easy way for finding CLMm .Since an ESP is a local minimum of (12) (but not the converse), z∗ can be found bygradually increasing the penalties of those violated constraints in (12) and by repeatedly finding the local minima of (12) until a feasible solution to Pm is obtained.
Thesearch for local minima can be accomplished by any existing local-search algorithmfor unconstrained optimization.Example 1 (cont’d) In solving (5), if we use Lm (y, α)T = f (y) + α |y| and choose∗∗∗∗α ∗ = 1, we have an ESP at y = 0 for any α > α . This establishes a local minimumof Lm (y, α)T at y∗ = 0 with respect to N d (y) = {y − 1, y + 1}. Note that the α ∗J Glob Optimthat satisfies Theorem 1 is only required to establish a local minimum of Lm (y, α)Tat y∗ = 0 and is, therefore, smaller than the α ∗ (= 43 ) required to establish a globalminimum of Lm (y, α)T in the static-penalty method.An important feature of the ESPC in Theorem 1 is that it can be partitioned insuch a way that each subproblem implementing a partitioned condition can be solvedby looking for any α ∗∗ and β ∗∗ that are larger than α ∗ and β ∗ .Consider Pt , a version of Pm whose constraints can be partitioned into N subsets:(Pt ) :min f (z),zsubject to h(t) (z(t)) = 0,and H(z) = 0,g(t) (z(t)) ≤ 0G(z) ≤ 0(local constraints)(14)(global constraints).Each subset of constraints can be treated as a subproblem, where Subproblem t,t = 1, .
. . , N, has local state vector z(t) = (z1 (t), . . . , zut (t))T of ut mixed variables,and ∪Nt=1 z(t) = z. Here, z(t) includes all the variables that appear in any of the mt(t)(t)local equality constraint functions h(t) = (h1 , . . . , hmt )T and the rt local inequality(t)(t)constraint functions g(t) = (g1 , . . . , grt )T . Since the partitioning is by constraints,z(1), . . . , z(N) may overlap with each other. Further, z(g) includes all the variablesthat appear in any of the p global equality constraint functions H = (H1 , . . .
, Hp )Tand the q global inequality constraint functions G = (G1 , . . . , Gq )T .We first define N m (z), the mixed neighborhood of z for Pt , and decompose theESPC in (13) into a set of necessary conditions that collectively are sufficient. Eachpartitioned ESPC is then satisfied by finding an ESP of the corresponding subproblem, and any violated global constraints are resolved by finding some appropriatepenalties.Definition 6 N p0 (z), the mixed neighborhood of z for Pt when partitioned by its constraints, is:N p0 (z) =Nt=1N p(t) (z) =1N z z (t) ∈ N m1(z(t)) and zi = zi ∀zi ∈/ z(t) ,(15)t=1where N m1(z(t)) is the mixed neighborhood of z(t) (see Definition 2).Intuitively, N p0 (z) is separated into N neighborhoods, where the tth neighborhoodonly perturbs the variables in z(t) while leaving those variables in z\z(t) unchanged.Without showing the details, we can consider Pt as a MINLP and apply Theorem 1to derive its ESPC.
We then decompose the ESPC into N necessary conditions, onefor each subproblem, and an overall necessary condition on the global constraintsacross the subproblems. We first define the penalty function for Subproblem t.Definition 7 Let ((z, γ , η)T ) = γ T |H(z)| + ηT G(z)+ be the sum of the transformedpglobal constraint functions weighted by their penalties, where γ = (γ1 , . . .
, γp )T ∈ Rqand η = (η1 , . . . , ηq )T ∈ R are the penalty vectors for the global constraints. Then thepenalty function for Pt in (14) and the corresponding penalty function in SubproblemJ Glob Optimt are defined as follows:N α(t)T |h(t) (z(t))|Lm (z, α, β, γ , η)T = f (z) +t=1T+β(t)+g(t) (z(t))+ ((z, γ , η)T ),(16)m ((z, α(t), β(t), γ , η)T ) = f (z) + α(t)T |h(t) (z(t))|++β(t)T g(t) (z(t)) + ((z, γ , η)T ),m(17)rwhere α(t) = (α1 (t), .
. . , αmt (t))T ∈ R t and β(t) = (β1 (t), . . . , βrt (t))T ∈ R t are thepenalty vectors for the local constraints in Subproblem t.Theorem 2 Partitioned necessary and sufficient ESPC on CLMm of Pt [27]. GivenN p0 (z), the ESPC in (13) can be rewritten into N + 1 necessary conditions that, collectively, are sufficient:m ((z∗ , α(t), β(t), γ ∗∗ , η∗∗ )T ) ≤ m ((z∗ , α(t)∗∗ , β(t)∗∗ , γ ∗∗ , η∗∗ )T )≤ m ((z, α(t)∗∗ , β(t)∗∗ , γ ∗∗ , η∗∗ )T ),Lm (z∗ , α ∗∗ , β ∗∗ , γ , η)T ≤ Lm (z∗ , α ∗∗ , β ∗∗ , γ ∗∗ , η∗∗ )T(18)(19)for any α(t)∗∗ > α(t)∗ ≥ 0, β(t)∗∗ > β(t)∗ ≥ 0, γ ∗∗ > γ ∗ ≥ 0, and η∗∗ > η∗ ≥ 0, andmrpq(t)for all z ∈ N p (z∗ ), α(t) ∈ R t , β(t) ∈ R t , γ ∈ R , η ∈ R , and t = 1, .
. . , N.1Theorem 2 shows that the original ESPC in Theorem 1 can be partitioned into Nnecessary conditions in (18) and an overall necessary condition in (19) on the globalconstraints across the subproblems. Because finding an ESP to each partitioned condition is equivalent to solving a MINLP, we can reformulate the ESP search of the tthcondition as the solution of the following optimization problem:(t)Pt :min f (z) + γ T |H(z)| + ηT G(z)+(20)z(t)subject to h(t) (z(t)) = 0 and g(t) (z(t)) ≤ 0.The weighted sum of the global constraint functions in the objective of (20) is important because it leads to points that minimize the violations of the global constraints.(t)When γ T and ηT are large enough, solving Pt will lead to points, if they exist, that(t)satisfy the global constraints. Note that Pt is very similar to the original problem andcan be solved by the same solver to the original problem with some modifications onthe objective function to be optimized.In summary, we have shown in this section that the search for a CLMm of Pmis equivalent to finding an ESP of the corresponding penalty function, and that thisnecessary and sufficient condition can be partitioned into multiple necessary conditions.