Chen Disser (1121212), страница 11
Текст из файла (страница 11)
. . , yw )T ∈ D wf (y)(3.11)h(y) = 0 and g(y) ≤ 0.dynamic penalty methods will require c∗∗ > c∗ = 1005 in order to have a global minimumof Ls (x, c∗∗ ) at x∗ = 5 for −1000 ≤ x ≤ 1000. In contrast, it suffices to have α∗∗ > α∗ = 10in order to have a local minimum of Lc (x, α∗∗ ) = −x2 + α∗∗ |x − 5| at x∗ = 5, irrespectiveof the range of x. Figure 3.1 illustrates that Lc (x, α∗∗ ) is a local minimum around x∗ = 5when α∗∗ = 20 but is not one when α∗∗ = 10. A small α∗∗ leads to a less rugged Lc (x, α∗∗ )function that makes it easier for global search algorithms to locate local minima.The goal of solving Pd is to find a constrained local minimum y ∗ with respect to N (y ∗),the discrete neighborhood of y ∗ .
Since the discrete neighborhood of a point is not well definedin the literature, it is up to the user to define the concept. Intuitively, N (y) represents pointsthat are perturbed from y, with no requirement that there be valid state transitions from y.Definition 3.4 Discrete neighborhood N (y) [1] of y ∈ D w in discrete space is a finiteuser-defined set of points {y ∈ D w } such that y is reachable from y in one step, that3.1.2ESPC for discrete optimizationy ∈ N (y) ⇐⇒ y ∈ N (y ), and that it is possible to reach every y from any y in one orNext, we present the ESPC of discrete nonlinear programming (DNLP) problems. This partmore steps through neighboring points.is developed by Wah and Wu in 1999 [91, 98].
We sketch the results here, and the completeDefinition 3.5 Point y ∗ is a CLMd , a constrained local minimum of Pd with respect to4950points in N (y ∗), if y ∗ is feasible and f (y ∗ ) ≤ f (y) for all feasible y ∈ N (y ∗).β ∗ ≥ 0 that satisfy (3.13). In order for α∗ and β ∗ to exist for every CLMd y ∗ , α∗ and β ∗There are two distinct features of CLMd . First, the set of CLMd of Pd are neighborhooddependent, and a point may be a CLMd under one definition of neighborhood but may not beone under another.
However, all CLMd ’s are guaranteed to be feasible, even in the extremecase in which the neighborhood of each point includes only itself. The fact that CLMd ’smust be bounded and be found in finite time. Given y ∗, consider all y ∈ N (y ∗), and let theinitial α∗ = β ∗ = 0. For every y such that |h(y)| > 0 (resp. max(0, g(y)) > 0), there is atleast one constraint that is not satisfied.
For each such constraint, we update its penalty tomake it large enough in order to offset the possible improvement in the objective value. Thisare neighborhood dependent is not critical in constrained searches, because our goal is toupdate is repeated for every violated constraint of Pd and every y ∈ N (y ∗) until no furtherfind feasible solutions that are better than their neighboring points.
As long as a consistentupdate is possible. Since N (y ∗) has a finite number of elements in discrete space, the updateneighborhood is used throughout a search, a CLMd found will be a local minimum withrespect to its neighborhood. Second, a discrete neighborhood has a finite number of points.Hence, the verification of a point to be a CLMd can be done by comparing its objective valueagainst that of its finite number of neighbors.
This feature allows the search of a descentwill terminate in finite time and result in finite α∗ and β ∗ values that satisfy (3.13).“⇐” part: Assuming (3.13) is satisfied, we need to prove that y ∗ is a CLMd . The proofis straightforward and is similar to that in the proof of Theorem 3.1.
Note that the constraint-qualification condition in Theorem 3.1 is not needed in Theorem 3.2 because constraint functions are not changing continuously in discrete problems.direction in discrete space to be done by enumeration or by a greedy search.Definition 3.6 The 1 -penalty function for Pd is defined as follows:3.1.3Ld (y, α, β) = f (y) + αT |h(y)| + β T max(0, g(y)) where α ∈ Rm and β ∈ Rr .(3.12)ESPC for mixed optimizationLast, we present the ESPC for MINLP problems defined in (1.1).The goal of solving Pm is to find a constrained local minimum (x∗ , y ∗) with respect to∗Theorem 3.2 Necessary and sufficient ESPC on CLMd of Pd [98, 91]. Suppose y ∈ DwNm (x∗ , y ∗), the mixed neighborhood of (x∗ , y ∗ ). In this thesis, we construct our mixed neigh-∗is a point in the discrete search space of Pd .
Then y is a CLMd of Pd if and only if thereexist finite α∗ ≥ 0 and β ∗ ≥ 0 such that the following is satisfied for all y ∈ N (y ∗), α ∈ Rm ,and β ∈ Rr :borhood as the union of points perturbed in either the discrete or the continuous subspace,but not both. Such a definition allows the theory for the two subspaces to be developedseparately. Because a discrete neighborhood is user-defined and a mixed neighborhood is a(3.13)Ld (y ∗, α, β) ≤ Ld (y ∗, α∗∗ , β ∗∗ ) ≤ Ld (y, α∗∗, β ∗∗ ) where α∗∗ > α∗ ≥ 0 and β ∗∗ > β ∗ ≥ 0.union of discrete and continuous neighborhoods, a mixed neighborhood is also a user-definedconcept.Proof. The original proof of this theorem is in Zhe Wu’s doctorate dissertation [98].
Wesketch the idea here. The proof consists of two parts.Definition 3.7 Mixed neighborhood Nm (x, y) of (x, y) ∈ Rv × D w in mixed space is madeup of the union of the continuous neighborhood and the user-defined discrete neighborhood:“⇒” part: Given y ∗ , we need to prove that there exist finite α∗∗ > α∗ ≥ 0 and β ∗∗ >5152Pc . Further, from Theorem 3.1, there exist finite αc∗ and βc∗ such that:Nm (x, y) = Nc (x)y ∪ N (y)x =(x , y) | x ∈ Nc (x) (x, y ) | y ∈ N (y) .
(3.14)Lm (x∗ , y ∗, α∗∗ , β ∗∗ ) ≤ Lm (x, y ∗ , α∗∗ , β ∗∗ ),∀ x ∈ Nc (x∗ )y∗ ,(3.17)α∗∗ > αc∗ ≥ 0, and β ∗∗ > βc∗ ≥ 0.Definition 3.8 Point (x∗ , y ∗ ) is a CLMm , a constrained local minimum of Pm with respectto points in Nm (x∗ , y ∗), if (x∗ , y ∗ ) is feasible and f (x∗ , y ∗ ) ≤ f (x, y) for all feasible (x, y) ∈Similarly, fixing x at x∗ converts Pm into Pd . Hence, from Theorem 3.2, we know that thereNm (x∗ , y ∗).exist finite αd∗ and βd∗ such that for the same α∗∗ and β ∗∗ in (3.17):Definition 3.9 The 1 -penalty function of Pm is defined as follows:Lm (x∗ , y ∗, α∗∗ , β ∗∗ ) ≤ Lm (x∗ , y, α∗∗, β ∗∗ ),Lm (x, y, α, β) = f (x, y) + α |h(x, y)| + β max(0, g(x, y)) where α ∈ R and β ∈ R(3.15).TTmrTheorem 3.3 Necessary and sufficient ESPC on CLMm of Pm .
Suppose (x∗ , y ∗) ∈ Rv ×∀ y ∈ N (y ∗)x∗ ,(3.18)α∗∗ > αd∗ ≥ 0, and β ∗∗ > βd∗ ≥ 0.Since all (x, y) ∈ Nm (x∗ , y ∗) perturb either x∗ or y ∗ but not both, by setting:∗D is a point in the mixed search space of Pm , and x satisfies the constraint qualificationwcondition in (3.3) for given y ∗ , then (x∗ , y ∗) is a CLMm of Pm if and only if there exist finiteα∗ = max( αc∗ , αd∗ ) = [ max(αc∗1 , αd∗1 ), . . . , max(αc∗m , αd∗m ) ]T(3.19)α∗ ≥ 0 and β ∗ ≥ 0 such that the following condition is satisfied for all (x, y) ∈ Nm (x∗ , y ∗)β ∗ = max( βc∗ , βd∗ ) = [ max(βc∗1 , βd∗1 ), . .
. , max(βc∗r , βd∗r ) ]T ,(3.20)and all α ∈ Rm and β ∈ Rr :we conclude, based on (3.17) and (3.18), that the second inequality in (3.16) is satisfied forLm (x∗ , y ∗, α, β) ≤ Lm (x∗ , y ∗, α∗∗ , β ∗∗ ) ≤ Lm (x, y, α∗∗, β ∗∗ )where(3.16)all (x, y) ∈ Nm (x∗ , y ∗) and any α∗∗ > α∗ ≥ 0 and β ∗∗ > β ∗ ≥ 0.“⇐” part: Assuming (3.16) is satisfied, we need to prove that (x∗ , y ∗) is a CLMm . Theα∗∗ > α∗ ≥ 0 and β ∗∗ > β ∗ ≥ 0.proof is straightforward and is similar to that in the proof of Theorem 3.1. Proof. The proof consists of two parts.∗The following theorem facilitates the search of points that satisfy (3.16) by partitioning∗∗∗“⇒” part: Given (x , y ), we need to prove that there exist finite α ≥ 0 and β ≥ 0the condition into two independent necessary conditions.
It follows directly from (3.14),so that (x∗ , y ∗, α∗∗ , β ∗∗ ) satisfy (3.16). The first inequality in (3.16) is true for all α and β,which defines Nm (x, y) to be the union of points perturbed in either the discrete or thesince (x∗ , y ∗) is a CLMm and |h(x∗ , y ∗)| = max(0, g(x∗, y ∗)) = 0.continuous subspace. Such partitioning cannot be accomplished if a mixed neighborhood∗To prove the second inequality in (3.16), we know that fixing y at y converts Pm intolike Nc (x) × N (y) were used.Theorem 3.4 Given the definition of Nm (x, y) in (3.14), the ESPC in (3.16) can be rewrit-5354methods that require expensive global optimization, ESPC provides a condition for locatingten into two necessary conditions that, collectively, are sufficient:Lm (x∗ , y ∗, α, β) ≤ Lm (x∗ , y ∗ , α∗∗ , β ∗∗ ) ≤ Lm (x∗ , y, α∗∗, β ∗∗ )where y ∈ N (y ∗)x∗(3.21)Lm (x∗ , y ∗ , α∗∗ , β ∗∗ ) ≤ Lm (x, y ∗, α∗∗ , β ∗∗ )where x ∈ Nc (x∗ )y(3.22)∗.constrained local optima, which leads to much lower complexity.
Unlike the KKT conditionthat works only for continuous and differentiable problems, ESPC offers a uniform treatmentto problems defined in continuous, discrete, and mixed spaces, and does not require thefunctions to be differentiable or in closed form. Moreover, the unique Lagrange-multiplierIn summary, we have presented in this section a set of necessary and sufficient conditionsthat govern all constrained local minima in nonlinear continuous, discrete, and mixed optimization problems. In contrast to general penalty approaches, α∗∗ and β ∗∗ always exist inESPC for any constrained local minimum, provided that the constraint qualification condition is satisfied in the continuous subspace. The similarity of these three conditions allowsproblems in these three classes to be solved in a unified fashion.The 1 -penalty function is different from the traditional Lagrangian function and thevalues in KKT are typically found by solving a system of nonlinear equations iteratively.