locatelli (Лекции в различных форматах)
Описание файла
Файл "locatelli" внутри архива находится в следующих папках: Лекции в различных форматах, Лекция 9, convergence. PDF-файл из архива "Лекции в различных форматах", который расположен в категории "". Всё это находится в предмете "алгоритмы оптимизации основанные на методе проб и ошибок" из 5 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст из PDF
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 104, No. 1, pp. 121–133, JANUARY 2000Simulated Annealing Algorithms for ContinuousGlobal Optimization: Convergence Conditions1M. LOCATELLI2Communicated by G. Di PilloAbstract. In this paper, simulated annealing algorithms for continuousglobal optimization are considered. After a review of recent convergenceresults from the literature, a class of algorithms is presented for whichstrong convergence results can be proved without introducing assumptions which are too restrictive. The main idea of the paper is that ofrelating both the temperature value and the support dimension of thenext candidate point, so that they are small at points with functionvalue close to the current record and bounded away from zerootherwise.Key Words.
Global optimization, simulated annealing, convergenceconditions.1. IntroductionLet f be a continuous objective function defined over a compact feasibleset X. We face the problem of determiningf *Gmin f (x),x ∈XX⊆ R d,called a global optimization problem. In particular, we consider the use ofsimulated annealing algorithms for the solution of this problem.
Simulatedannealing algorithms have received a great deal of attention in the last years.The name ‘‘simulated annealing’’ comes from a physical process calledannealing, the process for growing crystals, which can be simulated by the1This research was partially supported by Project MOST. The author thanks two anonymousreferees for their helpful suggestions.2Postdoctoral Fellow, Universitá di Firenze, Dipartimento di Sistemi ed Informatica, Firenze,Italy.1210022-3239y00y0100-0121$18.00y0 2000 Plenum Publishing Corporationps893$p22710-01-:0 11:35:20122JOTA: VOL. 104, NO. 1, JANUARY 2000Metropolis Monte Carlo method (see Ref.
1). It was applied first to combinatorial global optimization independently in Refs. 2 and 3, and lately tocontinuous global optimization. In the field of continuous global optimization, remarkable works from the point of view of applications are Refs. 4–10. The general structure of a simulated annealing algorithm is thefollowing.Step 1. Let Y0 ∈X be a given starting point, let z0 G{(Y0 , f (Y0 ))},f 0* Gf (Y0 ), and kG0.Step 2.
Sample a point XkC1 from a distribution D(·; zk ).Step 3. Sample a uniform random number p in [0, 1] and set5Y ,YkC1 GStep 4.Step 5.Step 6.Step 7.XkC1 , if pYA(Yk , XkC1 , tk ),otherwise,kwhere A is a function with values in [0, 1] and tk is a parametercalled the temperature at iteration k.Set zkC1 Gzk ∪{(XkC1 , f (XkC1))}; the set zk contains the information collected by the algorithm up to iteration k, i.e., theset of points at which the function has been evaluated togetherwith the corresponding function values.Set tkC1 GU(zkC1), where U is a function with nonnegativevalues.Set f *kC1 Gmin{ f k* , f (XkC1)}, where f k* is the record value atiteration k, i.e., the best observed value up to iteration k.Check a stopping criterion; if it fails, set k_kC1 and go backto Step 2.A particular simulated annealing algorithm is specified by choosing thestopping criterion and the functions D, A, U which define respectively thedistribution of the next candidate point, the probability of accepting it asthe next iterate, and the cooling schedule, i.e., the temperature, which is aparameter through which the acceptance of the candidate points is controlled as a consequence of the dependency of A on it.
In what follows, theacceptance probability is the so-called Metropolis function,A(Yk , XkC1 , tk )Gmin{1, exp(( f (Yk )Af (XkC1))ytk )},(1)which accepts always descent steps and accepts ascent steps with a positiveprobability (unless tk G0) in order to avoid getting trapped in a local minimum which is not a global minimum.
The parameter tk controls the acceptance of the ascent steps: decreasing tk also decreases their acceptanceprobability.An interesting theoretical issue is to establish conditions under whichwe can guarantee convergence in probability to the global optimum of aps893$p22710-01-:0 11:35:20JOTA: VOL. 104, NO. 1, JANUARY 2000123simulated annealing algorithm for continuous global optimization. Thisproblem is faced in the following sections.
In Section 2, recent results fromthe literature are presented. In Section 3, the assumptions under which newresults can be derived are introduced. Finally, in Section 4, the new convergence results are presented. The proofs of the convergence results are rathertechnical and have been omitted in the final version of the paper due tospace reasons; they can be found in detail in Ref.
11.2. Convergence Conditions in the LiteratureIn this section, we introduce briefly some recent convergence results forsimulated annealing algorithms for continuous global optimization. First,we recall the convergence result presented in Ref. 12. For any 2H0, letB2 G{x∈X: f (x)Yf *C2}.In Ref. 12, the following assumptions are introduced:(A1) D(·; zk ) has the form D(Yk , ·), where D(·, ·) is a Markov kernel,absolutely continuous with respect to the Lebesgue measure mand with density β bounded away from zero, i.e.,D(x, B)G# β (x, y) dyandBinf β (x, y)GρH0,x,y ∈X∀B∈B ,(2)where B is a σ -algebra over X.(A2) For any G⊆X open, D(x, G ) is continuous in x.(A3) For any initial state Y0 and any initial temperature t0 ,tkGU(zk) → 0,(3)with probability 1.(A4) X is a compact set, f is a continuous function, and m(B2 )H0,∀2H0.Under the above assumptions, it has been proved thatlim P[Yk ∈B2 ]G1,k→S∀2H0.Note that (3) is much less restrictive than the usual conditions for the temperature in the combinatorial case, where it is required that the temperaturedecreases to zero not faster than the inverse of the logarithm of the iterationcounter k times a constant (see Ref.
13). The reason for this is that, whileps893$p22710-01-:0 11:35:20124JOTA: VOL. 104, NO. 1, JANUARY 2000in combinatorial optimization we perform only local steps (i.e., steps in theneighborhood of the current point), here (2) says that every step is a globalone. The negative consequence of this fact is that, for any A⊆X\B2 , wehave thatP[Xk ∈A]Xρm(A),∀k.Therefore, at any iteration, there is a probability bounded away from zeroof sampling points in regions far from the global optimum region. Instead,we would like to be able to perform steps which are only local in order toexplore more deeply the most promising parts of the feasible region.Another interesting convergence result has been given in Ref.
14. Inthis paper, a class of simulated annealing algorithms is presented for whichconvergence to the global optimum not only of the sequence {Yk } but alsoof the sequence of candidate points {Xk } is guaranteed, so that asymptotically we have a local exploration in the region of the global optimum. Onthe other hand, the conditions under which these results are proved arerestrictive. In particular it is required that:(B1) f∈C 2.(B2) The distribution D of the next candidate point is Gaussian.In this paper, we introduce a class of algorithms for which convergenceto the global optimum of the sequence of candidate points {Xk } is guaranteed (as in Ref.
14), by introducing assumptions which are not too restrictive(as in Ref. 12). A first step in this direction has been made in Ref. 15. Inthat paper, a class of algorithms has been introduced for which the supportof D(·; zk ) is not the whole feasible set but its intersection with the sphereS(Yk , R) around the current point (R is a constant value). The followingcooling schedule has been introduced:tkXδ 1H0,if f (Yk )Af *k H2̄,(4a)tk Gck ,otherwise,(4b)where {ck } is a deterministic nonincreasing sequence converging to 0 and2̄H0 is a constant such thatm(S(x, R)∩B2 )H0,∀x∈B22̄ , ∀2H0,(5)Basically, condition (5) requires that, once we are in the set B22̄ , we canreach the set B2 in one single step for any 2H0. The meaning of (4) is that,if we are in a point whose function value is poor with respect to the currentrecord, then we guarantee a probability bounded away from 0 of acceptingascent steps, while if we are close to the record, we decrease the probabilityps893$p22710-01-:0 11:35:20JOTA: VOL.
104, NO. 1, JANUARY 2000125of accepting ascent steps. Under suitable mild assumptions, the followingtheorem has been proved.Theorem 2.1. Let N be a sufficiently large integer and let∆FG maxmaxx ∈X \ B2̄ y ∈S(x, R) ∩ X[ f ( y)Af (x)].Under the cooling schedule (4), ifckX(1Cµ)(N∆Fylog k),µH0,thenlim P[Yk ∈B2 ]G1,k→S∀2H0.Proof. See Theorem 3 in Ref.
15.hThe integer N depends on R and the shape of X. If X is a convex set,N can be chosen approximately equal to the integer part of the ratio betweenthe diameter of X and R. If 2̄ does not satisfy (5), it is still possible to provethatlim P[Yk ∈B2̄ ]G1.k→SFrom Theorem 2.1, it follows thatlim P[d(Xk , B2 )YR]G1,k→S∀2H0.Therefore, asymptotically, we cannot sample too far from the region of theglobal optima, i.e., at a distance greater than R. The natural developmentat this point is to try to see whether it is possible to obtain thatlim P[Xk ∈B2 ]G1,k→S∀2H0.We will consider this in the following sections.3.