Trouve (1121214), страница 7
Текст из файла (страница 7)
Finally, since feg ? feg0 = fg ? fg0 for g 2 G(fig) and g0 2 G(fi0g). We deducefrom the stability under addition of the elements of B that there exists i0 2 E ,g0 2 G(fi0g) and 0 0 such that for all i 2 E and all g 2 G(fig), the functionsf^g ( ) = feg ( ) ? feg0 ( )1I0 satisfy8 ^>< fg ( ) 0 if 0;>: f^ ( ) = 0 if :g00ALAIN TROUVE40Hence, (i) = a^ (i)=Z^ where^a (i) =Xg2G(fig)Ag exp(Z0Xf^g ) and Z^ = a^ (i):i2EThe essential fact here is that the ^a (i) are decreasing in and thatZ 0^a (i0) Ag0 exp(Now, assume that we have proved0f^g0 ) > 0 for all :P (Xn = i) nn (i);thenP (Xn+1 = i) =Xj 2E nP (Xn = j )Qn+1 (j; i)Xj 2En (j )Qn+1 (j; i):However, since ^a (j ) is decreasing and n+1 n, we getn (j ) kn n+1 (j ) with kn = Z^n+1 =Z^n :Hence, one easily computes thatP (Xn+1 = i) nkn n+1 (i):We can notice then thatSince Z^ with nYkp !limZ^ =Z^ :+1 0p=0R= Ag0 exp( 00 f^g0 )andY XY XZ^0 ( Auv ) ( Q0(u; v)) 1;u2E v2Eu2E v2Ewe get P (Xn+1 = i) inf j2E 0(j )n (i) so that the theorem is proved.The theorem above shows what can be expected from a decreasing cooling schedule.Whatever the cooling schedule is, the probability for Xn to be in a conguration iis bounded from below by the invariant probability measure at temperature Tn.
Wecan now give an upper bound for the convergence speed of G.S.A on level sets.OPTIMAL CONVERGENCE SPEED EXPONENT FOR G.S.A41Theorem 6.2. Let q be an irreducible Markov kernel on E , let 2 [1; +1[ andlet X be a G.S.A with parameter (q; ; P ) whose underlying cooling schedule T isdecreasing. Then, assuming that the underlying family Q 2 A(q; ) satises thecondition C1, there exists b > 0 (which depends on Q but not on T ) such that for alllevel > 0 and all n > 0,sup P (Xn 2 E j X0 = i ) nbi2Ewithmin W j 2 C (E ); W () ? min W g: = inf f W ()H?()eProof. A very similar theorem has been proved in [2] for sequential annealing andour proof borrows it crucial technical tricks from [2].As noticed,n [2] its is sucient to prove thatP (W (Xn ) ? min W ) nb ;for the Markov chain with the initial distribution measure 0(j ) = 1=jE j for all j 2 E .min W = .
Then,Now, consider 2 C (E ) such that W () ? min W and W ()He?()considering the last visit to , we have the expansion(43) P (Xn 2 ) + 1X X Xj 2= i2 m<nXi2P (X0 = i)P ( (; 0) > n j X0 = i)P (Xm = j )q(j; i)e?V (j;i)=Tm+1 P ( (; m + 1) > n j Xm+1 = i )From theorem 4.7 and Proposition 1.6 we get K > 0 independent of T such that forall j 2 EP (Xm = j ) Ke?(W (j)?minW )=Tm :Hence,(44)P (Xm = j )e?V (j;i)=Tm+1 Ke?(W (j)+V (j;i)?minW )=Tm+1 :Now, consider the cycle ij which is the smallest cycle in C (E ) containing fi; j gand i (respectively j ) which is the largest cycle in E n fj g (respectively E n fig)containing i (respectively containing j ).
We have i; j 2 M(ij ) so that we deduceALAIN TROUVE42from lemma 3.6 that W (i) + He (i) = W (j ) + He (j ) and from lemma 3.7 thatW (j ) + V (j; i) W (j ) + He (j ). Hence(45)infj 2= ; i2W (j ) + V (j; i) ? minW = He () + W () ? minW:We should handle separately the case jj = 1.Assume that jj = 1, then one proves easily by a direct computation that thereexists c > 0, 0 d 1 such that(46)P ( (; m) > n j Xm = i ) cnYl=m+1(1 ? de?He( )=Tl )so that we obtain from (43), (44), (45) and (46) that there exists K1 (independent of and T ) such thatnY(47) P (Xn 2 ) c (1 ? de?He()=Tl )l=1+nXm=1K1e?(He()+W ( )?minW )=TmnYl=m+1(1 ? de?He( )=Tl ):Assume now that jj 2, then denoting = inf fq(i; j ) j q(i; j ) > 0 and i; j 2 E g,we have P ( (; m) > m + 1 j Xm = i ) 1 ? where = 1 ? =.
Hence, wededuce from H1 in theorem 4.5, that there exists c, d and K > 0 such thatP ( X n 2 j Xm = i ) cnYl=m+1(1 ? (de?He()=Tl ) ^ );which leads with (43), (44) and (45) to(48) P ( Xn 2 ) c+nXm=1nYl=m+1(1 ? (de?He( )=Tl ) ^ )K2 e?(He( )+W ( )?minW )=TmnYl=m+1(1 ? (de?He( )=Tl ) ^ ):The remaining of the proof is now exactly the same that the proof of Theorem 5.2 in[2] and leads to the explicit computation of the lower bound for P (Xn 2 ) throughinequalities (47) and (48).
However, for completeness, we recall here the arguments.We consider only the case jj 2j. The case jj = 1 can be handled similarly andis left to the reader.OPTIMAL CONVERGENCE SPEED EXPONENT FOR G.S.A43Consider the sequence (Rn )n2N dened by8>< R0 c(49)>:Rn = (1 ? (de?He()=Tn ) ^ )Rn?1 + K2e?(He( )+We ())=Tn ;f () = W () ? minW .where we have noted WDenoting = 1= , a straightforward computation gives thatdd ( (1+)(50)minR=R?nn?1Tn >01 + K2(1 + ?1) ) Rn?1for?1 (1 + ?1 )K2Rn?1 ( d ) ():dHence consider?1S0 = c ^ ( d )?1 ( (1 + d )K2 )and)(51)Sn = Sn?1 ? 1 +d ( K (1 d+ ?1) ) Sn(1+?1 :2We will prove by induction that for all decreasing T = (Tn)n2N, the sequence (Rn )n2Ndened by (49) and R0 = S0 veries Rn Sn for all n 2 N.Assume that Rk Sk , then from (49) we get?He ( )=Tk+1 ) ^ )S + K e?(He ( )+We ( ))=Tk+1(52)Rk+1 min(1?(dek2Tk+1Since Sk S0 (=d)?1 (1 + ?1)K2=d, we deduce from (50) and (52) thatRk+1 Sk+1:Hence P (Xn 2 ) Sn for all n 2 N.It is sucient now to compute a lower bound for the sequence (Sn )n2N.
From (51)we deduce thatSn? ? Sn??1 ?(Sn ? Sn?1 )Sn?(+1) 1 + d( K (1d+ ?1 ) ( SSn?1 )(1+):2nHowever,Sn?1 = (1 ? d ( dSn?1 ) )?1Sn1 + K2(1 + ?1) (1 ? 1 +d ( K (1dS+0?1) ) )?1 (1 ? 1 + ) (1 ? ):2ALAIN TROUVE44HenceDenotingSn? S0 + n 1 + d( K (1 d+ ?1) ) (1 ? ):2a = S0 1 + d( K (1 d+ ?1) ) (1 ? );2we get na so that Sn with b = a??1 .
The proof of theorem 6.2 iscompleted.6.2. Lower bound. Theorem 6.1 shows that there exists a constant K such thatfor all i 2 E and all decreasing cooling schedules, we havesup P ( Xn = j j X0 = i) Ke?(W (j)?Wmin )=Tnb=n?1Sn?i2ETo establish a lower bound, it is natural to look for cooling schedule such thatsup P ( Xn = j j X0 = i) K 0e?(W (j)?Wmin )=Tn(53)i2ETheorem 4.7 is the key of the inequality (53). As we have previously noticed, thestatement of this theorem is exactly the same in the sequential case than in thegeneralized case except that we have to replace the virtual energy U by the virtualenergy W . Moreover, it is the unique source of all the lower bound estimates forthe convergence speed of the sequential simulated annealing algorithms. Hence , theextension the the general case of the theorem on the lower bound of the convergencespeed does not demand any specic modication compared to its statement in [2]for the sequential case neither for its proof.
It is sucient to use strictly the samearguments and to exchange U and W .Theorem 6.3. Let q be an irreducible Markov kernel on E , let 2 [1; +1[ and letQ 2 A(q; ). Let > 0. There exist a non negative constant K such that for allN 2 N , there exist a decreasing cooling schedule T N = (TnN2N) for which if X is aG.S.A with parameter (q; ; PN ) where PN = P( Q; T N ; 0 ) thensup PN ( XN 2 E j X0 = i) NK^i2Ewheremin W ) ^ j 2 C (E ); W () > min W g:^ = inf f (W () ?H ()eRemark 14.
One obviously have = ^ for suciently small .OPTIMAL CONVERGENCE SPEED EXPONENT FOR G.S.A45Proof. It is sucient to consider the proof of the theorem 7.1 in [2] and to replace Uby W .7. ConclusionIn this paper, we have emphasized the study of the value of the optimal convergencespeed exponent for level sets. Our approach gives a semi explicit expression of theexponents in function of the communication cost V . However, in practical situation,their numeric computations are hard combinatorial problems but an implementablerecursive algorithm has been proposed by the author in [16] and [15] useful to testconjecture on small state spaces.
Moreover, an easier problem is the comparison of theexponents for dierent annealing algorithms and this approach succeeds for instancein the study of parallel algorithms based on several interacting annealing processes asdone in [14]. Concerning the synchronous parallel version of the sequential annealingfor image processing presented in the introduction, the problem is a bit more delicateand it appear that generally the congurations minimizing the virtual energy do notminimize the underlying cost U (one can propose more ecient parallel schemes,; see[16] for an extensive study).One limitation of this approach is in the evaluation of the multiplicative constantsappearing in the upper and lower bounds for the convergence speed.
An alternativeapproach seems to be the computation of geometric bounds of the spectral gap of thetransition kernel based on the Poincare method with however a counterpart on thegenerality of the cooling schedules which can be studied [7] [4] [11].1.2.3.4.5.6.7.8.ReferencesR. Azencott. A common large deviation framework for sequential and parallel annealing. InR.
Azencott et al., editors, Simulated annealing: Parallelization techniques, chapter 2, pages11{23. Willey and Sons, 1992.O. Catoni. Rough large deviation estimates for simulated annealing. Application to exponentialschedules. Ann. Probab., 20:1109{1146, 1992.T.-S. Chiang and Y. Chow. A limit theorem for a class of inhomogeneous markov processes.Ann.
Probab., 1989.J.-D. Deuschel and C. Mazz. L2 convergence of time non-homogeneous markov processes: I.spectral estimates. Universite de Fribourg, Institut de Mathematiques, preprint., 1992.M. I. Freidlin and A. D. Wentzell. Random Pertubation of Dynamical Systems, volume 260.Springer-Verlag, 1984.D. Geman.
Random elds and inverse problem in imaging. In Ecole d'Ete de probabilites deSaint-Flour XVIII. Springer-Verlag, 1990.F. Gotze. Rate of convergence of simulated annealing processes. Preprint, 1992.B. Hajek. Cooling schedule for optimal annealing. Math. Oper. Res., 13:311{329, 1988.ALAIN TROUVE469. R. Holley and D. Stroock. Annealing via sobolev inequalities. Comm. Math. Phys., 115:553{559,1988.10.