Trouve (1121214), страница 4
Текст из файла (страница 4)
Hence we get W ()+ He()+ C(i; j ) = W (i)+ V (i; j ).Proof. From the denition of C we getW (i) + V (i; j ) = W (i) + C (i; j ) +If j 2= , we have nij ni; andW (i) +nXk=0nijX^ni;k=0(Hek (ik ) ? Hm(ik )) = AE ?(Hek (ik ) ? Hm (ik )):nEX(Hek (ik ) ? Hm (ik))n +1= W () + He ():The lemma is proved.4. The localization theoremsIn Section 3, we have dened the main objects needed to perform a large deviationstudy of a G.S.A. In this section, we turn to a rigorous setting of the heuristicprobabilistic interpretations of the cycle decomposition of E and of the renormalizedcommunication costs.
One of our main tasks will be to estimate the probability thatstarting from a point i 2 , the Markov chain escape from at a point j 2 c at agiven time n. We will follow the original approach of O. Catoni [2] who has handledlarge deviation estimates for arbitrary decreasing cooling schedules.Throughout this section, we will consider a xed irreducible Markel kernel q on Eand a xed 2 [1; +1[. We denote X a G.S.A.
with parameter (q; ; P ) , T theunderlying cooling schedule and V the communication cost.20ALAIN TROUVE4.1. Emptying kernels.Denition 4.1. Let H 0; a > 0 and b 0 be three real valued numbers. Nowconsider a kernel on the integers Q : Z Z![0; 1] such that Qnm = 0 if mQ> n (1 ? (1 + b) nl=?m1 +1 (1 ? ae?H=Tl)) Pnk=?m1 Qkm 1 if m < n.The set of all such kernels is called the set of the right emptying kernels for (H; a; b)and denoted by Dr (H; ).
In the same way, we dene the set Dl(H; ) of the leftemptying kernels for (H; a; b) by Qnm = 0 if mQ> n (1 ? (1 + b) nl=?m1 +1 (1 ? ae?H=Tl)) Pnk=m+1 Qnk 1 if m < n.We will take the convention that the product Qnl=?m1 +1( ) is 0 if one of the term isnegative.Remark 12. The emptying kernels has been introduced in [2] and play a central rolein our large deviation estimates. The right kernel will be used to give upper andlower bound to the probability that the Markov chain starting in a subset A at timen exit from A at time n. They follows approximately an inhomogeneous exponentiallaw where the control is done on the partial sums.We recall now the stability lemmas given in [2].Lemma 4.2.
(Stability under product) Consider two real valued numbers a > 0 andb 0. Let Q 2 Dr (H; a; b) (respectively Dl(H; a; b)), let R 2 Dr (H; a; b) (respectivelyDl(H; a; b)). Then, there exists a0 > 0 and b0 0 which depend on a and b such thatthe product kernel QR is a element of D r (H; a0; b0 ) (respectively Dl(H; a0 ; b0)).Lemma 4.3. (Stability under convex combination) Let Q 2 Dr (H; a; b) (respectivelyDl(H; a; b)), let R 2 Dr (H; a; b) (respectively Dl(H; a; b)) and 2 [0; 1]. Then Q +(1 ? )R 2 Dr (H; a; b) (respectively Dl(H; a; b).4.2. Localization kernels. Following Catoni, we deneDenition 4.4. (1) Let A E . We dene the stopping time (A; m) by (A; m) = inf f n > m ; Xn 2= A g:OPTIMAL CONVERGENCE SPEED EXPONENT FOR G.S.A21(2) Let A E , B E we deneM (A; B )j;ni;m =(P ( (A; m) n; Xn?1 2= B; Xn = j j Xm = i ) m < n; i 2 E; j 2 B;0otherwise,(P ( (A; m) > n; Xn = j j Xm = i ) m n; i 2 E; j 2 BL(A; B )j;ni;m = 0otherwise.The kernel M (A; B ) gives the probability to have an entrance at time n into theset B at j , starting from i at time m and traveling in A during the intermediatetime.
It will be used with B Ac and will describe the way that the Markov chainescape from A. On the other side, L(A; B ) gives the probability to be in j at timen, starting from i at time m and traveling in A.4.3. Main large deviation estimates.Theorem 4.5. There exists a > 0, b 0, c > 0, d > 0, K1 > 0, and K2 > 0 whichdepend only on E , q and (and not on V ) such that for all n 2 N, the inductionhypothesis Hn(a; b; c; d; K1; K2 ) is true :Hn1 (a; b; c; d; K2): For all 2 C (E ), jj n, for all A , A =6 ; and alli 2 :there exists Q 2 D r (He ( n A); a; b) such that?C (i;j )=Tm+1 Qn :M ( n A; ( n A)c)j;ni;m K2 e nAmQnMoreover P ( (; m) > n j Xm = i ) c l=m+1 (1 ? de?He ()=Tl )Hn2 (a; b; K1): For all A E , jAj n, all i 2 A, all j 2 Ac:there exists Q 2 D r (He (A); a; b) such that?C (i;j )=Tn Qn :M (A; Ac)j;ni;m K1 e AmMoreover we haveP ( (A; m) > n j Xm = i ) (1 + b)Hn3 (a; b; K2):nYl=m+1(1 ? ae?He(A)=Tl ):For all 2 C (E ), jj n and all i 2 :there exists Q 2 D r (He (); a; b) such that?C (i;j )=Tm+1 Qn :M (; c)j;ni;m K2 e m22ALAIN TROUVEHn4 (a; b; K2): For all A E , jAj n, all 2 M(A) and all i 2 A:there exists Q 2 Dr (He (A); a; b) such that?C (i;j )=Tm+1 Qn :M (A; [ Ac)j;ni;m K2 e AmRemark 13.
This rst theorem collects all the basic large deviation estimates neededfor the study of optimal cooling schedules. These estimates will be established byinduction on the size of the considered subset starting from obvious lower and upperbounds on the singletons. This is the approach proposed by O. Catoni who establishes in [2] a similar theorem for the sequential simulated annealing.
We will followthe sketch of his proof. However, introducing the renormalized communication cost,we will be able to give large deviation estimates for the exit time of arbitrary subsetsand for generalized simulated annealing. This theorem shows that the large deviationestimates can be obtained with uniform constants even in the time inhomogeneoussetting.
The uniformity of the constants will be veried step by step with the helpof lemma 4.2 and lemma 4.3.Proof. Consider a = =, b = 0, K1 = =2 and K2 = 2 = where = inf f q(i; j ) j q(i; j ) > 0; (i; j ) 2 E E g:Now consider (c; d) = (1; ), we will prove that H1(a; b; c; d; K1; K2) is true.H1 : Consider a cycle = fig and a subset A . Since we assume that A is notempty, we have A = so that n A = ;. Thenj;nM ( n A; ( n A)c)j;ni;m = M (;; E )i;m = 0:We consider now the exit time from . A straightforward computation givesP ( (; m) > n j Xm = i) nYl=m+1(1 ? e?He(i)=Tm+1 );with the convention that the product is 0 if it contains a negative term. Hence theproof of H1 is complete.H2 : Consider A = fig.
Then we have ?C (i;j)=Tn Qn ;M (A; Ac)j;ni;m 2 e AmOPTIMAL CONVERGENCE SPEED EXPONENT FOR G.S.A(0if n m;nwhere Qm = Qn?1?He (i)=Tl?He (i)=Tnl=m+1 (1 ? e)(eSince He (i) = He (A), one easily proves thatQ 2 Dr (He (A); a; b):Now, concerning the exit time from A, we haveP ( (A; m) > n j Xm = i ) so that the proof of H2 is complete.nYl=m+123^ 1) otherwise.(1 ? (=)e?He(i)=Tl );H3 : Consider a cycle = fig.
There exists i1 2 E such that He(i) = V (i; i1).HenceP (Xm+1 = i j Xm = i ) 1 ? P ( Xm+1 = i1 j Xm = i ) 1 ? (=)e?He(i)=Tm+1 ;andM (; c)j;n (2=)e?(V (i;j)?He(i))=Tm+1 Qnm;( Qn?1 i;m?He (i)=Tl )(=)e?He (i)=Tn if n > m;where Qnm = 0 l=m+1 (1 ? (=)eotherwise.Since He (i) = He (), one easily veries thatQ 2 Dr (He (); =; 0) Dr (He (); a; b):H4 : Consider a cycle = fig. Then M() = fig and M (fig; E ) = 0 so that H4is proved.The proof of H1 is now completed.Assume that there exists = (a; b; c; d; K1; K2) such that Hn?1( ) is true. We willshow that there exists 0 = (a0; b0; c0; d0; K10 ; K20 ) such that Hn( 0) is true.
Moreover,we should verify that 0 can be deduced from by a function which depends only onE; q and . This verication will be done explicitly in the proof of Hn1 and will beleft to the reader in the remaining cases. Furthermore, we should notice that it issucient to prove the weaker result:There exists (a0i; b0i)1i4 , K10 , K20 , K30 , K40 , c0 and d0 such that Hn1 (a01; b01; c0; d0; K10 ),Hn2 (a02; b02), Hn3 (a03; b03; K30 ) and Hn4 (a04; b04; K40 ) are true. Indeed, if we consider a =inf a0i, b = sup b0i, K1 = K20 and K2 = supfK10 ; K30 ; K40 g, then Hn (a; b; c; d; K1; K2) istrue.ALAIN TROUVE24We begin now the proof of Hn.Hn1 : Let us consider the family (k )0kp of the element of M().
Assume thati 2 0 so thatM ( n A; ( n A)c)j;ni;m M1 + M2 + M3 ;whereM1 = M (0 n A; (0 n A)c)j;ni;m ;M2 =M3 =pXk=1pXk=1fM (0 n A; n (A [ 0))M (k n A; (k n A)c)gj;ni;m ;fM (0 n A; n (A [ 0))M ( n A; k n A)M (k n A; (k n A)c)gj;ni;m :Since C 0nA(i; j ) C nA (i; j ) we deduce from Hn1 ?1(a; b; c; d; K2) that(23)M1 K2 e?CnA(i;j)=Tm+1 Qnm;with Q 2 Dr (He (0 n A); a; b).Concerning M2, we have the expansionM2 =pXXX1 ;l M ( n A; ( n A)c )j;n :M (0 n A; n (A [ 0))ii;mkki1 ;lk=1 i1 2n(A[0 ) mlnHowever, we deduce from Hn1 ?1(a; b; c; d; K2) and from the lemma 4.2 that there exists(a1; b1; K2(1)) which depends only on (a; b; K2) and Q 2 Dr (He ( n A); a1; b1) such thatX1 ;l M ( n A; ( n A)c )j;nM (0 n A; n (A [ 0))ii;mkki1 ;lmln(24) K2(1)e?(C0nA(i;i1)+Ck nA(i1;j))=Tm+1 Qnm:Since C 0 nA(i; i1) + C k nA(i1; j ) C nA (i; j ) , and adding the family of inequalities(24) for all the values of k and i1, we deduce from lemma 4.3 that there exists(a2; b2; K2(2)) which depends only on the size of E and on (a; b; K2) such thatM2 K2(2)e?CnA (i;j)=Tm+1 Qnm;(25)with Q 2 Dr (He ( n A); a2; b2).Concerning M3, let us consider the family (k;s)0srk of the elements of M( n A)OPTIMAL CONVERGENCE SPEED EXPONENT FOR G.S.A[such that k =0srkk;s.
We haveM3 (26)whereXml1 l2 n25p XrkXXXk=0 s=0 i1 2n(A[0 ) i2 2k;sM3(k; s; i1; i2);M3(k; s; i1; i2) =1 ;l1 M ( n A; )i2 ;l2 M ( n A; ( n A)c )j;n :M ( n A; (0 n (A [ 0))ii;mkkk;s i1 ;l1i2 ;l2Since A 6= ;, we have j n Aj n ? 1 so that we deduce from Hn1 ?1 (a; b; c; d; K2),Hn3 ?1(a; b; K2), Hn4 ?1 (a; b; K2) and from lemma 4.3 that there exists (a3; b3; K3) whichdepends only on (a; b; K2) and on the size of E such that(27)M3 K3e?[C0nA(i;i1)+CnA (i1;i2)+Ck nA(i2;j)]=Tm+1 Qnm;with Q 2 Dr (He ( n A); a3; b3). However, we have for i1 2 n (A [ 0) and i2 2 k n AC 0 nA(i; i1) + C nA(i1; i2) + C k nA(i2; j ) C nA(i; j );so that we deduce(28)M3 K3 e?CnA(i;j)=Tm+1 Qnm:Now, we deduce from (23), (25) and (28) that there exists (a4; b4; K4) which dependsonly on (a; b; K2) and on the size of E such that?C (i;j )=Tm+1 Qn ;M ( n A; (n)c)j;nmi;m K4 e nAwith Q 2 Dr (He ( n A); a4; b4).