1610906232-7ba7dbaea13262b50a6029d682cb7f1b (824370), страница 21
Текст из файла (страница 21)
Notice that ~L is an equivalencerelation.That is, x ~L y if either both strings belong to L or neither is in L; andmoreover, appending any fixed string to both x and y results in two strings thatare either both in L or both not in L.Example 2.5.1: If x is a string, and when L is understood by context, wedenote by [x] the equivalence class with respect to L to which x belongs. Forexample, for the language L = (ab U ba)* accepted by the automaton in Figure2-20, it is not hard to see that ~L has four equivalence classes:(1) [c] = L,(2) [a] = La,(3) [b] = Lb,(4) [a a] = L(aa U bb)~*.In (1), for any string x E L, including x = e, the z's that make xz E Lare precisely the members of L. In (2), any x E La needs a z of the form bL inorder for xz to be in L. Similarly, for (3) the z's are of the form aL. Finally, in(4) there is no z that can restore to L a string with a prefix in L(aa U bb).
Inother words, all strings in set (1) have the same fate with respect to inclusion inL; and the same for (2), (3), and (4). Finally, it is easy to see that these fourclasses exhaust all of~·. Hence these are the equivalence classes of ~L.O952.5: State MinimizationNotice that ~L relates strings in terms of a language, not in terms of anautomaton. Automata provide another, somewhat less fundamental, relation,described next.Definition 2.5.2: Let AI = (K, 2:;, d, s, F) be a deterministic finite automaton.We say that two strings x, y E 2:;* are equivalent with respect to M, denotedx '" MY, if, intuitively, they both drive 111 from s to the same state. Formally,x "'M y if there is a state q such that (s, x) f-~[ (q, e) and (s,y) f-M (q, e).Again, '" M is an equivalence relation.
Its equivalence classes can be identified by the states of M -more precisely, with those states that are reachablefrom 8 and therefore have at least one string in the corresponding equivalenceclass. We denote the equivalence class corresponding to state q of M as E q •Example 2.5.1 (continued): For example, for the automaton M in Figure 220, the equivalence classes of "'M are these (where L = (abUba)* is the languageaccepted by M)(1)(2)(3)(4)(5)(6)Eq,Eq2Eq3Eq4Eq5Eq6= (ba)*,= La U a= abL,= b(ab)*,= L(bbUaa)2:;*,= abLb.Again, they form a partition of 2:;*.0These two important equivalence relations, one associated with the language, the other with the automaton, are related as follows:Theorem 2.5.1: For any deterministic finite automaton Mand any strings X,Y E 2:;*, if x "'M y, then x ~L(M) y.(K, 2:;, d, s, F)Proof: For any string x E 2:;*, let q(x) E K be the unique state such that(s,x) f-M (q(x),e).
Notice that, for any X,Z E 2:;*, xz E L(M) if and only if(q(x),z) f-M (I,e) for some f E F. Now, if X "'M Y then, by the definition of"'M, q(x) = q(y), and thus x "'M y implies that the following holds:xz E L(M) if and only if yz E L(M) for allwhich is the same as x~L(M)ZE 2:;*},y .•A very suggestive way of expressing Theorem 2.5.1 is to say that "'M isa refinement of ~ L(M)' In general, we say that an equivalence relation '"is a refinement of another ~ if for all x, y x '" y implies x ~ y. If '" is a96Chapter 2: FINITE AUTOMATArefinement of ~, then each equivalence class with respect to ~ is contained insome equivalence class of ~; that is, each equivalence class of ~ is the unionof one or more equivalence classes of~. For example, the equivalence relationthat relates any two cities of the United States that are in the same county is arefinement of the equivalence relation that relates any two cities that are in thesame state.Example 2.5.1 (continued): For an example that is more to the point, theequivalence classes of ~ M for the automaton M in Figure 2-20 "refine" in thissense the equivalence classes of ~L(M), exactly as predicted by Theorem 2.5.l.For example, classes Eq5 and [a a] coincide, while classes Eql and Eq3 are bothsubsets of [e].<>Theorem 2.5.1 implies something very important about M and any otherautomaton M accepting the same language L(M): Its number of states must beat least as large as the number of equivalence classes of L(M) under ~.
In otherwords, the number of equivalence classes of L(M) is a natural lower bound onthe number of states of any automaton equivalent to M. Can this lower boundbe achieved? We next show that indeed it can.Theorem 2.5.2 (The Myhill-Nerode Theorem): Let L ~ ~* be a regularlanguage. Then there is a deterministic finite automaton with precisely as manystates as there are equivalence classes in ~ L that accepts L.Proof: As before, we denote the equivalence class of string x E ~* in theequivalence relation ~L by [x]. Given L, we shall construct a deterministicfinite automaton (the standard automaton for L) M = (K,~, 6, s, F) suchthat L = L( M). M is defined as follows:K = {[x] : x E ~*}, the set of equivalence classes under ~ L.= [e], the equivalence class of e under ~L.F = {[x] : x E L}.Finally, for any [x] E K and any a E ~, define 6([x], a) = [xa].SHow do we know that the set K is finite, that is, that ~L has finitely manyequivalent classes? L is regular, and so it is surely accepted by some deterministicfinite automaton M'.
By the previous theorem, ~M' is a refinement of ~L, andso there are fewer equivalence classes in L than there are equivalence classes of~ M' -that is to say, states of M'. Hence K is a finite set. We also have toargue that 6 is well defined, that is, 6 ([x], a) = [x a] is independent of the stringx E [x]. But this is easy to see, because x ~ L x' if and only if xa ~ L x' a.We next show that L = L(M). First we show that for all x, y E ~*, wehave([x], y) f-M ([xy], e).(1)972.5: State MinimizationThis is established by induction on IYI.
It is trivial when y = e, and, if itholds for all y's of length up to nand y = yla, then by induction ([x],yla) f-M([xyl], a) f-M ([xy], e).Now (1) completes the proof: For all x E ~., we have that x E L(M) ifand only if ([e],x) f- *(q,e)forsome q E F, which is by (1) the same as saying[x] E F, or, by the definition of F, [x E L] .•Example 2.5.1 (continued): The standard automaton corresponding to thelanguage L = (ab U ba)· accepted by the six-state deterministic finite automatonin Figure 2-20 is shown in Figure 2-21. It has four states. Naturally, it is thesmallest deterministic finite automaton that accepts this language.O{q4, q6} o---~o {q5}[b][aa]Figure 2-21Incidentally, Theorems 2.5.2 immediately imply the following characterization of regular languages, sometimes itself called the Myhill-Nerode Theorem:Corollary: A language L is regular if and only if ~L has finitely many equivalence classes.Proof: If L is regular, then L = L(M) for some deterministic finite automatonM, and M has at least as many states as ~L has equivalence classes.
Hencethere are finitely many equivalence classes in ~L'Conversely, if ~ L has finitely many equivalence classes, then the standarddeterministic finite automaton ML (recall the proof of Theorem 2.5.2) acceptsL .•Example 2.5.2: The corollary just proved is an interesting alternative wayof specifying what it means for a language L to be regular. Furthermore, itprovides another useful way for proving that a language is not regular -besidesthe Pumping Theorem.For example, here is an alternative proof that L = {anb n : n ~ I} is notregular: No two strings a i and a j , with i f. j, are equivalent under ~L, simplyChapter 2: FINITE AUTOMATA98because there is a string (namely, bi ) which, when affixed a i gives a string inL, but when affixed to a j produces a string not in L. Hence ~ L has infinitelymany equivalence classes [e], [aJ, [aaJ, [aaaJ, ...
, and hence by the corollary L isnot regular.OFor any regular language L the automaton constructed in the proof of Theorem 2.5.2 is the deterministic automaton with the fewest states that acceptsL ~an object of obvious practical importance. Unfortunately, this automatonis defined in terms of the equivalence classes of ~ I" and it is not clear howthese equivalence classes can be identified for any given regular language L -especially if L is given in terms of a deterministic automaton M. We shall nextdevelop an algorithm for constructing this minimal automaton, starting fromany deterministic finite automaton M such that L = L(M).Let M = (K,~, 8, s, F) be a deterministic finite automaton.
Define a relation AM ~ K x ~*, as follows: (q, w) E AM if and only if (q, w) f-M (1, e) forsome f E F; that is, (q, w) E AM means that w drives M from q to an acceptingstate. Let us call two states q, p E K equivalent, denoted q = p, if the followingholds for all z E ~': (q,z) E AM if and only if (p,z) E AM. Thus, if two statesare equivalent, then the corresponding equivalence classes of '" M are subsets ofthe same equivalence class of ~L.
In other words, the equivalence classes ofare precisely those sets of states of M that must be clumped together in orderto obtain the standard automaton of L(M)t.We shall develop an algorithm for computing the equivalence classes ofOur algorithm will compute as the limit of a sequence of equivalence relations=0, =1 , =2, ... , defined next. For two states q, p E K, q =n P if the following istrue: (q, z) E AM if and only if (p, z) E AM for all strings z such that Izl ~ n.In other words, =n is a coarser equivalence relation thanonly requiring thatstates q and p behave the same with respect to acceptance when driven by stringsof length up to n.Obviously, each equivalence relation in =0, =1, =2, ...
is a refinement of theprevious one. Also, q =0 p holds if q and p are either both accepting, or bothnon-accepting. That is, there are precisely two equivalence classes of =0: Fand K - F (assuming they are both nonempty). ILremains to show how =n+ldepends on =n. Here is how:==.==,quotient of '" M by ~ L. If '" is a refinement ofthen the quotient of ~ by"', denoted ~ I "', is an equivalence relation onthe equivalence classes of "'. Two classes [x] and [y] of'" are related by ~ I '" ifx ~ y ~it is easy to see that this is indeed an equivalence relation. To return toour geographic example, the quotient of the "same state" relation by the "samecounty" relation relates any two counties (each with at least one city in it) thathappen to be in the same state.t The relation == can be called the~,992.5: State MinimizationLemma 2.5.1: For any two states q,p E K and any integer n ~ 1, q =n P ifand only if (a) q =n-l p, and (b) for all a E ~, 8(q,a) =n-l 8(p, a).Proof: By definition of =n, q =n P if and only if q =n-l p, and furthermoreany string w = av of length precisely n drives either both q and p to acceptance,or both to nonacceptance.
However, the second condition is the same as sayingthat 8(q, a) =n-l 8(p, a) for any a E ~ . •Lemma 2.5.1 suggests that we can computeautomaton for L, by the following algorithm:=, and from this the standardInitially the equivalence classes of =0 are F and K - F;repeat for n := 1,2, ...compute the equivalence classes of =n from those =n-luntil =n is the same as =n-l.Each iteration can be carried out by applying Lemma 2.5.1: For each pair ofstates of M we test whether the conditions of the lemma hold, and if so we putthe two states in the same equivalence class ofBut how do we know thatthis is an algorithm, that the iteration will eventually terminate? The answeris simple: For each iteration at which the termination condition is not satisfied,=n ]i=n-l, =n is a proper refinement of =n-l, and thus has at least one moreequivalence class than =n-l.
Since the number of equivalence classes cannotbecome more than the number of states of M, the algorithm will terminate afterat most IKI - 1 iterations.When the algorithm terminates, say at the nth iteration and having computed =n==n-l, then the lemma implies that =n==n+l ==n+2==n+3= ...Hence the relation computed is precisely =. In the next section we give a morecareful analysis of the complexity of this important algorithm, establishing thatit is polynomial.=n-Example 2.5.3: Let us apply the state minimization algorithm to the deterministic finite automaton M in Figure 2-20 (of course, by the previous example,we know what to expect: the four-state standard automaton for L(M)). At thevarious iterations we shall have these equivalence classes of the correspondingInitially, the equivalence classes of =0 are {ql, q3} and {q2, q4, q5, q6}.After the first iteration, the classes ofare {ql,q3}, {qd, {q4,q6}, and{q5}.