Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 20

PDF-файл Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 20, который располагается в категории "разное" в предмете "конструирование компиляторов" изседьмого семестра. Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 20 - СтудИзба 2019-09-18 СтудИзба
Rice 1872

Описание файла

Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "разное". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 20 страницы из PDF

This property holds for everyset ps ∈ P, for every pair of states di , dj ∈ ps , and for every input character, c.Thus, the states in ps have the same behavior with respect to input charactersand the remaining sets in P.To minimize a dfa, each set ps ∈ P should be as large as possible, withinthe constraint of behavioral equivalence. To construct such a partition, thealgorithm begins with an initial rough partition that obeys all the properties except behavioral equivalence.

It then iteratively refines that partitionto enforce behavioral equivalence. The initial partition contains two sets,p0 = D A and p1 = {D − D A }. This separation ensures that no set in thefinal partition contains both accepting and nonaccepting states, since thealgorithm never combines two partitions.The algorithm refines the initial partition by repeatedly examining eachps ∈ P to look for states in ps that have different behavior for some inputstring. Clearly, it cannot trace the behavior of the dfa on every string. Itcan, however, simulate the behavior of a given state in response to a singleinput character. It uses a simple condition for refining the partition: a symbolc ∈ 6 must produce the same behavior for every state di ∈ ps . If it does not,the algorithm splits ps around c.This splitting action is the key to understanding the algorithm.

For di anddj to remain together in ps , they must take equivalent transitions on eachcccharacter c ∈ 6. That is, ∀ c ∈ 6, di → dx and dj → dy , where dx , dy ∈ pt . Anycstate dk ∈ ps where dk → dz , dz ∈/ pt , cannot remain in the same partition as diand dj . Similarly, if di and dj have transitions on c and dk does not, it cannotremain in the same partition as di and dj .Figure 2.10 makes this concrete. The states in p1 = {di , dj , dk } are equivalentif and only if their transitions, ∀ c ∈ 6, take them to states that are, themselves, in an equivalence class. As shown, each state has a transition on a:aaadi → dx , dj → dy , and dk → dz . If dx , dy , and dz are all in the same set in2.4 From Regular Expression to Scanner 55diadxdiadidxp4p2djdkaap1dydjdzdkp2p1(a) a Does Not Split p1aa(b) a Splits p1adxp2dydjdzdkp3p5aadydzp3(c) Partitions After Split On an FIGURE 2.10 Splitting a Partition around a.the current partition, as shown on the left, then di , dj , and dk should remaintogether and a does not split p1 .On the other hand, if dx , dy , and dz are in two or more different sets, thena splits p1 .

As shown in the center drawing of Figure 2.10, dx ∈ p2 whiledy and dz ∈ p3 , so the algorithm must split p1 and construct two new setsp4 = {di } and p5 = {dj , dk } to reflect the potential for different outcomeswith strings that begin with the symbol a. The result is shown on theright side of Figure 2.10. The same split would result if state di had notransition on a.To refine a partition P, the algorithm examines each p ∈ P and each c ∈ 6.If c splits p, the algorithm constructs two new sets from p and adds themto T .

(It could split p into more than two sets, all having internally consistentbehavior on c. However, creating one consistent state and lumping the restof p into another state will suffice. If the latter state is inconsistent in itsbehavior on c, the algorithm will split it in a later iteration.) The algorithmrepeats this process until it finds a partition where it can split no sets.To construct the new dfa from the final partition p, we can create a singlestate to represent each set p ∈ P and add the appropriate transitions betweenthese new representative states. For the state representing pl , we add a transition to the state representing pm on c if some dj ∈ pl has a transition onc to some dk ∈ pm .

From the construction, we know that if dj has such atransition, so does every other state in pl ; if this were not the case, the algorithm would have split pl around c. The resulting dfa is minimal; the proofis beyond our scope.ExamplesConsider a dfa that recognizes the language fee | fie, shown in Figure 2.11a.By inspection, we can see that states s3 and s5 serve the same purpose. Both56 CHAPTER 2 Scannersefs0s2es3s1is4es5(a) DFA for “fee | fie”ExaminesStepCurrentPartitionSetCharAction01234{ {s3 , s5 }, {s0 , s1, s2 , s4 } }{ {s3 , s5 }, {s0 , s1, s2 , s4 } }{ {s3 , s5 }, {s0 , s1, s2 , s4 } }{ {s3 , s5 }, {s0 , s1 }, {s2 , s4 } }{ {s3 , s5 }, {s0 }, {s1 }, {s2 , s4 } }—{s3 , s5 }{s0 , s1, s2 , s4 }{s0 , s1 }all—all—nonesplit {s2 , s4 }split {s1 }noneefall(b) Critical Steps in Minimizing the DFAs0fs1i,es2es3(c) The Minimal DFA (States Renumbered)n FIGURE 2.11 Applying the DFA Minimization Algorithm.are accepting states entered only by a transition on the letter e.

Neither hasa transition that leaves the state. We would expect the dfa minimizationalgorithm to discover this fact and replace them with a single state.Figure 2.11b shows the significant steps that occur in minimizing thisdfa. The initial partition, shown as step 0, separates accepting states fromnonaccepting states.

Assuming that the while loop in the algorithm iteratesover the sets of P in order, and over the characters in 6 = {e, f, i} in order,then it first examines the set {s3 , s5 }. Since neither state has an exiting transition, the state does not split on any character. In the second step, it examines{s0 , s1 , s2 , s4 }; on the character e, it splits {s2 , s4 } out of the set. In the thirdstep, it examines {s0 , s1 } and splits it around the character f.

At that point,the partition is { {s3 , s5 }, {s0 }, {s1 }, {s2 , s4 } }. The algorithm makes one finalpass over the sets in the partition, splits none of them, and terminates.To construct the new dfa, we must build a state to represent each set inthe final partition, add the appropriate transitions from the original dfa, anddesignate initial and accepting state(s). Figure 2.11c shows the result for thisexample.2.4 From Regular Expression to Scanner 57d0ad2bcd1cbd0bd3cp1(a) Original DFAad2bcd1cp2bbd3c(b) Initial Partitionn FIGURE 2.12 DFA for a(b|c ∗ ) .As a second example, consider the dfa for a (b | c)∗ produced by Thompson’s construction and the subset construction, shown in Figure 2.12a.The first step of the minimization algorithm constructs an initial partition{ {d0 }, {d1 , d2 , d3 } }, as shown on the right.

Since p1 has only one state, itcannot be split. When the algorithm examines p2 , it finds no transitions on afrom any state in p2 . For both b and c, each state has a transition back into p2 .Thus, no symbol in 6 splits p2 , and the final partition is { {d0 }, {d1 , d2 , d3 } }.The resulting minimal dfa is shown in Figure 2.12b. Recall that this isthe dfa that we suggested a human would derive. After minimization, theautomatic techniques produce the same result.This algorithm is another example of a fixed-point computation.

P is finite;at most, it can contain |D| elements. The while loop splits sets in P, butnever combines them. Thus, |P| grows monotonically. The loop halts whensome iteration splits no sets in P. The worst-case behavior occurs wheneach state in the dfa has different behavior; in that case, the while loop haltswhen P has a distinct set for each di ∈ D.

This occurs when the algorithm isapplied to a minimal dfa.2.4.5 Using a DFA as a RecognizerThus far, we have developed the mechanisms to construct a dfa implementation from a single re. To be useful, a compiler’s scanner must recognizeall the syntactic categories that appear in the grammar for the source language. What we need, then, is a recognizer that can handle all the res for thelanguage’s microsyntax. Given the res for the various syntactic categories,r1 , r2 , r3 , . . .

, rk , we can construct a single re for the entire collection byforming (r1 | r2 | r3 | . . . | rk ).If we run this re through the entire process, building an nfa, constructinga dfa to simulate the nfa, minimizing it, and turning that minimal dfa intoexecutable code, the resulting scanner recognizes the next word that matchesone of the ri ’s. That is, when the compiler invokes it on some input, thes0 ab,cs158 CHAPTER 2 Scannersscanner will examine characters one at a time and accept the string if it is inan accepting state when it exhausts the input. The scanner should return boththe text of the string and its syntactic category, or part of speech.

Свежие статьи
Популярно сейчас