Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 19

PDF-файл Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 19, который располагается в категории "разное" в предмете "конструирование компиляторов" изседьмого семестра. Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 19 - СтудИзба 2019-09-18 СтудИзба
Rice 1872

Описание файла

Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "разное". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 19 страницы из PDF

Those states are equivalent since they can be reached withoutconsuming input.To construct q0 from n0 , the algorithm computes -closure(n0 ). It takes,as input, a set S of nfa states. It returns a set of nfa states constructedfrom S as follows: -closure examines each state si ∈ S and adds to S anystate reachable by following one or more -transitions from si . If S is theset of states reachable from n0 by following paths labelled with abc, then-closure(S) is the set of states reachable from n0 by following pathslabelled abc ∗ .

Initially, Q has only one member, q0 and the WorkListcontains q0 .The algorithm proceeds by removing a set q from the worklist. Each q represents a valid configuration of the original nfa. The algorithm constructs,for each character c in the alphabet 6, the configuration that the nfa wouldreach if it read c while in configuration q.

This computation uses a functionDelta(q, c) that applies the nfa’s transition function to each element of q.It returns ∪s∈qi δ N (s,c).The while loop repeatedly removes a configuration q from the worklist anduses Delta to compute its potential transitions. It augments this computedconfiguration with any states reachable by following -transitions, and addsany new configurations generated in this way to both Q and the worklist.When it discovers a new configuration t reachable from q on character c, thealgorithm records that transition in the table T. The inner loop, which iteratesover the alphabet for each configuration, performs an exhaustive search.Notice that Q grows monotonically. The while loop adds sets to Q but neverremoves them.

Since the number of configurations of the nfa is bounded and50 CHAPTER 2 Scannerseach configuration only appears once on the worklist, the while loop musthalt. When it halts, Q contains all of the valid configurations of the nfa andT holds all of the transitions between them.Q can become large—as large as |2 N | distinct states. The amount of nonde-terminism found in the nfa determines how much state expansion occurs.Recall, however, that the result is a dfa that makes exactly one transition perinput character, independent of the number of states in the dfa. Thus, anyexpansion introduced by the subset construction does not affect the runningtime of the dfa.From Q to DWhen the subset construction halts, it has constructed a model of the desireddfa, one that simulates the original nfa.

Building the dfa from Q and T isstraightforward. Each qi ∈ Q needs a state di ∈ D to represent it. If qi contains an accepting state of the nfa, then di is an accepting state of the dfa.We can construct the transition function, δ D , directly from T by observingthe mapping from qi to di . Finally, the state constructed from q0 becomesd0 , the initial state of the dfa.ExampleConsider the nfa built for a(b|c)∗ in Section 2.4.2 and shown in Figure 2.7a,with its states renumbered. The table in Figure 2.7b sketches the steps thatthe subset construction follows.

The first column shows the name of theset in Q being processed in a given iteration of the while loop. The secondcolumn shows the name of the corresponding state in the new dfa. The thirdcolumn shows the set of nfa states contained in the current set from Q. Thefinal three columns show results of computing the -closure of Delta onthe state for each character in 6.The algorithm takes the following steps:1. The initialization sets q0 to -closure({n0 }), which is just n0 .

The firstiteration computes -closure(Delta(q0 ,a)), which contains six nfastates, and -closure(Delta(q0 ,b)) and -closure(Delta(q0 ,c)),which are empty.2. The second iteration of the while loop examines q1 . It produces twoconfigurations and names them q2 and q3 .3. The third iteration of the while loop examines q2 .

It constructs twoconfigurations, which are identical to q2 and q3 .4. The fourth iteration of the while loop examines q3 . Like the thirditeration, it reconstructs q2 and q3 .Figure 2.7c shows the resulting dfa; the states correspond to the dfa statesfrom the table and the transitions are given by the Delta operations that2.4 From Regular Expression to Scanner 51bn4n0an2n1n5n3n8cn6n9n7(a) NFA for “a(b | c)* ” (With States Renumbered)-closure(Delta(q,*))SetNameDFAStatesNFAStatesaq0d0q1d1q2d2q3d3n0n1 , n2 , n3 ,n4 , n6 , n9n5 , n8 , n9 ,n3 , n4 , n6n7 , n8 , n9 ,n3 , n4 , n6n1 , n2 , n3 ,n4 , n6 , n9bc– none –– none –n 7 , n8 , n9 ,n3 , n4 , n6– none –n5 , n8 , n9 ,n3 , n4 , n6– none –q2q3– none –q2q3(b) Iterations of the Subset Constructiond0ad2bcd1cbbd3c(a) Resulting DFAn FIGURE 2.7 Applying the Subset Construction to the NFA from Figure 2.5.generate those states.

Since the sets q1 , q2 and q3 all contain n9 (theaccepting state of the nfa), all three become accepting states in the dfa.Fixed-Point ComputationsThe subset construction is an example of a fixed-point computation, a particular style of computation that arises regularly in computer science. These52 CHAPTER 2 ScannersMonotone functiona function f on domain D is monotone if,∀ x, y∈ D, x ≤ y⇒f (x) ≤ f (y)computations are characterized by the iterated application of a monotonefunction to some collection of sets drawn from a domain whose structure isknown. These computations terminate when they reach a state where furtheriteration produces the same answer—a “fixed point” in the space of successive iterates.

Fixed-point computations play an important and recurring rolein compiler construction.Termination arguments for fixed-point algorithms usually depend on knownNproperties of the domain. For the subset construction, the domain D is 22 ,NNsince Q = {q0 , q1 , q2 , . . . , qk } where each qi ∈ 2 . Since N is finite, 2 andN22 are also finite. The while loop adds elements to Q; it cannot removean element from Q. We can view the while loop as a monotone increasingfunction f, which means that for a set x, f (x) ≥ x. (The comparison operator≥ is ⊇.) Since Q can have at most |2 N | distinct elements, the while loop caniterate at most |2 N | times. It may, of course, reach a fixed point and halt morequickly than that.Computing -closure OfflineAn implementation of the subset construction could compute -closure()by following paths in the transition graph of the nfa as needed.

Figure 2.8shows another approach: an offline algorithm that computes -closure( {n})for each state n in the transition graph. The algorithm is another example ofa fixed-point computation.For the purposes of this algorithm, consider the transition diagram of thenfa as a graph, with nodes and edges. The algorithm begins by creating aset E for each node in the graph. For a node n, E(n) will hold the currentfor each state n ∈ N doE(n) ← {n};end;WorkList ← N;while (WorkList 6= ∅) doremove n from WorkList;St ← {n} ∪E( p);n→p ∈ δNif t 6= E(n)then begin;E(n) ← t;WorkList ← WorkList ∪ {m | m →n ∈ δN };end;end;n FIGURE 2.8 An Offline Algorithm for -closure.2.4 From Regular Expression to Scanner 53approximation to -closure(n). Initially, the algorithm sets E(n) to { n },for each node n, and places each node on the worklist.Each iteration of the while loop removes a node n from the worklist, findsall of the -transitions that leave n, and adds their targets to E(n).

If thatcomputation changes E(n), it places n’s predecessors along -transitions onthe worklist. (If n is in the -closure of its predecessor, adding nodes to E(n)must also add them to the predecessor’s set.) This process halts when theworklist becomes empty.Using a bit-vector set for the worklist can ensurethat the algorithm does not have duplicatecopies of a node’s name on the worklist.See Appendix B.2.The termination argument for this algorithm is more complex than thatfor the algorithm in Figure 2.6. The algorithm halts when the worklist isempty.

Initially, the worklist contains every node in the graph. Each iterationremoves a node from the worklist; it may also add one or more nodes to theworklist.The algorithm only adds a node to the worklist if the E set of its successorchanges. The E(n) sets increase monotonically. For a node x, its successor yalong an -transition can place x on the worklist at most |E( y)| ≤ |N | times,in the worst case. If x has multiple successors yi along -transitions, each ofthem can place x on the worklist |E( yi )| ≤ |N | times. Taken over the entiregraph, the worst case behavior would place nodes on the worklist k · |N |times, where k is the number of -transitions in the graph.

Thus, the worklisteventually becomes empty and the computation halts.2.4.4 DFA to Minimal DFA: Hopcroft’s AlgorithmAs a final refinement to the re→dfa conversion, we can add an algorithmto minimize the number of states in the dfa. The dfa that emerges fromthe subset construction can have a large set of states. While this does notincrease the time needed to scan a string, it does increase the size of therecognizer in memory. On modern computers, the speed of memory accessesoften governs the speed of computation.

A smaller recognizer may fit betterinto the processor’s cache memory.To minimize the number of states in a dfa, (D, 6, δ, d0 , D A ), we need atechnique to detect when two states are equivalent—that is, when they produce the same behavior on any input string. The algorithm in Figure 2.9finds equivalence classes of dfa states based on their behavior. From thoseequivalence classes, we can construct a minimal dfa.The algorithm constructs a set partition, P = { p1 , p2 , p3 , . .

. pm }, of the dfastates. The particular partition, P, that it constructs groups together dfastates by their behavior. Two dfa states, di , dj ∈ ps , have the same behavior inccresponse to all input characters. That is, if di → dx , dj → dy , and di , dj ∈ ps ,Set partitionA set partition of S is a collection ofnonempty, disjoint subsets of S whoseunion is exactly S.54 CHAPTER 2 ScannersT ← {DA ,P ← ∅{ D − DA } };Split(S) {for each c ∈ 6 doif c splits S into s1 and s2then return {s1 , s2 };while (P 6= T) doP ← T;end;T ← ∅;for each set p ∈ P doT ← T ∪ Split(p);end;return S;}end;n FIGURE 2.9 DFA Minimization Algorithm.then dx and dy must be in the same set pt .

Свежие статьи
Популярно сейчас