Главная » Все файлы » Просмотр файлов из архивов » PDF-файлы » Cooper_Engineering_a_Compiler(Second Edition)

Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 37

Rice 1869

Описание файла

Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "разное". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 37 страницы из PDF

In cc6 , another ( will cause the parser to stack anotherstate 6 to represent the need for a matching ).goto(cc6 ,)) is cc10 .ncc10 = [Pair → ( ) •, )]oThis set contains one item, which specifies a reduction to Pair.The fourth iteration of the while loop tries to derive new sets from cc8 , cc9 ,and cc10 . Only one combination creates a nonempty set.goto(cc9 ,)) is cc11 .ncc11 = [Pair → ( Pair ) •, )]oState 11 calls for a reduction by Pair → ( Pair ).The final iteration of the while loop tries to derive new sets from cc11 .It finds only empty sets, so the construction halts with 12 sets, cc0through cc11 .Filling in the TablesGiven the canonical collection of sets of lr(1) items for a grammar, theparser generator can fill in the Action and Goto tables by iterating throughCC and examining the items in each ccj ∈ CC.

Each ccj becomes a parserstate. Its items generate the nonempty elements of one row of Action; thecorresponding transitions recorded during construction of CC specify thenonempty elements of Goto. Three cases generate entries in the Actiontable:134 CHAPTER 3 Parsers1. An item of the form [A→β•cγ ,a] indicates that encountering theterminal symbol c would be a valid next step toward discovering thenonterminal A. Thus, it generates a shift item on c in the current state.The next state for the recognizer is the state generated by computinggoto on the current state with the terminal c.

Either β or γ can be .2. An item of the form [A→β•, a] indicates that the parser has recognizeda β, and if the lookahead is a, then the item is a handle. Thus, itgenerates a reduce item for the production A→β on a in the currentstate.3. An item of the form [S 0 → S•, eof] where S 0 is the goal symbol indicatesthe accepting state for the parser; the parser has recognized an inputstream that reduces to the goal symbol and the lookahead symbol is eof.This item generates an accept action on eof in the current state.Figure 3.24 makes this concrete.

For an lr(1) grammar, it should uniquelydefine the nonerror entries in the Action and Goto tables.The table-filling actions can be integrated intothe construction of CC.Notice that the table-filling algorithm essentially ignores items where the •precedes a nonterminal symbol. Shift actions are generated when • precedesa terminal. Reduce and accept actions are generated when • is at the right endof the production. What if cci contains an item [A→β • γ δ, a], where γ ∈N T ? While this item does not generate any table entries itself, its presencein the set forces the closure procedure to include items that generate tableentries.

When closure finds a • that immediately precedes a nonterminalsymbol γ , it adds productions that have γ as their left-hand side, with a •preceding their right-hand sides. This process instantiates first(γ ) in cci .The closure procedure will find each x ∈ first(γ ) and add the items intocci to generate shift items for each x.for each cci ∈ CCfor each item I ∈ cciif I is [A→β • cγ ,a] and goto(cci ,c) = ccj thenAction[i ,c] ← ‘‘shift j’’else if I is [A→β•, a] thenAction[i ,a] ← ‘‘reduce A→β’’else if I is [S 0 → S•, eof] thenAction[i , eof ] ← ‘‘accept’’for each n ∈ N Tif goto(cci ,n) = ccj thenGoto[i ,n] ← jn FIGURE 3.24 LR(1) Table-Filling Algorithm.3.4 Bottom-Up Parsing 135For the parentheses grammar, the construction produces the Action andGoto tables shown in Figure 3.16b on page 120.

As we saw, combining thetables with the skeleton parser in Figure 3.15 creates a functional parser forthe language.In practice, an lr(1) parser generator must produce other tables needed bythe skeleton parser. For example, when the skeleton parser in Figure 3.15 onpage 119 reduces by A → β, it pops “2 × | β |” symbols from the stack andpushes A onto the stack. The table generator must produce data structuresthat map a production from the reduce entry in the Action table, say A → β,into both | β | and A.

Other tables, such as a map from the integer representinga grammar symbol into its textual name, are needed for debugging and fordiagnostic messages.Handle Finding, Revisitedlr(1) parsers derive their efficiency from a fast handle-finding mechanismembedded in the Action and Goto tables. The canonical collection, CC, represents a handle-finding dfa for the grammar. Figure 3.25 shows the dfa forour example, the parentheses grammar.How can the lr(1) parser use a dfa to find the handles, when we knowthat the language of parentheses is not a regular language? The lr(1) parserrelies on a simple observation: the set of handles is finite.

The set of handlesis precisely the set of complete lr(1) items—those with the placeholder •at the right end of the item’s production. Any language with a finite set ofsentences can be recognized by a dfa. Since the number of productions andthe number of lookahead symbols are both finite, the number of completeitems is finite, and the language of handles is a regular language.When the lr(1) parser executes, it interleaves two kinds of actions: shiftsand reduces. The shift actions simulate steps in the handle-finding dfa. The) Pair - cc4- cc8cc1cc5 @ (List Pair (@ ?(R@Pair- cc3 (- cc6- cc9 )- cc11cc0@@@)@)@Pair @ R@R@R@cc2cc7cc10 n FIGURE 3.25 Handle-Finding DFA for the Parentheses Grammar.The LR(1) parser makes the handle’s positionimplicit, at stacktop.

This design decisiondrastically reduces the number of possiblehandles.136 CHAPTER 3 Parsersparser performs one shift action per word in the input stream. When thehandle-finding dfa reaches a final state, the lr(1) parser performs a reduceaction. The reduce actions reset the state of the handle-finding dfa to reflectthe fact that the parser has recognized a handle and replaced it with a nonterminal. To accomplish this, the parser pops the handle and its state offthe stack, revealing an older state.

The parser uses that older state, the lookahead symbol, and the Goto table to discover the state in the dfa from whichhandle-finding should continue.The reduce actions tie together successive handle-finding phases. The reduction uses left context—the state revealed by the reduction summarizes theprior history of the parse—to restart the handle-finding dfa in a state thatreflects the nonterminal that the parser just recognized.

For example, in theparse of “( ( ) ) ( )”, the parser stacked an instance of state 3 for every( that it encounters. These stacked states allow the algorithm to match upthe opening and closing parentheses.Notice that the handle-finding dfa has transitions on both terminal and nonterminal symbols. The parser traverses the nonterminal edges only on areduce action. Each of these transitions, shown in gray in Figure 3.25, corresponds to a valid entry in the Goto table.

The combined effect of the terminaland nonterminal actions is to invoke the dfa recursively each time it mustrecognize a nonterminal.3.4.3 Errors in the Table ConstructionAs a second example of the lr(1) table construction, consider the ambiguous grammar for the classic if-then-else construct. Abstracting awaythe details of the controlling expression and all other statements (by treating them as terminal symbols) produces the following four-productiongrammar:1 Goal2 Stmt34→→||Stmtif expr then Stmtif expr then Stmt else StmtassignIt has two nonterminal symbols, Goal and Stmt, and six terminal symbols,if, expr, then, else, assign, and the implicit eof.The construction begins by initializing cc0 to the item [Goal →• Stmt, eof ] and taking its closure to produce the first set.3.4 Bottom-Up Parsing 1370123456789ItemGoalStmtifexprthenelseassigneofcc0cc1cc2cc3cc4cc5cc6cc7cc8cc9cc10cc11cc12cc13cc14cc15∅cc1cc2∅∅∅cc3∅∅∅∅∅∅∅∅∅∅∅cc4∅∅∅∅∅∅∅∅∅∅∅∅∅∅∅∅∅cc5∅∅∅∅cc6cc7∅∅∅cc8∅∅∅∅∅∅∅∅∅∅∅cc9∅∅∅∅∅∅∅∅∅∅∅∅cc11cc2cc12∅∅cc3∅∅∅∅∅∅∅∅∅cc10∅∅∅∅cc13cc7∅∅∅∅∅∅∅cc8∅∅∅∅∅∅∅∅∅cc14∅∅cc15cc7∅∅∅cc8∅∅∅∅∅∅∅∅∅n FIGURE 3.26 Trace of the LR(1) Construction on the If-Then-Else Grammar.(cc0 =[Goal → • Stmt, eof ][Stmt → • assign, eof ][Stmt → • if expr then Stmt, eof ][Stmt → • if expr then Stmt else Stmt, eof ]From this set, the construction begins deriving the remaining members ofthe canonical collection of sets of lr(1) items.Figure 3.26 shows the progress of the construction.

The first iteration examines the transitions out of cc0 for each grammar symbol. It produces threenew sets for the canonical collection from cc0 : cc1 for Stmt, cc2 for if, andcc3 for assign. These sets are:ncc1 = [Goal → Stmt •, eof ]o([Stmt → if • expr then Stmt, eof ],[Stmt → if • expr then Stmt else Stmt, eof ]nocc3 = [Stmt → assign •, eof ])cc2 =The second iteration examines transitions out of these three new sets.Only one combination produces a new set, looking at cc2 with the symbolexpr.(cc4 =[Stmt → if expr • then Stmt, eof],[Stmt → if expr • then Stmt else Stmt, eof]))138 CHAPTER 3 ParsersThe next iteration computes transitions from cc4 ; it creates cc5 asgoto(cc4 ,then).[Stmt → if expr then • Stmt, eof ],[Stmt → if expr then • Stmt else Stmt, eof ],cc5 = [Stmt → • if expr then Stmt, {eof, else}],[Stmt → • assign, {eof, else}],[Stmt → • if expr then Stmt else Stmt, {eof, else}]The fourth iteration examines transitions out of cc5 .

Свежие статьи
Популярно сейчас