Главная » Все файлы » Просмотр файлов из архивов » PDF-файлы » Cooper_Engineering_a_Compiler(Second Edition)

Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 34

Rice 1869

Описание файла

Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "разное". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 34 страницы из PDF

Theway that it builds the parse tree forces this order. The parser must add thenode for γi to the frontier before it can match γi .The scanner returns classified words in left-to-right order. To reconcile theleft-to-right order of the scanner with the reverse derivation constructed bythe scanner, a bottom-up parser looks for a rightmost derivation.

In a rightmost derivation, the leftmost leaf is considered last. Reversing that orderleads to the desired behavior: leftmost leaf first and rightmost leaf last.At each point, the parser operates on the frontier of the partially constructedparse tree; the current frontier is a prefix of the corresponding sentential formin the derivation. Because each sentential form occurs in a rightmost derivation, the unexamined suffix consists entirely of terminal symbols. When theparser needs more right context, it calls the scanner.With an unambiguous grammar, the rightmost derivation is unique. For alarge class of unambiguous grammars, γi−1 can be determined directly fromγi (the parse tree’s upper frontier) and a limited amount of lookahead in theinput stream.

In other words, given a frontier γi and a limited number ofadditional classified words, the parser can find the handle that takes γi toγi−1 . For such grammars, we can construct an efficient handle-finder, usinga technique called lr parsing. This section examines one particular flavor oflr parser, called a table-driven lr(1) parser.An lr(1) parser scans the input from left to right to build a rightmost derivation in reverse. At each step, it makes decisions based on the history of theparse and a lookahead of, at most, one symbol.

The name lr(1) derives118 CHAPTER 3 Parsersfrom these properties: left-to-right scan, reverse rightmost derivation, and1 symbol of lookahead.Informally, we will say that a language has the lr(1) property if it can beparsed in a single left-to-right scan, to build a reverse-rightmost derivation,using only one symbol of lookahead to determine parsing actions. In practice, the simplest test to determine if a grammar has the lr(1) property is tolet a parser generator attempt to build the lr(1) parser. If that process fails,the grammar lacks the lr(1) property.

The remainder of this section introduces lr(1) parsers and their operation. Section 3.4.2 presents an algorithmto build the tables that encode an lr(1) parser.3.4.1 The LR(1) Parsing AlgorithmThe critical step in a bottom-up parser, such as a table-driven lr(1) parser, isto find the next handle. Efficient handle finding is the key to efficient bottomup parsing. An lr(1) parser uses a handle-finding automaton, encoded intotwo tables, called Action and Goto. Figure 3.15 shows a simple table-drivenlr(1) parser.The skeleton lr(1) parser interprets the Action and Goto tables to find successive handles in the reverse rightmost derivation of the input string. Whenit finds a handle hA →β,ki, it reduces β at k to A in the current sententialform—the upper frontier of the partially completed parse tree. Rather thanbuild an explicit parse tree, the skeleton parser keeps the current upper frontier of the partially constructed tree on a stack, interleaved with states fromthe handle-finding automaton that let it thread together the reductions intoa parse.

At any point in the parse, the stack contains a prefix of the currentfrontier. Beyond this prefix, the frontier consists of leaf nodes. The variableword holds the first word in the suffix that lies beyond the stack’s contents;it is the lookahead symbol.Using a stack lets the LR(1) parser make theposition, k, in the handle be constant andimplicit.To find the next handle, the lr(1) parser shifts symbols onto the stack untilthe automaton finds the right end of a handle at the stack top. Once it hasa handle, the parser reduces by the production in the handle. To do so, itpops the symbols in β from the stack and pushes the corresponding lefthand side, A, onto the stack. The Action and Goto tables thread togethershift and reduce actions in a grammar-driven sequence that finds a reverserightmost derivation, if one exists.To make this concrete, consider the grammar shown in Figure 3.16a, whichdescribes the language of properly nested parentheses.

Figure 3.16b showsthe Action and Goto tables for this grammar. When used with the skeletonlr(1) parser, they create a parser for the parentheses language.3.4 Bottom-Up Parsing 119push $;push start state, s0 ;word ← NextWord( );while (true) do;state ← top of stack;if Action[state,word] = ‘‘reduce A → β’’ then begin;pop 2 × | β | symbols;state ← top of stack;push A;push Goto[state, A];end;else ifpushpushwordend;Action[state,word] = ‘‘shift si ’’ then begin;word;si ;← NextWord( );else if Action[state,word] = ‘‘accept’’then break;else Fail( );end;report success;/* executed break on ‘‘accept’’ case */n FIGURE 3.15 The Skeleton LR(1) Parser.To understand the behavior of the skeleton lr(1) parser, consider thesequence of actions that it takes on the input string “( )”.IterationStatewordinitial12345—03721(()eofeofeofStack$$$$$$000000( 3( 3 ) 7Pair 2List 1HandleAction— none —— none —— none ——shift 3shift 7reduce 5reduce 3accept()PairListThe first line shows the parser’s initial state. Subsequent lines show its stateat the start of the while loop, along with the action that it takes.

At the startof the first iteration, the stack does not contain a handle, so the parser shiftsthe lookahead symbol, (, onto the stack. From the Action table, it knows toshift and move to state 3. At the start of the second iteration, the stack still120 CHAPTER 3 ParsersState12345Goal → ListList → List Pair| PairPair → ( Pair )| ( )(a) Parentheses Grammar01234567891011Action TableGoto TableeofListPair124(r2s3s3r3s6r2r5r4s6r5r4accr3)s75s8s 109s 11r5r4(b) Action and Goto Tablesn FIGURE 3.16 The Parentheses Grammar.does not contain a handle, so the parser shifts ) onto the stack to build morecontext. It moves to state 7.In an LR parser, the handle is always positioned atstacktop and the chain of handles produces areverse rightmost derivation.In the third iteration, the situation has changed.

The stack contains a handle, hPair → ( ) i,t, where t is the stack top. The Action table directs theparser to reduce ( ) to Pair. Using the state beneath Pair on the stack, 0, andPair, the parser moves to state 2 (specified by Goto[0,Pair]). In state 2,with Pair atop the stack and eof as its lookahead, the parser finds the handle hList → Pair,ti and reduces, which leaves the parser in state 1 (specifiedby Goto[0,List]).

Finally, in state 1, with List atop the stack and eof asits lookahead, the parser discovers the handle hGoal → List,ti. The Actiontable encodes this situation as an accept action, so the parse halts.This parse required two shifts and three reduces. lr(1) parsers take timeproportional to the length of the input (one shift per word returned fromthe scanner) and the length of the derivation (one reduce per step in thederivation). In general, we cannot expect to discover the derivation for asentence in any fewer steps.Figure 3.17 shows the parser’s behavior on the input string, “( ( ) ) ( ).”The parser performs six shifts, five reduces, and one accept on this input.Figure 3.18 shows the state of the partially-built parse tree at the start ofeach iteration of the parser’s while loop.

The top of each drawing shows aniteration number and a gray bar that contains the partial parse tree’s upperfrontier. In the lr(1) parser, this frontier appears on the stack.3.4 Bottom-Up Parsing 121IterationStatewordinitial123456789101112—0361058213741((()))((()eofeofeofStack$0$ 0$ 0$ 0$ 0$ 0$ 0$ 0$ 0$ 0$ 0$ 0$ 0(((((33333PairListListListListList( 6( 6 ) 10Pair 5Pair 5 ) 8211 ( 31 ( 3 )71 Pair 41HandleAction— none —— none —— none —— none ——shift 3shift 6shift 10reduce 5shift 8reduce 4reduce 3shift 3shift 7reduce 5reduce 2accept()— none —( Pair )Pair— none —— none —()List PairListn FIGURE 3.17 States of the LR(1) Parser on ( ( ) ) ( ).Handle FindingThe parser’s actions shed additional light on the process of finding handles.Consider the parser’s actions on the string “( )”, as shown in the table onpage 119.

Свежие статьи
Популярно сейчас