Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 52
Описание файла
Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "". Всё это находится в предмете "конструирование компиляторов" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 52 страницы из PDF
The compiler writer hascomplete control over when the actions execute. In a bottom-up, shift-reduceparser, the actions are performed each time the parser performs a reduceaction. This is more restrictive, but still workable.To make this concrete, consider reformulating the signed binary numberexample in an ad hoc syntax-directed translation framework. Figure 4.11shows one such framework. Each grammar symbol has a single value associated with it, denoted val in the code snippets. The code snippet for eachrule defines the value associated with the symbol on the rule’s left-hand side.Rule 1 simply multiplies the value for Sign with the value for List.
Rules 2and 3 set the value for Sign appropriately, just as rules 6 and 7 set the valuefor each instance of Bit. Rule 4 simply copies the value from Bit to List. Thereal work occurs in rule 5, which multiplies the accumulated value of theleading bits (in List.val) by two, and then adds in the next bit.So far, this looks quite similar to an attribute grammar. However, it has twokey simplifications. Values flow in only one direction, from leaves to root.It allows only a single value per grammar symbol. Even so, the scheme inFigure 4.11 correctly computes the value of the signed binary number.
Itleaves that value at the root of the tree, just like the attribute grammar forsigned binary numbers.These two simplifications make possible an evaluation method that workswell with a bottom-up parser, such as the lr(1) parsers described in Chapter 3. Since each code snippet is associated with the right-hand side of aspecific production, the parser can invoke the action each time it reduces by4.4 Ad Hoc Syntax-Directed Translation 199Production1234567NumberSignSignListList0BitBit→→→→→→→Sign List+-BitList1 Bit01Code SnippetNumber . val ← Sign . val × List .
valSign . val ← 1Sign . val ← -1List . val ← Bit . valList0 .val ← 2 × List1 .val + Bit . valBit . val ← 0Bit . val ← 1n FIGURE 4.11 Ad Hoc Syntax-Directed Translation for Signed Binary Numbers.that production. This requires minor modifications to the reduce action inthe skeleton lr(1) parser shown in Figure 3.15.else if Action[s,word] = “reduce A→β” theninvoke the appropriate reduce actionpop 2 × |β| symbolss ← top of stackpush Apush Goto[s, A]The parser generator can gather the syntax-directed actions together, embedthem in a case statement that switches on the number of the production beingreduced, and place the case statement just before it pops the right-hand sidefrom the stack.The translation scheme shown in Figure 4.11 is simpler than the scheme usedto explain attribute grammars.
Of course, we can write an attribute grammarthat applies the same strategy. It would use only synthesized attributes. Itwould have fewer attribution rules and fewer attributes than the one shownin Figure 4.5. We chose the more complex attribution scheme to illustratethe use of both synthesized and inherited attributes.4.4.1 Implementing Ad Hoc Syntax-DirectedTranslationTo make ad hoc syntax-directed translation work, the parser must includemechanisms to pass values from their definitions in one action to their usesin another, to provide convenient and consistent naming, and to allow foractions that execute at other points in the parse.
This section describesmechanisms for handling these issues in a bottom-up, shift-reduce parser.200 CHAPTER 4 Context-Sensitive AnalysisAnalogous ideas will work for top-down parsers. We adopt a notation introduced in the Yacc system, an early and popular lalr(1) parser generatordistributed with the Unix operating system. The Yacc notation has beenadopted by many subsequent systems.Communicating between ActionsTo pass values between actions, the parser must have a methodology forallocating space to hold the values produced by the various actions.
Themechanism must make it possible for an action that uses a value to find it.An attribute grammar associates the values (attributes) with nodes in theparse tree; tying the attribute storage to the tree nodes’ storage makes it possible to find attribute values in a systematic way. In ad hoc syntax-directedtranslation, the parser may not construct the parse tree. Instead, the parsercan integrate the storage for values into its own mechanism for tracking thestate of the parse—its internal stack.Recall that the skeleton lr(1) parser stored two values on the stack for eachgrammar symbol: the symbol and a corresponding state. When it recognizesa handle, such as a List Bit sequence to match the right-hand side of rule 5,the first pair on the stack represents the Bit.
Underneath that lies the pairrepresenting the List. We can replace these hsymbol, statei pairs with triples,hvalue, symbol, statei. This provides a single value attribute per grammarsymbol—precisely what the simplified scheme needs. To manage the stack,the parser pushes and pops more values. On a reduction by A→β, it pops3 × |β| items from the stack, rather than 2 × |β| items. It pushes the valuealong with the symbol and state.This approach stores the values at easily computed locations relative tothe top of the stack.
Each reduction pushes its result onto the stack as part ofthe triple that represents the left-hand side. The action reads the values forthe right-hand side from their relative positions in the stack; the i th symbolon the right-hand side has its value in the i th triple from the top of the stack.Values are restricted to a fixed size; in practice, this limitation means thatmore complex values are passed using pointers to structures.To save storage, the parser could omit the actual grammar symbols fromthe stack.
The information necessary for parsing is encoded in the state.This shrinks the stack and speeds up the parse by eliminating the operations that stack and unstack those symbols. On the other hand, the grammarsymbol can help in error reporting and in debugging the parser. This tradeoff is usually decided in favor of not modifying the parser that the toolsproduce—such modifications must be reapplied each time the parser isregenerated.4.4 Ad Hoc Syntax-Directed Translation 201Naming ValuesTo simplify the use of stack-based values, the compiler writer needs a notation for naming them.
Yacc introduced a concise notation to address thisproblem. The symbol $$ refers to the result location for the current production. Thus, the assignment $$ = 0; would push the integer value zeroas the result corresponding to the current reduction. This assignment couldimplement the action for rule 6 in Figure 4.11. For the right-hand side, thesymbols $1, $2, . . . , $n refer to the locations for the first, second, throughnth symbols in the right-hand side, respectively.Rewriting the example from Figure 4.11 in this notation produces thefollowing specification:Production1234567NumberSignSignListList0BitBit→→→→→→→Sign List+-BitList1 Bit01Code Snippet$$$$$$$$$$$$$$←←←←←←←$1 × $21−1$12 × $1 + $201Notice how compact the code snippets are.
This scheme has an efficientimplementation; the symbols translate directly into offsets from the top ofthe stack. The notation $1 indicates a location 3 × |β| slots below the top ofthe stack, while a reference to $i designates the location 3 × (|β| − i + 1)slots from the top of the stack. Thus, the positional notation allows the actionsnippets to read and write the stack locations directly.Actions at Other Points in the ParseCompiler writers might also need to perform an action in the middle of aproduction or on a shift action. To accomplish this, compiler writers cantransform the grammar so that it performs a reduction at each point where anaction is needed. To reduce in the middle of a production, they can break theproduction into two pieces around the point where the action should execute.A higher-level production that sequences the first part, then the second part,is added.
When the first part reduces, the parser invokes the action. To forceactions on shifts, a compiler writer can either move them into the scanneror add a production to hold the action. For example, to perform an action202 CHAPTER 4 Context-Sensitive Analysiswhenever the parser shifts the terminal symbol Bit, a compiler writer canadd a productionShiftedBit → Bitand replace every occurrence of Bit with ShiftedBit. This adds an extrareduction for every terminal symbol. Thus, the additional cost is directlyproportional to the number of terminal symbols in the program.4.4.2 ExamplesTo understand how ad hoc syntax-directed translation works, considerrewriting the execution-time estimator using this approach.