Cooper_Engineering_a_Compiler(Second Edition) (Rice), страница 53
Описание файла
Файл "Cooper_Engineering_a_Compiler(Second Edition)" внутри архива находится в следующих папках: Rice, Купер и Торчсон - перевод. PDF-файл из архива "Rice", который расположен в категории "". Всё это находится в предмете "конструирование компиляторов" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 53 страницы из PDF
The primarydrawback of the attribute-grammar solution lies in the proliferation of rulesto copy information around the tree. This creates many additional rules inthe specification and duplicates attribute values at many nodes.To address these problems in an ad hoc syntax-directed translation scheme,the compiler writer typically introduces a central repository for informationabout variables, as suggested earlier. This eliminates the need to copy valuesaround the trees. It also simplifies the handling of inherited values. Since theparser determines evaluation order, we do not need to worry about breakingdependences between attributes.Most compilers build and use such a repository, called a symbol table. Thesymbol table maps a name into a variety of annotations such as a type, thesize of its runtime representation, and the information needed to generatea runtime address.
The table may also store a number of type-dependentfields, such as the type signature of a function or the number of dimensions and their bounds for an array. Section 5.5 and Appendix B.4 delveinto symbol-table design more deeply.Load Tracking, RevisitedConsider, again, the problem of tracking load operations that arose as part ofestimating execution costs. Most of the complexity in the attribute grammarfor this problem arose from the need to pass information around the tree.In an ad hoc syntax-directed translation scheme that uses a symbol table,the problem is easy to handle.
The compiler writer can set aside a field inthe table to hold a boolean that indicates whether or not that identifier hasalready been charged for a load. The field is initially set to false. Thecritical code is associated with the production Factor → name. If the name’ssymbol table entry indicates that it has not been charged for a load, thencost is updated and the field is set to true.4.4 Ad Hoc Syntax-Directed Translation 203ProductionSyntax-Directed ActionsBlock0 → Block1 Assign|AssignAssign → name = Expr ;{ cost = cost + Cost(store) }Expr→ Expr + Term{ cost = cost + Cost(add) }|Expr − Term{ cost = cost + Cost(sub) }|TermTerm→ Term × Factor|Term ÷ Factor|Factor{ cost = cost + Cost(mult) }{ cost = cost + Cost(div) }Factor → ( Expr )|num{ cost = cost + Cost(loadI) }|name{ if name’s symbol table fieldindicates that it has not been loadedthencost = cost + Cost(load)set the field to true }n FIGURE 4.12 Tracking Loads with Ad Hoc Syntax-Directed Translation.Figure 4.12 shows this case, along with all the other actions.
Because theactions can contain arbitrary code, the compiler can accumulate cost in asingle variable, rather than creating a cost attribute at each node in the parsetree. This scheme requires fewer actions than the attribution rules for thesimplest execution model, even though it can provide the accuracy of themore complex model.Notice that several productions have no actions. The remaining actions aresimple, except for the action taken on a reduction by name. All of the complication introduced by tracking loads falls into that single action; contrastthat with the attribute-grammar version, where the task of passing aroundthe Before and After sets came to dominate the specification.
The ad hocversion is cleaner and simpler, in part because the problem fits nicely intothe evaluation order dictated by the reduce actions in a shift-reduce parser.Of course, the compiler writer must implement the symbol table or import itfrom some library of data-structure implementations.Clearly, some of these strategies could also be applied in an attributegrammar framework. However, they violate the functional nature of theattribute grammar. They force critical parts of the work out of the attributegrammar framework and into an ad hoc setting.204 CHAPTER 4 Context-Sensitive AnalysisThe scheme in Figure 4.12 ignores one critical issue: initializing cost. Thegrammar, as written, contains no production that can appropriately initializecost to zero.
The solution, as described earlier, is to modify the grammar ina way that creates a place for the initialization. An initial production, such asStart → CostInit Block, along with CostInit → , does this. The frameworkcan perform the assignment cost ← 0 on the reduction from to CostInit.Type Inference for Expressions, RevisitedThe problem of inferring types for expressions fit well into the attributegrammar framework, as long as we assumed that leaf nodes already hadtype information. The simplicity of the solution shown in Figure 4.7 derivesfrom two principal facts.
First, because expression types are defined recursively on the expression tree, the natural flow of information runs bottom upfrom the leaves to the root. This biases the solution toward an S-attributedgrammar. Second, expression types are defined in terms of the syntax of thesource language. This fits well with the attribute-grammar framework, whichimplicitly requires the presence of a parse tree.
All the type informationcan be tied to instances of grammar symbols, which correspond preciselyto nodes in the parse tree.We can reformulate this problem in an ad hoc framework, as shown inFigure 4.13. It uses the type inference functions introduced with Figure 4.7.The resulting framework looks similar to the attribute grammar for the samepurpose from Figure 4.7. The ad hoc framework provides no real advantagefor this problem.ProductionExprTermSyntax-Directed Actions→ Expr − Term{ $$ ← F+ ($1,$3) }|Expr − Term{ $$ ← F− ($1,$3) }|Term{ $$ ← $1 }→ Term × Factor{ $$ ← F× ($1,$3) }|Term ÷ Factor{ $$ ← F÷ ($1,$3) }|Factor{ $$ ← $1 }Factor → ( Expr ){ $$ ← $2 }|num{ $$ ← type of the num }|name{ $$ ← type of the name }n FIGURE 4.13 Ad Hoc Framework for Inferring Expression Types.4.4 Ad Hoc Syntax-Directed Translation 205ProductionExprTermSyntax-Directed Actions→ Expr + Term{ $$ ← MakeNode2 (plus, $1, $3);$$.type ← F+ ($1.type, $3.type) }|Expr − Term{ $$ ← MakeNode2 (minus, $1, $3);$$.type ← F− ($1.type,$3.type) }|Term{ $$ ← $1 }→ Term × Factor{ $$ ← MakeNode2 (times, $1, $3);$$.type ← F× ($1.type, $3.type) }|Term ÷ Factor{ $$ ← MakeNode2 (divide, $1, $3);$$.type ← F÷ ($1.type, $3.type) }|Factor{ $$ ← $1 }Factor → ( Expr ){ $$ ← $2 }|num{ $$ ← MakeNode0 (number);$$.text ← scanned text;$$.type ← type of the number }|name{ $$ ← MakeNode0 (identifier);$$.text ← scanned text;$$.type ← type of the identifier }n FIGURE 4.14 Building an Abstract Syntax Tree and Inferring Expression Types.Building an Abstract Syntax TreeCompiler front ends must build an intermediate representation of the program for use in the compiler’s middle part and its back end.
Abstract syntaxtrees are a common form of tree-structured ir. The task of building an astfits neatly into an ad hoc syntax-directed translation scheme.Assume that the compiler has a series of routines named MakeNodei , for0 ≤ i ≤ 3. The routine takes, as its first argument, a constant that uniquelyidentifies the grammar symbol that the new node will represent. The remaining i arguments are the nodes that head each of the i subtrees.
Thus,MakeNode0 (number) constructs a leaf node and marks it as representinga num. Similarly,MakeNode2 (Plus,MakeNode0 (number,) MakeNode0 (number))builds an ast rooted in a node for plus with two children, each of which isa leaf node for num.The MakeNode routines can implement thetree in any appropriate way. For example, theymight map the structure onto a binary tree, asdiscussed in Section B.3.1.206 CHAPTER 4 Context-Sensitive AnalysisTo build an abstract syntax tree, the ad hoc syntax-directed translationscheme follows two general principles:1. For an operator, it creates a node with a child for each operand.
Thus,2 + 3 creates a binary node for + with the nodes for 2 and 3 as children.2. For a useless production, such as Term → Factor, it reuses the resultfrom the Factor action as its own result.In this manner, it avoids building tree nodes that represent syntactic variables, such as Factor, Term, and Expr. Figure 4.14 shows a syntax-directedtranslation scheme that incorporates these ideas.Generating ILOC for ExpressionsAs a final example of manipulating expressions, consider an ad hocframework that generates iloc rather than an ast. We will make severalsimplifying assumptions.
The example limits its attention to integers;handling other types adds complexity, but little insight. The example alsoassumes that all values can be held in registers—both that the values fit inregisters and that the iloc implementation provides more registers than thecomputation will use.Code generation requires the compiler to track many small details. Toabstract away most of these bookkeeping details (and to defer some deeperissues to following chapters), the example framework uses four supportingroutines.1.