Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 56
Текст из файла (страница 56)
Cardelli has writtenan excellent overview of type systems [69]. The apl community produceda series of classic papers that dealt with techniques to eliminate runtimechecks [1, 35, 264, 349].Attribute grammars, like many ideas in computer science, were first proposed by Knuth [229, 230]. The literature on attribute grammars has focusedon evaluators [203, 342], on circularity testing [342], and on applicationsof attribute grammars [157, 298]. Attribute grammars have served as thebasis for several successful systems, including Intel’s Pascal compiler for the80286 [142, 143], the Cornell Program Synthesizer [297] and the SynthesizerGenerator [198, 299].Ad hoc syntax-directed translation has always been a part of the developmentof real parsers. Irons described the basic ideas behind syntax-directed translation to separate a parser’s actions from the description of its syntax [202].Undoubtedly, the same basic ideas were used in hand-coded precedenceparsers.
The style of writing syntax-directed actions that we describe wasintroduced by Johnson in Yacc [205]. The same notation has been carriedforward into more recent systems, including bison from the Gnu project.Exercises 217nEXERCISES1. In Scheme, the + operator is overloaded. Given that Scheme isdynamically typed, describe a method to type check an operation ofthe form (+ a b) where a and b may be of any type that is valid forthe + operator.Section 4.22. Some languages, such as apl or php, neither require variabledeclarations nor enforce consistency between assignments to the samevariable. (A program can assign the integer 10 to × and later assign thestring value “book” to × in the same scope.) This style ofprogramming is sometimes called type juggling.Suppose that you have an existing implementation of a languagethat has no declarations but requires type-consistent uses.
How couldyou modify it to allow type juggling?3. Based on the following evaluation rules, draw an annotated parse treethat shows how the syntax tree for a - (b + c) is constructed.ProductionE0E0E0TT→→→→→E1 + TE1 − TT(E)idEvaluation Rules{{{{{E 0 .nptr ← mknode(+, E 1 .nptr, T.nptr) }E 0 .nptr ← mknode(-, E 1 .nptr, T.nptr) }E 0 .nptr ← T.nptr }T.nptr ← E.nptr }T.nptr ← mkleaf(id ,id .entry) }4. Use the attribute-grammar paradigm to write an interpreter for theclassic expression grammar. Assume that each name has a valueattribute and a lexeme attribute. Assume that all attributes are alreadydefined and that all values will always have the same type.5. Write a grammar to describe all binary numbers that are multiples offour.
Add attribution rules to the grammar that will annotate the startsymbol of a syntax tree with an attribute value that contains thedecimal value of the binary number.6. Using the grammar defined in the previous exercise, build the syntaxtree for the binary number 11100.a. Show all the attributes in the tree with their corresponding values.b. Draw the attribute dependence graph for the syntax tree andclassify all attributes as being either synthesized or inherited.Section 4.3218 CHAPTER 4 Context-Sensitive AnalysisSection 4.47. A Pascal program can declare two integer variables a and b with thesyntaxvar a, b: intThis declaration might be described with the following grammar:VarDecl → var IDList : TypeIDIDList→ IDList, ID| IDwhere IDList derives a comma-separated list of variable names andTypeID derives a valid Pascal type.
You may find it necessary torewrite the grammar.a. Write an attribute grammar that assigns the correct data type toeach declared variable.b. Write an ad hoc syntax-directed translation scheme that assigns thecorrect data type to each declared variable.c. Can either scheme operate in a single pass over the syntax tree?8. Sometimes, the compiler writer can move an issue across theboundary between context-free and context-sensitive analysis.Consider, for example, the classic ambiguity that arises betweenfunction invocation and array references in fortran 77 (and otherlanguages). These constructs might be added to the classic expressiongrammar using the productions:Factor→ name ( ExprList )ExprList → ExprList , Expr| ExprHere, the only difference between a function invocation and an arrayreference lies in how the name is declared.In previous chapters, we have discussed using cooperation betweenthe scanner and the parser to disambiguate these constructs.
Can theproblem be solved during context-sensitive analysis? Which solutionis preferable?9. Sometimes, a language specification uses context-sensitivemechanisms to check properties that can be tested in a context-freeway. Consider the grammar fragment in Figure 4.16 on page 208. Itallows an arbitrary number of StorageClass specifiers when, in fact,the standard restricts a declaration to a single StorageClass specifier.a. Rewrite the grammar to enforce the restriction grammatically.b. Similarly, the language allows only a limited set of combinations ofTypeSpecifier.
long is allowed with either int or float; short isallowed only with int. Either signed or unsigned can appearExercises 219with any form of int. signed may also appear on char. Can theserestrictions be written into the grammar?c. Propose an explanation for why the authors structured the grammaras they did.d. Do your revisions to the grammar change the overall speed of theparser? In building a parser for c, would you use the grammar likethe one in Figure 4.16, or would you prefer your revised grammar?Justify your answer.10. Object-oriented languages allow operator and function overloading. Inthese languages, the function name is not always a unique identifier,since you can have multiple related definitions, as invoid Show(int);void Show(char *);void Show(float);For lookup purposes, the compiler must construct a distinct identifierfor each function.
Sometimes, such overloaded functions will havedifferent return types, as well. How would you create distinctidentifiers for such functions?11. Inheritance can create problems for the implementation ofobject-oriented languages.
When object type A is a parent of objecttype B, a program can assign a “pointer to B” to a “pointer to A,” withsyntax such as a ← b. This should not cause problems sinceeverything that A can do, B can also do. However, one cannot assign a“pointer to A” to a “pointer to B,” since object class B can implementmethods that object class A does not.Design a mechanism that can use ad hoc syntax-directed translation todetermine whether or not a pointer assignment of this kind is allowed.Hint: The scanner returned a single token type forany of the StorageClass values and another tokentype for any of the TypeSpecifiers.Section 4.5This page intentionally left blankChapter5Intermediate RepresentationsnCHAPTER OVERVIEWThe central data structure in a compiler is the intermediate form of theprogram being compiled.
Most passes in the compiler read and manipulatethe ir form of the code. Thus, decisions about what to represent and howto represent it play a crucial role in both the cost of compilation and itseffectiveness. This chapter presents a survey of ir forms that compilers use,including graphical ir, linear irs, and symbol tables.Keywords: Intermediate Representation, Graphical ir, Linear ir, ssa Form,Symbol Table5.1 INTRODUCTIONCompilers are typically organized as a series of passes. As the compilerderives knowledge about the code it compiles, it must convey that information from one pass to another.
Thus, the compiler needs a representationfor all of the facts that it derives about the program. We call this representation an intermediate representation, or ir. A compiler may have a single ir,or it may have a series of irs that it uses as it transforms the code from sourcelanguage into its target language. During translation, the ir form of the inputprogram is the definitive form of the program. The compiler does not referback to the source text; instead, it looks to the ir form of the code. The properties of a compiler’s ir or irs have a direct effect on what the compiler cando to the code.Almost every phase of the compiler manipulates the program in its ir form.Thus, the properties of the ir, such as the mechanisms for reading and writing specific fields, for finding specific facts or annotations, and for navigatingaround a program in ir form, have a direct impact on the ease of writing theindividual passes and on the cost of executing those passes.Engineering a Compiler.