Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 87
Текст из файла (страница 87)
If so, it just assigns the register number to result.Otherwise, it uses the standard mechanisms to load the value frommemory.Call-by-reference parameter If the address resides in a register, thecompiler simply loads the value into a register. If the address resides inthe ar, it must load the address before it loads the value.7.3 Arithmetic Operators 347COMMUTATIVITY, ASSOCIATIVITY, AND NUMBER SYSTEMSThe compiler can often take advantage of algebraic properties of the operators.
Addition and multiplication are commutative and associative, as arethe boolean operators. Thus, if the compiler sees a code fragment thatcomputes a + b and then computes b + a, with no intervening assignmentsto either a or b, it should recognize that they compute the same value. Similarly, if it sees the expressions a + b + c and d + a + b, it should recognize thata + b is a common subexpression. If it evaluates both expressions in strictleft-to-right order, it will never recognize the common subexpression, sinceit will compute the second expression as d + a and then (d + a) + b.The compiler should use commutativity and associativity to improve thequality of code that it generates. Reordering expressions can expose additional opportunities for many transformations.Due to limitations in precision, floating-point numbers on a computer represent only a subset of the real numbers, one that does not preserve associativity.For this reason, compilers should not reorder floating-point expressions unlessthe language definition specifically allows it.Consider the following example: computing a - b - c.
We can assignfloating-point values to a, b, and c such thatb, c < aa-b=aa-c=abut a - (b + c) 6= a. In that case, the numerical result depends on the orderof evaluation. Evaluating (a - b) - c produces a result identical to a, whileevaluating b + c first and subtracting that quantity from a produces a resultthat is distinct from a.This problem arises from the approximate nature of floating-point numbers; the mantissa is small relative to the range of the exponent.
To addtwo numbers, the hardware must normalize them; if the difference in exponents is larger than the precision of the mantissa, the smaller number willbe truncated to zero. The compiler cannot easily work its way around thisissue, so it should, in general, avoid reordering float-point computations.In either case, the code fits nicely into the treewalk framework. Note that thecompiler cannot keep the value of a call-by-reference parameter in a registeracross an assignment, unless the compiler can prove that the reference isunambiguous, across all calls to the procedure.7.3.3 Function Calls in an ExpressionSo far, we have assumed that all the operands in an expression are variables,constants, and temporary values produced by other subexpressions. FunctionIf the actual parameter is a local variable of thecaller and its address is never taken, thecorresponding formal is unambiguous.348 CHAPTER 7 Code Shapecalls also occur as operands in expressions.
To evaluate a function call, thecompiler simply generates the calling sequence needed to invoke the function and emits the code necessary to move the returned value to a register (seeSection 7.9). The linkage convention limits the callee’s impact on the caller.The presence of a function call may restrict the compiler’s ability to changean expression’s evaluation order. The function may have side effects thatmodify the values of variables used in the expression. The compiler mustrespect the implied evaluation order of the source expression, at least withrespect to the call. Without knowledge about the possible side effects ofa call, the compiler cannot move references across the call. The compilermust assume the worst case—that the function both modifies and uses everyvariable that it can access.
The desire to improve on worst-case assumptions,such as this one, has motivated much of the work in interprocedural analysis(see Section 9.4).7.3.4 Other Arithmetic OperatorsTo handle other arithmetic operations, we can extend the treewalk model.The basic scheme remains the same: get the operands into registers, performthe operation, and store the result. Operator precedence, from the expressiongrammar, ensures the correct evaluation order. Some operators require complex multioperation sequences for their implementation (e.g. exponentiationand trigonometric functions).
These may be expanded inline or implementedwith a call to a library routine supplied by the compiler or the operatingsystem.7.3.5 Mixed-Type ExpressionsOne complication allowed by many programming languages is an operationwith operands of different types. (Here, we are concerned primarily withbase types in the source language, rather than programmer-defined types.)As described in Section 4.2, the compiler must recognize this situation andinsert the conversion code required by each operator’s conversion table. Typically, this involves converting one or both operands to a more general typeand performing the operation in that more general type.
The operation thatconsumes the result value may need to convert it to yet another type.Some processors provide explicit conversion operators; others expect thecompiler to generate complex, machine-dependent code. In either case, thecompiler writer may want to provide conversion operators in the ir. Such anoperator encapsulates all the details of the conversion, including any controlflow, and lets the compiler subject it to uniform optimization.
Thus, code7.3 Arithmetic Operators 349motion can pull an invariant conversion out of a loop without concern forthe loop’s internal control flow.Typically, the programming-language definition specifies a formula for eachconversion. For example, to convert integer to complex in fortran 77,the compiler first converts the integer to a real.
It uses the resulting number as the real part of the complex number and sets the imaginary part to areal zero.For user-defined types, the compiler will not have conversion tables thatdefine each specific case. However, the source language still definesthe meaning of the expression. The compiler’s task is to implement thatmeaning; if a conversion is illegal, then it should be prevented.
As seenin Chapter 4, many illegal conversions can be detected and prevented atcompile time. When a compile-time check is either impossible or inconclusive, the compiler should generate a runtime check that tests for illegalcases. When the code attempts an illegal conversion, the check should raisea runtime error.7.3.6 Assignment as an OperatorMost Algol-like languages implement assignment with the following simplerules:1. Evaluate the right-hand side of the assignment to a value.2.
Evaluate the left-hand side of the assignment to a location.3. Store the right-hand side value into the left-hand side location.Thus, in a statement such as a ← b, the two expressions a and b are evaluated differently. Since b appears to the right of the assignment operator, itis evaluated to produce a value; if b is an integer variable, that value is aninteger. Since a is to the left of the assignment operator, it is evaluated toproduce a location; if a is an integer variable, that value is the location ofan integer. That location might be an address in memory, or it might be aregister.
To distinguish between these modes of evaluation, we sometimesrefer to the result of evaluation on the right-hand side of an assignment as anrvalue and the result of evaluation on the left-hand side of an assignment asan lvalue.In an assignment, the type of the lvalue can differ from the type of thervalue. Depending on the language and the specific types, this situation mayrequire either a compiler-inserted conversion or an error message. The typical source-language rule for conversion has the compiler evaluate the rvalueto its natural type and then convert the result to the type of the lvalue.RvalueAn expression evaluated to a value is an rvalue.LvalueAn expression evaluated to a location is an lvalue.350 CHAPTER 7 Code ShapeSECTION REVIEWA postorder treewalk provides a natural way to structure a code generator for expression trees.
The basic framework is easily adapted to handlea variety of complications, including multiple kinds and locations ofvalues, function calls, type conversions, and new operators. To improvethe code further may require multiple passes over the code.Some optimizations are hard to fit into a treewalk framework. Inparticular, making good use of processor address modes (see Chapter11), ordering operations to hide processor-specific delays (seeChapter 12), and register allocation (see Chapter 13) do not fit wellinto the treewalk framework.
If the compiler uses a treewalk to generateIR, it may be best to keep the IR simple and allow the back end to addressthese issues with specialized algorithms.Review Questions1. Sketch the code for the two support routines, base and offset, usedby the treewalk code generator in Figure 7.5.2. How might you adapt the treewalk code generator to handle anunconditional jump operation, such as C’s goto statement?7.4 BOOLEAN AND RELATIONAL OPERATORSMost programming languages operate on a richer set of values than numbers. Usually, this includes the results of boolean and relational operators,both of which produce boolean values.
Because most programming languages have relational operators that produce boolean results, we treat theboolean and relational operators together. A common use for boolean and relational expressions is to alter the program’s control flow. Much of the powerof modern programming languages derives from the ability to compute andtest such values.The grammar uses the symbols ¬ for not, ∧for and, and ∨ for or to avoid confusion withILOC operators.The type checker must ensure that eachexpression applies operators to names, numbers,and expressions of appropriate types.Figure 7.7 shows the standard expression grammar augmented with booleanand relational operators. The compiler writer must, in turn, decide how torepresent these values and how to compute them.
For arithmetic expressions, such design decisions are largely dictated by the target architecture,which provides number formats and instructions to perform basic arithmetic.Fortunately, processor architects appear to have reached a widespread agreement about how to support arithmetic. Similarly, most architectures providea rich set of boolean operations. However, support for relational operatorsvaries widely from one architecture to another. The compiler writer must usean evaluation strategy that matches the needs of the language to the availableinstruction set.7.4 Boolean and Relational Operators 351Expr→ Expr ∨ AndTerm|AndTermAndTerm → AndTerm ∧ RelExpr| RelExprRelExprNumExpr → NumExpr + Term| NumExpr − Term| TermTerm→ Term × Value||→ RelExpr < NumExpr||||||RelExpr ≤RelExpr =RelExpr 6=RelExpr ≥RelExpr >NumExprNumExprNumExprNumExprNumExprNumExprTerm ÷ ValueFactorValue→ ¬ FactorFactor→ (Expr )|||Factornumnamen FIGURE 7.7 Adding Booleans and Relationals to the Expression Grammar.7.4.1 RepresentationsTraditionally, two representations have been proposed for boolean values:a numerical encoding and a positional encoding.