K. Cooper, L. Torczon - Engineering a Compiler (2011 - 2nd edition) (798440), страница 65
Текст из файла (страница 65)
Any value that can legally bekept in a register for most of its lifetime is kept in a register. Values arestored to memory only when the semantics of the program requireit—for example, at a procedure call, any local variable whose address ispassed as a parameter to the called procedure must be stored back tomemory. A value that cannot be kept in a register for most of its lifetimeis stored in memory.
The compiler generates code to store its value eachtime it is computed and to load its value at each use.2. Memory-to-Memory Model Under this model, the compiler assumesthat all values are kept in memory locations. Values move from memoryto a register just before they are used. Values move from a register tomemory just after they are defined. The number of registers named inthe ir version of the code can be small compared to the registerto-register model.
In this model, the designer may find it worthwhile toinclude memory-to-memory operations, such as a memory-to-memoryadd, in the ir.The choice of memory model is mostly orthogonal to the choice of ir. Thecompiler writer can build a memory-to-memory ast or a memory-to-memoryversion of iloc just as easily as register-to-register versions of either of these5.4 Mapping Values to Names 251THE HIERARCHY OF MEMORY OPERATIONS IN ILOC 9XThe ILOC used in this book is abstracted from an IR named ILOC 9X that wasused in a research compiler project at Rice University. ILOC 9X includes ahierarchy of memory operations that the compiler uses to encode knowledge about values.
At the bottom of the hierarchy, the compiler has littleor no knowledge about the value; at the top of the hierarchy, it knows theactual value. These operations are as follows:OperationMeaningImmediate loadNonvarying loadLoads a known constant value into a register.Loads a value that does not change duringexecution. The compiler does not know the value,but can prove that it is not defined by a programoperation.Operate on a scalar value, not an array element,a structure element, or a pointer-based value.Operate on a value that may be an array element,a structure element, or a pointer-based value. Thisis the general-case operation.Scalar load & storeGeneral load & storeBy using this hierarchy, the front end can encode knowledge about the target value directly into the ILOC 9X code.
As other passes discover additionalinformation, they can rewrite operations to change a value from using ageneral-purpose load to a more restricted form. If the compiler discoversthat some value is a known constant, it can replace a general load or a scalarload of that value with an immediate load. If an analysis of definitions anduses discovers that some location cannot be defined by any executablestore operation, loads of that value can be rewritten to use a non-varyingload.Optimizations can capitalize on the knowledge encoded in this fashion.
Forexample, a comparison between the result of a non-varying load and a constant must itself be invariant—a fact that might be difficult or impossibleto prove with a scalar load or a general load.irs. (Stack-machine code and code for an accumulator machine might beexceptions; they contain their own unique memory models.)The choice of memory model has an impact on the rest of the compiler.With a register-to-register model, the compiler typically uses more registersthan the target machine provides. Thus, the register allocator must map theset of virtual registers used in the ir program onto the physical registers provided by the target machine. This often requires insertion of extra load, store,252 CHAPTER 5 Intermediate Representationsand copy operations, making the code slower and larger.
With a memoryto-memory model, however, the ir version of the code typically uses fewerregisters than a modern processor provides. Here, the register allocator looksfor memory-based values that it can hold in registers for longer periodsof time. In this model, the allocator makes the code faster and smaller byremoving loads and stores.Compilers for risc machines tend to use the register-to-register model for tworeasons. First, the register-to-register model more closely reflects the instruction sets of risc architectures.
risc machines do not have a full complementof memory-to-memory operations; instead, they implicitly assume that values can be kept in registers. Second, the register-to-register model allows thecompiler to encode directly in the ir some of the subtle facts that it derives.The fact that a value is kept in a register means that the compiler, at someearlier point, had proof that keeping it in a register is safe.
Unless it encodesthat fact in the ir, the compiler will need to prove it, again and again.To elaborate, if the compiler can prove that only one name provides accessto a value, it can keep that value in a register. If multiple names might exist,the compiler must behave conservatively and keep the value in memory.For example, a local variable x can be kept in a register, unless it can bereferenced in another scope. In a language that supports nested scopes, likePascal or Ada, this reference can occur in a nested procedure. In c, this canoccur if the program takes x’s address, &x, and accesses the value throughthat address.
In Algol or pl/i, the program can pass x as a call-by-referenceparameter to another procedure.SECTION REVIEWThe schemes used to name values in a compiler’s IR have a direct effecton the compiler’s ability to optimize the IR and to generate qualityassembly code from the IR. The compiler must generate internal namesfor all values, from variables in the source language program to theintermediate values computed as part of an address expression for asubscripted array reference. Careful use of names can encode and exposefacts for late use in optimization; at the same time, proliferation of namescan slow the compiler by forcing it to use larger data structures.The name space generated in SSA form has gained popularity becauseit encodes useful properties; for example, each name corresponds to aunique definition in the code.
This precision can aid in optimization, aswe will see in Chapter 8.The name space can also encode a memory model. A mismatch betweenthe memory model and the target machine’s instruction set can complicate subsequent optimization and code generation, while a close matchallows the compiler to tailor carefully to the target machine.5.5 Symbol Tables 253Review Questions1. Consider the function fib shown in the margin.
Write down theILOC that a compiler’s front end might generate for this code undera register-to-register model and under a memory-to-memory model.How do the two compare? Under what circumstances might eachmemory be desirable?2. Convert the register-to-register code that you generated in the previous question into SSA form. Are there φ-functions whose output valuecan never be used?int fib(int n) {int x = 1;int y = 1;int z = 1;while(n > 1)z = x + y;x = y;y = z;n = n - 1;return z;}5.5 SYMBOL TABLESAs part of translation, a compiler derives information about the various entities manipulated by the program being translated. It must discover and storemany distinct kinds of information. It encounters a wide variety of names—variables, defined constants, procedures, functions, labels, structures, andfiles.
As discussed in the previous section, the compiler also generates manynames. For a variable, it needs a data type, its storage class, the name andlexical level of its declaring procedure, and a base address and offset inmemory. For an array, the compiler also needs the number of dimensionsand the upper and lower bounds for each dimension. For records or structures, it needs a list of the fields, along with the relevant information foreach field.
For functions and procedures, it needs the number of parametersand their types, as well as the types of any returned values; a more sophisticated translation might record information about what variables a procedurecan reference or modify.The compiler must either record this information in the ir or re-derive it ondemand. For the sake of efficiency, most compilers record facts rather thanrecompute them.