Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 64
Текст из файла (страница 64)
The compiler mustdetermine, for each value computed in the code, where that value will reside.For the code to execute, the compiler must assign a specific location, such asregister r13 or 16 bytes from the label L0089. Before the final stages of codegeneration, however, the compiler may use symbolic addresses that encodea level in the memory hierarchy, for example, registers or memory, but nota specific location within that level.Consider the iloc examples used throughout this book. A symbolic memoryaddress is denoted by prefixing it with the character @. Thus, @x is the offsetof × from the start of the storage area containing it. Since rarp holds theactivation record pointer, an operation that uses @x and rarp to compute anaddress depends, implicitly, on the decision to store the variable x in thememory reserved for the current procedure’s activation record.In general, compilers work from one of two memory models.1.
Register-to-Register Model Under this model, the compiler keepsvalues in registers aggressively, ignoring any limitations imposed by thesize of the machine’s physical register set. Any value that can legally bekept in a register for most of its lifetime is kept in a register. Values arestored to memory only when the semantics of the program requireit—for example, at a procedure call, any local variable whose address ispassed as a parameter to the called procedure must be stored back tomemory. A value that cannot be kept in a register for most of its lifetimeis stored in memory. The compiler generates code to store its value eachtime it is computed and to load its value at each use.2. Memory-to-Memory Model Under this model, the compiler assumesthat all values are kept in memory locations.
Values move from memoryto a register just before they are used. Values move from a register tomemory just after they are defined. The number of registers named inthe ir version of the code can be small compared to the registerto-register model.
In this model, the designer may find it worthwhile toinclude memory-to-memory operations, such as a memory-to-memoryadd, in the ir.The choice of memory model is mostly orthogonal to the choice of ir. Thecompiler writer can build a memory-to-memory ast or a memory-to-memoryversion of iloc just as easily as register-to-register versions of either of these5.4 Mapping Values to Names 251THE HIERARCHY OF MEMORY OPERATIONS IN ILOC 9XThe ILOC used in this book is abstracted from an IR named ILOC 9X that wasused in a research compiler project at Rice University. ILOC 9X includes ahierarchy of memory operations that the compiler uses to encode knowledge about values.
At the bottom of the hierarchy, the compiler has littleor no knowledge about the value; at the top of the hierarchy, it knows theactual value. These operations are as follows:OperationMeaningImmediate loadNonvarying loadLoads a known constant value into a register.Loads a value that does not change duringexecution. The compiler does not know the value,but can prove that it is not defined by a programoperation.Operate on a scalar value, not an array element,a structure element, or a pointer-based value.Operate on a value that may be an array element,a structure element, or a pointer-based value.
Thisis the general-case operation.Scalar load & storeGeneral load & storeBy using this hierarchy, the front end can encode knowledge about the target value directly into the ILOC 9X code. As other passes discover additionalinformation, they can rewrite operations to change a value from using ageneral-purpose load to a more restricted form. If the compiler discoversthat some value is a known constant, it can replace a general load or a scalarload of that value with an immediate load. If an analysis of definitions anduses discovers that some location cannot be defined by any executablestore operation, loads of that value can be rewritten to use a non-varyingload.Optimizations can capitalize on the knowledge encoded in this fashion. Forexample, a comparison between the result of a non-varying load and a constant must itself be invariant—a fact that might be difficult or impossibleto prove with a scalar load or a general load.irs.
(Stack-machine code and code for an accumulator machine might beexceptions; they contain their own unique memory models.)The choice of memory model has an impact on the rest of the compiler.With a register-to-register model, the compiler typically uses more registersthan the target machine provides. Thus, the register allocator must map theset of virtual registers used in the ir program onto the physical registers provided by the target machine. This often requires insertion of extra load, store,252 CHAPTER 5 Intermediate Representationsand copy operations, making the code slower and larger.
With a memoryto-memory model, however, the ir version of the code typically uses fewerregisters than a modern processor provides. Here, the register allocator looksfor memory-based values that it can hold in registers for longer periodsof time. In this model, the allocator makes the code faster and smaller byremoving loads and stores.Compilers for risc machines tend to use the register-to-register model for tworeasons.
First, the register-to-register model more closely reflects the instruction sets of risc architectures. risc machines do not have a full complementof memory-to-memory operations; instead, they implicitly assume that values can be kept in registers. Second, the register-to-register model allows thecompiler to encode directly in the ir some of the subtle facts that it derives.The fact that a value is kept in a register means that the compiler, at someearlier point, had proof that keeping it in a register is safe.
Unless it encodesthat fact in the ir, the compiler will need to prove it, again and again.To elaborate, if the compiler can prove that only one name provides accessto a value, it can keep that value in a register. If multiple names might exist,the compiler must behave conservatively and keep the value in memory.For example, a local variable x can be kept in a register, unless it can bereferenced in another scope. In a language that supports nested scopes, likePascal or Ada, this reference can occur in a nested procedure. In c, this canoccur if the program takes x’s address, &x, and accesses the value throughthat address. In Algol or pl/i, the program can pass x as a call-by-referenceparameter to another procedure.SECTION REVIEWThe schemes used to name values in a compiler’s IR have a direct effecton the compiler’s ability to optimize the IR and to generate qualityassembly code from the IR.
The compiler must generate internal namesfor all values, from variables in the source language program to theintermediate values computed as part of an address expression for asubscripted array reference. Careful use of names can encode and exposefacts for late use in optimization; at the same time, proliferation of namescan slow the compiler by forcing it to use larger data structures.The name space generated in SSA form has gained popularity becauseit encodes useful properties; for example, each name corresponds to aunique definition in the code.
This precision can aid in optimization, aswe will see in Chapter 8.The name space can also encode a memory model. A mismatch betweenthe memory model and the target machine’s instruction set can complicate subsequent optimization and code generation, while a close matchallows the compiler to tailor carefully to the target machine.5.5 Symbol Tables 253Review Questions1. Consider the function fib shown in the margin. Write down theILOC that a compiler’s front end might generate for this code undera register-to-register model and under a memory-to-memory model.How do the two compare? Under what circumstances might eachmemory be desirable?2. Convert the register-to-register code that you generated in the previous question into SSA form. Are there φ-functions whose output valuecan never be used?int fib(int n) {int x = 1;int y = 1;int z = 1;while(n > 1)z = x + y;x = y;y = z;n = n - 1;return z;}5.5 SYMBOL TABLESAs part of translation, a compiler derives information about the various entities manipulated by the program being translated.