Главная » Все файлы » Просмотр файлов из архивов » PDF-файлы » A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)

A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition), страница 36

Описание файла

PDF-файл из архива "A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)", который расположен в категории "книги и методические указания". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 36 страницы из PDF

}Temp.Temp munchExp(Tree.Exp s) { ... }Assem.InstrList codegen(Tree.Stm s) {Assem.InstrList l;munchStm(s);l=ilist;ilist=last=null;return l;}}package Frame;public class Frame {...public Assem.InstrList codegen(Tree.Stm stm); {return (new Codegen(this)).codegen(stm);}}PROCEDURE CALLSProcedure calls are represented by EXP(CALL(f, args)), and function calls by MOVE(TEMPt, CALL(f, args)). These trees can be matched by tiles such asmunchStm(EXP(CALL(e,args))){Temp r = munchExp(e); TempList l = munchArgs(0,args);emit(new OPER("CALL 's0\n",calldefs,L(r,l)));}In this example, munchArgs generates code to move all the arguments to their correctpositions, in outgoing parameter registers and/or in memory.

The integer parameter tomunchArgs is i for the ith argument; munchArgs will recur with i + 1 for the next argument,and so on.What munchArgs returns is a list of all the temporaries that are to be passed to the machine'sCALL instruction. Even though these temps are never written explicitly in assembly language,they should be listed as "sources" of the instruction, so that liveness analysis (Chapter 10) cansee that their values need to be kept up to the point of call.169A CALL is expected to "trash" certain registers - the caller-save registers, the return-addressregister, and the return-value register. This list of calldefs should be listed as "destinations"of the CALL, so that the later phases of the compiler know that something happens to themhere.In general, any instruction that has the side effect of writing to another register requires thistreatment. For example, the Pentium's multiply instruction writes to register edx with uselesshigh-order result bits, so edx and eax are both listed as destinations of this instruction.

(Thehigh-order bits can be very useful for programs written in assembly language to domultiprecision arithmetic, but most programming languages do not support any way to accessthem.)IF THERE'S NO FRAME POINTERIn a stack frame layout such as the one shown in Figure 6.2, the frame pointer points at oneend of the frame and the stack pointer points at the other. At each procedure call, the stackpointer register is copied to the frame pointer register, and then the stack pointer isincremented by the size of the new frame.Many machines' calling conventions do not use a frame pointer. Instead, the "virtual framepointer" is always equal to stack pointer plus frame size.

This saves time (no copy instruction)and space (one more register usable for other purposes). But our Translate phase hasgenerated trees that refer to this fictitious frame pointer. The codegen function must replaceany reference to FP+k with SP + k + fs, where fs is the frame size. It can recognize thesepatterns as it munches the trees.However, to replace them it must know the value of fs, which cannot yet be known becauseregister allocation is not known.

Assuming the function f is to be emitted at label L14 (forexample), codegen can just put sp+L14_framesize in its assembly instructions and hope thatthe prologue for f (generated by Frame.procEntryExit3) will include a definition of theassembly language constant L14_framesize. Codegen is passed the frame argument(Program 9.7) so that it can learn the name L14.Implementations that have a "real" frame pointer won't need this hack and can ignore theframe argument to codegen.

But why would an implementation use a real frame pointer whenit wastes time and space to do so? The answer is that this permits the frame size to grow andshrink even after it is first created; some languages have permitted dynamic allocation ofarrays within the stack frame (e.g., using alloca in C).

Calling-convention designers nowtend to avoid dynamically adjustable frame sizes, however.PROGRAM INSTRUCTION SELECTIONImplement the translation to Assem-instructions for your favorite instruction set (let μ standfor Sparc, Mips, Alpha, Pentium, etc.) using maximal munch. If you would like to generatecode for a RISC machine, but you have no RISC computer on which to test it, you may wishto use SPIM (a MIPS simulator implemented by James Larus), described on the Web page forthis book.First write the class μ.Codegen implementing the "maximal munch" translation algorithmfrom IR trees to the Assem data structure.170Use the Canon module (described in Chapter 8) to simplify the trees before applying yourCodegen module to them.

Use the format function to translate the resulting Assem trees to μassembly language. Since you won't have done register assignment, just pass newTemp.DefaultMap() to format as the translation function from temporaries to strings.package Temp;public class DefaultMap implements TempMap {public String tempMap(Temp.Temp t) {return t.toString();}}This will produce "assembly" language that does not use register names at all: Theinstructions will use names such as t3, t283, and so on. But some of these temps are the"built-in" ones created by the Frame module to stand for particular machine registers (seepage 143), such as Frame.FP.

The assembly language will be easier to read if these registersappear with their natural names (e.g., fp instead of t1).The Frame module must provide a mapping from the special temps to their names, andnonspecial temps to null:package Frame;public class Frame implements Temp.TempMap {⋮abstract public String tempMap(Temp temp);}Then, for the purposes of displaying your assembly language prior to register allocation, makea new TempMap function that first tries frame.tempMap, and if that returns null, resorts toTemp.toString().REGISTER LISTSMake the following lists of registers; for each register, you will need a string for its assemblylanguage representation and a Temp.Temp for referring to it in Tree and Assem data structures.•specialregs a list of μ registers used to implement "special" registers such as RVand FP and also the stack pointer SP, the return-address register RA, and (on somemachines) the zero register ZERO.

Some machines may have other special registers;•argregs a list of μ registers in which to pass outgoing arguments (including the staticlink);•calleesaves a list of μ registers that the called procedure (callee) must preserveunchanged (or save and restore);•callersaves a list of μ registers that the callee may trash.The four lists of registers must not overlap, and must include any register that might show upin Assem instructions. These lists are not public, but they are useful internally for both Frameand Codegen - for example, to implement munchArgs and to construct the calldefs list.Implement the procEntryExit2 function of the μ.Frame class.package Frame;171class Frame implements Temp.TempMap {⋮abstract public Assem.InstrList procEntryExit2(Assem.InstrList body);}This function appends a "sink" instruction to the function body to tell the register allocatorthat certain registers are live at procedure exit.

In the case of the Jouette machine, this issimply:package Jouette;class Frame extends Frame.Frame {⋮static TempList returnSink =L(ZERO, L(RA, L(SP, calleeSaves)));static Assem.InstrList append(Assem.InstrList a,Assem.InstrList b) {if (a==null) return b;else {Assem.InstrList p;for(p=a; p.tail!=null; p=p.tail) {}p.tail=b;return a;}}public Assem.InstrList procEntryExit2(Assem.InstrList body) {return append(body,new Assem.InstrList(new Assem.OPER("", null, returnSink),null));}}meaning that the temporaries zero, return-address, stack pointer, and all the callee-savesregisters are still live at the end of the function.

Having zero live at the end means that it islive throughout, which will prevent the register allocator from trying to use it for some otherpurpose. The same trick works for any other special registers the machine might have.Files available in $MINIJAVA/chap9 include:Canon/* Canonicalization and trace generation.Assem/* The Assem module.Main/Main.java A Main module that you may wish to adapt.Your code generator will handle only the body of each procedure or function, but not theprocedure entry/exit sequences. Use a "scaffold" version of Frame.procEntryExit3function:package μ;class Frame extends Frame.Frame {⋮public Frame.Proc procEntryExit3(Assem.InstrList body) {return new Frame.Proc("PROCEDURE " + name.toString() + "\n",body,"END " + name.toString() + "\n");172}}FURTHER READINGCattell [1980] expressed machine instructions as tree patterns, invented the maximal munchalgorithm for instruction selection, and built a code-generator generator to produce aninstruction selection function from a tree-pattern description of an instruction set.

Glanvilleand Graham [1978] expressed the tree patterns as productions in LR(1) grammars, whichallows the maximal munch algorithm to use multiple nonterminal symbols to representdifferent classes of registers and addressing modes. But grammars describing instruction setsare inherently ambiguous, leading to problems with the LR(1) approach; Aho et al. [1989] usedynamic programming to parse the tree grammars, which solves the ambiguity problem, anddescribe the Twig automatic code-generator generator. The dynamic programming can bedone at compiler-construction time instead of code-generation time [Pelegri-Llopart andGraham 1988]; using this technique, the BURG tool [Fraser et al.

1992] has an interfacesimilar to Twig's but generates code much faster.EXERCISES••••9.1 For each of the following expressions, draw the tree and generate Jouette-machineinstructions using maximal munch. Circle the tiles (as in Figure 9.2), but number themin the order that they are munched, and show the sequence of Jouette instructions thatresults.a. MOVE(MEM(+(+(CONST1000, MEM(TEMPx)), TEMPfp)), CONST0)b. BINOP(MUL, CONST5, MEM(CONST100))*9.2 Consider a machine with the following instruction:mult const1(src1), const2(src2), dst3r3 ← M[r1 + const1]* M[r2 + const2]On this machine, r0 is always 0, and M[1] always contains 1.•a.

Свежие статьи
Популярно сейчас