A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition), страница 29
Описание файла
PDF-файл из архива "A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)", который расположен в категории "". Всё это находится в предмете "конструирование компиляторов" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 29 страницы из PDF
It should also put the assembly-language fragmentframe.string(lab,lit) onto a global list of such fragments to be handed to the codeemitter. "Fragments" are discussed further on page 157.All string operations are performed in functions provided by the runtime system; thesefunctions heap-allocate space for their results, and return pointers. Thus, the compiler (almost)doesn't need to know what the representation is, as long as it knows that each string pointer isexactly one word long.
We say "almost" because string literals must be represented.In Pascal, strings are fixed-length arrays of characters; literals are padded with blanks to makethem fit. This is not very useful. In C, strings are pointers to variable-length, zero-terminatedsequences. This is much more useful, though a string containing a zero byte cannot berepresented.132RECORD AND ARRAY CREATIONImagine a language construct {e1, e2, …, en} which creates an n-element record initialized tothe values of expressions ei.
This is like an object constructor that initializes all the instancevariables of the object. Such a record may outlive the procedure activation that creates it, so itcannot be allocated on the stack. Instead, it must be allocated on the heap. If there is noprovision for freeing records (or strings), industrial-strength systems should have a garbagecollector to reclaim unreachable records (see Chapter 13).The simplest way to create a record is to call an external memory-allocation function thatreturns a pointer to an n-word area into a new temporary r.
Then a series of MOVE trees caninitialize offsets 0, 1W; 2W, …, (n − 1)W from r with the translations of expressions ei.Finally, the result of the whole expression is TEMP(r), as shown in Figure 7.4.Figure 7.4: Object initialization.In an industrial compiler, calling malloc (or its equivalent) on every record creation might betoo slow; see Section 13.7.Array creation is very much like record creation, except that all the fields are initialized to thesame value.
The external initArray function can take the array length and the initializingvalue as arguments, see later.In MiniJava we can create an array of integers by the constructnew int [exp]where exp is an expression that evaluates to an integer. This will create an integer array of alength determined by the value of exp and with each value initialized to zero.To translate array creation, the compiler needs to perform the following steps:1. Determine how much space is needed for the array.
This can be calculated by:133The reason we add one to the length of the array is that we want to store the length ofthe array along with the array. This is needed for bounds checking and to determinethe length at run time.2. Call an external function to get space on the heap. The call will return a pointer to thebeginning of the array.3. Generate code for saving the length of the array at offset 0.4. Generate code for initializing each of the values in the array to zero starting at offset 4.Calling runtime-system functions. To call an external function named init-Array witharguments a, b, simply generate a CALL such asstatic Label initArray = new Label("initArray");new CALL(new NAME(initArray),new Tree.ExpList(a, new Tree.ExpList(b, null)));This refers to an external function initArray which is written in a language such as C orassembly language - it cannot be written in MiniJava because MiniJava has no mechanism formanipulating raw memory.But on some operating systems, the C compiler puts an underscore at the beginning of eachlabel; and the calling conventions for C functions may differ from those of MiniJavafunctions; and C functions don't expect to receive a static link, and so on.
All these targetmachine-specific details should be encapsulated into a function provided by the Framestructure:public abstract class Frame {⋮abstract public Tree.Exp externalCall(String func,Tree.ExpList args);}where externalCall takes the name of the external procedure and the arguments to bepassed.The implementation of externalCall depends on the relationship between MiniJava'sprocedure-call convention and that of the external function. The simplest possibleimplementation looks likeTree.Exp externalCall(String s, Tree.ExpList args) {return new Tree.CALL(new Tree.NAME(new Temp.Label(s)),args);}but may have to be adjusted for static links, or underscores in labels, and so on. Also, callingnew Label(s) repeatedly with the same s makes several label objects that all mean the samething; this may confuse other parts of the compiler, so it might be useful to maintain a stringto-label table to avoid duplication.WHILE LOOPSThe general layout of a while loop is134test:if not(condition) goto done bodygoto testdone:If a break statement occurs within the body (and not nested within any interior whilestatements), the translation is simply a JUMP to done.Translation of break statements needs to have a new formal parameter break that is the donelabel of the nearest enclosing loop.
In translating a while loop, the translator will be calledrecursively upon body with the done label passed as the break parameter. When the translatoris recursively calling itself in nonloop contexts, it can simply pass down the same breakparameter that was passed to it.FOR LOOPSA for statement can be expressed using other kinds of statements:A straightforward approach to the translation of for statements is to rewrite the abstractsyntax into the abstract syntax of the while statement shown, and then translate the result.This is almost right, but consider the case where limit=maxint. Then i + 1 will overflow;either a hardware exception will be raised, or i ≤ limit will always be true! The solution is toput the test at the bottom of the loop, where i < limit can be tested before the increment.
Thenan extra test will be needed before entering the loop to check lo ≤ hi.FUNCTION CALLTranslating a function call f(a1, …an) is simple:where lf is the label for f. In an object-oriented language, the implicit variable this must bemade an explicit argument of the call. That is, p.m(a1, …an) is translated aswhere p belongs to class c, and c$m is the m method of class c. For a static method, thecomputation of address lc$m can be done at compile time - it's a simple label, as it is inMiniJava. For dynamic methods, the computation is more complicated, as explained inChapter 14.135STATIC LINKSSome programming languages (such as Pascal, Scheme, and ML) support nesting of functionsso that the inner functions can refer to variables declared in the outer functions. Whenbuilding a compiler for such a language, frame representations and variable access are a bitmore complicated.When a variable x is declared at an outer level of static scope, static links must be used.
Thegeneral form iswhere the k1, …, kn−1 are the various static-link offsets in nested functions, and kn is the offsetof x in its own frame.In creating TEMP variables, those temporaries that escape (i.e., are called from within an innerfunction) must be allocated in the stack frame, not in a register. When accessing such atemporary from either the same function or an inner function, we must pass the appropriatestatic link.
The exp method of Frame.Access would need to calculate the appropriate chain ofdereferences.Translating a function call f(a1, …an) using static links requires that the static link must beadded as an implicit extra argument:Here lf is the label for f, and sl is the static link, computed as described in Chapter 6. To dothis computation, both the level of f and the level of the function calling f are required.
Achain of (zero or more) offsets found in successive level descriptors is fetched, starting withthe frame pointer TEMP(FP) defined by the Frame module.[1]A different way of checking for nil is to unmap page 0 in the virtual-memory page tables,so that attempting to fetch/store fields of a nil record results in a page fault.7.3 DECLARATIONSFor each variable declaration within a function body, additional space will be reserved in theframe. Also, for each function declaration, a new "fragment" of Tree code will be kept for thefunction's body.VARIABLE DEFINITIONThe translation of a variable declaration should return an augmented type environment that isused in processing the remainder of the function body.However, the initialization of a variable translates into a Tree expression that must be put justbefore the body of the function.
Therefore, the translator must return a Translate.Expcontaining assignment expressions that accomplish these initializations.136If the translator is applied to function and type declarations, the result will be a "no-op"expression such as Ex(CONST(0)).FUNCTION DEFINITIONA function is translated into a segment of assembly language with a prologue, a body, andanepilogue. The body of a function is an expression, and the body of the translation is simply thetranslation of that expression.The prologue, which precedes the body in the assembly-language version of the function,contains1.
pseudo-instructions, as needed in the particular assembly language, to announce thebeginning of a function;2. a label definition for the function name;3. an instruction to adjust the stack pointer (to allocate a new frame);4. instructions to save "escaping" arguments into the frame, and to move nonescapingarguments into fresh temporary registers;5. store instructions to save any callee-save registers - including the return addressregister - used within the function.Then comes6.
the function body.The epilogue comes after the body and contains7. an instruction to move the return value (result of the function) to the register reservedfor that purpose;8. load instructions to restore the callee-save registers;9. an instruction to reset the stack pointer (to deallocate the frame);10. a return instruction (JUMP to the return address);11.
pseudo-instructions, as needed, to announce the end of a function.Some of these items (1, 3, 9, and 11) depend on exact knowledge of the frame size, which willnot be known until after the register allocator determines how many local variables need to bekept in the frame because they don't fit in registers.
So these instructions should be generatedvery late, in a FRAME function called procEntryExit3 (see also page 252). Item 2 (and 10),nestled between 1 and 3 (and 9 and 11, respectively) are also handled at that time.To implement 7, the Translate phase should generate a move instructionMOVE(RV, body)that puts the result of evaluating the body in the return value (RV) location specified by themachine-specific frame structure:package Frame;public abstract class Frame {⋮abstract public Temp RV();}Item 4 (moving incoming formal parameters), and 5 and 8 (the saving and restoring of calleesave registers), are part of the view shift described on page 128.