A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition), страница 22
Описание файла
PDF-файл из архива "A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)", который расположен в категории "". Всё это находится в предмете "конструирование компиляторов" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 22 страницы из PDF
For a call e.m(…), where e has type C, we look up thedefinition of m in class C. The parameter types must then be matched against the arguments inthe function-call expression. The result type of the method becomes the type of the methodcall as a whole.Every kind of statement and expression has its own type-checking rules, but in all the caseswe have not already described, the rules can be derived by reference to the Java LanguageSpecification.ERROR HANDLINGWhen the type-checker detects a type error or an undeclared identifier, it should print an errormessage and continue - because the programmer would like to be told of all the errors in the102program.
To recover after an error, it's often necessary to build data structures as if a validexpression had been encountered. For example, type-checking{int i = new C();int j = i + i;...}even though the expression new C() doesn't match the type required to initialize an integer, itis still useful to enter i in the symbol table as an integer so that the rest of the program can betype-checked.If the type-checking phase detects errors, then the compiler should not produce a compiledprogram as output.
This means that the later phases of the compiler - translation, registerallocation, etc. - will not be executed. It will be easier to implement the later phases of thecompiler if they are not required to handle invalid inputs. Thus, if at all possible, all errors inthe input program should be detected in the front end of the compiler (parsing and typechecking).PROGRAM TYPE-CHECKINGDesign a set of visitors which type-checks a MiniJava program and produces any appropriateerror messages about mismatching types or undeclared identifiers.EXERCISES••5.1 Improve the hash table implementation of Program 5.2: Double the size of thearray when the average bucket length grows larger than 2 (so table is now a pointerto a dynamically allocated array).
To double an array, allocate a bigger one and rehashthe contents of the old array; then discard the old array.***5.2 In many applications, we want a + operator for environments that does morethan add one new binding; instead of σ′ = σ + {a ↦ τ}, we want σ′ = σ1 + σ2, whereσ1 and σ2 are arbitrary environments (perhaps overlapping, in which case bindings inσ2 take precedence).We want an efficient algorithm and data structure for environment "adding." Balancedtrees can implement σ + {a ↦ τ} efficiently (in log(N) time, where N is the size of σ),but take O(N) to compute σ1 + σ2 if σ1 and σ2 are both about size N.To abstract the problem, solve the general nondisjoint integer-set union problem.
Theinput is a set of commands of the form,103An efficient algorithm is one that can process an input of N commands, answering allmembership queries, in less than o(N2) time.••*a. Implement an algorithm that is efficient when a typical set union a ← b∪ c has bmuch smaller than c [Brown and Tarjan 1979].***b. Design an algorithm that is efficient even in the worst case, or prove that thiscan't be done (see Lipton et al. [1997] for a lower bound in a restricted model).104Chapter 6: Activation Recordsstack: an orderly pile or heapWebster's DictionaryOVERVIEWIn almost any modern programming language, a function may have local variables that arecreated upon entry to the function.
Several invocations of the function may exist at the sametime, and each invocation has its own instantiations of local variables.In the Java methodint f(int x) {int y = x+x;if (y<10)return f(y);elsereturn y-1;}a new instantiation of x is created (and initialized by f's caller) each time that f is called.Because there are recursive calls, many of these x's exist simultaneously. Similarly, a newinstantiation of y is created each time the body of f is entered.In many languages (including C, Pascal, and Java), local variables are destroyed when afunction returns. Since a function returns only after all the functions it has called havereturned, we say that function calls behave in last-in-first-out (LIFO) fashion. If localvariables are created on function entry and destroyed on function exit, then we can use a LIFOdata structure - a stack - to hold them.HIGHER-ORDER FUNCTIONSBut in languages supporting both nested functions and function-valued variables, it may benecessary to keep local variables after a function has returned! Consider Program 6.1: This islegal in ML, but of course in C one cannot really nest the function g inside the function f.PROGRAM 6.1: An example of higher-order functions.int (*)() f(int x) {fun f(x) =int g(int y) {returnlet fun g(y) = x+yx+y;}in greturn g;end}val h = f(3)int (*h)() = f(3);val j = f(4)int (*j)() = f(4);val z = h(5)int z = h(5);val w = j(7)int w = j(7);(a) Written in ML(b) Written in pseudo-C105When f(3) is executed, a new local variable x is created for the activation of function f.
Then gis returned as the result of f(x); but g has not yet been called, so y is not yet created.At this point f has returned, but it is too early to destroy x, because when h(5) is eventuallyexecuted it will need the value x = 3. Meanwhile, f(4) is entered, creating a different instanceof x, and it returns a different instance of g in which x = 4.It is the combination of nested functions (where inner functions may use variables defined inthe outer functions) and functions returned as results (or stored into variables) that causeslocal variables to need lifetimes longer than their enclosing function invocations.Pascal has nested functions, but it does not have functions as returnable values.
C hasfunctions as returnable values, but not nested functions. So these languages can use stacks tohold local variables.ML, Scheme, and several other languages have both nested functions and functions asreturnable values (this combination is called higher-order functions). So they cannot usestacks to hold all local variables. This complicates the implementation of ML and Scheme but the added expressive power of higher-order functions justifies the extra implementationeffort.For the remainder of this chapter we will consider languages with stackable local variablesand postpone discussion of higher-order functions to Chapter 15.
Notice that while Javaallows nesting of functions (via inner classes), MiniJava does not.6.1 STACK FRAMESThe simplest notion of a stack is a data structure that supports two operations, push and pop.However, it turns out that local variables are pushed in large batches (on entry to functions)and popped in large batches (on exit). Furthermore, when local variables are created they arenot always initialized right away. Finally, after many variables have been pushed, we want tocontinue accessing variables deep within the stack.
So the abstract push and pop model is justnot suitable.Instead, we treat the stack as a big array, with a special register - the stack pointer - that pointsat some location. All locations beyond the stack pointer are considered to be garbage, and alllocations before the stack pointer are considered to be allocated. The stack usually grows onlyat the entry to a function, by an increment large enough to hold all the local variables for thatfunction, and, just before the exit from the function, shrinks by the same amount. The area onthe stack devoted to the local variables, parameters, return address, and other temporaries fora function is called the function's activation record or stack frame.
For historical reasons, runtime stacks usually start at a high memory address and grow toward smaller addresses. Thiscan be rather confusing: Stacks grow downward and shrink upward, like icicles.The design of a frame layout takes into account the particular features of an instruction setarchitecture and the programming language being compiled. However, the manufacturer of acomputer often prescribes a "standard" frame layout to be used on that architecture, wherepossible, by all compilers for all programming languages.
Sometimes this layout is not themost convenient one for a particular programming language or compiler. But by using the"standard" layout, we gain the considerable benefit that functions written in one language cancall functions written in another language.106Figure 6.2 shows a typical stack frame layout. The frame has a set of incoming arguments(technically these are part of the previous frame but they are at a known offset from the framepointer) passed by the caller. The return address is created by the CALL instruction and tellswhere (within the calling function) control should return upon completion of the currentfunction. Some local variables are in this frame; other local variables are kept in machineregisters.
Sometimes a local variable kept in a register needs to be saved into the frame tomake room for other uses of the register; there is an area in the frame for this purpose. Finally,when the current function calls other functions, it can use the outgoing argument space to passparameters.Figure 6.2: A stack frame.107THE FRAME POINTERSuppose a function g(…) calls the function f(a1,…, an). We say g is the caller and f is thecallee. On entry to f, the stack pointer points to the first argument that g passes to f . On entry,f allocates a frame by simply subtracting the frame size from the stack pointer SP.The old SP becomes the current frame pointer FP.