A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition) (798439), страница 51
Текст из файла (страница 51)
Link-timegraph coloring poses many problems for a system that allows dynamic incremental linking.Hashing Instead of global graph coloring, we can put a hash table in each class descriptor,mapping field names to offsets and method names to method instances. This works well withseparate compilation and dynamic linking.The characters of the field names are not hashed at run time. Instead, each field name a ishashed at compile time to an integer hasha in the range [0, N − 1].
Also, for each field name aunique run-time record (pointer) ptra is made.Each class descriptor has a field-offset table Ftab of size N containing field-offsets andmethod instances, and (for purposes of collision detection) a parallel key table Ktabcontaining field-name pointers.
If the class has a field x, then field-offset-table slot numberhashx contains the offset for x, and key-table slot number hashx contains the pointer ptrx.To fetch a field x of object c, the compiler generates code to1.2.3.4.5.Fetch the class descriptor d at offset 0 from object c.Fetch the field name f from the address offset d + Ktab + hashx.Test whether f = ptrx; if soFetch the field offset k from d + Ftab + hashx.Fetch the contents of the field from c + k.This algorithm has four instructions of overhead, which may still be tolerable.
A similaralgorithm works for dynamic method-instance lookup.The algorithm as described does not say what to do if the test at line 3 fails. Any hash-tablecollision-resolution technique can be used.[1]Distinct field name does not mean simple equivalence of strings. Each fresh declaration offield or method x (where it is not overriding the x of a parent class) is really a distinct name.14.4 TESTING CLASS MEMBERSHIPSome object-oriented languages allow the program to test membership of an object in a classat run time, as summarized in Table 14.6.Table 14.6.
Facilities for type testing and safe casting.Modula-3Test whether object x belongs class C, or to any subclass of C.Given a variable x of class C, where x actually points to anobject of class D that extends C, yield an expression whosecompile-time type is class D.JavaISTYPE(x,C)xinstanceofCNARROW(x,D)(D)x240Since each object points to its class descriptor, the address of the class descriptor can serve asa "type-tag." However, if x is an instance of D, and D extends C, then x is also an instance of C.Assuming there is no multiple inheritance, a simple way to implement x instanceof C is togenerate code that performs the following loop at run time:goto L1where t1.super is the superclass (parent class) of class t1.However, there is a faster approach using a display of parent classes. Assume that the classnesting depth is limited to some constant, such as 20. Reserve a 20-word block in each classdescriptor.
In the descriptor for a class D whose nesting depth is j, put a pointer to descriptor Din the jth slot, a pointer to D.super in the (j − 1)th slot, a pointer to D.super.super in slot j− 2, andsoonupto Object in slot 0. In all slots numbered greater than j, put nil.Now, if x is an instance of D, or of any subclass of D, then the jth slot of x's class descriptorwill point to the class descriptor D. Otherwise it will not.
So x instanceof D requires1. Fetch the class descriptor d at offset 0 from object c.2. Fetch the jth class-pointer slot from d.3. Compare with the class descriptor D.This works because the class-nesting depth of D is known at compile time.Type coercions Given a variable c of type C, it is always legal to treat c as any supertype of C- if C extends B, and variable b has type B, then the assignment b ← c is legal and safe.But the reverse is not true. The assignment c ← b is safe only if b is really (at run time) aninstance of C, which is not always the case. If we have b ← new B, c ← b, followed byfetching some field of c that is part of class C but not class B, then this fetch will lead tounpredictable behavior.Thus, safe object-oriented languages (such as Modula-3 and Java) accompany any coercionfrom a superclass to a subclass with a run-time type-check that raises an exception unless therun-time value is really an instance of the subclass (e.g., unless b instanceof C).It is a common idiom to writeModula-3:IF ISTYPE(b,C)THEN f(NARROW(b,C))ELSE ...Java:if (b instanceof C)f((C)b)else ...241Now there are two consecutive, identical type tests: one explicit (ISTYPE or instanceof)and one implicit (in NARROW or the cast).
A good compiler will do enough flow analysis tonotice that the then-clause is reached only if b is in fact an instance of C, so that the typecheck in the narrowing operation can be eliminated.C++ is an unsafe object-oriented language. It has a static cast mechanism without run-timechecking; careless use of this mechanism can make the program "go wrong" in unpredictableways. C++ also has dynamic_cast with run-time checking, which is like the mechanisms inModula-3 and Java.Typecase Explicit instanceof testing, followed by a narrowing cast to a subclass, is not awholesome "object-oriented" style. Instead of using this idiom, programmers are expected touse dynamic methods that accomplish the right thing in each subclass.
Nevertheless, the testthen-narrow idiom is fairly common.Modula-3 has a typecase facility that makes the idiom more beautiful and efficient (but notany more "object-oriented"):TYPECASE exprOF C1 (v1) => S1| C2 (v2) => S2⋮| Cn (vn) => SnELSE S0ENDIf the expr evaluates to an instance of class Ci, then a new variable vi of type Ci points to theresult of the expr, and statement Si is executed. The declaration of vi is implicit in theTYPECASE, and its scope covers only Si.If more than one of the Ci match (which can happen if, for example, one is a superclass ofanother), then only the first matching clause is taken.
If none of the Ci match, then the ELSEclause is taken (statement S0 is executed).Typecase can be converted straightforwardly to a chain of else-ifs, with each if doing aninstance test, a narrowing, and a local variable declaration. However, if there are very manyclauses, then it can take a long time to go through all the else-ifs. Therefore it is attractive totreat it like a case (switch) statement on integers, using an indexed jump (computed goto).That is, an ordinary case statement on integers:ML:case iof 0 => s0| 1=> s1| 2=> s2| 3=> s3| 4=> s4|_=> sdC, Java:switch (i) {case 0: s0; break;case 1: s1; break;case 2: s2; break;case 3: s3; break;case 4: s4; break;default: sd;}is compiled as follows: First a range-check comparison is made to ensure that i is within therange of case labels (0-4, in this case); then the address of the ith statement is fetched from theith slot of a table, and control jumps to si.242This approach will not work for typecase, because of subclassing.
That is, even if we couldmake class descriptors be small integers instead of pointers, we cannot do an indexed jumpbased on the class of the object, because we will miss clauses that match superclasses of thatclass. Thus, Modula-3 typecase is implemented as a chain of else-ifs.Assigning integers to classes is not trivial, because separately compiled modules can eachdefine their own classes, and we do not want the integers to clash. But a sophisticated linkermight be able to assign the integers at link time.If all the classes in the typecase were final classes (in the sense used by Java, that theycannot be extended), then this problem would not apply.
Modula-3 does not have finalclasses; and Java does not have typecase. But a clever Java system might be able to recognizea chain of else-ifs that do instanceof tests for a set of final classes, and generate a indexedjump.14.5 PRIVATE FIELDS AND METHODSTrue object-oriented languages can protect fields of objects from direct manipulation by otherobjects' methods. A private field is one that cannot be fetched or updated from any function ormethod declared outside the object; a private method is one that cannot be called from outsidethe object.Privacy is enforced by the type-checking phase of the compiler. In the symbol table of C,along with each field offset and method offset, is a boolean flag indicating whether the field isprivate.
When compiling the expression c.f() or c.x, it is a simple matter to check that fieldand reject accesses to private fields from any method outside the object declaration.There are many varieties of privacy and protection. Different languages allow••••Fields and methods which are accessible only to the class that declares them.Fields and methods accessible to the declaring class, and to any subclasses of thatclass.Fields and methods accessible only within the same module (package, namespace) asthe declaring class.Fields that are read-only from outside the declaring class, but writable by methods ofthe class.In general, these varieties of protection can be statically enforced by compiletime typechecking, for class-based languages.14.6 CLASSLESS LANGUAGESSome object-oriented languages do not use the notion of class at all.
In such a language, eachobject implements whatever methods and has whatever data fields it wants. Type-checking forsuch languages is usually dynamic (done at run time) instead of static (done at compile time).Many objects are created by cloning: copying an existing object (or template object) and thenmodifying some of the fields. Thus, even in a classless language there will be groups("pseudo-classes") of similar objects that can share descriptors. When b is created by cloninga, it can share a descriptor with a.
Only if a new field is added or a method field is updated(overridden) does b require a new descriptor.243The techniques used in compiling classless languages are similar to those for class-basedlanguages with multiple inheritance and dynamic linking: Pseudo-class descriptors containhash tables that yield field offsets and method instances.The same kinds of global program analysis and optimization that are used for class-basedlanguages - finding which method instance will be called from a (dynamic) method call site are just as useful for classless languages.14.7 OPTIMIZING OBJECT-ORIENTED PROGRAMSAn optimization of particular importance to object-oriented languages (which also benefitfrom most optimizations that apply to programming languages in general) is the conversion ofdynamic method calls to static method-instance calls.Compared with an ordinary function call, at each method call site there is a dynamic methodlookup to determine the method instance.