K. Cooper, L. Torczon - Engineering a Compiler (2011 - 2nd edition) (798440), страница 46
Текст из файла (страница 46)
These languages support element-by-element application of arithmetic operations to arrays. For4.2 An Introduction to Type Systems 17310 × 10 arrays x, y, and z, indexed from 1 to 10, the statement x = y + z wouldoverwrite each x[i,j] with y[i,j] + z[i,j] for all 1 ≤ i, j ≤ 10. apl takesthe notion of array operations further than most languages; it includes operators for inner product, outer product, and several kinds of reductions. Forexample, the sum reduction of y, written x ← +/y, assigns x the scalar sumof the elements of y.An array can be viewed as a constructed type because we construct an arrayby specifying the type of its elements.
Thus, a 10 × 10 array of integers hastype two-dimensional array of integers. Some languages include the array’sdimensions in its type; thus a 10 × 10 array of integers has a different typethan a 12 × 12 array of integers. This lets the compiler catch array operationsin which dimensions are incompatible as a type error. Most languages allowarrays of any base type; some languages allow arrays of constructed typesas well.StringsSome programming languages treat strings as a constructed type. pl/i, forexample, has both bit strings and character strings.
The properties, attributes,and operations defined on both of these types are similar; they are propertiesof a string. The range of values allowed in any position differs between a bitstring and a character string. Thus, viewing them as string of bit and string ofcharacter is appropriate. (Most languages that support strings limit the builtin support to a single string type—the character string.) Other languages,such as c, support character strings by handling them as arrays of characters.A true string type differs from an array type in several important ways. Operations that make sense on strings, such as concatenation, translation, andcomputing the length, may not have analogs for arrays. Conceptually, stringcomparison should work from lexicographic order, so that "a" < "boo" and"fee" < "fie". The standard comparison operators can be overloaded andused in the natural way.
Implementing comparison for an array of characterssuggests an equivalent comparison for an array of numbers or an array ofstructures, where the analogy to strings may not hold. Similarly, the actuallength of a string may differ from its allocated size, while most uses of anarray use all the allocated elements.Enumerated TypesMany languages allow the programmer to create a type that contains a specific set of constant values.
An enumerated type, introduced in Pascal, letsthe programmer use self-documenting names for small sets of constants.Classic examples include days of the week and months. In c syntax, thesemight be174 CHAPTER 4 Context-Sensitive Analysisenum WeekDay {Monday, Tuesday, Wednesday,Thursday, Friday, Saturday, Sunday};enum Month {January, February, March, April,May, June, July, August, September,October, November, December};The compiler maps each element of an enumerated type to a distinct value.The elements of an enumerated type are ordered, so comparisons betweenelements of the same type make sense. In the examples, Monday < Tuesdayand June < July.
Operations that compare different enumerated typesmake no sense—for example, Tuesday > September should produce a typeerror, Pascal ensures that each enumerated type behaves as if it were a subrange of the integers. For example, the programmer can declare an arrayindexed by the elements of an enumerated type.Structures and VariantsStructures, or records, group together multiple objects of arbitrary type. Theelements, or members, of the structure are typically given explicit names.For example, a programmer implementing a parse tree in c might need nodeswith both one and two children.struct Node1 {struct Node2 {structNode2 *left;Node2 *right;unsigned Operator;structNode1 *left;unsigned Operator;intstructValueint}Value}The type of a structure is the ordered product of the types of the individual elements that it contains. Thus, we might describe the type ofa Node1 as (Node1 *) × unsigned × int, while a Node2 would be(Node2 *) × (Node2 *) × unsigned × int.
These new types should havethe same essential properties that a base type has. In c, autoincrementinga pointer to a Node1 or casting a pointer into a Node1 * has the desiredeffect—the behavior is analogous to what happens for a base type.Many programming languages allow the creation of a type that is the unionof other types. For example, some variable x can have the type integer orboolean or WeekDay. In Pascal, this is accomplished with variant records—a record is the Pascal term for a structure.
In c, this is accomplished with aunion. The type of a union is the alternation of its component types; thusour variable x has type integer ∪ boolean ∪ WeekDay. Unions can also4.2 An Introduction to Type Systems 175AN ALTERNATIVE VIEW OF STRUCTURESThe classical view of structures treats each kind of structure as a distincttype.
This approach to structure types follows the treatment of other aggregates, such as arrays and strings. It seems natural. It makes distinctions thatare useful to the programmer. For example, a tree node with two childrenprobably should have a different type than a tree node with three children;presumably, they are used in different situations. A program that assignsa three-child node to a two-child node should generate a type error and awarning message to the programmer.From the perspective of the runtime system, however, treating each structure as a distinct type complicates the picture.
With distinct structure types,the heap contains an arbitrary set of objects drawn from an arbitrary setof types. This makes it difficult to reason about programs that deal directlywith the objects on the heap, such as a garbage collector. To simplify suchprograms, their authors sometimes take a different approach to structuretypes.This alternate model considers all structures in the program as a singletype. Individual structure declarations each create a variant form of thetype structure. The type structure, itself, is the union of all these variants.This approach lets the program view the heap as a collection of objects ofa single type, rather than a collection of many types. This view makes codethat manipulates the heap much simpler to analyze and optimize.include structures of distinct types, even when the individual structure typeshave different lengths.
The language must provide a mechanism to referenceeach field unambiguously.PointersPointers are abstract memory addresses that let the programmer manipulatearbitrary data structures. Many languages include a pointer type. Pointerslet a program save an address and later examine the object that it addresses.Pointers are created when objects are created (new in Java or malloc in c).Some languages provide an operator that returns the address of an object,such as c’s & operator.To protect programmers from using a pointer to type t to reference a structureof type s, some languages restrict pointer assignment to “equivalent” types.In these languages, the pointer on the left-hand side of an assignment musthave the same type as the expression on the right-hand side. A program canlegally assign a pointer to integer to a variable declared as pointer to integerbut not to one declared as pointer to pointer to integer or pointer to boolean.The address operator, when applied to an objectof type t, returns a value of type pointer to t.176 CHAPTER 4 Context-Sensitive AnalysisThese latter assignments are either illegal or require an explicit conversionby the programmer.PolymorphismA function that can operate on arguments ofdifferent types is a polymorphic function.If the set of types must be specified explicitly, thefunction uses ad hoc polymorphism; if thefunction body does not specify types, it usesparametric polymorphism.Of course, the mechanism for creating new objects should return an objectof the appropriate type.
Thus, Java’s new explicitly creates a typed object;other languages use a polymorphic routine that takes the return type as aparameter. ansi c handles this in an unusual way: The standard allocationroutine malloc returns a pointer to void. This forces the programmer tocast the value returned by each call to malloc.Some languages allow direct manipulation of pointers. Arithmetic on pointers, including autoincrement and autodecrement, allow the program toconstruct new pointers.
c uses the type of a pointer to determine autoincrement and decrement magnitudes. The programmer can set a pointer to thestart of an array; autoincrementing advances the pointer from one elementin the array to the next element.Type safety with pointers relies on an implicit assumption that addressescorrespond to typed objects. The ability to construct new pointers seriously reduces the ability of both the compiler and its runtime system toreason about pointer-based computations and to optimize such code. (See,for example, Section 8.4.1.)Type Equivalencestruct Tree {struct Tree *left;struct Tree *right;int value}struct STree {struct STree *left;struct STree *right;int value}A critical component of any type system is the mechanism that it uses todecide whether or not two different type declarations are equivalent.