K. Cooper, L. Torczon - Engineering a Compiler (2011 - 2nd edition) (798440), страница 82
Текст из файла (страница 82)
Thus, the second phase of the collector can treat thatobject as dead. Some objects marked as live may also be dead. However, thecollector lets them survive because it cannot prove them dead. As the secondphase traverses the heap to collect the garbage, it can reset the mark fields to“unmarked.” This lets the collector avoid the initial traversal of the heap inthe marking phase.Mark-Sweep CollectorsMark-sweep collectors reclaim and recycle objects by making a linear passover the heap. The collector adds each unmarked object to the free list (orone of the free lists), where the allocator will find it and reuse it.
With asingle free list, the same collection of techniques used to coalesce blocksin the first-fit allocator applies. If compaction is desirable, it can be implemented by incrementally shuffling live objects downward during the sweep,or with a postsweep compaction pass.Copying CollectorsCopying collectors divide memory into two pools, an old pool and a newpool.
The allocator always operates from the old pool. The simplest type ofcopying collector is called stop and copy. When an allocation fails, a stopand copy collector copies all the live data from the old pool into the new pooland swaps the identities of the old and new pools. The act of copying livedata compacts it; after collection, all the free space is in a single contiguousblock. Collection can be done in two passes, like mark sweep, or it can bedone incrementally, as live data is discovered.
An incremental scheme canmark objects in the old pool as it copies them to avoid copying the sameobject multiple times.An important family of copying collectors are the generational collectors.These collectors capitalize on the observation that an object that survives6.6 Advanced Topics 321one collection is more likely to survive subsequent collections.
To capitalizeon this observation, generational collectors periodically repartition their“new” pool into a “new” and an “old” pool. In this way, successive collections examine only newly allocated objects. Generational schemes varyin how often they declare a new generation, freezing the surviving objectsand exempting them from the next collection, and whether or not theyperiodically re-examine the older generations.Comparing the TechniquesGarbage collection frees the programmer from needing to worry about whento release memory and from tracking down the inevitable storage leaksthat result from attempting to manage allocation and deallocation explicitly.The individual schemes have their strengths and weaknesses. In practice,the benefits of implicit deallocation outweigh the disadvantages of eitherscheme for most applications.Reference counting distributes the cost of deallocation more evenly across program execution than does batch collection.
However, it increases the cost ofevery assignment that involves a heap-allocated value—even if the programnever runs out of free space. In contrast, batch collectors incur no cost until theallocator fails to find needed space. At that point, however, the program incursthe full cost of collection. Thus, any allocation can provoke a collection.Mark-sweep collectors examine the entire heap, while copying collectorsonly examine the live data.
Copying collectors actually move every liveobject, while mark-sweep collectors leave them in place. The tradeoffbetween these costs will vary with the application’s behavior and with theactual costs of various memory references.Reference-counting implementations and conservative batch collectors haveproblems recognizing cyclic structures, because they cannot distinguishbetween references from within the cycle and those from without. The marksweep collectors start from an external set of pointers, so they discover thata dead cyclic structure is unreachable. The copying collectors, starting fromthe same set of pointers, simply fail to copy the objects involved in the cycle.Copying collectors compact memory as a natural part of the process.
Thecollector can either update all the stored pointers, or it can require use ofan indirection table for each object access. A precise mark-sweep collectorcan compact memory, too. The collector would move objects from one endof memory into free space at the other end. Again, the collector can eitherrewrite the existing pointers or mandate use of an indirection table.In general, a good implementor can make both mark sweep and copying work well enough that they are acceptable for most applications.
In322 CHAPTER 6 The Procedure Abstractionapplications that cannot tolerate unpredictable overhead, such as real-timecontrollers, the runtime system must incrementalize the process, as the amortized reference-counting scheme does. Such collectors are called real-timecollectors.6.7 SUMMARY AND PERSPECTIVEThe primary rationale for moving beyond assembly language is to provide a more abstract programming model and, thus, raise both programmerproductivity and the understandability of programs.
Each abstraction that aprogramming language supports needs a translation to the target machine’sinstruction set. This chapter has explored the techniques commonly used totranslate some of these abstractions.Procedural programming was invented early in the history of programming.Some of the first procedures were debugging routines written for early computers; the availability of these prewritten routines allowed programmers tounderstand the runtime state of an errant program. Without such routines,tasks that we now take for granted, such as examining the contents of a variable or asking for a trace of the call stack, required the programmer to enterlong machine-language sequences without error.The introduction of lexical scoping in languages like Algol 60 influencedlanguage design for decades.
Most modern programming languages carryforward some of Algol’s philosophy toward naming and addressability.Techniques developed to support lexical scoping, such as access links anddisplays, reduced the runtime cost of this abstraction. These techniques arestill used today.Object-oriented languages take the scoping concepts of alls and reorientthem in data-directed ways. The compiler for an object-oriented languageuses both compile-time and runtime structures invented for lexical scopingto implement the naming discipline imposed by the inheritance hierarchy ofa specific program.Modern languages have added some new twists.
By making proceduresfirst-class objects, languages like Scheme have created new controlflow paradigms. These require variations on traditional implementationtechniques—for example, heap allocation of activation records. Similarly,the growing acceptance of implicit deallocation requires occasional conservative treatment of a pointer. If the compiler can exercise a little more careand free the programmer from ever deallocating storage again, that appearsto be a good tradeoff.
(Generations of experience suggest that programmersChapter Notes 323are not effective at freeing all the storage that they allocate. They also freeobjects to which they retain pointers.)As new programming paradigms emerge, they will introduce new abstractions that require careful thought and implementation. By studying thesuccessful techniques of the past and understanding the constraints and costsinvolved in real implementations, compiler writers will develop strategiesthat decrease the runtime penalty for using higher levels of abstraction.nCHAPTER NOTESMuch of the material in this chapter comes from the accumulated experienceof the compiler-construction community. The best way to learn more aboutthe name-space structures of various languages is to consult the languagedefinitions themselves.
These documents are a necessary part of a compilerwriter’s library.Procedures appeared in the earliest high-level languages—that is, languages that were more abstract than assembly language. fortran [27] andAlgol 60 [273] both had procedures with most of the features found in modern languages. Object-oriented languages appeared in the late 1960s withsimula 67 [278] followed by Smalltalk 72 [233].Lexical scoping was introduced in Algol 60 and has persisted to the presentday. The early Algol compilers introduced most of the support mechanisms described in this chapter, including activation records, access links,and parameter-passing techniques.
Much of the material from Sections 6.3through 6.5 was present in these early systems [293]. Optimizations quicklyappeared, like folding storage for a block-level scope into the containingprocedure’s activation record. The ibm 370 linkage conventions recognizedthe difference between leaf procedures and others; they avoided allocatinga register save area for leaf routines. Murtagh took a more complete andsystematic approach to coalescing activation records [272].The classic reference on memory allocation schemes is Knuth’s Art of Computer Programming [231, § 2.5]. Modern multipool allocators appeared inthe early 1980s. Reference counting dates to the early 1960s and has beenused in many systems [95, 125].
Cohen and later Wilson, provide broad surveys of the literature on garbage collection [92, 350]. Conservative collectors were introduced by Boehm and Weiser [44, 46, 120]. Copying collectorsappeared in response to virtual memory systems [79, 144]; they led, somewhat naturally, to the generational collectors in widespread use today [247,337]. Hanson introduced the notion of arena-based allocation [179].324 CHAPTER 6 The Procedure AbstractionnSection 6.2EXERCISES1.