Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 81
Текст из файла (страница 81)
Whenthe last external pointer is discarded, the cycle becomes bothunreachable and nonrecyclable. To ensure that all such objects are freed,the programmer must break the cycle before discarding the last pointerto the cycle. (The alternative, to perform reachability analysis on thepointers at runtime, would make reference counting prohibitivelyexpensive.) Many categories of heap-allocated objects, such asvariable-length strings and activation records, cannot be involved insuch cycles.Reference counting incurs additional cost on every pointer assignment. Theamount of work done for a specific pointer assignment can be bounded; inany well-designed scheme, the total cost can be limited to some constantfactor times the number of pointer assignments executed plus the numberof objects allocated.
Proponents of reference counting argue that these overheads are small enough and that the pattern of reuse in reference-countingsystems produces good program locality. Opponents of reference counting argue that real programs do more pointer assignments than allocations,so that garbage collection achieves equivalent functionality with less totalwork.Batch CollectorsIf the collector cannot free any space, then itmust request additional space from the system.If none is available, allocation fails.Batch collectors consider deallocation only when the free-space pool hasbeen exhausted. When the allocator fails to find needed space, it invokesthe batch collector.
The collector pauses the program’s execution, examines the pool of allocated memory to discover unused objects, and reclaimstheir space. When the collector terminates, the free-space pool is usuallynonempty. The allocator can finish its original task and return a newly allocated object to the caller. (As with reference counting, schemes exist thatperform collection incrementally to amortize the cost over longer periods ofexecution.)6.6 Advanced Topics 319Logically, batch collectors proceed in two phases.
The first phase discovers the set of objects that can be reached from pointers stored in programvariables and compiler-generated temporaries. The collector conservativelyassumes that any object reachable in this manner is live and that the remainder are dead. The second phase deallocates and recycles dead objects.Two commonly used techniques are mark-sweep collectors and copyingcollectors. They differ in their implementation of the second phase ofcollection—recycling.Identifying Live DataCollecting allocators discover live objects by using a marking algorithm.The collector needs a bit for each object in the heap, called a mark bit. Thisbit can be stored in the object’s header, alongside tag information used torecord pointer locations or object size.
Alternatively, the collector can createa dense bit map for the heap when needed. The initial step clears all the markbits and builds a worklist that contains all the pointers stored in registers andin variables accessible to current or pending procedures. The second phaseof the algorithm walks forward from these pointers and marks every objectthat is reachable from this set of visible pointers.Figure 6.12 presents a high-level sketch of a marking algorithm.
It is a simplefixed-point computation that halts because the heap is finite and the marksprevent a pointer contained in the heap from entering the Worklist morethan once. The cost of marking is, in the worst case, proportional to thenumber of pointers contained in program variables and temporaries plus thesize of the heap.The marking algorithm can be either precise or conservative. The differencelies in how the algorithm determines that a specific data value is a pointer inthe final line of the while loop.nnIn a precise collector, the compiler and runtime system know the typeand layout of each object.
This information can be recorded in objectheaders, or it can be known implicitly from the type system. Either way,the marking phase only follows real pointers.In a conservative marking phase, the compiler and runtime system maybe unsure about the type and layout of some objects.
Thus, when anobject is marked, the system considers each field that may be a possiblepointer. If its value might be a pointer, it is treated as a pointer. Anyvalue that does not represent a word-aligned address might be excluded,as might values that fall outside the known boundaries of the heap.Conservative collectors have limitations. They fail to reclaim some objectsthat a precise collector would find.
Nonetheless, conservative collectors have320 CHAPTER 6 The Procedure AbstractionClear all marksWorklist ← { pointer values from activation records & registers }while (Worklist 6 = ∅)remove p from the Worklistif (p→object is unmarked)mark p→objectadd pointers from p→object to Worklistn FIGURE 6.12 A Simple Marking Algorithm.been successfully retrofitted into implementations for languages such as cthat do not normally support garbage collection.When the marking algorithm halts, any unmarked object must be unreachable from the program. Thus, the second phase of the collector can treat thatobject as dead.
Some objects marked as live may also be dead. However, thecollector lets them survive because it cannot prove them dead. As the secondphase traverses the heap to collect the garbage, it can reset the mark fields to“unmarked.” This lets the collector avoid the initial traversal of the heap inthe marking phase.Mark-Sweep CollectorsMark-sweep collectors reclaim and recycle objects by making a linear passover the heap. The collector adds each unmarked object to the free list (orone of the free lists), where the allocator will find it and reuse it. With asingle free list, the same collection of techniques used to coalesce blocksin the first-fit allocator applies.
If compaction is desirable, it can be implemented by incrementally shuffling live objects downward during the sweep,or with a postsweep compaction pass.Copying CollectorsCopying collectors divide memory into two pools, an old pool and a newpool. The allocator always operates from the old pool. The simplest type ofcopying collector is called stop and copy. When an allocation fails, a stopand copy collector copies all the live data from the old pool into the new pooland swaps the identities of the old and new pools.
The act of copying livedata compacts it; after collection, all the free space is in a single contiguousblock. Collection can be done in two passes, like mark sweep, or it can bedone incrementally, as live data is discovered. An incremental scheme canmark objects in the old pool as it copies them to avoid copying the sameobject multiple times.An important family of copying collectors are the generational collectors.These collectors capitalize on the observation that an object that survives6.6 Advanced Topics 321one collection is more likely to survive subsequent collections. To capitalizeon this observation, generational collectors periodically repartition their“new” pool into a “new” and an “old” pool.
In this way, successive collections examine only newly allocated objects. Generational schemes varyin how often they declare a new generation, freezing the surviving objectsand exempting them from the next collection, and whether or not theyperiodically re-examine the older generations.Comparing the TechniquesGarbage collection frees the programmer from needing to worry about whento release memory and from tracking down the inevitable storage leaksthat result from attempting to manage allocation and deallocation explicitly.The individual schemes have their strengths and weaknesses.
In practice,the benefits of implicit deallocation outweigh the disadvantages of eitherscheme for most applications.Reference counting distributes the cost of deallocation more evenly across program execution than does batch collection. However, it increases the cost ofevery assignment that involves a heap-allocated value—even if the programnever runs out of free space. In contrast, batch collectors incur no cost until theallocator fails to find needed space.
At that point, however, the program incursthe full cost of collection. Thus, any allocation can provoke a collection.Mark-sweep collectors examine the entire heap, while copying collectorsonly examine the live data. Copying collectors actually move every liveobject, while mark-sweep collectors leave them in place. The tradeoffbetween these costs will vary with the application’s behavior and with theactual costs of various memory references.Reference-counting implementations and conservative batch collectors haveproblems recognizing cyclic structures, because they cannot distinguishbetween references from within the cycle and those from without. The marksweep collectors start from an external set of pointers, so they discover thata dead cyclic structure is unreachable.
The copying collectors, starting fromthe same set of pointers, simply fail to copy the objects involved in the cycle.Copying collectors compact memory as a natural part of the process. Thecollector can either update all the stored pointers, or it can require use ofan indirection table for each object access.
A precise mark-sweep collectorcan compact memory, too. The collector would move objects from one endof memory into free space at the other end. Again, the collector can eitherrewrite the existing pointers or mandate use of an indirection table.In general, a good implementor can make both mark sweep and copying work well enough that they are acceptable for most applications. In322 CHAPTER 6 The Procedure Abstractionapplications that cannot tolerate unpredictable overhead, such as real-timecontrollers, the runtime system must incrementalize the process, as the amortized reference-counting scheme does. Such collectors are called real-timecollectors.6.7 SUMMARY AND PERSPECTIVEThe primary rationale for moving beyond assembly language is to provide a more abstract programming model and, thus, raise both programmerproductivity and the understandability of programs.