A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition) (798439), страница 48
Текст из файла (страница 48)
In mark-sweep collection, these objects are on the stack; in Cheney'scopying collection, they are between scan and next.Black objects have been marked, and their children also marked. In mark-sweepcollection, they have already been popped off the stack; in Cheney's algorithm, theyhave already been scanned.The collection starts with all objects white; the collector executes Algorithm 13.13,blackening grey objects and greying their white children. Implicit in changing an object fromgrey to black is removing it from the stack or queue; implicit in greying an object is putting itinto the stack or queue. When there are no grey objects, then all white objects must begarbage.227ALGORITHM 13.13: Basic tricolor markingwhile there are any grey objectsselect a grey record pfor each field fi of pif record p. fi is whitecolor record p.
fi greycolor record p blackAlgorithm 13.13 generalizes all of the mark-sweep and copying algorithms shown so far:Algorithms 13.2, 13.3, 13.5, 13.6, and 13.9. All these algorithms preserve two naturalinvariants:1. No black object points to a white object.2. Every grey object is on the collector's (stack or queue) data structure (which we willcall the grey-set).While the collector operates, the mutator creates new objects (of what color?) and updatespointer fields of existing objects. If the mutator breaks one of the invariants, then thecollection algorithm will not work.Most incremental and concurrent collection algorithms are based on techniques to allow themutator to get work done while preserving the invariants.
For example:•••••Dijkstra, Lamport, et al. Whenever the mutator stores a white pointer a into a blackobject b, it colors a grey. (The compiler generates extra instructions at each store tocheck for this.)Steele. Whenever the mutator stores a white pointer a into a black object b, it colors bgrey (using extra instructions generated by the compiler).Boehm, Demers, Shenker. All-black pages are marked read-only in the virtualmemory system.
Whenever the mutator stores any value into an all-black page, a pagefault marks all objects on that page grey (and makes the page writable).Baker. Whenever the mutator fetches a pointer b to a white object, it colors b grey.The mutator never possesses a pointer to a white object, so it cannot violate invariant1. The instructions to check the color of b are generated by the compiler after everyfetch.Appel, Ellis, Li. Whenever the mutator fetches a pointer b from any virtual-memorypage containing any nonblack object, a page-fault handler colors every object on thepage black (making children of these objects grey).
Thus the mutator never possessesa pointer to a white object.The first three of these are write-barrier algorithms, meaning that each write (store) by themutator must be checked to make sure an invariant is preserved. The last two are read-barrieralgorithms, meaning that read (fetch) instructions are the ones that must be checked. We haveseen write barriers before, for generational collection: Remembered lists, remembered sets,card marking, and page marking are all different implementations of the write barrier.Similarly, the read barrier can be implemented in software (as in Baker's algorithm) or usingthe virtual-memory hardware.Any implementation of a write or read barrier must synchronize with the collector.
Forexample, a Dijkstra-style collector might try to change a white node to grey (and put it intothe grey-set) at the same time the mutator is also greying the node (and putting it into the228grey-set). Thus, software implementations of the read or write barrier will need to use explicitsynchronization instructions, which can be expensive.But implementations using virtual-memory hardware can take advantage of thesynchronization implicit in a page fault: If the mutator faults on a page, the operating systemwill ensure that no other process has access to that page before processing the fault.13.6 BAKER'S ALGORITHMBaker's algorithm illustrates the details of incremental collection.
It is based on Cheney'scopying collection algorithm, so it forwards reachable objects from from-space to to-space.Baker's algorithm is compatible with generational collection, so that the from-space and tospace might be for generation G0, or might be G0 +…+ Gk.To initiate a garbage collection (which happens when an allocate request fails for lack ofunused memory), the roles of the (previous) from-space and to-space are swapped, and all theroots are forwarded; this is called the flip.
Then the mutator is resumed; but each time themutator calls the allocator to get a new record, a few pointers at scan are scanned, so thatscan advances toward next. Then a new record is allocated at the end of the to-space bydecrementing limit by the appropriate amount.The invariant is that the mutator has pointers only to to-space (never to from-space). Thus,when the mutator allocates and initializes a new record, that record need not be scanned;when the mutator stores a pointer into an old record, it is only storing a to-space pointer.If the mutator fetches a field of a record, it might break the invariant.
So each fetch isfollowed by two or three instructions that check whether the fetched pointer points to fromspace. If so, that pointer must be forwarded immediately, using the standard forwardalgorithm.For every word allocated, the allocator must advance scan by at least one word. Whenscan=next, the collection terminates until the next time the allocator runs out of space. If theheap is divided into two semi-spaces of size H / 2, and R < H / 4, then scan will catch up withnext before next reaches halfway through the to-space; also by this time, no more than halfthe to-space will be occupied by newly allocated records.Baker's algorithm copies no more data than is live at the flip. Records allocated duringcollection are not scanned, so they do not add to the cost of collection.
The collection cost isthus c3 R. But there is also a cost to check (at every allocation) whether incremental scanningis necessary; this is proportional to H / 2 − R.But the largest cost of Baker's algorithm is the extra instructions after every fetch, required tomaintain the invariant.
If one in every 10 instructions fetches from a heap record, and each ofthese fetches requires two extra instructions to test whether it is a from-space pointer, thenthere is at least a 20% overhead cost just to maintain the invariant. All of the incremental orconcurrent algorithms that use a software write or read barrier will have a significant cost inoverhead of ordinary mutator operations.22913.7 INTERFACE TO THE COMPILERThe compiler for a garbage-collected language interacts with the garbage collector bygenerating code that allocates records, by describing locations of roots for each garbagecollection cycle, and by describing the layout of data records on the heap.
For some versionsof incremental collection, the compiler must also generate instructions to implement a read orwrite barrier.FAST ALLOCATIONSome programming languages, and some programs, allocate heap data (and generate garbage)very rapidly. This is especially true of programs in functional languages, where updating olddata is discouraged.The most allocation (and garbage) one could imagine a reasonable program generating is oneword of allocation per store instruction; this is because each word of a heap-allocated recordis usually initialized. Empirical measurements show that about one in every seven instructionsexecuted is a store, almost regardless of programming language or program. Thus, we have (atmost) 1/7 word of allocation per instruction executed.Supposing that the cost of garbage collection can be made small by proper tuning of agenerational collector, there may still be a considerable cost to create the heap records.
Tominimize this cost, copying collection should be used so that the allocation space is acontiguous free region; the next free location is next and the end of the region is limit. Toallocate one record of size N, the steps are1. Call the allocate function.2. Test next + N < limit ? (If the test fails, call the garbage collector.)3. Move next into result4. Clear M[next], M[next + 1],…, M[next + N − 1]5. next ← next + N6. Return from the allocate function.A. Move result into some computationally useful place.B.
Store useful values into the record.Steps 1 and 6 should be eliminated by inline expanding the allocate function at each placewhere a record is allocated. Step 3 can often be eliminated by combining it with step A, andstep 4 can be eliminated in favor of step B (steps A and B are not numbered because they arepart of the useful computation; they are not allocation overhead).Steps 2 and 5 cannot be eliminated, but if there is more than one allocation in the same basicblock (or in the same trace; see Section 8.2), the comparison and increment can be sharedamong multiple allocations.