A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition), страница 47
PDF-файл из архива "A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)", который расположен в категории "книги и методические указания". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 47 страницы из PDF
First, the roots areforwarded. This copies a few records (those reachable directly from root pointers) to to-space,thereby incrementing next.Figure 13.10: Breadth-first copying collection.ALGORITHM 13.9: Breadth-first copying garbage collection.scan ← next ← beginning of to-spacefor each root rr ← Forward(r)while scan < nextfor each field fi of record at scanscan. fi Forward(scan. fi)scan ← scan+ size of record at scanThe area between scan and next contains records that have been copied to to-space, butwhose fields have not yet been forwarded: In general, these fields point to from-space.
Thearea between the beginning of to-space and scan contains records that have been copied andforwarded, so that all the pointers in this area point to to-space. The while loop (of Algorithm13.9) moves scan toward next, but copying records will cause next to move also.Eventually, scan catches up with next after all the reachable data are copied to to-space.Cheney's algorithm requires no external stack, and no pointer reversal: It uses the to-spacearea between scan and next as the queue of its breadth-first search. This makes itconsiderably simpler to implement than depth-first search with pointer reversal.Locality of reference However, pointer data structures copied by breadth-first have poorlocality of reference: If a record at address a points to another record at address b, it is likelythat a and b will be far apart.
Conversely, the record at a + 8 is likely to be unrelated to the223one at a. Records that are copied near each other are those whose distance from the roots areequal.In a computer system with virtual memory, or with a memory cache, good locality ofreference is important. After the program fetches address a, then the memory subsystemexpects addresses near a to be fetched soon. So it ensures that the entire page or cache linecontaining a and nearby addresses can be quickly accessed.Suppose the program is fetching down a chain of n pointers in a linked list. If the records inthe list are scattered around memory, each on a page (or cache line) containing completelyunrelated data, then we expect n difference pages or cache lines to be active.
But if successiverecords in the chain are at adjacent addresses, then only n/k pages (cache lines) need to beactive, where k records fit on each page (cache line).Depth-first copying gives better locality, since each object a will tend to be adjacent to its firstchild b, unless b is adjacent to another "parent" a′. Other children of a may not be adjacent toa, but if the subtree b is small, then they should be nearby.But depth-first copy requires pointer-reversal, which is inconvenient and slow.
A hybrid,partly depth-first and partly breadth-first algorithm can provide acceptable locality. The basicidea is to use breadth-first copying, but whenever an object is copied, see if some child can becopied near it (Algorithm 13.11).ALGORITHM 13.11: Semi-depth-first forwarding.function Forward(p)if p points to from-spacethen if p. f1 points to to-spacethen return p. f1else Chase(p); return p. f1else return pfunction Chase(p)repeatq nextnext ← next+ size of record pr ← nilfor each field fi of record pq. fi ← p.
fiif q. fi points to from-space and q. fi . f1 does not point to tospacethen r ← q. fip. f1 ← qp ← runtil p = nilCost of garbage collection Breadth-first search (or the semi-depth-first variant) takes timeproportional to the number of nodes it marks, that is, c3 R for some constant c3 (perhaps equalto 10 instructions). There is no sweep phase, so c3 R is the total cost of collection. The heap isdivided into two semi-spaces, so each collection reclaims H = 2 − R words that can beallocated before the next collection. The amortized cost of collection is thus224instructions per word allocated.As H grows much larger than R, this cost approaches zero.
That is, there is no inherent lowerbound to the cost of garbage collection. In a more realistic setting, where H = 4R, the costwould be about 10 instructions per word allocated. This is rather costly in space and time: Itrequires four times as much memory as reachable data, and requires 40 instructions ofoverhead for every 4-word object allocated. To reduce both space and time costs significantly,we use generational collection.13.4 GENERATIONAL COLLECTIONIn many programs, newly created objects are likely to die soon; but an object that is stillreachable after many collections will probably survive for many collections more.
Thereforethe collector should concentrate its effort on the "young" data, where there is a higherproportion of garbage.We divide the heap into generations, with the youngest objects in generation G0; every objectin generation G1 is older than any object in G0; everything in G2 is older than anything in G1,and so on.To collect (by mark-and-sweep or by copying) just G0, just start from the roots and do eitherdepth-first marking or breadth-first copying (or semidepth-first copying). But now the rootsare not just program variables: They include any pointer within G1, G2,… that points into G0.If there are too many of these, then processing the roots will take longer than the traversal ofreachable objects within G0!Fortunately, it is rare for an older object to point to a much younger object.
In many commonprogramming styles, when an object a is created its fields are immediately initialized; forexample, they might be made to point to b and c. But b and c already exist; they are older thana. So we have a newer object pointing to an older object. The only way that an older object bcould point to a newer object a is if some field of b is updated long after b is created; thisturns out to be rare.To avoid searching all of G1, G2,… for roots of G0, we make the compiled program rememberwhere there are pointers from old objects to new ones. There are several ways ofremembering:225Figure 13.12: Generational collection.
The bold arrow is one of the rare pointers from an oldergeneration to a newer one.••••Remembered list: The compiler generates code, after each update store of the form b.fi ← a, to put b into a vector of updated objects. Then, at each garbage collection, thecollector scans the remembered list looking for old objects b that point into G0.Remembered set: Like the remembered list, but uses a bit within object b to recordthat b is already in the vector. Then the code generated by the compiler can check thisbit to avoid duplicate references to b in the vector.Card marking: Divide memory into logical "cards" of size 2k bytes.
An object canoccupy part of a card or can start in the middle of one card and continue onto the next.Whenever address b is updated, the card containing that address is marked. There is anarray of bytes that serve as marks; the byte index can be found by shifting address bright by k bits.Page marking: This is like card marking, but if 2k is the page size, then thecomputer's virtual memory system can be used instead of extra instructions generatedby the compiler. Updating an old generation sets a dirty bit for that page.
If theoperating system does not make dirty bits available to user programs, then the userprogram can implement this by write-protecting the page and asking the operatingsystem to refer protection violations to a usermode fault handler that records thedirtiness and unprotects the page.When a garbage collection begins, the remembered set tells which objects (or cards, or pages)of the old generation can possibly contain pointers into G0; these are scanned for roots.Algorithm 13.3 or 13.9 can be used to collect G0: "heap" or "from-space" means G0, "tospace" means a new area big enough to hold the reachable objects in G0, and "roots" includeprogram variables and the remembered set. Pointers to older generations are left unchanged:The marking algorithm does not mark old-generation records, and the copying algorithmcopies them verbatim without forwarding them.After several collections of G0, generation G1 may have accumulated a significant amount ofgarbage that should be collected.
Since G0 may contain many pointers into G1, it is best tocollect G0 and G1 together. As before, the remembered set must be scanned for rootscontained in G2, G3;…. Even more rarely, G2 will be collected, and so on.226Each older generation should be exponentially bigger than the previous one. If G0 is half amegabyte, then G1 should be two megabytes, G2 should be eight megabytes, and so on. Anobject should be promoted from Gi to Gi+1 when it survives two or three collections of Gi.Cost of generational collection Without detailed empirical information about the distributionof object lifetimes, we cannot analyze the behavior of generational collection. In practice,however, it is common for the youngest generation to be less than 10% live data.
With acopying collector, this means that H / R is 10 in this generation, so that the amortized cost perword reclaimed is c3 R / (10R − R), or about 1 instruction. If the amount of reachable data inG0 is about 50 to 100 kilobytes, then the amount of space "wasted" by having H = 10R in theyoungest generation is about a megabyte. In a 50-megabyte multigeneration system, this is asmall space cost.Collecting the older generations can be more expensive. To avoid using too much space, asmaller H / R ratio can be used for older generations.
This increases the time cost of an oldergeneration collection, but these are sufficiently rare that the overall amortized time cost is stillgood.Maintaining the remembered set also takes time, approximately 10 instructions per pointerupdate to enter an object into the remembered set and then process that entry in theremembered set. If the program does many more updates than fresh allocations, thengenerational collection may be more expensive than nongenerational collection.13.5 INCREMENTAL COLLECTIONEven if the overall garbage collection time is only a few percent of the computation time, thecollector will occasionally interrupt the program for long periods. For interactive or real-timeprograms this is undesirable. Incremental or concurrent algorithms interleave garbagecollection work with program execution to avoid long interruptions.Terminology The collector tries to collect the garbage; meanwhile, the compiled programkeeps changing (mutating) the graph of reachable data, so it is called the mutator.
Anincremental algorithm is one in which the collector operates only when the mutator requestsit; in a concurrent algorithm the collector can operate between or during any instructionsexecuted by the mutator.Tricolor marking In a mark-sweep or copying garbage collection, there are three classes ofrecords:•••White objects are not yet visited by the depth-first or breadth-first search.Grey objects have been visited (marked or copied), but their children have not yetbeen examined.