Survey of garbage collection techniques

This page contains a survey of garbage collection techniques. The references for this survey are attached at the end. The first part introduces the background and terminologies in GC and the following sections introduce various techniques.

Garbage Collection Overview

The definition for garbage collection is "the automation reclamation of computer storage": The garbage collector identifies unused data objects and make their space available for reuse (usually on heap). The ideal case is that garbage collector is part of programming language and possibly assisted by compiler/runtime/OS/hardware.

An object is considered Garbage (subject to reclamation) if it is not reachable by the running program via any path of pointer traversals. An Live object is preserved by the collector, ensuring that the program can never traverse a "dangling pointer" into a deallocated object. The liveness criterion used by GC is conservative, defined in terms of a root set and reachability from these roots. The root set consists of the statically allocated global or local variables, local variables in the activation stack, and any registers used by active procedures. Heap objects, which are directly reachable from any of these variables, are preserved. Besides, any objects reachable from a live object is also live. Thus the set of live object is the set of objects on any directed path of pointers from the roots. Any object that is not reachable from the root set is garbage.

The motivation for garbage collector

Ensuring fully modular programming to avoid inter-module dependencies. Otherwise, if objects must be deallocated explicitly, some modules need to track the reference of these object and check no other modules will use them (i.e. determine its global liveness). The nonlocal bookkeeping of the modules reduces extensibility (e.g., updating new functionality), and can actually be orthognal, composable, and reusable.
Avoiding programming mistakes, including memory leak and allocate/free incorrect memory location (i.e. not memory safe).

GC cost issue A well-implemented GC should slow running programss down by ~10% relative to explicit heap deallocation for HPC systems.

A garbage collector is cooperative in the sense that it requires explicit support from other components in the software stack. For example, a compiler-cooperative GC requires object formats must be recognizable by the GC and certain invariant must be preserved by the running code. conservative GC are usable with little or no cooperation from the compiler-not even the types of named variables.

The Two-Phase Abstraction of a GC

Garbage Detection: distinguish live object from garbage
Garbage Reclamation free unused object for reuse

How to identify objects by GC?

Statically-typed: Hidden "header" fields on head objects (i.e. an extra field containing type information) can be used to decode the format of the object itself. These can also be used to find pointers to other objects.
Dynamically-typed: Tagged pointer, a slightly shortened representation of the hardware address is used, with a small type-identifying field in place of the missing address bits. This tagged representation supports polymorphic fields which may contain either one of these "immediate" objects or a pointer to an object on the heap.

Finalization perform "clean-up" actions when object die.

Fragmentation challenge in heap management scheme: The system need to decide to allocate more memory as needed to create small data objects, or to divide up large contiguous hunks of free memory risking permanent fragmentation.

Basic Garbage Collection Algorithms

Garbage Detection in the two-phase abstraction can be done in several ways: reference counting and tracing (including marking and copying). Garbage Reclamation is strongly dependent on the first phase.

Reference Couting

Each object has an associated count of the references (pointers) to it. The reference count could be a subfield in the header file of the object, which is generally not visible at language level.

Each time a reference to the object is created, (e.g., when a pointer is copied from one place to another by an assignment), the count is incremented.
When an existing reference to an object is eliminated, the count is decremented.
The memory occupied by an object may be reclaimed when the object's count equals zero.
When the object is reclaimed, its pointer fields are examined, and any objects it holds pointers to also have their reference counts decremented, since references from a garbage object don't count in determine liveness. Reclaiming one object may lead to transitive decrementing of reference counts and reclaiming many other objects.

Advantages of reference counting

incremental update: GC work (updating ref counts) is interleaved closely with the running program's own execution. This benefits real-time applications.
The immediacy of reclamation is beneficial for overall memory usage and locality of reference.
Distributed garbage collection can benefit from the local nature of reference counting compared to global tracing.

Disadvantages of reference counting

Reference count fields take extra space.
they are difficult to be efficient: Its cost is proportional to the amount of work done by the running program. Short-lived stack variables can incur significant overhead. Some bookkeeping need to be done to make reclaimed objects reusable again.
- Solution: deferred transitive reclamation A list of freed objects whose reference counts are zero can be processed at later time. This avoids adjusting reference counts for most short-lived pointers from the stack.
References from the local variables are not included in bookkeeping reference counts most of the time. Usually, reference accounts are only adjusted to reflect pointers from one heap object to another, which mean it may be inaccurate (pointers from the stack is not considered). The reference counts are uptodate by scanning the stack for pointers to heap objects. Then objects with reference counts zero can be safely reclaimed.
they are not always effective: Reference counting fails to reclaim circular structure. If the pointers in a group of objects create a (directed) cycle, the objects' reference counts are never reduced to zero, even if there is no path to the objects from the root set.
- Solution: combine with other GCs to collect cyclic object structures (fall-back reclamation). Purely functional language do not allow cyclic data structures.

Variations on Reference Counting

On optimization uses a small count field for most objects in reference counting, and other objects are reclaimed by fall-back tracing collector.

Tracing

Tracing collectors determine liveness by traversing the pointers that the program could traverse to find all the objects that the program could reach. There are several varieties of tracing collection: mark-sweep, mark-compact, copying, and non-copying implicit reclamation.

Mark-sweep Collection

Garbage Detection: Tracing - starting at root set, traversing the graph of pointer relationships (e.g., BFS or DFS), and marking reachable objects (e.g., using bitmap).
Garbage Reclamation Memory is swept to find all unmarked (garbage) objects and reclaim their space. These objects are linked into free lists to be accessible for allocation routines.

Three major problems of traditional mark-sweep GC:

It is difficult to handle objects of varying sizes without fragmentation of the available memory. The garbage object space is interspersed with live objects.
- Solution: Keeping separate free lists for objects of varying sizes and merging adjacent free space together together.
The cost of a collection is proportional to the size of the heap, including both live and garbage objects, since all live objects must be marked and all garbage objects must be collected.
Locality of reference is harmed due to unmoving nature of GC The live object remain in place after collection, interspersed with free space where new objects are allocated. Thus, non-generational mark-sweep GC is unsuitable for VM applications.

Mark-compact Collection

This technique remedies the fragmentation and allocation problems of mark-sweep collectors.

Garbage Detection: First reachable objects are traversed and marked. Then objects are compacted, moving most of the live objects until all of the live objects are contiguous. The compact process is often done by a linear scan through memory, finding live objects and "sliding" them down to be adjacent to the previous object, so that in the end all live objects are adjacent to a live neighbor.

Advantages:

For fragmentation problem: The contiguous free area allows allocating objects of various sizes.
For locality problem: The garbage spaces are simply "squeezed out", without disturbing the original ordering of objects in memory. (The allocation ordering is similar to subsequent access ordering.)

Disadvantages:

Several passes over the data are required, including extra passes in compact process.
- Solution: two-pointer algorithm, where one pointer scans down from the top of the heap, looking for live objects, and the other scans upward from the bottom, looking for holes to put them in.

Copying GC

Similar to mark-compact, it moves all live objects into one area, and the rest of the heap is known to be available since it only contains garbage. "Garbage collection" process is implicit. Different from mark-compact where a separate marking phase is used to traverse the live data, copying GC integrates the traversal of the data and the copying process (together termed "scavenging") so that most objects are traversed once.

Disadvantages:

The work needed is proportional to amount of live data (all of which must be copied).

"Stop-and-Copy" GC using semispaces

The heap space is subdivided into two contiguous semispaces, only one of which is used in normal program execution.

Memory is allocated linearly upward through this "current" semispace as demanded by the execution program.
"stop-and-copy" When allocation will not fit in the unused area of the current semispace, the program is stopped and the copying GC is called to reclaim space. All of the live data is copied from the current semispace to the other semispace, and then the other semispace is made current. The roles of the two space is reversed when GC is called.
The simplest form of scavenging is Cheney algorithm.

Efficiency of Copying GC

It can be made arbitrarily efficient if sufficient memory is available. Decreasing the frequency of garbage collection will decrease the total amount of garbage collection efforts (proportional to the amount of live data at garbage collection time), since some objects will become garbage before garbage collection and less objects need to be copied. However, this require the amount of memory in the heap to be increased.

Non-Copying Implicit Collection (Treadmill Collection)

In copying collector, the "spaces" of the collector are a particular implementation of sets. In a copying collector, the set is an area of memory, but in a non-copying collector it can be any kind of set of chunks of memory that formerly held live objects.

The non-copying system adds two pointer fields and a "color" field to each object.

These fields are invisible to programmers and serve to link each hunk of storage into a doubly-linked list that serves as a set.
The color field indicates which set an object belongs to.

The working mechanism:

Chunks of free space are initially linked to form a doubly-linked list, while chunks holding allocated objects are linked together into another list.
When the free list is exhausted, the collector traverses the live objects and move them from "allocated" set (unlinking the object from the doubly linked list, toggling its color field and link to another set) to another set.

Disadvantages compared to copying GC:

The per-object constants are higher, and fragmentation is possible

Advantages compared to copying GC:

The tracing cost for large objects is not as high. Like mark-sweep GC, the whole object needn't be copied.
This doesn't require the actual language level pointers between objects to changed, so it imposes few constraints on compilers.

Comparison among Basic Tracing Techniques

A common criterion: the cost of GC is comparable to the cost of allocating objects. The basic three components of tracing algorithm

The initial working required at each collection, such as root set scanning. It is usually fixed.
The work done at allocation (proportional to the amount of allocation) plus an initialization cost.
The work done during garbage detection (e.g., tracing, proportional to the amount of live data).

Some truths:

Mark-sweep: cost is proportional to the amount of allocation
Copying GC: cost is proportional to the amount of live data traced
When memory is not much larger than the expected amount of live data, non-moving collectors have an advantage over copying collectors
When space is tight, reference counting GC is good
HPC usually deploys hybird GCs.
The comparison between in-place (non-moving) GCs and copying GCs:
- In-place GCs are conservative with respect to data values that may or may not be pointers, which is good for languages that can not unambiguously identify all pointers at runtime. A non-moving collectors can be conservative since anything that looks like a pointer object can be left where it is.
- A copying GC must know whether a value is a pointer, and whether to move the object and update the pointer.

Problems with Simple Tracing Collectors

If virtual memory is used, the poor locality of the allocation and and reclamation cycle will generally cause excessive paging. (Every location in the heap is used before any location's space is reclaimed and reused). Thus, it is not wise to make the heap area larger than the available main memory.
The major locality problem is an indirect result of the use of GC: by the time the space is reclaimed and reused, it is likely to have been paged out, simply because too many other pages have been allocated in between.
- The only way to have good locality is to ensure that memory is large enough to hold the regularly-used area.
- Another way is to rely on optimization such as prefetching.
- Generational GC addresses this problem by reusing a smaller amount of memory more often.

Conservatism in GC

An ideal garbage collector would be able to reclaim every object's space after the last use, which is impossible in real case, since the last use can not be determined. The art of efficient GC design is largely one of introducing small degrees of conservatism which significantly reduce the work done in detecting garbage.

First conservative assumption most GCs make: any variable in the stack, global, or registers is live, even though the variable may actually never be referenced again.
Tracing collectors introduce a major temporal form of conservatism: allow garbage to go uncollected between collection cycles.
Reference counting GCs are conservative topologically: fail to distinguish path that shares an edge in the graph of pointer relationship.

Garbage Collection Variations

Incremental Techniques

The main idea is to reduce program halt time due to GC by interleaving small units of GC with small units of program execution. It is desirable to make tracing GC (marking or copying) incremental. The challenges with incremental tracing is that while the collector is tracing out the graph of reachable data structures, the graph may change due to program execution, so running program is referred to as mutator. Therefore, incremental tracing GC must be able to track the changes.

Coherence and Conservatism

The problem formulation: both the collector and the mutation try to modify the shared reachability graph while making some kind of consistent view, which can be formulated as a coherence problem. In general, less coherency means more conservatism.

Incremental mark-sweep traversal (multiple readers, single writer coherence problem): only the mutator writes to pointer fields, and only the collector writes to mark fields.
Incremental copying GC (multiple readers, multiple writers coherence problem): both the mutator and the collector may modify pointer fields, and each must be protected from inconsistencies introduced by the other.

In particular, GC's view of the reachability graph is a safe, conservative approximation of the true reachability graph:

unreachable objects may seem reachable (called floating garbage)
reachable objects always exhibit reachability

Tricolor Marking

The idea is to traverse the graph of reachable objects and color them, with garbage as white and live objects as black.

For mark-sweep: "mark bit = 1" --> mark black
For copy collector: moving objects from fromspace (white) to tospace (black).

To ensure safe interaction of mutator and collector, the intermediate state of the coloring traversal is gray. Gray signifies that an object has been reached by the traversal, but that its descendants may not have been. The traversal proceeds in a wavefront of gray objects, which separates the white (unreached) objects from the black objects that have been passed by the wave - there are no pointers directly from black to white (the mutator preserve this well defined gray fringe).

As the traversal proceeds outward from the roots, objects are initially colored gray.
When they are scanned and pointers to their offspring are traversed, they are blackened and the offspring are colored gray.
WARNING If the mutator creates a pointer from a black object to a white one, it must somehow notify the collector that its assumption has been violated.

In both following cases, objects that have not been reached yet are white.

In copying GC: the gray objects are the objects in the unscanned area of tospace - if a Chaney BFS is used, these are the objects between the scan and free pointers.
In mark-sweep GC: the gray objects correspond to the stack or queue of objects used to control the marking traversal, and the black objects are the ones that have been removed from the queue.

Incremental Approaches: Two basic approaches to coordinate (synchronize) the collector with the mutator - before the mutator can perform certain operations, it must activate the GC to perform some action.

Read Barrier: It detects when the mutator attempts to access a pointer to a white object, and immediately color the object gray. Since the mutator can't read pointers to white objects, it can't install them in black object.
Write Barrier: When the program attempts to write a pointer into an object, the write is trapped or recorded. To foil the GC's marking traversal, it is necessary for the mutator to (1) write a pointer to a white object into a black object, and (2) destroy the original pointer before the collector sees it.
- If (1) doesn't hold, no special action is needed - If there are other pointers to the white object from gray objects it will be retained, and if not it is garbage.
- If (2) doesn't holf, the object will be reached via the original pointer and retained.

Write Barrier Algorithm

For non-copying GC, write barrier techniques are cheaper since heap writes are less common than heap read.

Snapshot-at-beginning collectors ensure (2) cannot happen: rather than allowing pointers to be simply overwritten, they are first saved in a data structure "off to the side" so that the collector can find them. Thus no path to a white object can be broken without providing another path to the object for the GC.

Incremental update collectors rather than saving copies of all pointers that are overwritten, they actually record pointers stored into black objects, and catch the copies where they are stored. That is, if a pointer to a white object is copied into a black object, that new copy of the pointer will be found.

Snapshot-at-beginning collectors

This uses a write barrier to ensure that no objects ever become inaccessible to the GC while collection is in progress. At the beginning of GC, a copy-on-write virtual copy of the graph of reachable data structures is made, to ensure the graph of reachable objects is fixed at the moment GC starts.

Rather than preventing the creation of pointers from black objects to white ones, the original path to the object can't be lost, because all overwritten pointer values are saved and traversed. Newly-allocated objects are considered to be black as if they have already been traversed. The collector's view of the reachability graph is thus the set union of the graph at the beginning of the GC, puls all of those that are allocated during tracing.

This allows the tricolor "invariant" to be broken temporarily. Rather than guaranteeing that each path from a black object to a white object must go through a gray object, it is only guaranteed that for each reachable white object there will be at least one path to the object from a gray object.

Incremental update write barrier collectors

It heuristically attempts to retain the objects that are live at the end of GC. An object will not be reached by the GC if all paths to it are broken at a point that the GC has not yet reached. To avoid the problem of pointers being hidden in reachable objects that have already been scanned, such copied pointers are caught when they are stored into the scanned objects. (It notices when the pointer hides in an object that has already been traversed).

Objects are assumed to be unreachable when they are allocated (in terms of tricolor marking, objects are allocated white), on the assumption that new objects are likely to be short-lived and quickly reclaimed. At some point, the stack must be traversed and the objects that are reachable are marked and preserved (this is the extra computation cost introduced).

It preserves the tricolor "invariant" by blackening the pointed-to white object, rather than reverting the stored-into black object to gray (This pushes the gray wavefront outward to preserve the tricolor "invariant").

Baker's Read Barrier Algorithm

Baker's incremental copying scheme is an adaptation of the simple copying GC (Cheney's algorithm), and uses a read barrier for coordination with the mutator.

Incremental Copying

The copying of data proceeds in the Cheney (BFS) fashion, by advancing the scan pointer through the unscanned area of tospace and moving any referred-to objects from fromspace. This background scavenging is interleaved with mutator operation, and mutator activity can also trigger copying, as needed, to ensure that the mutator's view of data structures is always consistent.

A GC cycle begins with an atomic flip, which conceptually invalidates all objects in fromspace, and copies to tospace all objects directly reachable from the root set.
Then the mutator is allowed to resume.
Any fromspace object that is accessed by the mutator must first be copied to tospace, and this copying-on-demand is enforced by the read barrier. (The mutator is never allowed to see pointers into fromspace, i.e., pointers to white objects. Whenever the mutator reads a potential pointer from the heap, it immediately checks to see if it is a pointer into fromspace; if so, the referent is copied to tospace, i.e., its color is changed from white to gray.)
The background scavenging process is also interleaved with normal program execution, to ensure that all reachable data are copied to tospace and the collection cycle completes before memory is exhausted.

The objects allocated by the mutator during incremental collection are in tospace, and are assumed to be black in terms of tricolor marking. They are never reclaimed until the next GC cycle. The scanned area of tospace contains black objects, and the copied but unscanned objects (between the scan and free pointer) are gray. As-yet unreached objects in fromspace are white. The scanning of objects (and copying of their offspring) moves the wavefront forward. To ensure the invariant that pointers into fromspace must not be stored into objects that have already been scanned, undoing the collector's work, other objects may be copied to tospace, in addition to the background tracing.

The read barrier:

may be implemented in S/W, by preceding each read (of a potential pointer from the heap) with a check and a conditional call to the copying-and-updating routine.
may be implemented in specialized H/W checks / microcoded routines.

Baker's Incremental Non-copying algorithm - The Treadmill

The sets of objects of each color are implemented using doubly-linked lists (and per-object color fields). By avoiding the actual moving of objects and updating of pointers, the scheme puts fewer restrictions on other aspects of language implementations.

The real-time version of this scheme links the various lists into a cyclic structure, divided into 4 sections:

New: It is contiguous with the free list. Allocation occurs by advancing the pointer that separates them. At beginning new list is empty. The new objects are allocated black.
Free:
From: It holds objects that were allocated before GC began, and which are subject to GC. As the collector and mutator traverse data structures, objects are moved from the from-list to the to-list. Eventually, all of the reachable objects in the from-list have been moved to the to-list, and scanned for offspring. When no more offspring are reachable, all of the objects in the from-list are known to be garbage, and all of the objects in the to-list are black. After GC is complete, the from-list is merged with free-list.
To: It is initially empty, but grows as objects are unlinked from the from-list during collection. It contains both black objects (which have been completely scanned) and gray ones (which have been reached but not scanned). It is only necessary to have a scan pointer into the to-list and advance it though gray objects.

Conservatism of Baker's Read Barrier

Objects allocated during collection are assumed to be live, even if they air before the collection is finished.
Pre-existing objects may become garbage after having been reached by the collector's traversal, and they will not be reclaimed - once an object has been grayed, it will be considered live until the next GC cycle. On the other hand, if objects are destroyed before being traversed, then they will be reclaimed.

Variations on the Read Barrier

Replication Copying Collection

While copying is going on, the mutator continues to see the fromspace versions of objects, rather than the "replicas" in tospace. When the copying process is complete, a flip is performed, and the mutator then sees the replicas.

Consistency issue: The mutator continues to access the same versions of objects during the copying traversal, so it needn't check for forwarding pointers. This eliminates the need for a read barrier - conceptually all objects are "forwarded" to their new versions at once, when the flip occurs. But, it requires a write barrier. The mutator sees the old version in fromspace; if an object has already been copied to tospace, and the fromspace version is then modified by the mutator, the new replica can get the wrong value. To avoid this, the write barrier must catch all updates, and the collector must ensure that all updates have been propagated when the flip occurs (all of the modifications to old versions of objects must be made to the corresponding new versions).

Coherence and Conservatism Revisited

The degree of coordination between mutator and collector's tracing traversal results in trade-off between performance (low coordinating cost) and conservatism (use outdated info, retain garbage during collection).

Coherence and Conservatism in Non-copying Collection (Write-Barrier)

Degree of conservatism: snapshot-at-beginning > Djikstra's incremental update (black points to white, then gray the white) > Steele's algorithm (black points to white, then gray the black, undoing the blackening done by the collector)

Coherence and Conservatism in Copying Collection (Read-Barrier)

Black-only read barrier > Baker's read barrier > Replication copying (Weaker notion of consistency: because the mutator doesn't operate in tospace until after the copying phase is complete, the copies of data in tospace needn't be entirely consistent during incremental copying. The changes made to fromspace by the mutator must be propagated to tospace eventually, but the entire state only needs to be consistent at the end of collection, when the atomic "flip" is performed).

"Radical" Collection and Opportunistic Tracing

Radical GC: (1) redo some of the traversal that is already done; (2) un-mark objects that is already reached to avoid conservatism. A fully radical GC will respond to any change to in reachability graph. Two way to implement:

A full trace of the reachability graph at every pointer of write
Record every dependencies within the reachability graph, and update the dependency graph at every pointer update. Whenever all paths keeping an object alive is broken, the object is known to be garbage. (Refcount is an approximate of this strategy.)

Comparing Incremental Techniques

The below dimensions are orthognal:

the abstraction of tricolor marking
concrete tracing mechanism: mark-sweep, or copy collection
coordination: read / write barrier

Real-time Tracing Collection

The real time criterion for GC: imposing only small and bounded delays on any particular program operation.

A copying algorithm generally frees a large, contiguous area of memory, and requests for objects of any size can be satisfied by a constant-time stack-like allocation operation.
Non-copying algorithms are subject to fragmentation - memory that is freed may not be contiguous, so it may not be possible to allocate on object of a given size even if there is that much memory free.
- Their time overheads are more predictable.

Generational Schemes

These improve efficiency and locality by garbage collecting a smaller area more often, while exploiting typical lifetime characteristics to avoid undue overhead from long-lived objects. Typical pause times are short since most collections are in small area.

Problem description: some objects survive through many garbage collections and the GC spend a lot of time copying these objects at every iteration.

Solution: Generational collection avoids this repeated copying by segregating objects into multiple areas by age, and collecting areas containing older objects less often than the younger ones. Once objects have survived a small number of collections, they are moved to a less frequently collected area.

For stop-and-collect (non-incremental) GC, generational collection is beneficial since most collections which reclaim the youngest generation are taking only a short time.
The longer pauses for multi-generation collections can be postponed until the system is not in use or hidden within non-interactive compute-bound phase.

Multiple Subheaps with Varying Collection Frequencies (for Copying GC)

Memory will be divided into areas that will hold objects of different approximate ages or generations; each generation's memory is further divide into semispaces. Garbage collection only happen in the new generation and copy between semispaces of the same generation. Long living objects are copied to older generations, and these older generations are collected less frequently.

Requirements for this to work:

GC can find pointers into younger generations' objects. All references from old to new objects must be located at collection time, and used as roots for the copying traversal.
- A write-barrier is needed for each potential pointer store, in case an intergenerational pointer is being created.
- This can be implemented as checking every store, or maintaining dirty bits and scanning dirty areas at collection time.

**Conservatism: ** Old objects may be floating / undetected garbage, so its pointers to new objects do not represent liveness.

Challenges:

Advancement Policy: How long must an object survive in one generation before it is advanced to the next generation?
Heap Organization: How should the storage space divided between generations and within a generation?
Collection Scheduling: For non-incremental GC, how to mitigate the effect of disruptive pauses by "opportunistic scheduling"?
Intergenerational References: How to find old to young generation pointers?

Advancement Policy

**Simpler Version: **advance all live data into the next generation whenever they are traversed. This requires no distinguishing different age objects in the same generation.

Heap Organization

In copying collectors:

keeping objects of different ages in different areas of memory
- contiguous area: generations can be determined by address comparison
- non-contiguous sets of pages: using page number part of its address to index into a table with generation information In non-copying collectors:
each object has a header field indicating which generation it belongs to

Subareas in copying schemes

Generations in non-copying schemes

Tracking Intergenerational References

The Generational Principle Revisited

Real-time Generational Colletion

Locality Considerations

Low-level Implementation Issues

Garbage Collection for C/C++

Some misunderstanding about GC for C/C++

Although memory can be manually managed by malloc & free in C (or new & delete in C++), it is possible to link in a garbage collector to reclaim the unused memory automatically. The technology of garbage collector will free the programmer from memory management and give them more time to focus on other stuffs.

One choice is that: malloc (or new) is replaced with the garbage collector's allocator and free (or delete) is replaced with a do-nothing subroutine. The second choice is that: the garbage collector can be used together with C/C++ self-built malloc & free/ new & delete. In this way, the garbage collector acts as a backstop, preventing the leaks that might accidentally occur.

But we should know that: Using garbage collector doesn't necessarily run any faster than free-does-nothing, but it may help keep the heap smaller.

Open sourced GC for C/C++

Boehm's Collector (http://www.hboehm.info/gc/)

It is a conservative garbage collector, which implements a modified mark-sweep algorithm. Conceptually, it is operated in four steps:

Preparation: Each object has an associated mark bit. Clear all mark bits, indicating that all objects are potentially unreachable.
Mark: Marks all objects that can be reachable via chains of pointers from variables. Often the collector has no real information about the location of pointer variables in the heap, so it views all static data areas, stacks and registers as potentially containing pointers. Any bit patterns that represent addresses inside heap objects managed by the collector are viewed as pointers. Unless the client program has made heap object layout information available to the collector, any heap objects found to be reachable from variables are again scanned similarly.
Sweep: Scans the heap for inaccessible, and hence unmarked, objects, and returns them to an appropriate free list for reuse. This is not really a separate phase; even in non incremental mode this is operation is usually performed on demand during an allocation that discovers an empty free list. Thus the sweep phase is very unlikely to touch a page that would not have been touched shortly thereafter anyway.
Finalization: Unreachable objects which had been registered for finalization are enqueued for finalization outside the collector.

It also has a scalable multiprocessor version. (More detail will be added after going through the source code.)

Evaluation

Benchmark

This is a benchmark that is compatible with Boehm's GC. http://www.hboehm.info/gc/gc_bench.html

(It seems that most benchmarks targets on the GC for JVM, but our project targets on GC for C/C++. More detail will be added whether these benchmarks are naturally compatible to both languages, or some conversion is required, or it is nonsense to evaluate a GC for C/C++ with a GC benchmark for Java.)

References

Wilson, Paul R. "Uniprocessor garbage collection techniques." Memory Management. Springer, Berlin, Heidelberg, 1992. 1-42.

GC FAQ http://www.iecc.com/gclist/GC-faq.html