The GC: Reference Counter

The simplest form of Garbage Collector is the Reference Counter Garbage Collector.

Each time an object is referenced (a pointer to it is made) the reference count on the object is incremented. If the pointer isn’t used anymore the reference count is decreased. If the reference count hits 0 the object will be freed.

Python uses a Reference Counter.

Advantages

  • A reference counter does its tracking at runtime. There is no big garbage collection, and the execution of the program won’t be interupted for a while when the garbage is collected.
  • It’s very easy to implement.

Disadvantages

  • Instead of the required malloc free pair required for each allocation without Gc, it now requires add_ref free_ref for each pointer/reference to an allocation. When someone forgets to release its reference the object persists.
  • Circular references will never be collected. An example of a circular reference is a list that contains as an item a reference to itself. When the list is released its count will remain 1 because inside the list it refers to itself and will therefore never be collected.
  • A RefCounter GC doesn’t keep track of pointers into the managed heap, and therefore can’t move any object because it`ll break pointers. This makes the fragmentation of the RefCounter GC as bad as malloc‘s fragmentation.

Circular references

Circular references can be fixed by adding a trace algorithm. This algorithm will trace through all objects from the root objects (pointers on the stack, etc.) and mark them. The objects that aren’t marked aren’t accessible and therefore are subject to Garbage Collection.

The problem is that a RefCounter Gc usually doesn’t keep track of what pointers are in objects, and that there could be pointers outside the managed heap which only use the add_ref and free_ref. Python struggled with this problem and it implemented the a trace algorithm that depended on the programmer’s of not-python-modules (of which pointers aren’t known) to implement the list of references to objects that also contain objects. By traversing links that way circular dependencies could be found without hurting unmanaged pointers, but this has a high performance hit on Python, and it is still adviced to avoid circular references.

Conclusion

RefCounter Gc’s are easy to implement, and a good idea for scripting languages which know all references so that circular references can be dealt with accoringly. But they still have a lot of disadvantages which other kinds of Gc’s solve a lot better, and more elegantly.