Pointers and reference counts

The three Lava pointer types, storage management based on reference counts

Why reference counts rather than garbage collection?

We felt that simple garbage collection by "mark and sweep" is a rather inelegant sledge-hammer method. It causes quite noticeable bursts of garbage collection (GC) activities that interrupt the normal computation process from time to time, possibly even in the midst of time-critical transactions. More sophisticated GC procedures, however, require much more implementation effort. Reference counts are easier to implement and promise to provide a more smooth storage management style by releasing storage occupied by objects at the earliest possible moment.

A second, less emotional argument in favor of reference counts: If you pass an object by reference across a component border then you would like to know anyhow when the receiving component does not need/reference the object any longer, and we expect that it is much easier to manage reference counts across component borders and to standardize an appropriate interface to this end than to standardize garbage collection procedures across the borders of components written in different languages and following different programming paradigms.

A third argument in favor of reference counts has been that Lava anyway distinguishes already two different kinds of pointers/links between objects: constituents and acquaintances (see below), and it is rather obvious to introduce a third kind then: reverse links. Constituent and acquaintance links are considered to point in forward/downward direction in Lava, while a reverse link from object B to object A should normally imply that B can be reached from A by a forward/downward path (i.e., via a chain of constituent/acquaintance links). Closed cycles of objects are avoided under these conditions, which may otherwise restrict the successful usage of reference counts for storage management.

The three application-level pointer types in Lava

We have stated already in a former section that we need a proper application-level distinction between constituents and acquaintances of complex objects, and storage management by reference counts makes it desirable to introduce a third type of pointers/references/links between objects:

If our object system (which is linked together by b>constituent and acquaintance pointers) contains closed cycles rather than being an acyclic graph structure then reference counts without any further precaution could cause isolated cycles of objects to arise that are no longer referenced from anywhere else. The objects belonging to such a cycle would just prevent each other's reference counts to reach 0 since each of these would be referenced by an adjacent object in the cycle, and therefore none of them would ever be destroyed.

In order to avoid this, Lava provides a third kind of references between objects: reverse links.. They should be used particularly for backward references to "parent" or "ancestor" objects or anyhow

in a way that prevents the necessity to establish closed cycles of constituent/acquaintance references.
Moreover, proper functioning of storage management, as outlined below, requires that reverse links originate from objects that can be reached by a forward (i.e., constituent/acquaintance) link from the target object of the reverse link./li>

Lava manages a separate reference count for reverse links. If the normal "forward" reference count of an object becomes 0 so that the object can at most, if at all, be reached via reverse links from now on, then it is automatically and irreversibly transformed into a "zombie" object. Any attempt to access this object will result in a specific exception.

If the "zombified" object has forward (constituent or aquaintance) links to other objects then the corresponding forward/reverse counts of these objects are decremented by 1.

In addition, Lava provides a basic method finalize of class Object, which marks the object as a zombie immediately and which again releases all directly linked objects.

Object::finalize can be used

for (secure) manual storage management in exceptional cases of reference structures containing closed cycles of forward links ("dangling pointers" cannot occur even in this case!),
to forcibly and immediately terminate the validity/usability of an object, for instance if a bank account is closed. This enables a safe and clean semantics of "closing" an object.

AAs a further, still stronger aid for releasing whole collections of objects that are linked together by potentially circular forward (= constituent or acquaintance) links, class Obj provides a method finalizeRec with a boolean parameter aquaintancesToo which recursively finalizes/zombifies an object and all its constituents (and acquaintances, if aquaintancesTooancesToo is true), irrespective of their reference counts.

Note, however, that Lava objects are not destroyed before both the normal and the reverse reference count are 0. This can be achieved only by proper usage of the three reference types and, in exceptional cases, the Object::finalize and Object::finalizeRec methods.

So Lava guarantees the absence of "dangling pointers" under all circumstances, and under normal circumstances (without circular or simply unreasonable linkage structures) objects that are not needed any longer will be destroyed automatically at the earliest possible time.

Finalizers of classes other than Object

C++ provides the notion of a destructor, Java the (overridable) finalize method of class Object. These are invoked implicitly/automatically when objects are going to be destroyed (by the Java garbage collector or the C++ delete operation or if local variables disappear from the run time stack. They give an opportunity to the object to perform special finalization operations before it is destroyed and its storage is freed. In C++, destructors are required particularly because members that are attached through pointers to other objects are not released automatically.

This doesn't apply to Java and Lava. In these languages destructors/finalizers are only needed if an object should perform certain clean-up operations before it is released and destroyed, E.g., a file descriptor should close the respective file if the user of the file descriptor has forgotten to do this (or has been prevented from doing so by an exception that occurred before the close).

Like Java, Lava provides a finalize method of class Object to this end (see above), which may be overridden in derived classes. It is invoked automatically by the Lava run time system when an object is about to be destroyed since its reference counts are 0.

Note, however, that finalize may attach the respective object as a member to some other object and thus cause its reference count(s) to become non-zero again. In this case the object isn't destroyed, of course, after finalize returns.

The finalize method of a derived class is a quite normal method and isn't restricted anyhow. It may in turn call Object::finalize (by a "static call") if the respective object is to be zombified finally even if its reference counts are not yet = 0.

If finalize is invoked by a tracing garbage collector as in Java the you can get into troubles since the time of invocation is more or less unpredictable in this case. Cf., e.g., the discussion of finalization in Eclipse/SWT and elsewhere. In Lava it is perfectly predictable: It is just the earliest possible time, viz. the moment when the reference counts of the object go down to zero.

The three Lava pointer types, storage management based on reference counts

Why reference counts rather than garbage collection?

The three application-level pointer types in Lava

Finalizers of classes other than Object

See also