Multi-threading, synchronization, and transactions in Lava

(Not yet implemented)

Lava is a sequential programming language with built-in multi-threading support. It treats multi-threading, synchronization, and transactions in a purely declarative way without requiring delicate and error-prone executable primitives. Let us start with

Concurrent execution using implicit thread creation and termination

Functions occurring in Lava classes may be declared to run in synchronous, concurrent, or autonomous mode. Lava initiators, which play the role of autonomous main programs, may be declared to run in synchronous or concurrent mode.

Concurrent execution means that the function or initiator is executed within a thread of its own. Autonomous execution of a function means that, in addition to being executed concurrently, the function does not return output parameters or throw exceptions (using the throw statement) to its caller. Member functions are executed in synchronous mode by default. Initiators have no output parameters and cannot throw exceptions.

When a transaction function or initiator returns then this does not necessarily mean that its associated thread is destroyed. For performance reasons the Lava run time system may prefer to return such a thread into a pool of "unused threads" and reuse these whenever a new thread is required. New threads would then be created only if this pool is empty (not yet implemented!).

Mutual-exclusion synchronization using transactions

Irrespective of their execution mode, functions and initiators may be declared to be transactions. The purpose of transactions is to synchronize read and write access operations that are performed on shared data by concurrent threads or processes.

At this place you should recall that Lava distinguishes two categories of objects: Variable state objects ("services" or "servers") that can be modified again and again, and value objects ("structures") that can be modified only during construction and that become immutable after "completion". For this reason a value object does not need access synchronization after completion.

Before completion Lava prevents any Lava objects from being passed on as function or initiator parameters. Thus they can be processed only by member functions of their own class, and these cannot be executed in concurrent mode in this stage. As a consequence access synchronization is not needed for incomplete Lava objects, and for value objects access synchronization is never required.

The traditional transaction notion defines that a set of concurrent transactions is synchronized correctly if they have "the same effect" as if they were executed in some proper, strictly sequential order one after the other. The problem with this definition is that the term "same effect" means that you must compare the objects that are inserted, updated, deleted, or just read by those transactions for the properly synchronized concurrent and the strictly sequential case. Two problems arise in this context:

  1. When comparing objects you must know precisely what belongs to an object and what is only a reference to some other, independent object. In traditional databases you may sometimes doubt whether, e.g., a "foreign key", or a link in a network database points to an acquaintance or to a constituent of the containing primary object from the application's point of view. In the second case the target of such a pointer should be included in the comparison (and should be locked, together with its containing parent object, as a consequence).

    In Lava this problem doesn't occur owing to the Lava constituent notion.
  2. On the other hand, you may ask whether it is necessary and justified indeed to always lock entire objects that are touched by a transaction, including all their constituents. This might unnecessarily restrict potential parallelism.

    Lava gives a negative answer to this question. Owing to the single-assignment nature of Lava the body of a transaction function or initiator in Lava can be viewed as a complex logical statement, which is to be rendered true at some proper instant, and this gives rise to our specifically redefined

Lava transaction notion:

If a function or initiator is declared to be a transaction then this means that its body, viewed as a complex logical statement, is to be rendered true as a whole at some proper instant.

Actually this concerns only the Atomicity part of the usual ACID definition of the transaction notion. The C (Consistency), I (Isolation) and D (Durability) parts remain unchanged in Lava.

A particular consequence of this revised transaction notion is that not entire objects but only individual object references (= member variables) occurring expressly in this complex statement need to be protected against concurrent access. Clearly the entire access path of an object, for instance a.b.c.d, has to be protected by proper locks, including also the intermediate references a.b, a.b.c; the local variable "a" itself need not be protected since it cannot be referenced/changed from elsewhere.

Member variables of state objects and incomplete value objects are protected in the usual way against colliding access operations: Readers put write locks on the respective variable. A write lock is granted if the variable is not blocked currently by a read lock. Writers put read locks on the respective variables. A read lock is granted if the variable is not blocked currently by a write lock. Locks are kept until end of transaction and are released then.

Clearly Lava has to support nested transactions since function and initiator calls may be nested. Distributed transactions are closely related to the Lava component notion since distribution is supported only on the component level (not yet implemented) in Lava.

Summary: Lava replaces the usual indirect definition of atomicity ("same effect as with properly sequentialized execution") by a quite direct definition as "truth of a complex logical statement at some proper instant". This is enabled by the ambivalent (logical and imperative) nature of the Lava semantics which is a consequence of its single-assignment nature in turn. The Lava implementation of transactions does not lock entire objects touched by a transaction but only individual member variables contained in some access path that is really used in the transaction.

Aborting transactions by throwing exceptions

A special "abort" primitive for transactions is not required in Lava since you can simply use the Lava exception signalling statement

throw <expr> 

(analogous to throw in C++/Java) to trigger the general Lava exception handling mechanism immediately.

If an exception is not handled within the current transaction then the transaction is terminated and the exception is forwarded in upward direction within the current execution stack, provided the transaction function has not been declared "autonomous" (recall that initiators are always autonomous). For non-autonomous concurrent transactions this means that the exception has to be propagated from the transaction thread to the caller thread.

Waiting for output parameters returned by a concurrently executing method

This is the typical asymmetric client/server relationship with asynchronous service requests. In Lava, a subsequent read access to such output parameters blocks the caller until the asynchronous method has returned. For simplicity concurrently executing methods are always "autonomous" in Lava, i.e., they don't throw exceptions to the caller. (Otherwise the caller would possibly have to wait yet since the method could still throw an exception.

Waiting for new property values to be assigned by concurrently executing methods

This is the typical symmetric producer/consumer relationship between concurrent threads. In Lava, this is handled by special "consumable" member variables of services. (Remember that member variables of services/state objects are called "properties".) Such an object is "consumed" by every read access. Readers are blocked until a writer has assigned a new value to this property. Writers are blocked until the object has been consumed by all readers. Only one writer is admitted at a time. Several readers may access the object concurrently but it is consumed only once by this entire group of readers. If a reader wants to access this same copy of the object several times then he must assign it to a different (e.g., local) variable first. (Note that Lava variables contain only references to objects. So when a new value is assigned to a Lava variable then the old value (= object) is not necessarily destroyed, but the life time of Lava objects is controlled by their reference count. An object is destroyed only when its reference count reaches 0.)

Implementing servers that wait indefinitely for service requests

A server process will typically wait indefinitely for service requests and process these either sequentially one after the other or as concurrent transactions. In Lava you would create a service object to this end whose methods are transactions. Whether they should be declared "synchronous" (= default) or "concurrent" will depend on whether the clients are willing to wait for the response or would prefer an asynchronous processing of the request. Strictly sequential processing of requests as a brute force method to guarantee correct synchronization is not needed since transaction synchronization is automatic and easy in Lava and does not charge the programmer with additional efforts.

Special case of waiting for service requests: waiting for callbacks.

Summary

We are convinced that

are sufficient to appropriately handle all cases of multi-threading and synchronization that may reasonably occur in application level programming. None of these cases requires special executable primitives, they can all be handled by purely declarative expressive means in Lava.

However, to make proper use of this you should be ready to think about a proper structuring of your application. For instance, Lava does not support alternative waits for several objects in the producer/consumer case but this case should better be handled by callbacks (= client-defined notification functions, client = consumer) or by server-defined notifications (server = producer) in our opinion.

We expect and hope that this reduction of expressive means to the really required minimum will lead to more standardized application structures and to an improved common understanding as to these traditionally very delicate problems "multi-threading, synchronization, and transactions".