next up previous contents index
Next: 7.3 Thread Statuses: Joinable Up: 7. Multi-Threaded Programming in Previous: 7.1.0.0.1 Hello World for   Contents   Index

7.2 Communication among Threads

Example 7.2.1   Consider the program fragment
:- dynamic p/1.

test:- thread_create(assert(p(1)),_X).
If you type the goal ?- test and then the goal ?- p(X), the call p(X) will fail.

This illustrates an important point about dynamic and tabled predicates in the multi-threaded engine: by default clauses for a dynamic predicate p/n are private to the thread that asserts them; and by default tables created in an evaluation of a goal for p/n are private to the thread that evaluates the goal. This behavior contrasts to that of static code which is always shared between threads. In the example above, to allow p(1) to be visible to various threads, p/1 must be declared to be shared with the following declaration.

:- thread_shared(p(_)).

The ability to share dynamic code between predicates provides an extremely powerful mechanism for threads to communicate. So why does XSB make dynamic predicates thread-private by default? The main reason for this is that if dozens or hundreds of threads are running concurrently, shared dynamic code becomes an expensive synchronization point. Code for shared predicates must be more heavily mutexed than code for private predicates. In the case of dynamic code, XSB does not always immediately reclaim the space of retracted clause, to avoid the possibility of some computation backtracking into a clause that has been reclaimed. Rather, (like most Prologs), XSB may garbage collect the space of the retracted clauses at a later time. While clause garbage collection is simple enough to implement for a single thread, garbage collecting clauses for shared dynamic predicates is difficult to do when multiple threads are active. Accordingly, in Version 3.0, space for shared dynamic clauses is not reclaimed until there is a single active thread. However for thread-private dynamic predicates, there is no problem in reclaiming space when multiple threads are active: from the engine's perspective garbage collection is no different than in the sequential case. Thus one set of reasons for making dynamic predicates private by default are based on efficiency 7.1.

The second reason for making dynamic predicates thread-private by default is semantic. Suppose thread $T_1$ starts a tabled computation that depends on the dynamic shared predicate p/1. While $T_1$ is computing the table, thread $T_2$ asserts a clause to p/1. $T_1$'s table is likely to be inconsistent, leading to the problem of read consistency of any table that depends on thread-shared dynamic predicates. In Version 3.0, users are responsible for ensuring read consistency of any tables that depend on shared dynamic data. Future versions of XSB are intended to allow more sophisticated mechanisms for read consistency.

Not only can tables depend on thread-shared or thread-private dynamic data, but the tables themselves may be thread-shared or thread-private. Like dynamic code, the declaration thread_shared/1 allows sharing of tables for a predicate evaluated with call-variance to be shared among threads 7.2. To some extent, tabling considerations for making a predicate thread-shared or thread-private are like those of dynamic code. Thread-private tables require fewer synchronization points overall. The situation for reclaiming space for abolished tables is analogous to reclaiming space for retracted dynamic clauses: the garbage collector treats abolished tables for thread-private predicates as in the sequential case, while space for shared tables is not reclaimed until there is a single active thread. However the precise semantics of how tabling information is shared depends on whether the multi-threaded engine is configured with the default local evaluation or with batched evaluation. As discussed in Chapter 5, local evaluation is so-named because computation always takes place in the SCC most recently created, and no answer is returned outside of an SCC until the SCC has been completely evaluated. Within this scheduling strategy it is not often useful to share answers between tables that have not been completed - as local evaluation would allow these answers to be returned only if the tables were in the same SCC. This leads to a concurrency semantics called Shared Completed Tables. Shared Completed Tables can in fact be supported by a relatively simple algorithm for optimistic concurrency control. If goals to two mutually dependent tables $Table_a$ and $Table_b$ are called concurrently by two different threads, $Thread_a$ and $Thread_b$, nothing is done until it is detected that $Table_a$ and $Table_b$ are both incomplete and are contained in the same SCC of the table dependency graph. At that time, one of the threads (e.g. $Thread_a$) takes over recomputation of all tables in the SCC, and when the SCC is completed, any remaining answers are returned to other threads that had invoked goals in the SCC. While $Thread_a$ is completing this computation, $Thread_b$ suspends until the SCC is complete. Thus the semantics of Shared Completed Tables supports concurrency for the well-founded semantics, but only supports the most coarse-grained parallelism.

Batched evaluation, on the other hand, allows answers to be returned outside of an SCC before that SCC has been completed. Concurrency control for batched evaluation is similar to that for local evaluation, except in the following case. Assume as before that $Table_a$, first called by $Thread_a$, and $Table_b$ first called by $Thread_b$ are determined to be in the same SCC, and that $Thread_a$ takes over computation of subgoals in the SCC. Now, $Thread_b$, rather than suspending, may continue work. In particular, $Thread_b$ can return any answers in $Table_b$ that it finds whenever it finds them, regardless of whether they have been produced by $Thread_b$ (before $Thread_a$ took over the SCC) or by $Thread_a$ (afterwards). We call this type of concurrency semantics, Table Parallelism. Table Parallelism can be used to program producer-consumer examples, as well as to implement Or- and And- parallelism. Table Parallelism was first introduced in [25], but the mechanism now used for implementing Table Parallelism differs significantly from what was described there. In Version 3.0 of XSB, the implementation of Table Parallelism is experimental: in particular, it does not yet support tabled negation.

As mentioned, for either semantics of shared tables, in Version 3.0, users of thread-shared tables are responsible for ensuring read consistency. Note that, in principle, thread-shared tables may depend on thread-private tables and vice-versa. Either type of table may depend on thread-private or thread-shared dynamic code. In addition, a predicate may be both dynamic and tabled, and its clauses and tables may be either thread-private or thread-shared.


next up previous contents index
Next: 7.3 Thread Statuses: Joinable Up: 7. Multi-Threaded Programming in Previous: 7.1.0.0.1 Hello World for   Contents   Index
Terrance Swift 2007-10-05