The problem with threads has been very concisely explained in the famous paper[pdf link] of Edward A. Lee. Very simply put, the author argues that threads are a problem because they add non-determinism to an application, and that the programmer has to put in extra effort to control the non-determinism, while this should be the other way around. Quoting the author, *Threads, as a model of computation, are wildly nondeterministic, and the job of the programmer
becomes one of pruning that nondeterminism. (..) I argue that this is approaching the problem backwards. Rather than pruning nondeterminism, we should build from essentially deterministic, composable components. Nondeterminism should be explicitly and judiciously introduced where needed, rather than removed where not needed*.

The author goes on to suggest few ways to allievate the problem

- Better software engineering process
- Use vetted design patterns for concurrent computation
- Using patterns which can be encapsulated into language libraries (for example, MapReduce)
- Extend the existing programming language with annotations to indicate concurrent compuations
- Using promises / futures as implemented in the E language (work on proxies of shared data instead of the actual)
- Using tools to determine if the existing code is indeed thread safe or not

The author then goes on to suggest an alternative to threads, as a system modeled closely like CSPs. In this model, the various components of the system are deterministic. The rendezvous director is the one which *controls *the nondeterministic part of the system. In such a system, the component is deterministic, but the merge is the one which is nondeterministic. In such a case, the merge block is where one can concentrate on to ensure that any *rollbacks *be transmitted back to the components. The discussion at LtU, has some interesting points (some of which are OHT for me right now).

This post of course is not deal with the problem of threads in detail. Rather, I want to concentrate on how the author presents a model of threads as a computation, and how one can clearly see the problem of nondeterminism creep. This makes is very difficult to reason about systems which are heavily threaded. As I am not aware of how to integrate the LaTEX plugin, I shall try to write the equations as closely to the document.

Let N={0,1,2…} be the set of natural numbers

B={0,1} the set of binary bits

B* is the infinite set of all finite sequences of bits

Bw =(N->B) i.e. it is the infinite set of infinite sequences of bits which give an output. An output of zero or one. Consider the Bw as that set of those bits which have some output (codomain in {0,1})

B** = B* U Bw i.e. B** is the set of all finite sequences of bits and those infinite sequences which have a finite ouput i.e. the finite sequences could either give an output or needn’t give any. This is the computing machine, which can potentially infinite inputs and potentially infinite outputs (remember it is the union between B* which is an infinite set).

Q is a partial function of values between B** and B** i.e. there can be values in B**’s domain which don’t map back to values in B** (that is what a partial function is). Simply put, this means that there are sequences (finite / infinite) which don’t have a solution in B**.

An imperative machine (A, c) is a finite set of actions and control functions. The set A is the set of actions are generally the instructions that are executed in a computation.

A program of length m is represented as p : N->A – a function between the number and set of actions. A way to say that for every N there is one program (actions) of length N. In simple words, a program of length m.

Now, that a program has been defined of length m, it is time to consider the state of the program as it executes (via a thread). bn+1 = p(c(bn))(bn), for bn being the state in B**. Here,

- c(bn) gives the nth number to execute
- p(c(bn)) the nth instruction to execute
- p(c(bn))(bn) – result of this execution (apply the nth instruction determined by c(bn), to the bnth data to get the output of the execution)

Now, the author explains the actual problem with threads. For a sequential process b(n+1) is always defined for a given n. The output is always deterministic. But, if there are (say) two threads running concurrently accessing this data, then the equation becomes: bn+1 = pi(c(bn))(bn) for i in {1, 2}. (please read the pi as p subscript i). In this case, given the value of i, the next state is determined by either p1(c(bn))(bn) or p2(c(bn))(bn). So, even though, the next instruction as returned by the control function is known, the next state is nondeterministic because of the value of i that is picked up at that instance of time. And of course, if there are two CPUs (or cores) whose time slice is given to both of these threads, then the data access problems become compounded.

I hope I was able to explain the theory (I do know that it is not possible to explain theory withouth the right mathematical symbols, I shall fix that soon, once I get my head around either MathML or using LaTEX in HTML) as to why there is a problem with threads. The last equation of the program presentation is the crux of the complete problem. The bn+1 i.e the next stage is very clearly defined in the case of a sequential program, whereas in the concurrent program with threads, it is not defined. This essentially is the crux of the problem.

The author at the end of the paper tries to model the concurrent computation thus:

f: (T->B**) -> (T->B**). Here T is an *ordered *set of *tags*. The order of these tags can be determined by various factors like time, causality etc. In simple terms, based on some order within a set of tags, the next instruction to execute is picked, and that tries to find an output from a similar set of outputs – nondeterminism in its fullest.

Please let me know if I have made any factual errors in trying to explain the theory of the problem with threads. The idea behind the post was to make it easy for people to understand what those (sometimes scary) formal symbols actually meant.