StackThreads/MP: version 0.77 User's Guide
Suppose procedure f calls ST_THREAD_CREATE(g())
, f
performs a normal C procedure call g()
. What is important here
is that it performs the procedure call just as if it were a sequential
call. It also records the fact that the procedure call is a fork
point. This way, the runtime system knows that the rest of f can
progress even when g is blocked, or when an idle worker shows
up. The exact method by which we record a fork point is unimportant; we
note that it only takes a couple of instructions.
If g()
terminates without blocking, it returns to f just
as a normal procedure returns. What happens when g blocks? A thread
blocks , for example, g calls st_mutex_lock(j)
, but m
is currently locked by another thread. In this case, we must be able to
suspend execution of g for now and later resume it when m is
unlocked. We achieve this by splitting the stack at the fork point where
g was spawned; the activation frame of g is logically decoupled
from the stack and the rest of f continues (when f is still not
migrated to another worker). g's activation frame physically remains
in the original stack, occupying the top portion of the stack. Stack
pointer is kept above g, while frame pointer now points to f (we
use a terminology as if a stack grows upward. A frame x is above
y if x is nearer to the growing end of the stack than
y). Subsequent procedure calls by f allocate a frame above
g. When g later restarts its computation, g frame is
logically connected to the current frame and becomes the child of
it. This frame may of course be different from g's original
caller. When g terminates, the control returns to the new parent, not
the original caller.
In general, each worker maintains its stack pointer below any unfinished
frame. A procedure call or ST_THREAD_CREATE
allocates a new frame
above stack pointer. Frames of blocked (yet unfinished) threads are
logically decoupled from the stack, but physically stay within the
stack. Frames of restarted threads are reconnected to the stack.
Leaving blocked frames in their original positions and making further
allocations on top of them, we may leave free spaces in a middle of a
stack. We do not try to reuse such holes within a stack. We always
allocate a new frame on top of the stack. Once a region becomes free in
a middle of a stack, it is reused only when no spaces are occupied above
it. When such a situation occurs, stack pointer is reset to top of the
unfinished frames. Exact details as to how to reclaim such spaces is not
described here; we only mention that such reclamation of space occurs
when the programmer calls ST_POLLING
. This is the first reason
why you have to call ST_POLLING
periodically.
What about thread migration? Interestingly, a thread migration is just a combination of thread blocks and a resume. Suppose that both f and g are in worker W's stack and W wants to migrate f to another (idle) worker U. At this point, W temporarily blocks g and resumes f. Next f also temporarily blocks and resumes the parent of f. The parent of f passes the context of f to U through shared-memory and then restarts g. U picks up the context of f and restarts it.
In general, to migrate a thread from W to U, W first blocks all frame above the migrating thread. Next W blocks the migrating thread too. W then restarts threads above the migrating thread. The migrating thread is picked up by U and restarted. This way, the migrating thread is pulled out from one stack and reconnected to another.
A thread migration also occurs when ST_POLLING
is called. When a
worker becomes idle, it randomly selects another worker and sends a
request. ST_POLLING
checks whether a request has arrived to the
calling worker and if it has, performs the above operation to migrate a
thread to the idle worker. This is the second (and the last) reason why
you have to call ST_POLLING
periodically.
The remaining question is, when a worker gets a request, how does it
select a thread to migrate. We simply migrate the thread at the bottom
of the stack. For example, suppose f performs
ST_THREAD_CREATE(g())
, g
performs
ST_THREAD_CREATE(h())
, and ST_POLLING()
finds a request
from an idle worker. In this case, f migrates. When the second
request soon arrives to the same worker, then g migrates. When call
tree is fairly balanced, as in the Fibonacci example (See An Example), this strategy works very well; all workers will soon get a
sub-tree which is fairly near to the root and concentrate on their local
work. With this task distribution strategy in mind, let's go on to the
next section.