PREV UP NEXT StackThreads/MP: version 0.77 User's Guide

StackThreads/MP is a library that supports fine-grain multithreading in GCC/G++. It is supplied as a set of C library routines that are callable from any GCC/G++ code that satisfies a few conditions described later. It supports dynamic thread migration on multiprocessor systems, assuming cache-coherent shared-memory.

Unlike traditional user-level thread libraries (e.g., Pthreads), it tolerates a large number of (say, 10,000 or much more) threads and imposes a very small overhead for creating and terminating a thread. The programmer can create a new thread of control upon any procedure call and a thread creation usually adds only a few instructions to the procedure call. StackThreads/MP thus allows a programming style in which the programmer assigns a thread to a unit of work that s/he considers natural, and spawns them dynamically when needed.

The net result is that, using fine-grain threads, you can reduce the cost of building a parallel program, either from scratch or from an existing sequential source, and the cost of maintenance. It is also useful as a compilation target of higher-level parallel programming languages.

  • Introduction Introduction
  • The Basics Basics
  • APIs APIs
  • Memory Management Memory Management
  • Performance Profiler Performance Profiler
  • Sequential Modules Cooperating with Sequential Modules
  • Command Summary Command Summary
  • Patches to GCC Applying Safety Patches to GCC
  • Advanced Topics Advanced Topics
  • Limitations Known Limitations and Subtleties
  • Indices Indices

    --- The Detailed Node Listing --- Introduction

  • Reporting Bugs
  • Platforms
  • Installation

    The Basics

  • An Example
  • Compiling
  • Running
  • Basic Primitives


  • Creating A Thread
  • Polling
  • Synchronization
  • Spin-Locks
  • Yield
  • Stacks and Contexts
  • Profiler
  • Timing
  • Workers and Worker Groups
  • Callback
  • Graceful Exits
  • Thread Manipulation
  • Thread and Worker ID


  • Join Counter
  • Semaphore
  • Mutex
  • Condition Variable


  • Initialization Lock and Unlock
  • Reading and Checking Locations
  • Try Lock and Lock Any
  • Fetch and Add

    Stacks and Contexts

  • Examine Current Stack Size
  • Show Stack Trace
  • Invoke Stack Management
  • Stack Pointers and Frame Pointers

    Workers and Worker Groups

  • Create A Group
  • Add a Slave Worker
  • Check Messages between Workers
  • Deschedule Worker

    Thread Manipulation

  • Suspending A Thread
  • Resuming A Thread
  • Procedure Information

    Memory Management

  • Parallel Conservative Garbage Collector (SGC)
  • Region-Based Memory Management
  • How to Switch between Allocators
  • Alloca is not supported (yet)

    Parallel Conservative Garbage Collector (SGC)

  • Controlling GC behavior

    Region-Based Memory Management

  • Region Basics
  • Creating A Region

    How to Switch between Allocators

  • Changing the Underlying Allocator
  • Changing Region-based Allocator

    Performance Profiler

  • Profiling your program
  • Viewing profiled results

    Cooperating with Sequential Modules

  • Calling Sequential Procedure from StackThreads/MP Procedure
  • Calling StackThreads/MP Procedure from Sequential Procedure

    Calling StackThreads/MP Procedure from Sequential Procedure

  • Create a Worker Group
  • Setup TLS Pointers

    Command Summary

  • STGCC and STGPP A Wrapper for GCC
  • STLINK A Wrapper for Linker
  • Command Line Options Common for All StackThreadsMP Programs

    Applying Safety Patches to GCC

  • How to Apply Patches
  • Using Patched GCC
  • Detailed Description of Patches

    Advanced Topics

  • How does it work
  • A recommended programming style for performance
  • Where should you insert ST POLLING
  • Implementing Synchronization Primitives
  • What Does stgcc Command Do
  • What Does the Postprocessor Do

    Known Limitations and Subtleties

  • Floating SP Problem
  • Possible Slowdown by Disabling Inline
  • Structure Passing on SPARC
  • CPP Global Constructors
  • Alloca
  • Big arguments
  • Interaction with pragma interface


  • Why am I interested in it
  • Does it subsume Pthreads
  • Compilation fails
  • Link fails
  • How to obtain tools
  • Why is compilation slow
  • Segmentation fault
  • Why is my program slow on a single processor
  • Why is my program slow on multiprocessors
  • What is ST_POLLING for
  • Why both spin-locks and mutex


  • Function Index
  • Variable Index
  • Data Type Index
  • Program Index
  • Concept Index