StackThreads/MP: version 0.77 User's Guide
Q: When I run a program compiled by stgcc
on a single processor,
it is slower than its serial program compiled by gcc
. Why?
A: Although StackThreads/MP tries to minimize the overhead of running parallel programs, there are several sources.
ST_THREAD_CREATE
adds something like ten instructions in
addition to a normal procedure call.
ST_POLLING
costs something like ten instructions.
These overheads are insignificant when thread and synchronization granularity are large enough (several hundreds instructions between two events are large enough). Moreover, these overheads are fairly easy to reason about, because they all appear in the source program. There are other sources that are not directly visible in C/C++ sources.
stgcc
by default prohibits inlining. This may slowdown
performance of C++ programs considerably. See Possible Slowdown by Disabling Inline for details and workarounds.
stgcc
prohibits omitting frame pointers that are commonly
adopted on Mips and Alpha. This may add a few instructions to a
procedure call overhead.
The last two items are not noticeable unless the program calls a very
small function (that perform, say, only 10 instructions in its body) too
often. Such procedures are most likely to be inline-expanded if you
attach inline
keyword. This eliminates the procedure call
overhead all together.