PREV UP NEXT StackThreads/MP: version 0.77 User's Guide

5.1: Profiling your program

The most convenient way is to run your program with -tp option. For example,

        harp:367% ./fib -nw 10 -tp
@        pfib: 1091 ms on 10 processors, sfib: 2062 ms

It produces a lot of files whose names are 00stprof.xx.yy (where xx and yy is a number) in the current directory. Using 10 processors, you typically have 10 files.

You may want to profile a particular section of your program. In such cases, add a call to st_begin_profile() where you want to begin profiling and st_end_profile() where you want to finish profiling. Currently, you can call them only once in a program run.

The resolution of profiling is, by default, 100 microseconds. Each processor measures how much time it spends in each state (busy, idle, etc.) and, at every 100 microseconds, calculates the dominating state of that period. The log file records a state of each period. You can change the resolution of profiling by command line option --time_profile_resolution S, where S specifies the length of a period in microseconds. Specifying a large number saves space but the result may be inaccurate. Specifying a smaller number makes result more reliable at the expense of space.

Each processor keeps a fixed sized buffer for accumulating profiles and saves it into a file when the buffer overflows (and when the profile is finished). It may introduce a large Heisenberg effect into your profiling. You can increase the size of the in-memory buffer by --time_profile_buffer_size N where N specifies the number of entry in an in-memory buffer. The default is 8100. N does not necessarily represent the number of periods you can tolerate without saving the in-memory profile into a secondary storage, because a single entry describes a number of consecutive periods in a single state. For example, a processor is busy most of the time, the necessary storage will be quite small.