Christian's profileChristian's SpaceBlogLists Tools Help

Blog


    9/2/2008

    Debugging parallel programs with Visual Studio: OpenMP

    I love Visual Studio (2008) for C++ development - and this is not the first time that I stated that, nor the only place. One of its specific strengths is the tight integration of the compiler and the debugger. In my opinion, VS is the best debugger for C++, as “it just works” in most cases, even with our most demanding applications. Sure, there are some issues (as with every debugger out there), but especially when it comes to working with multi-threaded codes VS has proven to be a pretty solid solution. This blog post will discuss the debugging capabilities for OpenMP programs written in C/C++ (well, and the obstacles you will encounter); a following post will discuss debugging MPI programs using the DDTlite Visual Studio plugin from Allinea as soon as the beta program goes public).

    For the sake of simplicity and just because I like it, I will take the Jacobian solver as an example program. The interesting part looks like follows:

    01         while (data.iIterCount < data.iIterMax && residual > data.fTolerance) 
    02         {
    03             residual = 0.0;
    04             /* copy new solution into old */
    05 #pragma omp parallel
    06 {
    07 #pragma omp for 
    08             for (int j = 1; j < data.iRows - 1; j++)
    09             {
    10                 for (int i = 1; i < data.iCols - 1; i++)
    11                     UOLD(j,i) = U(j,i);
    12             }
    13 
    14             /* compute stencil, residual and update */
    15 #pragma omp for reduction(+:residual)
    16             for (int j = data.iRowFirst + 1; j <= data.iRowLast - 1; j++)
    17             {
    18                 for (int i = 1; i < data.iCols - 1; i++)
    19                 {
    20                     /* … more code here …*/
    21 
    22                     residual += fLRes * fLRes;
    23                 }
    24              }
    25 } /* end omp parallel */
    26 
    27             /* error check */
    28             data.iIterCount++;
    29             residual = sqrt(residual) / (data.iCols * data.iRows);
    30         } /* while */

    Here we have one OpenMP Parallel Region with two Worksharing constructs in it, one of the Worksharing constructs has a reduction operation on it. There are two important features a parallel debugger has to provide for OpenMP:

    1. Control of individual threads. In OpenMP, the program execution starts with a single thread (named the Master or Initial thread). At each Parallel Region a Team of threads is created. As the thread creation is done by the OpenMP runtime, you (= the user) are not interested in this, but once all threads are created you are interested in controlling them individually!
    2. Examination of private variables. Each thread has its own copy of a private variable, thus the values can differ between threads. The debugger has to be able to display the different values for different threads!

    In order to set the number of threads an OpenMP program should run with, you have several options. The most prominent one is to set the OMP_NUM_THREADS environment variable (right-click on the project, choose Properties, select Debugging from the Configuration Options in the left pane and add OMP_NUM_THREADS=value to the Environment field in the right pane). Other options are to add the num_threads() clause to the Parallel Region or use OpenMP’s API, but typically I prefer to use the environment variable.

    Obstacle: When you “approach” a Parallel Region in the debugger and select “Step Over (F10)” just at the line at which the Parallel Region begins (line 05 in my example above), the debugger will continue at the next executable line right after the Parallel Region (line 28 in my example above). Of course it will execute the Parallel Region, but it will not go through it line by line. If you select “Step Into (F11)”, it will ask for the file fork.cpp, as it tries to debug the OpenMP runtime creating the Team of threads. The Workaround is pretty simple: Just set a breakpoint on the first executable line inside the Parallel Region.

    Following the example above, you would set a breakpoint in line 08 and select “Step Over (F10)” (as you are probably not interested in debugging Microsoft’s OpenMP implementation and assumingly do not have the source to it). Once you arrived there, the Call Stack window will show you something pretty similar to this:

    shot_debugging_openmp_callstack

    Lets now examine what we have here, starting at the bottom of the list:

    • Three lines beginning with “jacobi-omp.exe”: This is our program code. jacobi-omp.exe!main() is the program entry point of my example. All lines that are below that are from initialization of the runtime library and initialization of static variables and kernel functions to setup the program context. In 99.9% of the cases you are using a debugger you can happily ignore those. Inside main() the function Jacobi() is called, which contains the code snippet shown above with the Parallel Region.
    • Four lines beginning with “vcomp90d.dll”: This is Microsoft’s OpenMP runtime library. VCOMP assumingly stands for Visual C/C++ OpenMP and 90 is the version number (Visual Studio 2008 is Visual Studio 9.0). Please note that many things I am telling you about this library are based on my experiments with it, as I do not have any additional sources than MSDN (which is pretty sparse regarding OpenMP details).
      • _vcomp_fork() + InvokeThreadTeam(): These functions create or awake the Team of threads for the current Parallel Region. You might have noticed the if_test variable that equals 1: OpenMP allows for an if() clause that can be added to a Parallel Region and if that expression evaluates to false, the Parallel Region will be executed with a Team of only one thread. You will find that this variable correlates to the if() clause.
      • _vcomp::ParallelRegion::HandlerThreadFunc() + _vcomp::fork_help(): These functions start the execution of the Parallel Region and take care of setting up any Worksharing constructs. You might have noticed the index variable that equals 0: It correlates to the OpenMP thread id. The Master thread has the id zero, the other Team members have positive ids starting with one.
    • jacobi-omp.exe!Jacobi$omp$1(): This name results from the way in which OpenMP is implemented. The code inside a Parallel Region is outlined into a new routine which is named accordingly, so here you have the first OpenMP Parallel Region of the Jacobi() function. This routine just contains your code and it called by all threads in the Team. When you switch to another thread, you will see that its user-part of the call stack just begins with such a routine, as its life begins somewhere in the OpenMP runtime. This is shown in the picture above for the second thread of a Team (see variable index).

    shot_debugging_openmp_callstack_slave

    I wrote “When you switch to another thread”: There is a Thread window with which you can switch between the different threads. If you don’t have it active, you can get it via right-most button of the Debug toolbar (or via Ctrl+D,T):

    shot_debugging_toolbar 

    The Thread window is pretty straight-forward: It shows you which threads are current running and give some additional information on those. In the screenshot below you have two threads, namely the “Main Thread” (which is the Master in OpenMP) and one additional “Worker Thread”. It also indicates the current state and location of all threads. You can select a thread by just double-clicking it.

    shot_debugging_window_threads

    Lets look at the first important feature: Control of individual threads. Once you are run into a breakpoint, the debugger will suspend all threads. And it will suspend them when the first thread has hit a breakpoint - the other threads might still be at a point before that (e.g. somewhere in the OpenMP runtime library so that you cannot select them). So far, so good, but when you select “Step Over (F10)” for instance, this can take a while and during that time span all threads are running. If you do not want that, you can right click on a thread (you can select multiple threads at once via the Ctrl key) and select “Freeze”. All “freezed” threads are marked with pause-symbol in the second column of the Threads window and the value in the Suspend column will be 1. Right-clicking on a thread and selecting “Thaw” will “unfreeze” it. Using this approach you can control each thread individually, which was the required capability.

    Please note: If an “unfreezed” thread encounters an OpenMP barrier it will only continue once all threads have reached that barrier. If you manually drag a thread past a barrier and any other thread encounters that barrier, it will wait there forever (or until you have dragged the other threads back in front of the barrier). An do not forget the implicit barriers at the end of any Worksharing constructs (e.g. in line 12 in the example above)!

    The second required feature was the examination of private variables. In order to examine this, create a breakpoint in line 22. The residual variable will have significantly different values right after the first few iterations of the j-loop. You can add this variable to a Watch window and switch between threads to examine the values, and Visual Studio correctly displays different values for different threads, which was the required capability. Besides that, you can of course use all the debugging capabilities you know from Visual Studio, such as editing the value of a variable.

    But to my feeling there is one thing missing: You cannot get a view of a private variable showing all its values from all different threads. This would clearly be a nice features, it can be found in Unix debuggers with dedicated support for OpenMP (e.g. DDT or TotalView).

    Hint: The “Show Threads in Source” option is a nice feature and once it is enabled, you get a better overview of the source code location at which the threads currently are (look out for two small wavelike lines in the breakpoint-column). You can enable this view by clicking on the second right-most button of the Debug toolbar.

    shot_debugging_window_threads_showsource

     

    Hint: You might try to add a thread-dependent condition to a breakpoint. You cannot do that in Visual Studio, but you can use the Filter option to achieve that functionality. Right-click on a breakpoint and select “Filter…”. The help text provided with the upcoming dialog window will guide you through setting thread-specific breakpoints. If you select “When Hit…”, you can access a thread’s id and name as well in the action to be taken.

    I guess that this is enough for a blog post and I hope it gave a compact overview of the OpenMP-related debugging capabilities provided by Visual Studio. In case you are interested in giving it a try but are looking for a simple OpenMP example program, you can grab the Jacobian solver mentioned here from my SkyDrive (C++-omp-jacobi.zip) with a pre-configured Visual Studio 2008 solution.

    Comments

    Please wait...
    Sorry, the comment you entered is too long. Please shorten it.
    You didn't enter anything. Please try again.
    Sorry, we can't add your comment right now. Please try again later.
    To add a comment, you need permission from your parent. Ask for permission
    Your parent has turned off comments.
    Sorry, we can't delete your comment right now. Please try again later.
    You've exceeded the maximum number of comments that can be left in one day. Please try again in 24 hours.
    Your account has had the ability to leave comments disabled because our systems indicate that you may be spamming other users. If you believe that your account has been disabled in error please contact Windows Live support.
    Complete the security check below to finish leaving your comment.
    The characters you type in the security check must match the characters in the picture or audio.
    Christian Terboven has turned off comments on this page.

    Trackbacks

    The trackback URL for this entry is:
    http://terboven.spaces.live.com/blog/cns!EA3D3C756483FECB!401.trak
    Weblogs that reference this entry
    • None