| Christian's profileChristian's SpaceBlogLists | Help |
|
6/3/2008 First experiments with Tasking in OpenMP 3.0OpenMP 3.0 is out, maybe a bit later than we hoped for, but I think that we got a solid standard document. At IWOMP 2008 a couple of weeks ago, there was an OpenMP tutorial which included a talk by Alex Duran (from UPC in Barcelona, Spain) on what is new in OpenMP 3.0 - which is really worth a look! My talk was on some OpenMP application experiences, including a case study on Windows, and I really think that many of our codes can profit from Tasks. Motivated by Alex' talk I tried the updated Nanos compiler and prepared a couple of examples for my lectures on Parallel Programming in Maastricht and Aachen. In this post I am walking through the simplest one: Computing the Fibonacci number in parallel. Before I dive into the code: If you really want to compute Fibonacci, use a different algorithmic approach. This version is taken just for the reason of simplicity... The recursive approach to compute the Fibonacci number n is: int main(int argc, char* argv[]) int fib(int n) Several approaches to parallelize such a recursive algorithm with OpenMP have been discussed in the past, including the use of Nested OpenMP. If you do it right, you will get the performance you are looking for, but for example the solution with Nested OpenMP typically requires vendor-specific control of the total number of threads created (OpenMP 3.0 has a solution for that as well). With Tasks in OpenMP 3.0, the code will look like this: int main(int argc, char* argv[]) int fib(int n) Let me explain that a bit. The main() function now includes a Parallel Region with a perfectly nested Single construct. The Parallel Region will create the team of threads that will eventually execute all the Tasks. The Single construct has the effect, that only one thread will enter the fib() function, while the other threads will wait at the end of the Single construct. That single thread will then encounter the Task constructs, thus only one thread is creating the two tasks per recursion level. If we would not have had the Single construct, all the Team's threads would have entered fib() and each thread would have created two tasks - which is not the desired behavior. So the first lession is: Task is not a worksharing construct in the sense you know it. Inside fib(), I am writing shared(x) and shared(y). Isn't shared the default in OpenMP? Yes it is, but for the worksharing constructs of 2.5. With the Task construct, the following rules apply:
Without shared(x, y), x and y would be firstprivate (because x and y are local in the Parallel Region and thus do not inherit the shared attribute) for the Task, thus a new instance would be created for each task and that would prohibit building the correct sum at each recursion level. And why do we need this taskwait here? This construct causes the encountering task to suspend until all child tasks are completed, but only direct childs and not descendants. We need this synchronization here, because otherwise x and y (local variables of automatic storage location!) might have gone out of scope before the created tasks have been executed. Ok, let's finally take a brief look at the performance. Again the disclaimer: This algorithm is bad. Besides that: If you compile this code squentially, it will run faster than with all the threads you can take - because in each recursion step only two substractions and one addition are computed, which is way cheaper than creating a task or a thread! But this code is so well-suited for explaining Tasking... I am comparing three versions here:
If the actual compute function is more expensive (from a compute point of view), the difference between version omp-v2 and omp-v3 should vanish. I think there is a lot to say and explain about Tasks and many people started looking into this and bringing Tasking to the public, which is great! There is even more to say for this simple example, but for today I think this is enough, let me just give you the source of version omp-v3: int fib(int n)
Trackbacks (4)The trackback URL for this entry is: http://terboven.spaces.live.com/blog/cns!EA3D3C756483FECB!316.trak Weblogs that reference this entry
|
|
|