A Pocket Guide for .NET Technologies: Parallel Computing in .NET Framework 4.0

Parallel Computing in .NET Framework 4.0

Computers in the near future are expected to have significantly more cores. To take advantage of the hardware, you can parallelize your code to distribute work across multiple processors. In the past, parallelization required low-level manipulation of threads and locks.

Visual Studio 2010 and the .NET Framework 4 enhance support for parallel programming by providing a new runtime, new class library types, and new diagnostic tools. These features simplify parallel development so that you can write efficient, fine-grained, and scalable parallel code in a natural idiom without having to work directly with threads or the thread pool.

On the higher level, .NET framework 4.0 provided two major libraries for parallel programming. These are the Task Parallel Library (TPL) and the parallel version of Language-Integrated Query (PLINQ).

Note: Starting with the .NET Framework 4, the TPL is the preferred way to write multithreaded and parallel code. However, not all code is suitable for parallelization; for example, if a loop performs only a small amount of work on each iteration, or it doesn't run for many iterations, then the overhead of parallelization can cause the code to run more slowly.

To know more about Parallel Computing, please visit:
http://msdn.microsoft.com/en-us/library/dd460693(v=vs.100).aspx

Task Parallel Library [TPL]

The Task Parallel Library (TPL) is a set of public types and APIs in the System.Threading and System.Threading.Tasks namespaces in the .NET Framework version 4. The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications. The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available.

In addition, the TPL handles the partitioning of the work, the scheduling of threads on the ThreadPool, cancellation support, state management, and other low-level details. By using TPL, you can maximize the performance of your code while focusing on the work that your program is designed to accomplish.

Starting with the .NET Framework 4, the TPL is the preferred way to write multithreaded and parallel code. The Task Parallel Library provides parallelism based upon both data and task decomposition. Data parallelism is simplified with new versions of for loop and foreach loop that automatically decompose the data and separate the iterations onto all available processor cores.

Task parallelism is provided by new classes that allow tasks to be defined using lambda expressions. You can create tasks and let the .NET framework determine when they will execute and which of the available processors will perform the work.

Data Parallelism (Task Parallel Library)

Data parallelism is usually applied to large data processing tasks. Data parallelism applicable to operations which performed concurrently (that is, in parallel) on elements in a source collection. Data parallelism syntax is supported by several overloads of the for and foreach methods in the System.Threading.Tasks.Parallel class.

In data parallel operations, the source collection is partitioned so that multiple threads can operate on different segments concurrently. System.Threading.Tasks.Parallel class provides method-based parallel implementations of for and foreach loops (For and For Each in Visual Basic).

You write the loop logic for a Parallel.For or Parallel.ForEach loop much as you would write a sequential loop. You do not have to create threads or queue work items. In basic loops, you do not have to take locks. The TPL handles all the low-level work for you.

Note: Many examples use lambda expressions to define delegates in TPL. If you are not familiar with lambda expressions in C# or Visual Basic, visit http://msdn.microsoft.com/en-us/library/dd460699(v=vs.100).aspx for Lambda Expressions in PLINQ / TPL.

Lambda Expressions (C#): A lambda expression is an anonymous function that can contain expressions and statements, and can be used to create delegates or expression tree types. All lambda expressions use the lambda operator =>, which is read as "goes to". The left side of the lambda operator specifies the input parameters (if any) and the right side holds the expression or statement block. The lambda expression x => x * x is read "x goes to x times x."

The => operator has the same precedence as assignment (=) and is right-associative. Lambdas are used in method-based LINQ queries as arguments to standard query operator methods such as Where.

To learn the Lambda Expressions in C# follow the below link:

http://msdn.microsoft.com/en-us/library/bb397687(v=vs.100).aspx

1. Parallel.For: Here we will consider the parallel for loop. This provides some of the functionality of the basic for loop, allowing you to create a loop with a fixed number of iterations. If multiple cores are available, the iterations can be decomposed into groups that are executed in parallel.

Potential Pitfalls in Parallel.For loop: After working the above examples we can say it is very easy to implement and work with Parallel.For, but as developer we need to look some potential and usually unanticipated disaster or difficulty related to this. There are various issues that can be encountered. Some cause immediately noticeable bugs in your code. Some cause subtle bugs that may occur only rarely and are difficult to find. Others simply lower the performance of parallel loops.

MSDN Link: http://msdn.microsoft.com/en-us/library/dd997392(v=vs.100).aspx

· Shared State: Parallel loops are ideal when the individual iterations are independent. When the iterations share mutable state, synchronization is necessary to ensure that errors are not introduced by parallel processes using inconsistent values. This usually requires the introduction of locking mechanisms that slow the performance of the software or changes to algorithms to remove shared state.

· Dependent Iterations: With sequential loops you can assume that all earlier iterations will be completed before the current execution. With parallel loops, as seen in the first example, the order is usually changed. This means that you should not have code within a parallel loop that depends upon another iteration's result.

· Excessive Parallelism: In general case parallelism increases the performance of loops. However, in many case it is possible to overuse parallelism which may decrease the performance.

· Calls to Thread-Safe Methods: If the methods that you call from within a loop are thread-safe, you should not generate synchronization problems through their use. However, if the methods use locking to achieve thread-safety, you may spoil the performance of your software as multiple cores become blocked when your parallel loop executes.

· Myth of Parallelism: A common myth is that a loop will always execute in parallel will increase the performance. On single core processors parallel loops will generally execute sequentially. Even on multiprocessor systems it is possible for a loop to run in series. If your code requires that a later iteration completes before an earlier one can continue, the loop will be deadlocked.

2. Parallel.ForEach: The parallel ForEach loop provides a parallel version of the standard, sequential foreach loop. Each iteration processes a single item from a collection. However, the parallel nature of the loop means that multiple iterations may be executing at the same time on different processors or processor cores. This opens up the possibility of synchronization problems so the loop is ideally suited to processes where each iteration is independent of the others.

A ForEach loop works like a For loop. The source collection is partitioned and the work is scheduled on multiple threads based on the system environment. The more processors on the system, the faster the parallel method runs. For some source collections, a sequential loop may be faster, depending on the size of the source, and the kind of work being performed.

We have many more points to study, I given you the basic overview of Data Parallelism using for and foreach loop. Below are list of some useful points which you can study yourself and create some examples:

Termination of Parallel Loops

Parallel Loop State
ParallelLoopState.Break
LowestBreakIteration
ParallelLoopState.Stop

Synchronization in Parallel Loops

Aggregation in Sequential Loops
Aggregation in Parallel Loops
Synchronization Using Locking
Local Loop State in For Loops

Task Parallelism (Task Parallel Library)

The Task Parallel Library (TPL), as its name implies, is based on the concept of the task. The term task parallelism refers to one or more independent tasks running concurrently. A task represents an asynchronous operation, and in some ways it resembles the creation of a new thread or ThreadPool work item, but at a higher level of abstraction. Tasks provide two primary benefits:

· More efficient and more scalable use of system resources: Behind the scenes, tasks are queued to the ThreadPool, which has been enhanced with algorithms (like hill-climbing) that determine and adjust to the number of threads that maximizes throughput. This makes tasks relatively lightweight, and you can create many of them to enable fine-grained parallelism. To complement this, widely-known work-stealing algorithms are employed to provide load-balancing.

· More programmatic control than is possible with a thread or work item: Tasks and the framework built around them provide a rich set of APIs that support waiting, cancellation, continuations, robust exception handling, detailed status, custom scheduling, and more.

Note: For both of these reasons, in the .NET Framework 4, tasks are the preferred API for writing multi-threaded, asynchronous, and parallel code.

MSDN link for Task Parallelism: http://msdn.microsoft.com/en-us/library/dd537609(v=vs.100).aspx

Why we need Task Parallelism? - Some algorithms do not lend themselves to data decomposition because they are not repeating the same action. However, they may be candidates for task decomposition. This is where an algorithm is broken into sections that can be executed independently. Each section is considered to be a separate task that may be executed on its own processor core, with several tasks running concurrently. This type of decomposition is usually more difficult to implement and sometimes requires that an algorithm be changed substantially or replaced entirely to minimize elements that must be executed sequentially and to limit shared mutable values.

1. Creating and Running Tasks Implicitly [Parallel.Invoke]: The Parallel.Invoke method provides a convenient way to run any number of arbitrary statements concurrently. Just pass in an Action delegate for each item of work. The easiest way to create these delegates is to use lambda expressions.

The Parallel.Invoke method provides a simple way in which a number of tasks may be created and executed in parallel. As with other methods in the Parallel Task Library, Parallel.Invoke provides potential parallelism. If no benefit can be gained by creating multiple threads of execution the tasks will run sequentially.

To use Parallel.Invoke, the tasks to be executed are provided as delegates. The method uses a parameter array for the delegates to allow any number of tasks to be created. The tasks are usually defined using lambda expressions but anonymous methods and simple delegates may be used instead. Once invoked, all of the tasks are executed before processing continues with the command following the Parallel.Invoke statement. The order of execution of the individual delegates is not guaranteed so you should not rely on the results of one operation being available for one that appears later in the parameter array.

Exception Handling with Parallel.Invoke: In the case of Parallel.Invoke, it is guaranteed that every task will be executed. Each task will either exit normally or throw an exception. All of the thrown exceptions are gathered together and held until all of the tasks have stopped, at which point an AggregateException containing all of the exceptions is thrown. The individual errors can be found within the InnerExceptions property.

2. Creating and Running Tasks Explicitly: If we need more control over parallel tasks, we can use the Task class. This allows us to explicitly generate parallel tasks. The code needed for explicit task creation is slightly more complex than that for Parallel.Invoke but the benefits outweigh this disadvantage.

A task is represented by the System.Threading.Tasks.Task class. A task that returns a value is represented by the System.Threading.Tasks.Task<TResult> class, which inherits from Task. The task object handles the infrastructure details, and provides methods and properties that are accessible from the calling thread throughout the lifetime of the task. For example, you can access the Status property of a task at any time to determine whether it has started running, ran to completion, was canceled, or has thrown an exception. The status is represented by a TaskStatus enumeration.

When you create a task, you give it a user delegate that encapsulates code that task will execute. Delegate can be expressed as a named delegate, an anonymous method, or a lambda expression. Lambda expressions can contain a call to a named method.

The Task class provides a wrapper for an Action delegate. The delegate describes the code that you wish to execute and the wrapper provides parallelism. A simple way to create a task is to use the constructor that has a single parameter, which accepts delegate that you wish to execute. Tasks do not execute immediately after being created. To start a task call its Start method.

3. Many More….: Working with Task class is complex and big topic. In the above example I have given you some example and explanation. I cannot cover all related topic in this session; we need a separate session if you want to learn all the features and functionality. Here I am giving you a list of topic which you can take it as your task, study these topics and create some examples so that you will get better understanding: [MSDN Link: http://msdn.microsoft.com/en-us/library/dd537609(v=vs.100).aspx]

Waiting on Tasks: The System.Threading.Tasks.Task type and System.Threading.Tasks.Task<TResult> type provide several overloads of a Task.Wait and Task<TResult>.Wait method that enable you to wait for a task to complete. In addition, overloads of the static Task.WaitAll and Task.WaitAny method let you wait for any or all of an array of tasks to complete.

You can also have a look on Exception Handling if any exception is thrown during the execution of a task and Adding Timeouts for long running task.

Task Results: The Task<T> generic class inherits much of its functionality from its non-generic counterpart. Tasks are created using a delegate, often a lambda expression, started using the Start method and executed in parallel where it is efficient to do so. This return value can be accessed by reading the task's Result property.
Continuation Tasks: When you are writing software that has tasks that execute in parallel, it is common to have some parallel tasks that depend upon the results of others. These tasks should not be started until the earlier tasks, known as antecedents, have completed. Before the introduction of the Task Parallel Library (TPL), this type of interdependent thread execution would be controlled using callbacks.

The Task.ContinueWith method and Task<TResult>.ContinueWith method let you specify a task to be started when the antecedent task completes. The continuation task's delegate is passed a reference to the antecedent, so that it can examine its status. In addition, a user-defined value can be passed from the antecedent to its continuation in the Result property, so that the output of the antecedent can serve as input for the continuation.

You can also check some topic related to Using Task Results in Continuations, Exceptions Handling with Continuations Task, Creating Continuations with Multiple Antecedents, and Multiple Continuations of a Single Antecedent.

Nested Task: When user code that is running in a task creates a new task and does not specify the AttachedToParent option, the new task not synchronized with the outer task in any special way. Such tasks are called a detached nested task.

Tasks may be nested in this manner to many levels deep. The inner tasks are known as child tasks, of which there are two types. The first type is the detached child task; also known as nested tasks. The other type of child task is the attached child task, generally known simply as child tasks.

When you create nested tasks there is no link between a nested task and its parent. Nested tasks are completely independent, reporting a separate status and throwing their own exceptions.

Child Tasks: When user code that is running in a task creates a task with the AttachedToParent option, the new task is known as a child task of the originating task, which is known as the parent task. You can use the AttachedToParent option to express structured task parallelism, because the parent task implicitly waits for all child tasks to complete.
Canceling Tasks: The Task class supports cooperative cancellation and is fully integrated with the System.Threading.CancellationTokenSource class and the System.Threading.CancellationToken class, which are new in the .NET Framework version 4. Many of the constructors in the System.Threading.Tasks.Task class take a CancellationToken as an input parameter. Many of the StartNew overloads also take a CancellationToken.

You can create the token, and issue the cancellation request at some later time, by using the CancellationTokenSource class. Pass the token to the Task as an argument, and also reference the same token in your user delegate, which does the work of responding to a cancellation request.

4. What is Task ID? - Every task receives an integer ID that uniquely identifies it in an application domain and that is accessible by using the Id property. The ID is useful for viewing task information in the Visual Studio debugger Parallel Stacks and Parallel Tasks windows. The ID is lazily created, which means that it isn't created until it is requested; therefore a task may have a different ID each time the program is run.

A Pocket Guide for .NET Technologies

Thursday, 20 March 2014

Parallel Computing in .NET Framework 4.0 - Part 2

Task Parallel Library [TPL]

Data Parallelism (Task Parallel Library)

Task Parallelism (Task Parallel Library)

No comments:

Post a Comment

Translate

My Quick Pofile