Thursday, 20 March 2014

Parallel Computing in .NET Framework 4.0 - Part 4

Pros and Cons of Parallel Programming

The common problems faced when developing parallel code are the same as those seen when using multiple threads. Some of these problems are lessened by the parallelism classes of the .NET framework but they are not removed completely, so are worthy of mention.
·    Synchronisation: A class of problem that you will encounter with parallel programming. When you start a number of tasks simultaneously, at some point in the future those tasks must "join up", perhaps to combine their results. A type of synchronization controls this process to ensure that the result of a task is not used until it has been completed.
Sometime we the synchronization to prevent parallel tasks from interfering with each other. If you have code that is not thread-safe, it may be necessary to prevent two processes from accessing that code simultaneously.
In reality, the solutions to some problems do not have such algorithms so locking mechanisms are used that stop code or data being accessed until the thread holding the lock releases it. This can severely affect performance, especially when using a large number of processors that all need to access the lockable code.
·   Race Conditions: A race condition occurs when parallel tasks are dependent on shared data, generally when the synchronisation around that data is not implemented correctly. One process may perform operations using shared data that temporarily leave a value in an inconsistent state. If the other process uses the inconsistent data, unpredictable behavior can occur. Worse, errors may only occur rarely and be difficult to predict or recreate.

·   Blocking: Locking used to avoid the synchronization problems. A lock can be requested by one task to prevent a section of code from being entered or a shared state variable from being accessed by another task. This is a technique that can be used to synchronise threads and prevent race conditions. When a process requests a lock that has already been granted to another thread, the first process stops executing and waits for the lock to be released. The stopped thread is said to have been blocked. Usually the blocked thread will eventually obtain a lock and continue working as normal. However, if there is excessive blocking some processors may become idle as they are starved of work. This impacts performance.
·   Deadlocking: Deadlocking is an extreme state of blocking involving two or more processes. In the simplest situation you may have two tasks that are each blocked by the other. As each task is blocked and will not continue until the other has released its lock, the deadlock cannot be broken and the two tasks will potentially be blocked forever.
But you can do better coding and avoid the above pitfall by using parallel programming with extra care and understanding. Now I will discuss some benefit which we can gain by using parallel programming:
·   The new parallel programming functionality in the .NET framework provides several benefits that make it the preferred choice over standard multi-threading. When manually creating threads, you may create too many, leading to excessive task-switching operations that affect performance. You may also create two few, leaving processors idle. These are some of the key problems that the new classes aim to address.
·   Both the TPL and PLINQ provide automatic data decomposition. Although you can control decomposition, usually the standard behavior is sufficient. This behavior is intelligent. For example, after decomposition and allocation of work, the activity of each processor is continually considered. If it turns out that the work assigned to one processor is more time-consuming than that of another, a work-stealing algorithm is used to transfer work from the busy processor to the under-utilized one.
·    It is important to understand that the new libraries provide potential parallelism. With standard multi-threading, when you launch a new thread it immediately starts its work. This might not be the most efficient way of utilizing the available processors. The parallelism libraries may launch new threads if processor cores are available. If they are not, tasks may be postponed until a core becomes free or until the result of the operation is actually needed.

·   Finally, the new libraries allow you to not worry about the number of available cores and the number that might be available on future computers. All of the available cores will be utilized as required. If the code is executed on a single-processor machine, it will be mostly executed sequentially. A little overhead is introduced by the parallelism libraries so parallel code running on a single core machine will run more slowly than purely sequential code. However, this impact is minor when compared with the benefits gained.

No comments:

Post a Comment