Thursday, 27 November 2008

Parallel Processing Design Patterns

This is the notes I have written for a number of parallel processing design patterns. Information is largely from http://wing.cs.uiuc.edu/group/patterns/patterns.html, but my general understanding of the designs comes from various other sources.

Repositories
- Completely independent operations are applied to a centralized repository of data, with no restriction on order.
- The only restriction is that data can only be read/written by one UE at any one time.
- The simplest design. Components are completely seperated other than communication with the main repository.
- Noise is introduced to the results thanks to the non-deterministic nature of having no synchronisation between the operations. Results can be hard to reproduce.


Pipe and Filter
- Used for when multiple data sets must be processed using a sequence of ordered operations.
- The operations are split onto separate UEs. Parallelization comes into play when the first set of data is finished, and moves to the second operation. The first operation can then be ran on the second set of data.
- Eventually all operations will be running on a data set.
- The main problem to watch out for is that a delay in one operation will propagate throughout the entire system. The following operation will have nothing to do, and the previous operation cannot pass on the data. This means that the load of each UE must be balanced carefully so all operations take roughly the same amount of time.


Master and Worker
- The same computation must be performed on each data set, but completely independently of each other (an example of an Embarassingly Parallel problem)
- Parallelizing is easy, simply process multiple data sets at the same time.
- The master gives tasks to the workers once they are done with the previous task. All workers are almost always busy, which implies optimal load balancing.
- If the number of workers is large, the communication overhead for the master can become large. This can be combated by increasing the size of tasks, which will lead to a more coarse granulation (more processing per communication).


Replicable
- The same computation must be performed on each data set, completely seperately of each other. The computation requires a lot of reading from a set of global data.
- Similar to Master and Worker except for the reading of global data. The solution is similar, but in addition each UE stores a copy of this global data, in order that it can access it quickly without having to wait for other UEs to stop reading the data.
- Replicating the data may require a very large amount of memory, if the data set is large and the number of workers also large.


Divide & Conquer
- The operation required can be split into smaller suboperations and the results merged into the final solution later on. Suboperations are independent of each other, only the final solution is dependent on all of the suboperations.
- Ideally each suboperation should be of equal size to maintain load balance.
- Fits naturally to many problems such as sorting or searching, especially merge sort.


Geometric Decomposition
- Computation needs to be performed on a set of data, which are semi-independent from each other. For example, the data is in an array where the calculation is dependent on the neighbouring cells.
- The data is divided between processess, each only accessing their local data. On the data which is required by other processes is sent (in the array example, the cells on the border between processor’s areas). The network diagram would be a geometric shape related to how the array was split up.
- The communication overhead can be quite large, so care must be taken that this overhead doesn’t outweigh the processing advantage gained.

Tomorrow I will look into how these patterns could relate to some of the volume rendering techniques I have learnt about.

No comments: