Friday, 28 November 2008

Friday 28th November Meeting

The meeting today was very useful and went quickly.

The main thing I need to look at in preparation for doing parallelized code is how the hardware will work. For example, how does the memory cache work - is it a global one or is there one for each thread? Also, how will memory sharing work on an architecture with multiple CPUs (rather than simply multiple cores on the same CPU).

The cache findings will govern how the algorithms will work best. For example with the ray tracer, if the memory cache is global it would be best for each thread to be working on different pixels within the same area of the image, rather than each being assigned a completely seperate section of the image to work on.

This then adds the question of whether it would be faster to synchronise the threads in order to maintain the cache benefits, or let them work on their own without the synchronisation overhead but potentially getting out of sequence.

Something else to consider is how the volume data is stored. Storing small cubes of the volume sequentially rather than a straight linear storage of the data could prevent cache hits, as the cache will encompass the area of the volume surrounding the last sample point, rather than just a single row of data. I did store data like this for another project recently and saw a speed increase so it will be worth trying out here.

I will need to be careful with terminology used when it comes to writing the final report. So far I have not used overly consistent (or correct) terminology, for example a 'process' really describes a completely seperate application running rather than a thread of an existing application.

No comments: