Friday, 12 December 2008

"Optimisation"

I implemented the first optimisation of the Ray Casting method.

When storing the volume data in a standard 3D array manner (all voxels in a row, then all rows in a layer, then all layers in a volume), a cache hit is likely to occur when marching along a ray through the volume.

If the ray is travelling directly along the X axis, the data it is reading will be concurrent in memory. However when it moves in any other direction each sample of the data will be spread out in memory, and will not be in the cache from the previous read.

The idea for the optimisation was to store the data in a similar fashion (voxels, rows, layers), but to do this seperately for small blocks of the volume. Then when moving through the volume a cache hit is far less likely, potentially occurring only on the boundary between the blocks.

Unfortunately implementing this actually caused an increase in rendering times. Here are some render times for identical images, using various sizes of block:

1 (original storage method):
12.863 seconds
12.293
12.293

2:
16.816
16.781
16.742

4:
16.641
16.727
16.664

8:
17.180
17.156
17.137

16:
17.238
17.266
17.977

32:
17.227
17.227
17.207

I would guess that the slow-down is due to the extra computation involved in computing the memory location of a voxel. Some optimisations could be made in this area, for example if the block size was limited to powers of two I could replace divides with bit-shifts. However this would also require the size of the volume data to be a power of two (or at least a multiple of the selected block size). I may do this at some point, however for now I think I need to focus more on the parallel processing since that is the main purpose of the project.

No comments: