OPTIMIZATION

The optimization evaluation is done by starting with the most optimizing code and removing each optimization to find out how much of an effect the optimization has on the processing time.

The optimization scheme used for evaluation comes from [1].

Each optimization is denoted by a level, as shown below:

Level  5 – Most Optimized Code
Level  4 – No Loop Unrolling
Level  3 – No Coalesced Memory Access
Level  2 – No Avoidance of Idle threads
Level  1 – No Shared Memory

It basically means that as you go from level 1 to level 2 shared memory optimization is introduced, from level 2 to level 3 idle thread avoidance is introduced and so on and so forth.

The table and the graph below show how much do these optimization affect the processing time.The processing time for the three different image sizes has been shown.

Processing time is in milliseconds. The exe was run 10 times per image size per optimization level to get an average value of processing time.

NOTE : - The processing time refers to the time taken to do the image conversion, not on the time taken to read in and write out the image.


As can be seen from above, memory coalescing and shared memory  optimization  are the ones which affect the the processing time the most.Shared memory affecting it much much more as compared to memory coalescing.

The effect of optimization becomes more and more evident as the image size increases.

Other optimizations like Idle thread avoidance and loop unrolling have a little effect on optimization.

Another optimization not evaluated here is the introduction of seperable filters as mentioned in [1]. This makes the number of operations for a M x M image to be 2M, while for non seperated filters it is M^2.This would also put a great effect on optimzation.