Fun little optimization
by
, November 1st, 2011 at 09:07 PM (2674 Views)
Disclaimer: While optimizations can be fun and addictive, determine if your particular application needs any before attempting to optimize it.
I've always wondered if the Java compiler was capable of performing certain loop optimizations. So for fun, I picked a two of them (a simple permutation and loop unrolling) and tested them out.
public class Test { public static void main(String[] args) { final int DIM = 0x2000; for (int i = 0; i < 5; ++i) { int[][] data = new int[DIM / 4][DIM / 4]; for (int col = 0; col < data.length; ++col) { for (int row = 0; row < data.length; ++row) { data[row][col] = row * col; } } } System.out.println("Warmed up"); long starttime = System.currentTimeMillis(); for (int i = 0; i < 10; ++i) { int[][] data = new int[DIM][DIM]; for (int row = 0; row < data.length; ++row) { for (int col = 0; col < data.length; ++col) { data[row][col] = 1; } } } long endtime = System.currentTimeMillis(); System.out.println("row-major order: " + (endtime - starttime)); starttime = System.currentTimeMillis(); for (int i = 0; i < 10; ++i) { int[][] data = new int[DIM][DIM]; for (int col = 0; col < data.length; ++col) { for (int row = 0; row < data.length; ++row) { data[row][col] = 1; } } } endtime = System.currentTimeMillis(); System.out.println("col-major order: " + (endtime - starttime)); starttime = System.currentTimeMillis(); for (int i = 0; i < 10; ++i) { int[][] data = new int[DIM][DIM]; for (int row = 0; row < data.length; ++row) { for (int col = 0; col < data.length; col += 8) { data[row][col] = 1; data[row][col + 1] = 1; data[row][col + 2] = 1; data[row][col + 3] = 1; data[row][col + 4] = 1; data[row][col + 5] = 1; data[row][col + 6] = 1; data[row][col + 7] = 1; } } } endtime = System.currentTimeMillis(); System.out.println("loop unrolling: " + (endtime - starttime)); } }
Results:
Warmed up
row-major order: 2500
col-major order: 21305
loop unrolling: 2945
Turns out that manual loop unrolling isn't beneficial (this is something very simple for the compiler to do), but loop permutation is something it can't do. I suspect tiling would fall under the same category since it's a more complicated re-ordering.
So if you're looking for a way to improve your code's performance look for locality optimizations.