Posts

Showing posts from January, 2012

C code optimization benchmark

Image
Steve Oualline talks about C code optimization on his book: Practical C Programming . I was curious about the real performance gains. The benchmark test results are at the end of the post. How can this C code be optimized? matrix1.c #define X_SIZE 60 #define Y_SIZE 30 int matrix[X_SIZE][Y_SIZE]; void initmatrix(void) { int x,y; for (x = 0; x < X_SIZE; ++x){ for (y = 0; y < Y_SIZE; ++y){ matrix[x][y] = -1; } } } void main() { initmatrix(); } The first suggested optimization is to use the "register" qualifier for the indexes variables x and y: matrix2.c #define X_SIZE 60 #define Y_SIZE 30 int matrix[X_SIZE][Y_SIZE]; void initmatrix(void) { register int x,y; for (x = 0; x < X_SIZE; ++x){ for (y = 0; y < Y_SIZE; ++y){ matrix[x][y] = -1; } } } void main() { initmatrix(); }  Then the optimization suggestion is to order the for loops so that the innermost for is the most complex: matrix3.c #define X

How to recompile software with hardware optimization?

This may be useful for compiling local applications that you want to run faster. Try this on your computer: $ echo "" | gcc -march=native -v -E - 2>&1 | grep cc1 On my computer it has returned:  /usr/libexec/gcc/x86_64-redhat-linux/4.6.1/cc1 -E -quiet -v  - -march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=4096 -mtune=corei7-avx This command probes the local computer for optimization flags. To use it: $ CFLAGS=" [blue string from above] " ./configure You may consider adding the "-O3" flag. The -O3 flag enables levels 1, 2 and 3 of compile time optimization. There are more information about -O3 on gcc man page. For doing it, instead of previous line, use: $ CFLAGS="-O3  [blue string from above] " ./configure From:  http://blog.mybox.ro/2011/11/02