Sei sulla pagina 1di 5

C/C++ Code Optimization Techniques

Rules to Remember
1. Find the bottleneck of your Application / module. 80 - 20 rule also applies over here which means most of the execution goes in a smaller part of the code. You should use some profiling tool to fing the bottle nect of your application before going into the optimization part. 2. C/C++ compilers are queit efficient and they procuce optimized binary for the written code. So micro-optimization like using register variables will not be usesful. Let the compiler find the best way to utilize the variables. 3. Some guidelines for optimizing C/C++ code: a. Get same calculations/comparisons out of the loop.

for (int i=0; i<iMax; i++) { if (Val > CONST_VAL) { ... } else { .... } }

This can be optimized as:

if (Val > CONST_VAL) { for (int i=0; i<iMax; i++) { ..... } } else { for (int i=0; i<iMax; i++)

{ ..... } } b. Avoid type conversions, use same type of variables including signed and unsigned. unsigned long x, sum; for (int i = 0; i<Max_I; i++) { sum = x*i; } Optimized: unsigned long x, sum; for (unsigned long i = 0; i<Max_I; i++) { sum = x*i; }

c. Interger calculations are faster than floating point. d. Keep in mind the processors word length. On 32 bit processor 32-bit calculations and copying will be faster or equal to 16-bit or 8-bit.

e. Interger Multiply/Divide by two is fastest using bit shift operations.

f. Multiplication and Division operations are generally costlier than Addition Substraction. Try to convert multiplication/division to addition/substraction.

An Example:
Before These are constants to the loop extentY, extentX, TopLeftX, TopLeftY and are of type unsigned long #define Target(cx,cy) pOffBits[serverResX * (cy) + (cx)] #define Source(cx,cy) pBitsI[extentX * (cy) + (cx)] for (int y = 0; y<extentY; y++) for (int x = 0; x<extentX; x++)

{ Target(x + TopLeftX, y + TopLeftY) = Source(x,y); }

Optimized unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermZ = 0; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = intermY + intermC; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x] = pBitsI[intermZ + x]; } intermZ += extentX; intermY += serverResX; } Step By Step 1. Replacing the #define and making the loop variable data type same as that of the others for (unsigned long y = 0; y<extentY; y++) for (unsigned long x = 0; x<extentX; x++) { pOffBits[serverResX * (y + TopLeftY) + (x + TopLeftX)] = pBitsI[extentX * y + x]; } 2. Rearranging the calculation pOffBits[serverResX * y + x + serverResX * TopLeftY + TopLeftX] = pBitsI[extentX * y + x];

Here (serverResX * TopLeftY + TopLeftX) is constant so we can take it out of the loop unsigned long intermC = serverResX * TopLeftY + TopLeftX; for (unsigned long y = 0; y<extentY; y++) for (unsigned long x = 0; x<extentX; x++) { pOffBits[serverResX * y + x + intermC] = pBitsI[extentX * y + x];

3. Now here both serverResX * y and extentX * y are constant in the inner loop so we can take it out of the innder loop. unsigned long intermC = serverResX * TopLeftY + TopLeftX; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = serverResX * y; unsigned long intermB = extentX * y; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x + intermC] = pBitsI[intermB + x]; } }

4. Now you can see the multiplication serverResX * y going in the loop. Where y ranges from 0 to extentY. We can convert it to addition like this: unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = intermY; unsigned long intermB = extentX * y; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x + intermC] = pBitsI[intermB + x]; } intermY += serverResX; }

5. Similerly for extentX, and we gan get rid of intermA and intermB unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermZ = 0; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++)

{ for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermY + x + intermC] = pBitsI[intermZ + x]; } intermZ += extentX; intermY += serverResX; }

6. Also since intermY + intermC is constant in the inner loop we can take it out unsigned long intermC = serverResX * TopLeftY + TopLeftX; unsigned long intermZ = 0; unsigned long intermY = 0; for (unsigned long y = 0; y<extentY; y++) { unsigned long intermA = intermY + intermC; for (unsigned long x = 0; x<extentX; x++) { pOffBits[intermA + x] = pBitsI[intermZ + x]; } intermZ += extentX; intermY += serverResX; }

This is the optimized algorithm. Note: All datatypes are same over here

Comments, questions, please mail me. Akhilesh Singh Last Updated 8th January 2005 Crossroad | About Me | Science | Life | Code | Contact | Site Map | Search Copyright 2006 - 2009

Potrebbero piacerti anche