26
Explicitly Extract Common Subexpressions
AMD Athlon™ Processor x86 Code Optimization
22007E/0—November 1999
lead to unexpected results. Fortunately, in the vast majority of
cases, the final result will differ only in the least significant
bits.
Example 1 (Avoid):
double a[100],sum;
int i;
sum = 0.0f;
for (i=0; i<100; i++) {
sum += a[i];
}
Example 2 (Preferred):
double a[100],sum1,sum2,sum3,sum4,sum;
int i;
sum1 = 0.0;
sum2 = 0.0;
sum3 = 0.0;
sum4 = 0.0;
for (i=0; i<100; i+4) {
sum1 += a[i];
sum2 += a[i+1];
sum3 += a[i+2];
sum4 += a[i+3];
}
sum = (sum4+sum3)+(sum1+sum2);
Notice that the 4-way unrolling was chosen to exploit the 4-stage
fully pipelined floating-point adder. Each stage of the floating-
point adder is occupied on every clock cycle, ensuring maximal
sustained utilization.
Explicitly Extract Common Subexpressions
In certain situations, C compilers are unable to extract common
subexpressions from floating-point expressions due to the
guarantee against reordering of such expressions in the ANSI
standard. Specifically, the compiler can not re-arrange the
computation according to algebraic equivalencies before
ex tracting com mo n subexpre ss ions . I n such ca se s, the
p r o g r a m m e r s h o u l d m a n u a l l y e x t r a c t t h e c o m m o n
subexpression. It should be noted that re-arranging the
expression may result in different computational results due to
the lack of associativity of floating-point operations, but the
results usually differ in only the least significant bits.
Summary of Contents for Athlon Processor x86
Page 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Page 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...