Writing Parallel Code
6-9
Optimizing Assembly Code via Linear Assembly
6.3
Writing Parallel Code
One way to optimize linear assembly code is to reduce the number of execu-
tion cycles in a loop. You can do this by rewriting linear assembly instructions
so that the final assembly instructions execute in parallel.
6.3.1
Dot Product C Code
The dot product is a sum in which each element in array
a is multiplied by the
corresponding element in array
b. Each of these products is then accumulated
into
sum. The C code in Example 6–5 is a fixed-point dot product algorithm.
The C code in Example 6–6 is a floating-point dot product algorithm.
Example 6–5. Fixed-Point Dot Product C Code
int dotp(short a[], short b[])
{
int sum, i;
sum = 0;
for(i=0; i<100; i++)
sum += a[i] * b[i];
return(sum);
}
Example 6–6. Floating-Point Dot Product C Code
float dotp(float a[], float b[])
{
int i;
float sum;
sum = 0;
for(i=0; i<100; i++)
sum += a[i] * b[i];
return(sum);
}