Refining C/C++ Code
3-30
3.4.2.3
Using Double Word Access for Word Data (’C64x and ’C67x Specific)
The ’C64x and ’C67x families have a load double word (LDDW) instruction,
which can read 64 bits of data into a register pair. Just like using word accesses
to read 2 short data items, double word accesses can be used to read 2 word
data items (or 4 short data items). When operating on a stream of float data,
you can use double accesses to read 2 float values at a time, and then use
intrinsics to operate on the data.
The basic float dot product is shown in Example 3–14. Since the float addition
(ADDSP) instruction takes 4 cycles to complete, the minimum kernel size for
this loop is 4 cycles. For this version of the loop, a result is completed every
4 cycles.
Example 3–14. Basic Float Dot Product
float dotp1(const float a[restrict], const float b[restrict])
{
int i;
float sum = 0;
for (i=0; i<512; i++)
sum += a[i] * b[i];
return sum;
}
In Example 3–15, the dot product example is rewritten to use double word
loads and intrinsics are used to extract the high and low 32-bit values con-
tained in the 64-bit double. The _hi() and _lo() instrinsics return integer values,
the _itof() intrinsic subverts the C typing system by interpreting an integer val-
ue as a float value. In this version of the loop, 2 float results are computed every
4 cycles. Arrays can be aligned on double word boundries by using either the
DATA_ALIGN (for globally defined arrays) or DATA_MEM_BANK (for locally
defined arrays) pragmas.Example 3–15 and Example 3–16 show these prag-
mas.
Note:
For the pragmas that apply to functions or symbols, the syntax for
the pragma differs between C and C++. In C, you must supply the name of
the object or function to which you are applying the pragma as the first argu-
ment. In C++, the name is omitted; the pragma applies to the declaration
of the object or function that follows it.