Revision 1.0
Performance Tips
131
vadd
$v1, $v2, $v3
vadd
$v4, $v4, $v1
In this example, the second
vadd
instruction could not execute until the first
vadd
has completed and written back its result. There is a
data dependency
on register
$v1
. The result will be a pipeline stall that will effectively
serialize the vector code, seriously dampening its performance.
Note:
Fortunately, the hardware does do register usage locking in this
case; the above code may be slow, but at least it is guaranteed to generate
the correct results.
If a data dependency cannot be avoided, try rearranging code so that at least
some useful work is done during the delay.
Hint:
“Keeping the pipeline full”
is going to be one of your keys to
maximum performance.
Loop Inversion
A common trick used in vector programming is
loop inversion
. This means
swapping inner and outer loops, in order to create the simplest loop with the
largest number of iterations so we can maximize vectorization.
Consider the following code fragment which could be used for vertex
translation:
for (i = 0; i < num_pts; i++) {/* for each point */
for (j=0; j<4; j++) {/* for each dimension */
point[i][j] += offset[j];
}
}
Since we can only vectorize the inner-most operation (the addition), we
would only be using 50% of our vector unit.
Now suppose we have an infinite number of vector elements. If we did, we
could swap the loops and do the outer loop four times, vectorizing the inner
loop across
num_pts
elements:
for (i = 0; i < 4; i++) {/* for each dimension */
for (j=0; j<num_pts; j++) {/* for each point */
point[j][i] += offset[i];
}
Содержание Ultra64
Страница 2: ...2 ...
Страница 10: ...10 ...
Страница 12: ...12 Figure 6 2 buildtask Operation 137 ...
Страница 14: ...14 ...
Страница 80: ...80 Vector Unit Instructions vmadm dres_int dres_int vconst 3 vmadn dres_frac vconst vconst 0 ...
Страница 104: ...104 RSP Coprocessor 0 ...
Страница 150: ...150 Advanced Information ...
Страница 155: ...Revision 1 0 155 ...
Страница 248: ...248 Exceptions None ...
Страница 251: ...Revision 1 0 251 Exceptions None ...
Страница 254: ...254 Exceptions None ...
Страница 257: ...Revision 1 0 257 Exceptions None ...
Страница 293: ...Revision 1 0 293 Exceptions None ...
Страница 316: ...316 Exceptions None ...