Coding for SIMD Architectures
3
3-33
The main loop consists of two functions: transformation and lighting.
For each object, the main loop calls a transformation routine to update
some data, then calls the lighting routine to further work on the data. If
the size of array
v[Num]
is larger than the cache, then the coordinates for
v[i]
that were cached during
Transform(v[i])
will be evicted from
the cache by the time we do
Lighting(v[i])
. This means that
v[i]
will have to be fetched from main memory a second time, reducing
performance.
for (i=0; i<Num; i++) {
Lighting(v[i]);
}
....
}
Example 3-19 Strip Mined Code
main()
{
Vertex_rec v[Num];
....
for (i=0; i < Num; i+=strip_size) {
for (j=i; j < min(Num, i+strip_size); j++) {
Transform(v[j]);
}
for (j=i; j < min(Num, i+strip_size); j++) {
Lighting(v[j]);
}
}
}
Example 3-18 Pseudo-code Before Strip Mining
(continued)
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...