IA-32 Intel® Architecture Optimization
E-12
Now for the case
T
l
=18,
T
b
=8 (2 cache lines are needed per iteration)
examine the following graph. Consider the graph of accesses per
iteration in example 1, Figure E-6.
The prefetch scheduling distance is a step function of
T
c
, the
computation latency. The steady state iteration latency (
il
) is either
memory-bound or compute-bound depending on
T
c
if prefetches are
scheduled effectively.
The graph in example 2 of accesses per iteration in Figure E-7 shows
the results for prefetching multiple cache lines per iteration. The cases
shown are for 2, 4, and 6 cache lines per iteration, resulting in differing
burst latencies. (
T
l
= 18,
T
b
= 8, 16, 24).
Figure E-6 Accesses per Iteration, Example 1
Summary of Contents for ARCHITECTURE IA-32
Page 1: ...IA 32 Intel Architecture Optimization Reference Manual Order Number 248966 013US April 2006...
Page 220: ...IA 32 Intel Architecture Optimization 3 40...
Page 434: ...IA 32 Intel Architecture Optimization 9 20...
Page 514: ...IA 32 Intel Architecture Optimization B 60...
Page 536: ...IA 32 Intel Architecture Optimization C 22...