
Sun Microelectronics
275
16. Code Generation Guidelines
If such a load (D-Cache miss, E-Cache hit) is immediately followed by a use, the
group is broken and an (N+1)-cycle stall occurs; Figure 16-12 illustrates this situ-
ation. (The figure shows a 7-cycle stall, which is consistent with 1–1–1 mode;
2–2 mode incurs an 8-cycle stall.)
Figure 16-12
D-Cache Miss, E-Cache Hit (1 –1 – 1 mode shown)
Because of the high penalty associated with a load miss for code scheduled based
on loads hitting the D-Cache, UltraSPARC provides hardware support for non-
blocking loads through a load buffer that allows code scheduling based on Exter-
nal Cache (E-Cache) hits.
16.3.6 Scheduling for the E-Cache
Some applications have a working set that is too large to fit within the D-Cache
(they cause many capacity misses); others use data in patterns that generate
many conflict-misses. Compilers c an schedule these applications to “bypass” the
D-Cache and access the data out of the E-Cache.
Loads that miss the D-Cache do not necessarily stall the pipeline (non-blocking
loads). Instead, they are sent to the load buffer, where they wait for the data to be
returned from the E-Cache. The pipeline stalls only when an instruction that is
dependent on the non-blocking load enters the pipeline before the load data is re-
turned.
16.3.6.1 Load Buffer Timing
The load buffer’s depth and its interaction with the rest of the pipeline are de-
signed to support full throughput (one load per cycle) for a D-Cache with a three-
cycle pin-to-pin latency and one cycle throughput, which is consistent with 1–1–1
mode.) As shown in Figure 16-13, if a use is separated from a load by 8 cycles, no
stall occurs and full throughput is achieved. In comparison, if code is scheduled
for the D-Cache only, N extra cycles are required between the load and the use,
where N is determined by the SRAM mode, as shown in Table 16-1 on page 274.
The shaded rows in Figure 16-13 represent these N extra cycles.
load r
1
F
D
G
E
C
N
1
Q
Q
Q
Q
Q
use r
1
F
D
G
G
E
E
E
E
E
E
E
E
C
N
1
N
2
N
3
W
Group Break
(
N+1)-Cycle Stall
Execution Resumes
Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com