
Sun Microelectronics
274
UltraSPARC User’s Manual
see later, this is desirable not only for improving the D-Cache hit rate (by increas-
ing its utilization density), but also for D-Cache misses where, for sequential ac-
cesses, one out of two requests to the E-Cache can be eliminated. Grouping load
data beyond a D-Cache sub-block is also desirable, since an E-Cache line contains
four D-Cache sub-blocks (for a total of 64 bytes). Thus, sequential accesses can
guarantee that only one E-Cache miss will occur for loads that access up to four
consecutive D-Cache sub-blocks (two D-Cache lines). Section 16.3.6 discuss how
code scheduled for accessing data directly out of the E-Cache can hide the extra
latency introduced by D-Cache misses.
Data alignment (right justification) for byte, halfword, and word accesses does
not add latency to the loads (unless superseded by the sign rule described in Sec-
tion 16.3.2.1, “Signed Loads”). This is true whether the load goes to the register
file or to internal pipeline bypasses.
16.3.4 Direct-Mapped Cache Considerations
A direct-mapped cache is more susceptible to collisions than a set-associative
cache. It is possible to organize data at compile time so that collisions are mini-
mized, however. For frequently executed loops, the compiler should organize the
data so that all accesses within the loop are mapped to different cache lines, un-
less the access is to a line that is already mapped and the access is to the same
physical line. For UltraSPARC, this means that accesses should differ in the virtual
address bits VA<13:5>. Hot spots can be detected by configuring the on-chip
counters to accumulate D-Cache accesses and D-Cache misses. The counters can
be turned on/off before/after the load of interest, or around a series of loads
where hot spots are suspected to occur.
16.3.5 D-Cache Miss, E-Cache Hit Timing
Under normal circumstances (for example, no snoops, no arbitration conflict for
the E-Cache bus, etc.), loads that hit the E-Cache are returned N cycles later than
loads that hit the D-Cache, where N is determined by the E-Cache SRAM mode.
Table 16-1 shows the latency for all supported SRAM Modes. (See Section 1.3.9.1,
“E-Cache SRAM Modes,” on page 9 for more information, including which
modes are supported by each UltraSPARC model.)
Table 16-1
D-Cache Miss, E-Cache Hit Latency Depends on SRAM Mode
SRAM Modes
1–1–1
2–2
# of Cycles
6
7
Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com