Sun Microelectronics
279
16. Code Generation Guidelines
In order to increase the throughput to the E-Cache, which results in decreasing
the frequency of the store buffer full condition, UltraSPARC collapses two stores to
the same 16 bytes of memory into one store. Since compression only occurs
among two adjacent entries in the store buffer, the code should be organized so
that multiple stores to the same “region” in memory are issued sequentially (in-
creasing or decreasing order).
16.3.8 Read-After-Write and Write-After-Read Hazards
A Read-After-Write (RAW) hazard occurs when a load to the same address as an
older outstanding store is issued. UltraSPARC does not provide direct by-passing
from intermediate stages of the store buffer to the various pipes that may result
in pipeline stalls.
Most RAW hazards can be eliminated by proper register allocation and by elimi-
nating spurious loads. Disassembled traces of various programs showed that
most RAWs were “false” RAWs, and can be eliminated. However, some RAWs
were “true” RAWs; they occur because two data structures point to the same
memory location (through array indexes or pointers) without having knowledge
that there could be a match between them. In order to simplify the hardware, the
full 40 physical address bits are not used when comparing the address of the
memory location requested by the load with the addresses associated with the
stores in the store buffer. The rules are:
•
The physical tag of the address is ignored
•
If the load hits the D-Cache, bits <13:0> of the address are used for
comparison (byte granularity)
•
If the load misses the D-Cache, bits <13:4> of the address are used for
comparison (sub-block granularity)
In order to cover both cache hits and cache misses, one should try to avoid RAWs
based on a 16-byte boundary (using bits <13:4>). Even if a RAW occurs, the pipe-
line is not stalled until a use of the load data enters the pipeline (similar to the
way loads are handled during D-Cache misses). Code Example 16-4 shows an ex-
ample of back-to-back instructions causing a RAW hazard and a load-use. In the
best scenario (that is, when the store buffer and load buffer are empty) the RAW
hazard stalls the pipe for 8 cycles (versus one cycle for the normal load-use stall).
This is mainly due to the fact that the store data enters the store buffer late in the
pipe and that the load buffer must wait until the data is in the D-Cache before it
can access it.
Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com