
Sun Microelectronics
233
13. UltraSPARC Extended Instructions
taken, so the trap handler need not consider pending block loads. If the BLD
overlaps a previous or later store and there is no intervening MEMBAR, trap, or
data reference, the BLD may return data from before or after the store.
BST does not follow memory model ordering with respect to loads, stores or
flushes. In particular, read-after-write, write-after-write, flush after write and
write-after-read hazards to overlapping addresses are not detected. The side ef-
fects bit associated with the access is ignored. If ordering with respect to earlier
or later loads or stores is important then there must be an intervening reference
to the load data (for earlier loads), or appropriate MEMBAR instruction. This re-
striction does not apply when a trap is taken, so the trap handler does not have to
worry about pending block stores. If the BST overlaps a previous load and there
is no intervening load data reference or MEMBAR
#LoadStore
instruction, the
load may return data from before or after the store and the contents of the block
are undefined. If the BST overlaps a later load and there is no intervening trap or
MEMBAR
#StoreLoad
instruction, the contents of the block are undefined. If
the BST overlaps a later store or flush and there is no intervening trap or MEM-
BAR
#StoreStore
instruction, the contents of the block are undefined.
Block load and store operations do not obey the ordering restrictions of the cur-
rently selected processor memory model (TSO, PSO, or RMO); block operations
always execute under an RMO memory ordering model. Explicit MEMBAR in-
structions are required to order block operations among themselves or with re-
spect to normal loads and stores. In addition, block operations do not conform to
dependence order on the issuing processor; that is, no read-after-write or writer-
after-read checking occurs between block loads and stores. Explicit MEMBARs
are required to enforce dependence ordering between block operations that refer-
ence the same address.
Typically, BLD and BST will be used in loops where software can ensure that
there is no overlap between the data being loaded and the data being stored. The
loop will be preceded and followed by the appropriate MEMBARs to ensure that
there are no hazards with loads and stores outside the loops. Code Example 13-5
on page 234 illustrates the inner loop of a byte-aligned block copy operation.
Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com