User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
Instruction Timing
Page 244 of 377
gx_06.fm.(1.2)
March 27, 2006
Table 6-9 shows load-and-store instruction latencies. Pipelined load/store instructions are shown with cycles
of total latency and throughput cycles separated by a colon.
Table 6-9. Load-and-Store Instructions
(Page 1 of 4)
Instruction
Mnemonic
Primary
Opcode
Extended
Opcode
Unit
Cycles
Serialization
Data Cache Block Flush
dcbf
31
86
LSU
3:5
Execution
Data Cache Block
Invalidate
dcbi
31
470
LSU
3:3
Execution
Data Cache Block Store
dcbst
31
54
LSU
3:5
Execution
Data Cache Block
Touch
dcbt
31
278
LSU
2:1
—
Data Cache Block
Touch for Store
dcbtst
31
246
LSU
2:1
—
Data Cache Block set to
Zero
dcbz
31
1014
LSU
3:6
Execution
External Control In
Word Indexed
eciwx
31
310
LSU
2:1
—
External Control Out
Word Indexed
ecowx
31
438
LSU
2:1
—
Instruction Cache Block
Invalidate
icbi
31
982
LSU
3:4
Execution
Load Byte and Zero
lbz
34
—
LSU
2:1
—
Load Byte and Zero with
Update
lbzu
35
—
LSU
2:1
—
Load Byte and Zero with
Update Indexed
lbzux
31
119
LSU
2:1
—
Load Byte and Zero
Indexed
lbzx
31
87
LSU
2:1
—
Load Floating-Point
Double
lfd
50
—
LSU
2:1
—
Load Floating-Point
Double with Update
lfdu
51
—
LSU
2:1
—
Load Floating-Point
Double with Update
Indexed
lfdux
31
631
LSU
2:1
—
Load Floating-Point
Double Indexed
lfdx
31
599
LSU
2:1
—
Load Floating-Point
Single
lfs
48
—
LSU
2:1
—
Load Floating-Point
Single with Update
lfsu
49
—
LSU
2:1
—
Load Floating-Point
Single with Update
Indexed
lfsux
31
567
LSU
2:1
—
1. For cache operations, the first number indicates the latency in finishing a single instruction; the second indicates the throughput for
back-to-back cache operations. Throughput might be larger than the initial latency, as more cycles might be needed to complete
the instruction to the cache, which stays busy keeping subsequent cache operations from executing.
2. The throughput number of six cycles for dcbz assumes it is to nonglobal (M = 0) address space. For global address space,
throughput is at least 11 cycles.
3. Load/store multiple/string instruction cycles are represented as a fixed number of cycles plus a variable number of cycles, where n
is the number of words accessed by the instruction.