Introduction
19
Programming the MIPS32® 74K™ Core Family, Revision 02.14
later than an ALU instruction with the same dependency — that’s usually a three cycle delay, because most ALU
operations already take an extra clock to produce their result.
It’s like the skewed pipeline which experts in MIPS Technologies’ 24K® family might remember, and has the same
motivation: ALU operations dependent on recent loads are more common than loads dependent on recent ALU oper-
ations.
1.4.4 Queues, Resource limits and Consequences
Queues which can fill up include:
•
Cache refills in flight: Is dependent on the size of the “FSB” queue - this and other queues are described in more
detail under
Section 3.3, "Reads, writes and synchronization"
. The CPU does not wait for a cache refill process
— at least not until it needs data from the cache miss. But in practice most load data is used almost at once, so the
CPU will stop very soon after a miss. As a result, you’re unlikely to ever have four refills in flight unless you are
using prefetch or otherwise deliberately optimizing loops. If a series of aggressive prefetches miss often enough,
the fourth outstanding load-miss will use the last FSB entry, preventing further loads from graduating and even-
tually blocking up the whole CPU until the load data returns. It’s likely to be good practice for code making con-
scious use of prefetches to ration itself to a number of operations slightly less than the size of the FSB.
•
Non-blocking loads to registers (nine): there are nine entries in the “LDQ”, each of which remembers one out-
standing load, and which register the data is destined to return to. Compiled code is unlikely to reach this limit. If
you write carefully optimized code where you try to fill load-use delays (perhaps for data you think will not hit in
the D-cache) you may hit this problem.
•
Lines evicted from the cache awaiting writeback (4+): writes are collected in the “WBB” queue. The 74K core’s
ability to write data will in almost all circumstances exceed the bandwidth available to memory; so a long enough
burst of uncached or write-through writes will eventually slow to memory speed. Otherwise, you’re unlikely to
suffer from this.
•
Queues in the coprocessor interface: the 74K core hides its out-of-order character from any coprocessors, so
coprocessor hardware need be no more complicated than it is for MIPS Technologies’ 24K core. The coprocessor
hardware sees its instructions strictly in order. Each coprocessor instruction also makes its own way through the
integer execution unit. Between the execution unit and coprocessor there are some queues which can fill up:
•
IOIQ (8 entries): instructions being issued — strictly in program order — to a coprocessor.
•
CBIDQ (8 entries): data being returned from a coprocessor by an instruction which writes a GP register. But
prior to graduation the data goes back to a completion buffer (hence the queue acronym).
•
CLDQ (8 entries): track data being loaded to coprocessor registers (the job done for the GPRs by the LDQ
above). CLDQ data isn’t necessarily provided in instruction sequence: in particular MIPS Technologies
floating-point unit accepts FP load data as and when it arrives, making FP loads non-blocking.
The dispatch process stalls (flooding the ALU and AGEN pipes with bubbles) when there is no space in any of
these queues.
Содержание MIPS32 74K Series
Страница 1: ...Document Number MD00541 Revision 02 14 March 30 2011 Programming the MIPS32 74K Core Family...
Страница 10: ...Programming the MIPS32 74K Core Family Revision 02 14 10...
Страница 20: ...1 4 A brief guide to the 74K core implementation Programming the MIPS32 74K Core Family Revision 02 14 20...
Страница 28: ...2 2 PRId register identifying your CPU type Programming the MIPS32 74K Core Family Revision 02 14 28...
Страница 54: ...3 8 The TLB and translation Programming the MIPS32 74K Core Family Revision 02 14 54...
Страница 83: ......
Страница 86: ...6 5 FPU pipeline and instruction timing Programming the MIPS32 74K Core Family Revision 02 14 86...
Страница 101: ...The MIPS32 DSP ASE 101 Programming the MIPS32 74K Core Family Revision 02 14...
Страница 134: ...8 4 Performance counters Programming the MIPS32 74K Core Family Revision 02 14 134...
Страница 154: ...C 3 FPU changes in Release 2 of the MIPS32 Architecture Programming the MIPS32 74K Core Family Revision 02 14 154...