MIPS MIPS32 74K Series Скачать руководство пользователя страница 15

Страница: 15 / 156

Introduction

Programming the MIPS32® 74K™ Core Family, Revision 02.14

units and each has a phrase (in italics) summarizing what it does. The three-letter acronyms match those found in the
detailed descriptions, and the pipeline stage names used in the detailed descriptions are across the top. To simplify the
picture the integer multiply unit and the (optional) floating point unit have been omitted — once you figure out what’s
going on, they shouldn’t be too hard to put back. So:

•

The 74K core’s instruction fetch unit (“IFU”) is semi-autonomous. It’s 128 bits wide, and handles four instruc-
tions at a bite.

The IFU works a bit like a dog being taken for a walk. It rushes on ahead as long as the lead will stretch (the IFU,
processing instructions four at a time, can rapidly get ahead). Even though you’re in charge, your dog likes to go
first - and so it is with the IFU. Like a dog, the IFU guesses where you want to go, strongly influenced by the way
you usually go. If you make an unexpected turn there is a brief hiatus while the dog comes back and gets up front
again.

The IFU has a queue to keep instructions in when it’s running ahead of the rest of the CPU. This kind of design is
called a “decoupled” IFU.

•

Issue: the IDU (“instruction decode/dispatch unit”) keeps its own queue of instructions and tries to find two of
them which can be issued in parallel. The instruction set is strictly divided into AGEN instructions (loads, stores,
prefetch, cacheops; conditional moves, branches and jumps) and ALU (everything else). If all else is good, the
IDU can issue one instruction of each type in every cycle. Instructions are marked with their place in the program
sequence, but are not necessarily issued in order. An instruction may leapfrog ahead of program order in the
IDU’s queue, if all the data it needs is ready (or at least will be ready by the time it’s needed).

Instructions which execute ahead of time can’t write data to real registers — that would disrupt the operation of
their program predecessors, which might execute later. It may turn out that such an instruction shouldn’t have run
at all if there was a mispredicted branch, or an earlier-in-program-order instruction took an exception. Instead,
each instruction is assigned a completion buffer (CB) entry to receive its result. The CB entry also keeps informa-
tion about the instruction and where it came from. An instruction which is dependent on this one for a source reg-
ister value but runs soon afterward can get its data from the CB. CB-resident values can be found through the
rename map; that map is indexed by register number and points to the CB reserved by the instruction which will
write or has written a register value.

•

out-of-order execution: the effect of the above is that instructions are issued in “dataflow” order, as determined by
their dependencies on register values produced by other instructions. Up to 32 instructions can be somewhere
between available for issue and completed in the 74K core — those instructions are often said to be in flight. The
32 possible instructions correspond to 32 CB entries — 14 for AGEN instructions, 18 for ALU instructions.

Inside the “execution” box the AGEN and ALU instructions proceed strictly through two internally-pipelined
units of the same names. The two pipelines are in lockstep, and are kept that way. This sounds rigid, but is help-
ful. When the IDU issues an instruction, it does not have to know that an instruction’s data is ready “right now”:
it’s enough that the instruction producing that data is far enough along either execution pipeline. When no other
progress can be made its probably best to think of the IDU issuing a “no-op” or “bubble” into either or both pipe-
lines.

Most of the time the execution pipelines just keep running — the IDU tries to detect any reason why an instruc-
tion cannot run through either the AGEN or ALU pipe.When dependent instructions run close together, the data
doesn’t have time to go into a register or CB entry and be read out again. Instead it can flow down a dedicated
bypass connection between two particular pipestages — a routine trick used in pipelined logic. In the 74K core
there are bypasses interconnecting the AGEN and ALU pipelines, as well as within each pipeline. But whereas
pipeline multiplexing in a conventional design is controlled by comparing register numbers, in 74K cores we
compare completion buffer entry IDs.