![Xilinx Virtex-II Pro PPC405 Скачать руководство пользователя страница 539](http://html1.mh-extra.com/html/xilinx/virtex-ii-pro-ppc405/virtex-ii-pro-ppc405_user-manual_3410279539.webp)
March 2002 Release
847
Virtex-II Pro™ Platform FPGA Documentation
1-800-255-7778
Instruction Performance
R
The PPC405 can execute instructions following a load miss or non-cacheable load if those
subsequent instructions do not have a load-use dependency on the load data. When
possible, the instruction using the load data should be separated from the load instruction
by as many non-use instructions as possible. This enables the processor to continue
executing instructions with minimal delay while the load data is accessed.
Scalar Store Instructions
Cacheable store instructions that miss in the data cache are queued by the data-cache unit
so that they appear to execute in a single cycle (if the store is aligned properly). Non-
cacheable store instructions are handled in the same way. Under certain conditions, the
data-cache unit can queue up to three store instructions (see
for
more information.)
All aligned
stwcx.
instructions execute in two cycles.
String and Multiple Instructions
The access time for load/store string and load/store multiple instructions depends on the
alignment of the data being accessed.
String instructions are decomposed by the processor into multiple word-aligned accesses.
The execution time for string instructions is calculated as follows (assuming data-cache
hits):
•
Access to leading bytes consume one cycle. Unused bytes are discarded if the leading
bytes are not aligned on a word boundary.
•
Access to intermediate bytes consume one cycle for each word accessed.
•
Access to trailing bytes consume one cycle. Unused bytes are discarded if the trailing
bytes are not aligned on a word boundary.
shows an example of a 21-byte string with unaligned leading and trailing bytes.
Shaded boxes represent bytes outside the string that are discarded by the processor.
In the above example, access to the string requires six cycles, assuming data-cache hits.
This is calculated as follows:
•
One cycle is required to access the bytes at addresses 1, 2, and 3. The byte at address 0
is also accessed but discarded.
•
Four cycles are required to access the four words at addresses 4, 8, 12, and 16 (one
cycle for each word).
•
One cycle is required to access the bytes at addresses 20 and 21. The bytes at addresses
22 and 23 are also accessed but discarded.
Load/store multiple instructions are also decomposed by the processor into multiple
word-aligned accesses. Unaligned words are assembled (loads) or disassembled (stores)
by the processor during the access. The execution time for these instructions is calculated
as follows (assuming data-cache hits):
•
Access to the leading word consumes one cycle. Unused bytes are discarded if the
leading word is not aligned on a word boundary.
•
Access to intermediate words consume one cycle for each word accessed.
•
Access to the trailing word consumes one cycle. Unused bytes are discarded if the
trailing word is not aligned on a word boundary.
Address
0
4
8
12
16
20
Data
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Figure D-1:
String Access Example
Содержание Virtex-II Pro PPC405
Страница 1: ...R Volume 2 a PPC405 User Manual Virtex II Pro Platform FPGA Developer s Kit March 2002 Release...
Страница 14: ...322 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation Preface R...
Страница 252: ...560 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation R...
Страница 260: ...568 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation R...
Страница 562: ...870 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation R...