Xilinx Virtex-II Pro PPC405 Скачать руководство пользователя страница 539

Страница: 539 / 562

March 2002 Release

www.xilinx.com

Virtex-II Pro™ Platform FPGA Documentation

1-800-255-7778

Instruction Performance

The PPC405 can execute instructions following a load miss or non-cacheable load if those
subsequent instructions do not have a load-use dependency on the load data. When
possible, the instruction using the load data should be separated from the load instruction
by as many non-use instructions as possible. This enables the processor to continue
executing instructions with minimal delay while the load data is accessed.

Scalar Store Instructions

Cacheable store instructions that miss in the data cache are queued by the data-cache unit
so that they appear to execute in a single cycle (if the store is aligned properly). Non-
cacheable store instructions are handled in the same way. Under certain conditions, the
data-cache unit can queue up to three store instructions (see

Pipeline Stalls

, page 446

for

more information.)

All aligned

stwcx.

instructions execute in two cycles.

String and Multiple Instructions

The access time for load/store string and load/store multiple instructions depends on the
alignment of the data being accessed.

String instructions are decomposed by the processor into multiple word-aligned accesses.
The execution time for string instructions is calculated as follows (assuming data-cache
hits):

•

Access to leading bytes consume one cycle. Unused bytes are discarded if the leading
bytes are not aligned on a word boundary.

•

Access to intermediate bytes consume one cycle for each word accessed.

•

Access to trailing bytes consume one cycle. Unused bytes are discarded if the trailing
bytes are not aligned on a word boundary.

Figure D-1

shows an example of a 21-byte string with unaligned leading and trailing bytes.

Shaded boxes represent bytes outside the string that are discarded by the processor.

In the above example, access to the string requires six cycles, assuming data-cache hits.
This is calculated as follows:

•

One cycle is required to access the bytes at addresses 1, 2, and 3. The byte at address 0
is also accessed but discarded.

•

Four cycles are required to access the four words at addresses 4, 8, 12, and 16 (one
cycle for each word).

•

One cycle is required to access the bytes at addresses 20 and 21. The bytes at addresses
22 and 23 are also accessed but discarded.

Load/store multiple instructions are also decomposed by the processor into multiple
word-aligned accesses. Unaligned words are assembled (loads) or disassembled (stores)
by the processor during the access. The execution time for these instructions is calculated
as follows (assuming data-cache hits):

Xilinx Virtex-II Pro PPC405, Руководство пользователя

Результаты поиска

Содержание Virtex-II Pro PPC405

Отзывы:

Бренды по названию

Популярные бренды