Xilinx Virtex-II Pro PPC405 Скачать руководство пользователя страница 538

Страница: 538 / 562

846

www.xilinx.com

March 2002 Release

1-800-255-7778

Virtex-II Pro™ Platform FPGA Documentation

Appendix D:

Programming Considerations

Referring to

Table D-1

, issue-rate cycle numbers are used in the following cases:

•

No operand dependency exists on a previous multiply or MAC instruction in the
multiply hardware.

•

The result of a MAC instruction is used as the accumulate operand of a subsequent
MAC instruction in the multiply hardware. In this case, the processor is capable of
forwarding the required result within the time imposed by the issue-rate.

Latency cycle numbers are used in the following cases:

•

No multiply or MAC instruction is present in the multiply hardware when the current
instruction is executed.

•

An operand of a multiply or MAC instruction depends on the result of a previous
multiply or MAC instruction in the multiply hardware. An exception to this rule is
described in the issue-rate rules described above.

Scalar Load Instructions

Cacheable load instructions that hit in the data cache usually execute in one cycle.
Cacheable and non-cacheable load instructions that hit in the data fill buffer also execute
(usually) in one cycle.

The pipelining of load instructions by the processor can cause loads that hit in the cache or
fill buffer to take extra cycles. If a load instruction is followed by an instruction that uses
the loaded data, a load-use dependency exists. When the loaded data is available, it is
forwarded to the operand register of the dependent instruction. This prevents a processor
stall from occurring due to missing operand data. This data forwarding adds an extra
latency cycle when updating the appropriate GPR. In this case, the load appears to execute
in two cycles.

Load Misses and Uncacheable Loads

Cacheable load misses and non-cacheable loads incur penalty cycles for accessing memory
over the PLB. These penalty cycles depend on the speed of the PLB and when the address
acknowledge is returned over the PLB. Assuming the PLB operates at the same frequency
as the processor and that the address acknowledge is returned in the same cycle the data-
cache unit asserts the PLB request, the number of penalty cycles are as follows:

•

Six cycles if operand forwarding is enabled.

•

Seven cycles if operand forwarding is not enabled.

Additional cycles are required if the system performance does not match the above
assumptions.

Table D-1:

Multiply and MAC Instruction Timing

Operations

Issue-Rate

Cycles

Latency

Cycles

MAC and Negative MAC

Halfword

Halfword (32-bit result)

Halfword

Word (48-bit result)

Word

Word (64-bit result)

Notes:

For the purposes of this table, words are treated as halfwords if the upper 16 bits of the operand
contain a sign extension of the lower 16 bits. For example, if the upper 16 bits of a word operand
are zero, the operand is considered a halfword when calculating execution time.

Xilinx Virtex-II Pro PPC405, Руководство пользователя

Результаты поиска

Содержание Virtex-II Pro PPC405

Отзывы:

Бренды по названию

Популярные бренды