![Xilinx Virtex-II Pro PPC405 Скачать руководство пользователя страница 538](http://html1.mh-extra.com/html/xilinx/virtex-ii-pro-ppc405/virtex-ii-pro-ppc405_user-manual_3410279538.webp)
846
March 2002 Release
1-800-255-7778
Virtex-II Pro™ Platform FPGA Documentation
Appendix D:
Programming Considerations
R
Referring to
, issue-rate cycle numbers are used in the following cases:
•
No operand dependency exists on a previous multiply or MAC instruction in the
multiply hardware.
•
The result of a MAC instruction is used as the accumulate operand of a subsequent
MAC instruction in the multiply hardware. In this case, the processor is capable of
forwarding the required result within the time imposed by the issue-rate.
Latency cycle numbers are used in the following cases:
•
No multiply or MAC instruction is present in the multiply hardware when the current
instruction is executed.
•
An operand of a multiply or MAC instruction depends on the result of a previous
multiply or MAC instruction in the multiply hardware. An exception to this rule is
described in the issue-rate rules described above.
Scalar Load Instructions
Cacheable load instructions that hit in the data cache usually execute in one cycle.
Cacheable and non-cacheable load instructions that hit in the data fill buffer also execute
(usually) in one cycle.
The pipelining of load instructions by the processor can cause loads that hit in the cache or
fill buffer to take extra cycles. If a load instruction is followed by an instruction that uses
the loaded data, a load-use dependency exists. When the loaded data is available, it is
forwarded to the operand register of the dependent instruction. This prevents a processor
stall from occurring due to missing operand data. This data forwarding adds an extra
latency cycle when updating the appropriate GPR. In this case, the load appears to execute
in two cycles.
Load Misses and Uncacheable Loads
Cacheable load misses and non-cacheable loads incur penalty cycles for accessing memory
over the PLB. These penalty cycles depend on the speed of the PLB and when the address
acknowledge is returned over the PLB. Assuming the PLB operates at the same frequency
as the processor and that the address acknowledge is returned in the same cycle the data-
cache unit asserts the PLB request, the number of penalty cycles are as follows:
•
Six cycles if operand forwarding is enabled.
•
Seven cycles if operand forwarding is not enabled.
Additional cycles are required if the system performance does not match the above
assumptions.
Table D-1:
Multiply and MAC Instruction Timing
Operations
Issue-Rate
Cycles
Latency
Cycles
MAC and Negative MAC
1
2
Halfword
×
Halfword (32-bit result)
1
2
Halfword
×
Word (48-bit result)
2
3
Word
×
Word (64-bit result)
4
5
Notes:
For the purposes of this table, words are treated as halfwords if the upper 16 bits of the operand
contain a sign extension of the lower 16 bits. For example, if the upper 16 bits of a word operand
are zero, the operand is considered a halfword when calculating execution time.
Содержание Virtex-II Pro PPC405
Страница 1: ...R Volume 2 a PPC405 User Manual Virtex II Pro Platform FPGA Developer s Kit March 2002 Release...
Страница 14: ...322 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation Preface R...
Страница 252: ...560 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation R...
Страница 260: ...568 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation R...
Страница 562: ...870 www xilinx com March 2002 Release 1 800 255 7778 Virtex II Pro Platform FPGA Documentation R...