9-17
PERFORMANCE CONSIDERATIONS
The Intel486 processor’s on-chip cache dramatically speeds floating-point loads and stores. For
the Intel386 processor with a math coprocessor, instructions such as FLD (floating-point load)
will take 14-20 clock cycles if any external memory addressing is required. Once operands are
on the internal stack, it takes 23 to 31 cycles to execute the floating-point add instruction, depend-
ing on the value of the operands. Finally an external memory store can take up to 11-44 cycles.
Because the floating-point unit of the Intel486 processor is integrated, the entire operation exe-
cutes in fewer cycles. Data from the external memory can be cached. After that it can be accessed
by the floating-point unit, and loaded into the stack in three cycles on a cache hit. The floating-
point add instruction takes between 8 to 20 cycles depending on the value of the operands. Final-
ly, the store instruction takes 7 clocks.
Because the Intel486 processor provides a higher performance not only for floating point loads
and stores, but also for floating-point compute operations, a 3x to 4x performance boost is real-
ized for numerics-intensive routines. A large portion of the performance improvement is attrib-
uted to the fact that synchronous floating-point transfers occur on-chip.
9.9.2
Performance of the Floating-Point Unit
To achieve three to four times the floating-point performance of a non-integrated math coproces-
sor, the Intel486 processor’s floating-point circuitry has been enhanced to reduce the number of
clock counts needed to execute frequently used instructions. Also, the interface to the processor’s
registers and buses is much more efficient since all of the interacting units are on the same chip.
Table 9-3
shows the number of clock counts per instruction on the Intel486 processor.
Table 9-3. Floating-Point Instruction Execution
Instruction
Clock Counts
Intel486™ Processor
FLD-Load
3
FST-Store
3
FADD/FSUB
8-20
FMUL
Floating multiply
16
FDIV
Floating divide
73
Содержание Embedded Intel486
Страница 16: ......
Страница 18: ......
Страница 26: ......
Страница 28: ......
Страница 42: ......
Страница 44: ......
Страница 62: ......
Страница 64: ......
Страница 138: ......
Страница 139: ...5 Memory Subsystem Design Chapter Contents 5 1 Introduction 5 1 5 2 Processor and Cache Feature Overview 5 1 ...
Страница 140: ......
Страница 148: ......
Страница 150: ......
Страница 170: ......
Страница 172: ......
Страница 226: ......
Страница 228: ......
Страница 264: ......
Страница 282: ......
Страница 284: ......