4-2
Performance
AMD-K5 Processor Technical Reference Manual
18524C/0—Nov1996
Moreover, future implementations may increase the penal-
ties associated with microcoded instructions.
■
Dependencies—Spread out true dependencies to increase
the opportunities for parallel execution. Antidependencies
and output dependencies do not impact performance.
■
Memory Operands—Instructions that operate on data in
memory (load/op/store) can inhibit parallelism. Using sepa-
rate move and ALU instructions allows independent opera-
tions to be performed in parallel. On the other hand, if
there are no opportunities for parallel execution, use the
load/op/store forms to reduce the number of register spills
(storing register values in memory to free registers for
other uses) and increase code density.
■
Register Operands—Maintain frequently used values in reg-
isters or on the stack rather than in static storage.
■
Branch Prediction—Use control-flow constructs that allow
effective branch prediction. Although correctly predicted
branches have no cost, mispredicted branches incur a three
clock penalty.
■
Stack References—Use ESP for references to the stack so
that EBP remains available for general use.
■
Stack Allocation—When placing outgoing parameters on the
stack, allocate space by adjusting the stack pointer (prefer-
ably at the same time local storage is allocated on proce-
dure entry) and use moves rather than pushes. This method
of allocation allows random access to the outgoing parame-
ters so that they may be set up when they are calculated,
instead of having to be held somewhere else until the proce-
dure call. This method also uses fewer execution resources
(specifically, fewer register-file write ports when updating
ESP).
■
Shifts—Although there is only one shifter, certain shifts can
be done using other execution units: for example, shift left
1 by adding a value to itself. Use LEA index scaling to shift
left by 1, 2, or 3.
■
Data Embedded in Code—When data is embedded in the
code segment, align it in separate cache blocks from nearby
code to avoid some overhead in maintaining coherency
between the instruction and data caches.
■
Undefined Flags—Do not rely on the behavior of undefined
flag results.
Summary of Contents for AMD-K5
Page 1: ...AMD K5 Processor Technical Reference Manual TM...
Page 10: ...x AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 24: ...1 4 Overview AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 54: ...2 30 Internal Architecture AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 116: ...4 26 Performance AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 356: ...6 44 System Design AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 380: ...7 24 Test and Debug AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 396: ...A 16 AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...
Page 406: ...I 10 Index AMD K5 Processor Technical Reference Manual 18524C 0 Nov1996...