
84
Repeated String Instruction Usage
AMD Athlon™ Processor x86 Code Optimization
22007E/0—November 1999
In addition, using MMX instructions increases the available
parallelism. The AMD Athlon processor can issue three integer
OPs and two MMX OPs per cycle.
Repeated String Instruction Usage
Latency of Repeated String Instructions
Table 1 shows the latency for repeated string instructions on the
AMD Athlon processor.
Table 1 lists the latencies with the direction flag (DF) = 0
(increment) and DF = 1. In addition, these latencies are
a s s u m e d fo r a l i g n e d m e m o ry o p e ra n d s . N o t e t h a t fo r
MOVS/STOS, when DF = 1 (DOWN), the overhead portion of the
latency increases significantly. However, these types are less
commonly found. The user should use the formula and round up
to the nearest integer value to determine the latency.
Guidelines for Repeated String Instructions
To help achieve good performance, this section contains
guidelines for the careful scheduling of VectorPath repeated
string instructions.
Use the Largest
Possible Operand
Size
Always move data using the largest operand size possible. For
example, use REP MOVSD rather than REP MOVSW and REP
MOVSW rather than REP MOVSB. Use REP STOSD rather than
REP STOSW and REP STOSW rather than REP MOVSB.
Table 1.
Latency of Repeated String Instructions
Instruction
ECX=0 (cycles)
DF = 0 (cycles)
DF = 1 (cycles)
REP MOVS
11
15 + (4/3*c)
25 + (4/3*c)
REP STOS
11
14 + (1*c)
24 + (1*c)
REP LODS
11
15 + (2*c)
15 + (2*c)
REP SCAS
11
15 + (5/2*c)
15 + (5/2*c)
REP CMPS
11
16 + (10/3*c)
16 + (10/3*c)
Note:
c = value of ECX, (ECX > 0)
Summary of Contents for Athlon Processor x86
Page 1: ...AMD Athlon Processor x86 Code Optimization Guide TM...
Page 12: ...xii List of Figures AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 16: ...xvi Revision History AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 202: ...186 Page Attribute Table PAT AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 252: ...236 VectorPath Instructions AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...
Page 256: ...240 Index AMD Athlon Processor x86 Code Optimization 22007E 0 November 1999...