214
January, 2004
Developer’s Manual
Intel XScale® Core
Developer’s Manual
Optimization Guide
A.5.4
Scheduling SWP and SWPB Instructions
The SWP and SWPB instructions have a 5 cycle issue latency. As a result of this latency, the
instruction following the SWP/SWPB instruction would stall for 4 cycles. SWP and SWPB
instructions should, therefore, be used only where absolutely needed.
For example, the following code may be used to swap the contents of 2 memory locations:
; Swap the contents of memory locations pointed to by r0 and r1
ldr r2, [r0]
swp r2, [r1]
str r2, [r1]
The code above takes 9 cycles to complete. The rewritten code below, takes 6 cycles to execute:
; Swap the contents of memory locations pointed to by r0 and r1
ldr r2, [r0]
ldr r3, [r1]
str r2, [r1]
str r3, [r0]