I
NTRODUCTION TO THE
ARM
®
P
ROCESSOR
U
SING
I
NTEL
FPGA T
OOLCHAIN
For Quartus Prime 16.1
8
Example Program
As an illustration of ARM instructions and assembler directives, Figure 3 gives an assembly-language program that
computes a dot product of two vectors,
A
and
B
. The vectors have
n
elements. The required computation is
Dot product
=
P
n
−
1
i
=
0
A(
i
)
×
B(
i
)
The vectors are stored in memory locations at addresses
AVECTOR
and
BVECTOR
, respectively. The number of
elements,
n
, is stored in memory location
N
. The computed result is written into memory location
DOTP
. Each
vector element is assumed to be a signed 32-bit number.
The program includes some sample data. It illustrates how the
.word
assembler directive can be used to load data
items into memory. The memory locations involved are those that follow the location occupied by the Branch
instruction, B, which is the last instruction in the program. The execution of the program ends by continuously
looping on this instruction.
.
text
.
global
_start
_start:
LDR
R0, =AVECTOR
/* Register R0 is a pointer to vector
A
. */
LDR
R1, =BVECTOR
/* Register R1 is a pointer to vector
B
. */
LDR
R2, N
/* Register R2 is used as the counter for loop iterations. */
MOV
R3, #0
/* Register R3 is used to accumulate the product. */
LOOP:
LDR
R4, [R0], #4
/* Load the next element of vector
A
. */
LDR
R5, [R1], #4
/* Load the next element of vector
B
. */
MLA
R3, R4, R5, R3
/* Compute the product of next pair of elements, */
/* and add to the sum. */
SUBS
R2, R2, #1
/* Decrement the counter. */
BGT
LOOP
/* Loop again if not finished. */
STR
R3, DOTP
/* Store the result in memory. */
STOP:
B
STOP
N:
.
word
6
/* Specify the number of elements. */
AVECTOR: .
word
5, 3,
−
6, 19, 8, 12
/* Specify the elements of vector A. */
BVECTOR: .
word
2, 14,
−
3, 2,
−
5, 36
/* Specify the elements of vector B. */
DOTP:
.
space
4
/* Space for the final dot product. */
.
end
Figure 3. A program that computes the dot product of two vectors.
Observe the treatment of labels. In the instruction
LDR
R0,
=
AVECTOR
Intel Corporation - FPGA University Program
November 2016
19