Writing ARM and Thumb Assembly Language
ARM DUI 0068B
Copyright © 2000, 2001 ARM Limited. All rights reserved.
2-23
Because of the number of branches, the code is seven instructions long. Every time a
branch is taken, the processor must refill the pipeline and continue from the new
location. The other instructions and non-executed branches use a single cycle each.
By using the conditional execution feature of the ARM instruction set, you can
implement the gcd function in only four instructions:
gcd
CMP r0, r1
SUBGT r0, r0, r1
SUBLT r1, r1, r0
BNE gcd
In addition to improving code size, this code executes faster in most cases. Table 2-2
and Table 2-3 on page 2-24 show the number of cycles used by each implementation for
the case where r0 equals 1 and r1 equals 2. In this case, replacing branches with
conditional execution of all instructions saves three cycles.
The conditional version of the code executes in the same number of cycles for any case
where r0 equals r1. In all other cases, the conditional version of the code executes in
fewer cycles.
Table 2-2 Conditional branches only
r0: a
r1: b
Instruction
Cycles (ARM7)
1
2
CMP r0, r1
1
1
2
BEQ end
1 (not executed)
1
2
BLT less
3
1
2
SUB r1, r1, r0
1
1
2
B gcd
3
1
1
CMP r0, r1
1
1
1
BEQ end
3
Total = 13