A6.3
L1 instruction memory system
The L1 instruction side memory system provides an instruction stream to the decoder.
To increase overall performance and to reduce power consumption, it uses:
• Dynamic branch prediction.
• Instruction caching.
A6.3.1
Program flow prediction
The Cortex-A76 core contains program flow prediction hardware, also known as branch prediction.
Branch prediction increases overall performance and reduces power consumption. With program flow
prediction disabled, all taken branches incur a penalty that is associated with flushing the pipeline.
To avoid this penalty, the branch prediction hardware predicts if a conditional or unconditional branch is
to be taken. For conditional branches, the hardware predicts if the branch is to be taken. It also predicts
the address that the branch goes to, known as the branch target address. For unconditional branches, only
the target is predicted.
The hardware contains the following functionality:
• A
Branch Target Buffer
(BTB) holding the branch target address of previously taken branches.
• Dynamic branch predictor history.
• The return stack, a stack of nested subroutine return addresses.
• A static branch predictor.
• An indirect branch predictor.
Predicted and non-predicted instructions
Unless otherwise specified, the following list applies to A64, A32, and T32 instructions. As a rule the
flow prediction hardware predicts all branch instructions regardless of the addressing mode, and
includes:
• Conditional branches.
• Unconditional branches.
• Indirect branches that are associated with procedure call and return instructions.
• Branches that switch between A32 and T32 states.
The following branch instructions are not predicted:
• Exception return instructions.
T32 state conditional branches
A T32 unconditional branch instruction can be made conditional by inclusion in an
If-Then
(IT) block. It
is then treated as a conditional branch.
Return stack
The return stack stores the address and instruction set state.
This address is equal to the link register value stored in R14 in AArch32 state or X30 in AArch64 state.
The following instructions cause a return stack push if predicted:
•
BL r14
•
BLX (immediate)
in AArch32 state
•
BLX (register)
in AArch32 state
•
BLR
in AArch64 state
•
MOV pc,r14
In AArch32 state, the following instructions cause a return stack pop if predicted:
•
BX
•
LDR pc, [r13], #imm
A6 Level 1 memory system
A6.3 L1 instruction memory system
100798_0300_00_en
Copyright © 2016–2018 Arm Limited or its affiliates. All rights
reserved.
A6-75
Non-Confidential
Summary of Contents for Cortex-A76 Core
Page 4: ......
Page 22: ......
Page 23: ...Part A Functional description ...
Page 24: ......
Page 119: ...Part B Register descriptions ...
Page 120: ......
Page 363: ...Part C Debug descriptions ...
Page 364: ......
Page 401: ...Part D Debug registers ...
Page 402: ......
Page 589: ...Part E Appendices ...
Page 590: ......