Functional Description
ARM DDI 0500D
Copyright © 2013-2014 ARM. All rights reserved.
2-3
ID021414
Non-Confidential
•
Sequential instruction fetches.
•
Instruction prefetches.
•
Critical word first linefill on a cache miss.
The IFU obtains instructions from the instruction cache or from external memory and predicts
the outcome of branches in the instruction stream, then passes the instructions to the
Data
Processing Unit
(DPU) for processing.
If the cache protection configuration is chosen, the L1 Instruction cache data and tag RAMs are
protected by parity bits. The parity bits enable any single-bit error to be detected. If an error is
detected, the line is invalidated and fetched again.
Branch Target Instruction Cache
The IFU contains a single entry
Branch Target Instruction Cache
(BTIC). This
stores up to two instruction cache fetches and enables the branch shadow of
predicted taken branch instructions to be eliminated. The BTIC implementation
is architecturally transparent, so it does not have to be flushed on a context switch.
Branch Target Address Cache
The IFU contains a 256-entry
Branch Target Address Cache
(BTAC) to predict
the target address of indirect branches. The BTAC implementation is
architecturally transparent, so it does not have to be flushed on a context switch.
Branch predictor
The branch predictor is a global type that uses branch history registers and a
3072-entry pattern history prediction table.
Return stack
The IFU includes an 8-entry return stack to accelerate returns from procedure
calls. For each procedure call, the return address is pushed onto a hardware stack.
When a procedure return is recognized, the address held in the return stack is
popped, and the IFU uses it as the predicted return address. The return stack is
architecturally transparent, so it does not have to be flushed on a context switch.
See
Chapter 6
Level 1 Memory System
for more information.
2.1.2
Data Processing Unit
The
Data Processing Unit
(DPU) holds most of the program-visible state of the processor, such
as general-purpose registers and system registers. It provides configuration and control of the
memory system and its associated functionality. It decodes and executes instructions, operating
on data held in the registers in accordance with the ARMv8-A architecture. Instructions are fed
to the DPU from the IFU. The DPU executes instructions that require data to be transferred to
or from the memory system by interfacing to the
Data Cache Unit
(DCU), that manages all load
and store operations.
See
Chapter 3
Programmers Model
and
Chapter 4
System Control
for more information.
2.1.3
Advanced SIMD and Floating-point Extension
The optional Advanced SIMD and Floating-point Extension implements:
•
ARM NEON technology, a media and signal processing architecture that adds
instructions targeted at audio, video, 3-D graphics, image, and speech processing.
Advanced SIMD instructions are available in AArch64 and AArch32 states.