Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Classification of ARM implementations

ARM is an old semi-RISC processor design; the first design was released in 1985. Implementations are classified broadly by architecture and by the design of the processor core implementing the architecture:

  • Architecture. This is the view of the processor seen by programmers (privileged and not). The architecture revision is referred to as ARMv5, ARMv7, etc. There used to be only one current version of the architecture but lately this has been split into three "profiles":
    • A or Application profile. Intended for use with multi-user operating systems such as Linux, A-profile architectures include

Basic architecture

...

    • a Virtual Memory System Architecture (VMSA)

...

    • wherein an MMU (or several) provide full-blown address remapping and memory attributes such as cached, non-executable, etc.

The other profiles are R and M. Profile R processors have a Memory Protection Unit which like an MMU provides memory attributes which which does no address mapping. ARM calls this a Protected Memory System Architecture (PMSA). Profile M, for Microcontroller, processors have neither VMSA nor PMSA.

Instruction set

    • R or Real-Time profile. Meant for single-user RTOSes such as VxWorks or RTEMS. Incorporates a Protected Memory System Architecture (PMSA) wherein an MPU provides memory attributes but not address remapping.
    • M or Microcontroller profile. Intended for the simplest embedded systems which don't run a true operating system.
  • Processor core implementation. There have been many implementations, until recently designated by "ARM" followed by a processor family number and a bunch of letters telling what extra features are present, e.g., ARM7TDMI. Note that the family number doesn't indicate which ARM architecture revision is implemented, e.g., ARM7 processors implement architecture v5. Lately this scheme has been abandoned in favor of a family name, architecture profile letter and family revision number such as "Cortex-A9".
  • Number of cores. In later systems a number of ARM cores may share main memory, some or all peripherals and some caches. Such systems have the word "MPCore" appended to the classification.

Classification and feature set of the Zynq-7000 SoC

I'll list the system features here along with key terms you should look for when navigating the ARM documentation forest.

Feature

Look for

Architecture

ARMv7-A

Processor

Cortex-A9, Cortex-A9 MPCore

Instruction sets

ARM, Thumb, Jazelle, ThumbEE

Floating point

VFP3-32

Vector operations

NEON, Advanced SIMD

DSP-like ops

EDSP

Timers

Generic Timer

Extra security

TrustZone, Security Extension

Debugging

JTAG, CoreSight

Multiprocessing

SMP, MPCore

The Cortex family of ARM processors incorporate as standard some features that used to be optional in earlier families and were designated by letters following the family names: (T)humb instruction set, (D)ebugging using JTAG, faster (M)ultiplication instructions, embedded (I)CE trace/debug and (E)xtended instructions allowing interoperation of ARM and Thumb code. Oddly, Cortex processors don't have any integer division instructions. MPCore variants have new synchronization instructions favored over the older Swap (SWP): Load Register Exclusive (LDREX) and Store Register Exclusive (STREX).

The same block of silicon, NEON, implements scalar single and double-float operations as well as SIMD for integer and single-float operands.

The following extensions are not implemented in the processor: obsolete floating-point (FP) independent of NEON, alternate NEON floating point (VFP3-16 or VFP4-anything), 40-bit physical addresses (Large Physical Address Extension) or virtualization (hypervisor support).

GNU toolkit options

Use -mcpu=cortex-a9 when compiling in order to get the full instruction set including LDREX and STREX. This is already done in our make system. If you don't specify this you'll get the default -mcpu=arm7tdmi which is for a much older ARM implementation.

Processor "state" vs. "mode" and "privilege level"

Both mode and state are reflected in bits in the Current Processor State Register, or CPSR. "State" refers to the instruction set being executed. "Mode" and "privilege" determine the view of the processor the programmer sees; some instructions may be forbidden and a the visible bank of registers may differ.

Instruction v7 actually comprises several different instruction sets:

  • The standard ARM instruction set. Each instruction is 32 bits long and aligned on a 32-bit boundary. The full set of general registers is available. Shift operations may be combined with arithmetic and logical operations. This is the instruction set we'll be using for our project. Oddly, an integer divide instruction is optional and the Zynq CPUs don't have it.
  • Thumb-2. Designed for greater code density. Contains a mix of 16-bit and 32-bit instructions. Many instructions can access only general registers 0-7.
  • Jazelle. Similar to Java byte code.
  • ThumbEE. A sort of hybrid of Thumb and Jazelle, actually a CPU operation mode. Intended for environments where code modification is frequent, such as ones having a JIT compiler.

Coprocessors

The ARM instruction set has a standard coprocessor interface which allows up to 16 distinct coprocessors.

...

CPs 12, 13 and 14 are reserved for floating point and vector hardware, which in this system are both part of the NEON extension.

Options and extensions

There are a number of options and extensions available for a Cortex-A CPU. Some features can change the way you have to program the processor even you don't want to use them. The following table lists them and indicates whether they are available on the Zynq.

Name

On Zynq?

Description

ARM instruction set

(tick)

 

Thumb-2

(tick)

 

Jazelle

(tick)

 

ThumbEE

(tick)

 

Integer divide instructions

(error)

 

VMSA

(tick)

Has at least one MMU

PMSA

(error)

Has at least one MPU

Fast multiply

(tick)

Improved integer multiplication

VFP3-32

(tick)

Scalar floating point rev. 3 with 32 double-sized registers

VFP3-16

(error)

Like VFP3-32 but with half the number of registers

VFP4-x

(error)

Scalar floating point rev. 4

NEON (Advanced SIMD)

(tick)

Vector operations with integers and single floats

Large Physical Address (LPA)

(error)

40-bit physical addresses

Generic Timer

(tick)

System counter (clock) plus count-down and count-up timers

Multiprocessing (MPCore)

(tick)

Multiple CPU cores sharing memory

Security (TrustZone)

(tick)

Adds distiction between secure and non-secure code

Virtualization (Hypervisor)

(error)

Allows creation of virtual machines to run non-secure code

Two extensions that are obsolete as of ARMv7 are Fast Context Switch (FCSE) and old-style Floating Point (FP). You can ignore what the ARM architecture manual has to say about these.

MMU

There can be up to four independent MMUs per CPU (though they may be implemented with a single block of silicon). Without the security or virtualization extensions there is just one MMU which is used for both privileged and non-privileged accesses. Adding the security extension adds another for secure code, again for both privilege levels. Adding the virtualization extension adds two more MMUs, one for the hypervisor and one for a second stage of translation for code running in a virtual machine. The first stage of translation in a virtual machine maps VM-virtual to VM-real addresses while the second stage maps VM-real to actual hardware addresses. The hypervisor's MMU maps only once, from hyper-virtual to actual hardware addresses.

...