ARM PROCESSOR
ARCHITECTURE
ARM : Advanced RISC
Machine ( originally Acorn
RISC Machine)
ARM is an Architecture
designed by ARM Holdings,
British Multinational
semiconductor and software
design company based in
Cambridge, England.
A leading provider of
Processor Technology
(Processor Architecture).
Introducing ARM
ARM designs processors(CPUs)
with reduced instruction set
architecture
Licenses these designs to
other semiconductor companies.
The companies include these
core designs into their own products.
Is ARM a microprocessor or Microcontroller?
ARM is an architecture that has to be
implemented as a chip
Like other microprocessor companies (Intel,
Hitachi, etc.) ARM does not manufacture
physical processor ICs
Instead, ARM licenses Architectures to others
(ATMEL, Phillips - now NXP, Samsung , etc)
Generally ARM architecture is licensed to
companies, who wants to manufacture
ARM based, CPUs or SoCs
Companies develop their own processors
with the ARM core & instruction set
architecture (ISA).
The device we are using LPC2148 is an
ARM architecture based SoC product
developed by NXP Semiconductor.
Licensees of
ARM
Similarly, Atmel, Samsung, Analog Devices,
Epson, Ericsson, Fujitsu, IBM, Intel,
Motorola, Panasonic, Sharp, Sony,
Microelectronics, Texas Instruments, etc.
they all make ARM based SoCs.
CISC –Complex Instruction Set
Computer
Generally, an ISA defines a set of basic operations
that the system can perform. There are two major
types of ISA: CISC and RISC.
Complex instruction set computer:
1. Typically use complex, but fewer instructions
2. Each instruction takes more than 1
clock execution.
3. Used in CISC machines (Intel 486,
Pentium, etc.).
4. Programmer can perform a lot with a
less number of instruction (thus making
an
RISC - Reduced Instruction Set
Computer
RISC processor uses a simple, reduced
set of instructions used in portable
devices such as cell phones, iPod, etc. .
But are powerful than a complex one.
Each RISC instruction executes
within single clock cycle at a high
speed.
ARM7 Family
Generally ARM7: is a family of 32-bit RISC
architectures for processors.
Family includes: ARM700, ARM710, ARM7DI,
ARM710a, ARM720T, ARM740T, ARM710T,
ARM7TDMI, ARM7TDMI-S and ARM7EJ-S
Letters after ARM specifies the features
supported a particular version.
T – THUMB 16-Bit
Decoder D – JTAG Debug
M – Fast Multiplier
I – Embedded ICE Macro
Cell E – Enhanced
Instruction
J – Jazelle (Support for Java)
F – Vector Floating
Point Unit S –
Synthesizable Version
Among all, ARM7TDMI and ARM7TDMI-S
were the most popular cores of this
Von Neumann architecture
ARM7TDMI-S is a 32-Bit core, with Von
Neumann architecture
This is architecture supports a common
physical memory for storing both
instructions and data.
The memory is accessed via a single set
of 32-bit address and data buses in a
half- duplex mode.
Only one information can be accessed at a
time; Executes the instructions one by
one.
Significance of
ARM7
Due to their
1. low costs,
2. minimum power consumption
3. over heat generation
ARM processors are desirable
for smartphones, laptops and
tablet computers
Microprocess
or
Microprocessor is a gener purpose
programmable al ,
digital
arithmetic device onperform
and logic operations binary
data and produces a result. s
It incorporates Control Unit, ALU and a
Register array for temporary storage on a
single chip. Altogether the entire unit is
called as CPU;
Processor is nothing but a CPU.
Microprocessor Vs
Microcontroller
Microcontroller
Basically a Mini or Microcomputer on a
chip (SoC- System on a Chip); includes a
Processor, Memory, I/O ports & different
peripherals such as ADC, Timers, UART,
etc.
Desktop machines can perform many
task. But MC is an application specific
system (dedicated for a single purpose).
Hence, heavily used in embedded
Microcontroller
What is MC?
Microcontroller is nothing
but a Microprocessor with different
peripherals such as ROM,RAM,
Input and Output (I/O) Ports,
embedded on a single chip.
Interfacing I/O
Devices
I/O Devices (peripherals) can be
connected to the processor through
ports.
Generally there are no dedicated I/O
ports supported by a microprocessor.
We use, 8255A special interfacing device
which may offer dedicated parallel I/O
pins (ports) and control signals to
interface with.
Interfacing I/O
Devices
Programmable Peripheral Interface :
8255A
This a 40 pin package supports three
independent 8-bit ports, namely
PORTA, PORTB and PORTC (upper and
Lower)
Embedded systems
may control a basic
traffic light system
to very large
systems used in
NASA.
ARM7TDMI A[31:0]
Block Diagram
Address Register Address
Incrementer
PC bus
PC
REGISTER
BANK
ALU bus
Control Lines
INSTRUCCTION
DECODER
Multiplier
B bus
A bus
SHIFT
A.L.U.
Instruction Reg.
Thumb to
ARM
Write Data Reg. Read Data Reg.
translator
D[31:0]
Specifications of LPC2148-ARM7
• Based on ARM7 TDMI - S processor (32 bit RISC processor)
• manufactured by NXP Semiconductors
• (founded by Philips).
• Architecture v4T(16 bit Thumb compressed form of
instruction introduced)
• 512 KB On-Chip Flash Memory
• 32 KB On-Chip SRAM
• Supports several peripherals (AD0, AD1, Timer0, Timer1,
UART0, UART1, DAC, Real time clock, I2C, SPI, etc.)
Specifications of LPC2148-ARM7
• CPU clock : 14.7MHz; Can be run
at maximum of 60 MHz
• Operating Voltage Range : 3.0V to
3.6 V
• 64 Pin Package (IC)
• 4 GB Address Space (with 32 Bit
PC)
• Two 32 Bit GPIO Ports- PORTA
and PORTB
Block diagram LPC2148
Pin diagram - LPC2148
Various Units of ARM - ALU
Arithmetic and Logic Unit has two 32-
bit inputs.
The primary input comes from the
register bank file (a collection of 32-bit
core registers that is directly connected
to the ARM core).
whereas the other input comes from a
pre- processing unit called as Barrel
Shifter.
Various Units of ARM - ALU
ALU perform all arithmetic and
logic operations : addition, subtraction,
multiplication, division, and, or, not,
etc..
Like 8085, output of ALU have an
impact on flag bits - carry, overflow, zero
and negative flags in current processor
status register (CPPSR)
Example: if the result of ALU
is zero, then zero flag will be set; if it is a
negative value then negative flag (n) will
Various Units of ARM - Barrel Shifter
Barrel shifter is basically a
combinational circuit which can shift or
rotate a data to left or right by an
arbitrary number of bit positions.
As this is a combinational logic, it does
not need clock input, thus reduces the
shifting time.
Shift or rotate operations will
be performed in a single cycle.
Various Units of ARM - Barrel Shifter
It shifts or rotates data by a specifieda
number of bit positions at once in a
single clock cycle.
In a sequential logic shift register , the
number of shifts require equivalent
number of clock cycles ( n shifts needs
n clock cycles).
Various Units of ARM - Barrel Shifter
Architecturally the Barrel Shifter
is associated with ALU.
ALU may receive the output of shifter as
its second operand for further processing.
i.e. Barrel shifter preprocess the
data before the it is entering into
ALU
Overview :Core Data path
*Data items are placed in register file.
--No data processing instructions directly manipulate data in
memory.
*Instructions typically use two source registers and single result
or destination registers.
*A barrel shifter on the data path can preprocess data before it
enters ALU
*Increment/Decrement logic can update register content for
sequential access independent of ALU.
02/24/2022
Barrel Shifter
Various Units of ARM - MAC Unit
There is a dedicated Multiply
and Accumulate unit.
Performs regular multiplication of two
registers; and accumulate the product
with a third register.
MLA r3, r2, r1, r0 : multiply R2 with R1,
and accumulate the product with R0;
store final result in r3. i.e r3=
(r2xr1) + r0
Various Units of ARM - MAC
Unit
Long multiplications may produce a 64-
bit product.
In such cases, two 32-bit registers are
used to put the result one for higher 32-
bits
and the other for lower 32-bits..
Various Units of ARM - Control Unit
• Control unit has a control over all
other units : memory unit, input and
output unit, ALU, MAC, barrel shifter,
decoding unit, etc.
• Instructions are stored in memory as
the data is stored
• Execution includes three stages Fetch
(Fetch an instruction from memory),
Decode (find the operation to be
performed), and Execute (execute
the instruction).
Various Units of ARM - Register Array
Register Array is a collection of
internal RAM locations used for
temporary data storage
Registers are directly coupled to the
processor bus and may reside
entirely inside the processor.
Considered as a part of processor,
therefore such registers can be
accessed in a single clock during
execution.
Various Units of ARM - Register
Array
• Register array has a total of 37
physical core registers each 32 bit
wide
• 31 general purpose registers , one
dedicated CPSR and 5 dedicated
SPSRs (save program status registers).
• While writing user programs, only
15 general-purpose registers (r0 to
r14), r5(pc) and CPSR are need to
Various Units of ARM - Register Array
Various Units of ARM - Register Array
• Register R13 is used as a stack pointer (SP)
• Register R14 is called Link Register (LR)
which receives a copy of r15 (PC) when a
branch with link (BL) instruction is executed.
i.e. LR stores the return address during
function call.
• Register R15 is the program counter (PC)
Special Registers
Special function registers:
PC (R15): Program Counter. Any instruction with PC as its destination register is a program branch
LR (R14): Link Register. Saves a copy of PC when executing the BL instruction (subroutine call) or
when jumping to an exception or interrupt routine
- It is copied back to PC on the return from those routines
SP (R13): Stack Pointer. There is no stack in the ARM architecture. Even so, R13 is usually
reserved as a pointer for the program-managed stack
CPSR : Current Program Status Register. Holds the visible status register
SPSR : Saved Program Status Register. Holds a copy of the previous status register while executing
exception or interrupt routines
- It is copied back to CPSR on the return from the exception or interrupt
- No SPSR available in User or System modes
Processor states : ARM and THUMB
• The ARM7TDMI-S processor has two
operating states such as ARM state and
THUMB state.
• A 32-bit long ARM instructions are
executed in the ARM state
• whereas and a 16-bit long Thumb
instructions are processed in the THUMB
state.
• In fact, there is only one instruction set;
thumb instruction set is a subset of the
ARM instruction set.
Processor states : ARM and THUMB
• During execution, the processor decodes a
16- bit Thumb instruction fetched from a
memory into its 32-bit equivalent.
• Consider a 16-bit thumb instruction
ADD r0, r1, this may be decoded into a 32-
bit ARM instruction, ADD r0, r0, r1 by a
special decoder logic
Processor states : ARM and THUMB
THUMB state registers
• Like thumb instruction set, thumb state
register set also considered as a subset
of the ARM state register set.
• The programmer has direct access to
eight general purpose registers (r0 to
r7), the PC, SP, LR and CPSR.
• In privileged modes, SPSR is accessible.
THUMB state registers
• In Thumb state all instructions have an
access to r0 to r7 registers; But only a few
instructions MOV, ADD, or CMP may have
access to r8 to r12.
• The PUSH and POP instructions may
use register 13 as a stack pointer.
• There are three banked registers (SP, LR
and SPSR) in each privileged mode.
ARM Instructions Vs Thumb
Thumb subset does not support all
ARM instructions
Sometimes we need a sequence of
thumb instructions to implement a
single ARM instruction.
Note that in the below example code,
ADDLE is implemented with the help
of two instructions : BGT and ADD.
ARM Instructions Vs Thumb
BGT-If R0>R1, then Branch
ARM Instructions Vs Thumb
Even though Thumb implementation
uses more instructions than ARM,
overall memory space used is reduced.
ARM code takes 16 bytes (4
instructions x 4 bytes) whereas the
Thumb code occupies only 10 bytes (5
x 2 bytes).
Code density is the main factor for
having Thumb instruction set.
Memory
Smallest unit of an information is a Bit (0
or 1); smallest unit of a memory is a
Flip-Flop.
Flip-Flop is a one bit memory, may
store single bit.
Multiple flip-flops may brought together
to store multiple bits simultaneously
Register is a collection of flip-flops which
are coupled together to store multiple bits.
Memory
n-bit register is a collection of n flip-
flops.
Memory is a collection (a linear array) of
registers; registers are nothing but
memory locations.
Each location in a memory unit is
identified by an unique address.
Using a 3-bit address, a processor can
address, 23 memory locations, each 1-
MEMORY
• Viewed as a large, single-dimension array, with an
address.
• A memory address is an index into the array.
• "Byte addressing" means that the index points to a
byte of memory.
0 8 bits of data
1 8 bits of data
2 8 bits of data
3 8 bits of data
4 8 bits of data
5 8 bits of data
6 8 bits of data
. ...
. ...
Memory
Memory
•1 KB memory is a includes 1024 registers, each one byte (1024
bytes = 1Kb).
•The 8085 processor has 16-bit address bus, so that it can
address 216
distinct memory locations, each one
byte.
2 16 = 65536 bytes =
64KB
Memory
• Similarly, using a 32-bit address bus
ARM7 core can address 232
memory locations.
• i.e. 4294967296 memory
locations, a memory unit of size 4.0
GB
• Entire 4.0 GB memory space
has been partitioned into
different segments.
Memory
• Memory map of LPC2148 shows how
the entire memory is shared among
on- chip Flash memory, SRAM (Static
RAM consists of flip flops) and
Peripherals.
• 512KB of on-chip Flash
memory (EPROM)
• 32KB of on-chip SRAM
• Those memories can be used to
store program and data.
General purpose RAM
• RAM locations are not associated to
any peripherals; the whole memory is
available for the programmer
• RAM address starts at 0x40000000
on the LPC214x.
• It provides space for user functions,
stack and and global variables.
Flash Memory - EPROM
• Non-volatile flash memory, used to
store program (code).
• It retains the code even when there is no
power. Generally a programmed flash
memory may preserve data for at least 40
years.
• Code can be electrically written and
erased. Flash memory can be read quickly,
so that a program can run quickly.
• Address begins at 0x00000000 and extend
to 0x0007FFFF (512,000 bytes).
Peripheral Registers
Peripheral Registers are special purpose external RAM
locations with specific addresses.
That are associated with peripherals.
Unlike core registers, peripheral registers are not directly
connected to the core bus
first the data should be loaded from memory (or
peripheral registers) into the processor registers (core
registers); after the process the data
will be stored back into the memory.
Memory and I/O Mapping
• Memory mapping is a process of
interfacing memory resources to the
processor and associating (or mapping)
addresses to each memory location.
• I/O mapping is a process of sharing
address space with on-chip peripherals.
Memory
Organization
• Memory may be viewed as a linear array of
bytes numbered from 0 up to 232-1(0x00000000
to 0xFFFFFFFF)
• Data items may be of 8-bit bytes, 16-bit half-
words or 32-bit words. Words are always
aligned on 4-byte boundaries and half-words
are aligned on 2-byte boundaries.
• There are two standard formats such as Little-
endian and Big-endian to store data in a
physical memory.
Memory Organization
• Memory may be viewed as a linear array
of bytes numbered from 0 up to 232-1
(4GB).
• Data items may be of 8-bit bytes, 16-bit
half- words or 32-bit words. Words are
always aligned on 4-byte boundaries and
half-words are aligned on 2-byte
boundaries.
• There are two standard formats such as
Little- endian and Big-endian to store data in
Little Endian and Big Endian
• ARM processor must be configured to use
either a Little-endian or Big-endian
format.
• In little-endian, the higher byte is located at
the higher address and the lower byte at
the lower address. i.e. Data is stored from
least significant byte to the most significant
byte (from left to right).
• Vice versa for the big-endian, higher byte
is stored at the lower address and the
lower byte at the higher address.
Little Endian Vs Big Endian
Little Endian with Big
Endian system
data become
read as
0x02040608
(in reverse)
Data Types
ARM supports 6 data types. 32-Bit word is
a basic data type, also have 8-Bit and 16-
Bit data as well.
• 8-bit signed and unsigned bytes.
• 16-bit signed and unsigned half-words,
these are aligned on 2-byte
boundaries.
• 32-bit signed and unsigned words,
aligned on 4-byte boundaries.
Data Types
• ARM instructions are 32-bit words
and must be word-aligned.
• Thumb instructions are half-
words and must be aligned on 2-
byte boundaries.
• Internally all operations are on 32-
bit operands only.
Load-Store Architecture
• Instruction set will operate only on registers.
• Only memory access:
• Copy memory values to registers (load) read
• Copy register values to memory (store) write
• Unlike in CISC processors, memory-to-memory
operations are not supported.
Instruction Execution
Instruction format
• Assembly language programmer has to
know how instructions are represented
in terms of 1s and 0s or hexadecimal
digits.
• We discussed earlier that opcode and
operands are encoded as a 32 bit long
ARM instructions.
• Operands may be read from core
registers or specified as literals in the
instruction itself, like 8085 immediate
addressing.
Instruction format
• ARM7 core uses 3 register
address format.
• Typically instructions may have two
source registers Rd and Rm for
operands; one destination register Rn
for result.
• Operands are 32 bit wide
unsigned integers or signed
integers.
General Purpose I/O Ports
Ports are used to connect I/O
devices (peripherals) to the
processor.
Processor may interact with the outside
world through a group of GPIO pins
called ports.
GPIO pins interfaces the processor with
the external peripherals like Buttons,
LEDs, Switches, LCD, etc.
General Purpose I/O Ports
• Note that, we used 8255A PPI to link
peripherals with processor; PPI
supports 3 ports : PORTA , PORTB
and PORTC.
• LPC2148 is designed with on-chip,
memory units, other peripherals and
interfacing circuits. Thus supports 64 GPIO
pins.
• Infact, PORT is a collection of GPIO
pins; through which ARM7 core
communicate with external world.
General Purpose I/O Ports
The 64 GPIO pins of LPC2148
Microcontroller are grouped into two
ports such as PORT0 and PORT1; each
32 bit wide.
PORTs are bidirectional but half duplex.
Figure shows how ARM7 core is
connected to peripherals through
PORTS.
General Purpose I/O Ports
Processor Modes
*Processor modes determine
• which registers are active
• access rights to CPSR register
*Each processor mode is either
Privileged: full read write access to CPSR
Non privileged :
read access to the control field of CPSR
read-write access to the condition flags.
Processor Modes (2)
ARM has Seven Modes
Privileged :
• Abort, Fast Interrupt Request (FIQ), Interrupt
Request (IRQ), Supervisor, System & Undefined
Non-Privileged :
• User
User Mode is used for Programs and Applications
Privileged Modes
Abort :
• When there is a failed attempt to access memory
Fast Interrupt Request (FIQ) & Interrupt
Request :
• Correspond to Interrupt levels available on ARM
Supervisor Mode :
• State after Reset and generally the mode in
which OS kernel executes
Privileged Modes (2)
System Mode :
• Special Version of User Mode that allows Full Read
Write access of CPSR
Undefined :
• When the Processor encounters and Undefined
Instruction
Program Status Registers
31 28 27 24 23 16 15 8 7 6 5 4 0
N Z C V undefined I F T mode
f s x c
Condition code flags • Interrupt Disable bits.
N = Negative result from ALU • I = 1: Disables the IRQ.
Z = Zero result from ALU
• F = 1: Disables the FIQ.
C = ALU operation Carried out
V = ALU operation oVerflowed • T Bit (Arch. with Thumb mode only)
• T = 0: Processor in ARM state
Mode bits
• T = 1: Processor in Thumb state
10000 User
10001 FIQ
10010 IRQ • Never change T directly (use BX instead)
10011 Supervisor • Changing T in CPSR will lead to unexpected
10111 Abort behavior due to pipelining
11011 Undefined
11111 System • Tip: Don’t change undefined bits.
• This allows for code compatibility with
newer ARM processors
Thank You
…