Sei sulla pagina 1di 36

ARM Assembly Language

And
AMBA
Contents
• Introduction to instruction set
• ARM instruction formats
• ARM instruction Execution
• AMBA – the advanced microcontroller bus architecture

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
2
Introduction
• The ARM processor is very easy to program at the assembly
level, though for most applications it is more appropriate to
program in a high-level language such as C or C++.
• An ARM instruction is 32 bits long, so there are many different
binary machine instructions.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
• ARM processors support a form of the instruction set that has
been compressed into 16 bit ‘Thumb’ instructions.
• ARM instruction set comprises of :
- Data processing instructions
- Data transfer instructions.
- Control flow instructions.
3
Introduction
• The most notable features of the ARM instruction set are:-
- The load-store architecture.
- 3 address data processing instructions(i.e 2 source operand registers
and the result register are all independently specified).
- Conditional execution of every instruction.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
- The inclusion of very powerful load and store multiple register
instructions.
- The ability to perform a general shift operation and a general ALU
operation in a single instruction that executes in a single clock cycle.
- Open instruction set extension through the coprocessor instruction
set.
- A very dense 16 bit compressed representation of the instruction set
in the Thumb architecture. 4
Introduction
• ARM instructions are aligned on 4 byte boundaries in memory.
• Internally all ARM operations are on 32 bit operands. The
shorter data types are only supported by data transfer
instructions.
• When a byte is loaded from memory it is zero or sign

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
extended to 32 bits and then treated as a 32bit value for
internal processing.

5
ARM Instruction Format & Execution
Data Processing Instructions
• ARM data processing instructions enable the programmer to
perform arithmetic and logical operations on data values in
registers.
• All other instructions just move data around and control the
sequence of program execution, so the data processing instructions

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
are the only instructions which modify data values.
• These instructions typically require two operands and produce a
single result.
• Some rules which apply to ARM data processing instructions are:
-all operands are 32 bits wide and come from registers or are specified
in the instruction itself.
-The result, if there is one, is 32bits wide and is placed in a register.
-Each operand registers and the result register are independently 6
specified in the instruction.
prepared by: shruthi.k, Dept of
7

E&C, MIT, Manipal.


ARM Instruction Format & Execution
Data Processing Instructions
Instruction Format

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
Condition field

8
ARM Instruction Format & Execution
Data Processing Instructions

Opcode Opcode Function Description


(binary)
0000 AND Logical bit wise AND Rd=op1 AND op2

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
0001 EOR Logical bit wise Exclusive or Rd=op1 EOR op2
0010 SUB Subtract Rd=op1 – op2
0011 RSB Reverse Subtract Rd=op2 – op1
0100 ADD Add Rd=op1+op2
0101 ADC Add with carry Rd=op1+op2+C
0110 SBC Subtract with carry Rd=op1 – op2 +C – 1
0111 RSC Reverse subtract with carry Rd = op2 – op1 +C – 1
9
ARM Instruction Format & Execution
Data Processing Instructions

Opcode Opcode Function Description


(binary)
1000 TST Test Set condition code on op1 AND op2.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
1001 TEQ Test Equivalence Set condition code on op1 EOR op2.
1010 CMP Compare Set condition code op1 – op2.
1011 CMN Compare Negated Set condition code on op2+op1.
1100 ORR Logical bit wise OR Rd= op1 OR op2.
1101 MOV Move Rd = op2
1110 BIC Bit clear Rd= op1 AND NOT op2.
1111 MVN Move negated Rd = NOT op2.
10
ARM Instruction Format & Execution
Data Processing Instructions
• The ARM data processing instructions employ a 3 address
format, which means that the 2 source operands and the
destination register are specified independently.
• One source operand is always a register, the second may be a
register, a shifted register or an immediate value.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
• The shift applied to the second operand, if it is a register may
be a logical or arithmetic shift or a rotate and it may be by an
amount specified either as an immediate quantity or by a
fourth register.
• When the instruction does not require all the available
operands the unused register field should be set to zero. The
assembler will do this simultaneously. 11
ARM Instruction Format & Execution
Data Processing Instructions
• These instructions allow direct control of whether or not the
processor’s condition codes are affected by their execution
through the S bit (bit 20) . When clear , the condition codes
will be unchanged, when set
- The N flag is set if the result is negative, otherwise it is cleared

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
(i.e N= bit 31 of the result).
- The Z flag is set is the result is zero, otherwise it is cleared.
- The C flag is set to the carry-out from the ALU when the
operation is arithmetic or to the carry – out from the shifter
otherwise. If no shift is required, C is preserved.
- The V flag is preserved in non-arithmetic operations. It is set in
an arithmetic operation if there is an overflow from bit 30 to 12
bit 31 and cleared if no overflow occurs.
ARM Instruction Format & Execution
Data Processing Instructions
Execution:
• A data processing instruction requires two operands, one of which is
always a register and the other is either a second register or an
immediate value.
• The second operand is passed through the barrel shifter where it is
subject to a general shift operation, then it combines with the first

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
operand in the ALU using a general ALU operation.
• Finally the result from the ALU is written back into the destination
register.
• All these operations take place in a single clock cycle.
• PC value in the address register is incremented and copied back into
both the address register and r15 in the register bank and the next
instruction but one is loaded into the bottom of the instruction pipeline
(I pipe).
• The immediate value when required is extracted from the current
instruction at the top of the instruction pipeline.
• For data processing instructions only the bottom eight bits ([7:0]) of the 14
instruction are used in the immediate value.
ARM Instruction Format & Execution
Data Processing Instructions
• Reg-Reg
oBoth source operands
will be register file.
oRm is given to ALU

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
through the barrel
shifter.
oRd = Rn operation Rm.
or15 (PC)= AR + 4
AR = AR + 4
15
Register-Register operations
ARM Instruction Format & Execution
Data Processing Instructions
• Reg-Imm
o One source operand in
register & other is
immediate value which is
obtained from

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
instruction in pipeline.
o The immediate value is
given to ALU through
barrel shifter.
o Rd = Rn operation Imm
o R15(PC) = AR + 4
AR = AR + 4 16

Register-immediate operations
ARM Instruction Format & Execution
Data Transfer Instructions
• Data transfer instructions move data between ARM registers
and memory.
• There are 3 basic forms of data transfer instruction in the ARM
instruction set:

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
- Single register load and store instructions: these instructions
provide the most flexible way to transfer single data items
between an ARM register and memory. The data item may be
a byte, a 32bit word or a 16 bit half word.
- Multiple register load and store instructions: these
instructions are less flexible than single register transfer
instructions, but enable large quantities of data to be
transferred more efficiently. They are used to copy blocks of 17
data around memory.
ARM Instruction Format & Execution
Data Transfer Instructions
- Single register swap instructions: these instructions allow a
value in a register to be exchanged with a value in memory,
effectively doing both a load and a store operation in one
instruction.
• It is quite possible to write any program for the ARM using the

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
single register load and store instructions, but there are
situations where the multiple register transfers are much
more efficient.

18
ARM Instruction Format & Execution
Data Transfer Instructions
Instruction Format

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
Condition
field

19
• Pre indexed mode(P=1)
- LDR r0,[r1,#4]………………..if w=0, r0=mem[r1+4]& r1 =r1 only.
- LDR r0,[r1,#4]!…………………if w=1, r0=mem[r1+4] & r1=r1+4.
- A pre-indexed (P=1) addressing mode uses the computed

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
address for load or store operation and then when write back
is requested, updates the base register to the computed value.
• Post indexed mode (P=0)
- LDR r0,[r1],#4……………….irrespective of w, r0=mem[r1] &
r1=r1+4.
- A post-indexed (P=0) addressing mode uses the unmodified
base register for the transfer and then updates the base
register to the computed address irrespective of the W bit. 20
ARM Instruction Format & Execution
Data Transfer Instructions
• These instructions construct an address starting from a base
register (Rn) then adding (U=1) or subtracting (U=0) an
unsigned immediate or register offset.
• The base or computed address is used to load (L=1) or store
(L=0) an unsigned byte (B=1) or word (B=0) quantity to or from

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
a register (Rd), from or to memory.
• When a byte is loaded into register it is zero extended to 32
bits.
• When a byte is stored into memory, the bottom 8 bits of the
register are stored into the addressed location.

21
ARM Instruction Format & Execution
Data Transfer Instructions
Execution:
• A data transfer (load or store) instruction computes a memory
address in a manner similar to the way a data processing instruction
computes its result.
• A register is used as the base address, to which is added (or from

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
which is subtracted) an offset which again may be another register
or an immediate value.
• The address is sent to the address register and in a second cycle the
data transfer takes place.
• In order to avoid the data path from being idle during data transfer
cycle, the ALU holds the address components from the first cycle
and is available to compute an auto-indexing modification to the
base register if required.
• If auto-indexing is not required the computed value is not written
back to the base register in the second cycle. 23
ARM Instruction Format & Execution
Data Transfer Instructions
• Compute address
o AR = Rn op Disp
o r15 = AR + 4
• Store data
o AR = PC

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
o mem[AR] = Rd<x:y>
o If autoindexing =>
Rn = Rn +/- 4

26
ARM Instruction Format & Execution
Branch Instructions
• These instructions neither processes data nor moves it
around. It simply determines which instructions get executed
next.
• The most common way to switch program execution from one
place to another is to use the branch instruction.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
• The processor normally executes instructions sequentially, but
when it reaches the branch instruction it proceeds directly to
the instruction at the specified label instead of executing the
instruction immediately after the branch.
• Branch and branch with link instructions are the standard way
to cause a switch in the sequence of instruction execution.
28
ARM Instruction Format & Execution
Branch Instructions
• The ARM normally executes instructions from sequential word
addresses in memory, using conditional execution to skip over
individual instructions where required.
• Whenever the program must deviate from sequential
execution a control flow instruction is used to modify the

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
program counter.
Instruction Format:

• This cause the processor to begin executing instructions from


an address computed by sign extending the 24 bit offset
specified in the instruction, shifting it left 2 places to form a
word offset, then adding it to the PC which contains the 29
address of the branch instruction + 8 bytes.
ARM Instruction Format & Execution
Branch Instructions
• The assembler will compute the correct offset under normal
circumstances.
• The range of the branch instruction is +/ - 32 bytes.
• The branch with link variant which has the L bit (bit 24) set,

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
also moves the address of the instruction following the branch
into the link register (r14) of the current processor mode.
• This is normally used to perform a subroutine call, with the
return being caused by copying the link register back into the
PC.
• Both forms of the instruction may be executed conditionally
or unconditionally.
30
ARM Instruction Format & Execution
Branch Instructions
Execution:
• Branch instructions compute the target address in the first cycle .
• A 24 bit immediate field is extracted from the instruction and then
shifted left two bit positions to give a word-aligned offset which is
added to the PC.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
• The result is issued as an instruction fetch address and while the
instruction pipeline refills the return address is copied into the link
register (r14) if this is required (i.e if the instruction is a branch with
link).
• The third cycle which is required to complete the pipeline refilling is
also used to make a small correction to the value stored in the link
register in order that it points directly at the instruction which
follows the branch.
• This is necessary because r15 contains PC+8 whereas the address of
the next instruction is PC+4. 31
ARM Instruction Format & Execution
Branch Instructions
• Compute target address
o AR = PC + Disp,lsl #2
• Save return address
(if required)
o r14 = PC

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
o AR = AR + 4
• Third cycle: do a small
correction to the value
stored in the link register
in order that it points to
directly at the instruction
which follows the branch.

32
(a) 1st cycle – compute branch
(b) 2nd cycle – save return address
target
The Advanced Microcontroller Bus Architecture

• ARM processor cores have bus interfaces that are optimized


for high speed cache interfacing.
• Where a core is used with or without a cache as a component
on a complex system chip, some interfacing is required to
allow the ARM to communicate with other on-chip macrocells.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
• ARM Limited specified the Advanced Microcontroller Bus
Architecture, AMBA to standardize the on-chip connection of
different macrocells.
• Three buses are defined within the AMBA specification.

33
The Advanced Microcontroller Bus Architecture

- The Advanced High Performance Bus (AHB) is used to connect


high performance system modules. It supports burst mode
data transfers and split transactions and all timing is
referenced to a single clock edge.
- The Advanced System Bus (ASB) is used to connect high

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
performance system modules. It supports burst mode data
transfers.
- The Advanced Peripheral Bus (APB) offers simple interface for
low performance peripherals.
• A typical AMBA based microcontroller will incorporate either
an AHB or an ASB together with an APB.
• The APB is generally used as a local secondary bus which 34
appears as a single slave module on the AHB or ASB.
The Advanced Microcontroller Bus Architecture
Typical AMBA based system

ARM CORE/ CPU ON – CHIP RAM Test if ctrl

External bus
interface

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
AHB or ASB

UART
DMA controller bridge

Timer

APB

35
Parallel i/f
The Advanced Microcontroller Bus Architecture
• Arbitration: A bus transaction is initiated by a bus master
which requests access from a central arbiter.
• The arbiter decides priorities when there are conflicting
requests and its design is a system-specific issue.
• The ASB only specifies the protocol which must be followed:
- The master, x, issues a request (AREQx) to the central arbiter.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
- When the bus is available, the arbiter issues a grant(AGNTx) to
the master.
• Bus Transfers: When a master has been granted access to the
bus, it issues address and control information to indicate the
type of the transfer and the slave device which should
respond.
• The following signal is used to define the transaction timing:
- The bus clock, BCLK. This will usually be the same as MCLK, the 36
ARM processor clock.
The Advanced Microcontroller Bus Architecture
• The bus master which holds the grant then proceeds with the bus
transaction using the following signals:
- Bus transaction, BTRAN[1:0], indicates whether the next bus cycle
will be address-only, sequential or non-sequential.
- The address bus, BA[3:0].
- Bus transfer direction, BWRITE.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
- Bus protection signals, BPROT[1:0], which indicate instruction or
data fetches and supervisor or user access.
- The transfer size, BSIZE[1:0], specifies a byte, half-word or word
transfer.
- Bus lock, BLOK, allows a master to retain the bus to complete an
read-modify-write transaction.
- The data bus, BD[31:0], used to transmit write data and to receive
read data. In an implementation with multiplexed address and data, 37
the address is also transmitted down this bus.
The Advanced Microcontroller Bus Architecture
• A slave unit may process the requested transaction immediately
accepting write data or issuing read data on BD[31:0] or signal one
of the following responses:
- Bus wait, BWAIT, allows a slave module to insert wait states when it
cannot complete the transaction in the current cycle.
- Bus last, BLAST, allows a slave to terminate a sequential burst to

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
force the bus master to issue a new bus transaction request to
continue.
- Bus error, BERROR, indicates a transaction that cannot be
completed. If the master is a processor it should abort the transfer.
• Bus reset: The ASB supports a number of independent on-chip
modules, many of which may be able to drive the data bus (and
some control lines).
- Correct ASB power-up is ensured by imposing an asynchronous reset
38
mode that forces all drivers off the bus.
The Advanced Microcontroller Bus Architecture
• Test Interface: A possible use of the AMBA is to provide
support for a modular testing methodology through the Test
Interface Controller.
- This approach allows each module on the AMBA to be tested
independently by allowing an external tester to appear as a
bus master on the ASB.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
- The only requirement for test mode to be supported is that
the tester has access to the ASB through a 32bit bidirectional
port.

39
The Advanced Microcontroller Bus Architecture
• Advanced Peripheral Bus: The ASB offers a relatively high
performance on-chip interconnect which suits processor,
memory and peripheral macrocells with some built-in
interface sophistication.
- The APB is a simple static bus which operates as a stub on an
ASB to offer a minimalist interface to very simple peripheral

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
macrocells.
- The bus includes address (PADDR[n:0]) and read and write
data (PRDATA[m:0] and PWDATA[m:0], where m is 7,15 or 31)
buses which are no wider than necessary for the connected
peripherals, a read/ write direction indicator (PWRITE),
individual peripheral select strobes (PSELx) and a peripheral
timing strobe (PENABLE).
- APB transfers are timed to PCLK and all APB devices are reset
with PRESETn. 40
The Advanced Microcontroller Bus Architecture
• Advanced High performance Bus: The AHB is intended to
replace the ASB in very high performance systems.
• The following features differentiates the AHB from the ASB:
- It supports split transactions, where a slave with a long
response latency can free up the bus for other transfers while
it prepares its data for transmission.

E&C, MIT, Manipal.


prepared by: shruthi.k, Dept of
- It uses a single clock edge to control all of its operations, and
design verification.
- It uses a centrally multiplexed bus scheme rather than a
bidirectional bus.
- It supports wider data bus configurations of 64 or 128 bits.
• The multiplexed bus scheme may appear to introduce a lot of
excess wiring, but bidirectional buses create a number of
problems for designers and even more for synthesis systems. 41

Potrebbero piacerti anche