Sei sulla pagina 1di 218

RNSIT NOTES ARM EMBEDED SYSTEMS

MODULE-I-ARM 32-BIT MICROCONTROLLER

1.1 INTRODUCTION
The ARM Cortex™-M3 processor, the first of the Cortex generation of processors released
by ARM in 2006, was primarily designed to target the 32-bit microcontroller market. The
Cortex-M3 processor provides excellent performance at low gate count and comes with
many new features previously available only in high-end processors. The Cortex-M3
addresses the requirements for the 32-bit embedded processor market in the following
ways:

 Greater performance efficiency: allowing more work to be done without


increasing the frequency or power requirements
 Low power consumption: enabling longer battery life, especially critical in
portable products including wireless networking applications
 Enhanced determinism: guaranteeing that critical tasks and interrupts are
serviced as quickly as possible and in a known number of cycles
 Improved code density: ensuring that code fits in even the smallest memory
footprints
 Ease of use: providing easier programmability and debugging for the growing
number of 8-bit and 16-bit users migrating to 32 bits
 Lower cost solutions: reducing 32-bit-based system costs close to those of legacy
8-bit and 16-bit devices and enabling low-end, 32-bit microcontrollers to be priced
at less than US$1 for the first time
 Wide choice of development tools: from low-cost or free compilers to full-
featured development suites from many development tool vendors

1.2 THUMB-2 TECHNOLOGY


The Thumb-23 technology extended the Thumb Instruction Set Architecture (ISA) into a
highly efficient and powerful instruction set that delivers significant benefits in terms of
ease of use, code size, and performance. The extended instruction set in Thumb-2 is a
superset of the previous 16-bit Thumb instruction set, with additional 16-bit instructions
alongside 32-bit instructions. It allows more complex operations to be carried out in the
Thumb state, thus allowing higher efficiency by reducing the number of states switching
between ARM state and Thumb state. Focused on small memory system devices such as
microcontrollers and reducing the size of the processor, the Cortex-M3 supports only the
Thumb-2 (and traditional Thumb) instruction set. Instead of using ARM instructions for
some operations, as in traditional ARM processors, it uses the Thumb-2 instruction set
for all operations. As a result, the Cortex-M3 processor is not backward compatible with
traditional ARM processors. Nevertheless, the Cortex-M3 processor can execute almost
all the 16-bit Thumb instructions, including all 16-bit Thumb instructions supported on
ARM7 family processors, making application porting easy. With support for both 16-bit
and 32-bit instructions in the Thumb-2 instruction set, there is no need to switch the
processor between Thumb state (16-bit instructions) and ARM state (32-bit instructions).
The Thumb-2 instruction set is a very important feature of the ARMv7 architecture.
Compared with the instructions supported on ARM7 family processors (ARMv4T
architecture), the Cortex-M3 processor instruction set has a large number of new
features. For the first time, hardware divide instruction is available on an ARM processor,
2
and a number of multiply instructions are also available on the Cortex-M3 processor to
improve data-crunching performance. The Cortex-M3 processor also supports unaligned
data accesses, a feature previously available only in high-end processors.

Fig1. The Relationship between the Thumb Instructions Set in Thumb-2 Technology and the
Traditional Thumb

1.3 Cortex-M3 Processor Applications


With its high performance and high code density and small silicon footprint, the Cortex-
M3 processor is ideal for a wide variety of applications:

 Low-cost microcontrollers: The Cortex-M3 processor is ideally suited for low-cost


microcontrollers, which are commonly used in consumer products, from toys to
electrical appliances.
 Automotive: Another ideal application for the Cortex-M3 processor is in the
automotive industry. The Cortex-M3 processor has very high-performance
efficiency and low interrupt latency, allowing it to be used in real-time systems.
 Data communications: The processor’s low power and high efficiency, coupled
with instructions in Thumb-2 for bit-field manipulation, make the Cortex-M3 ideal
for many communications applications, such as Bluetooth and ZigBee.
 Industrial control: In industrial control applications, simplicity, fast response,
and reliability are key factors.
 Consumer products: In many consumer products, a high-performance
microprocessor (or several of them) is used. The Cortex-M3 processor, being a
small processor, is highly efficient and low in power and supports an MPU enabling
complex software to execute while providing robust memory protection.

1.3.1 FUNDAMENTALS OF CORTEX M3


• A 32-bit microprocessor.

• A 32-bit data path, a 32-bit register bank, and 32-bit memory interfaces.

• Harvard architecture -a separate instruction bus and data bus.

• Allows instructions and data accesses to take place at the same time, and as a
result of this, the performance of the processor increases because data accesses
do not affect the instruction pipeline.

• Multiple bus interfaces on Cortex-M3 with optimized usage and the ability to be
used simultaneously.
3
• However, the instruction and data buses share the same memory space (a unified
memory system).

• In other words, you cannot get 8 GB of memory space just because you have
separate bus interfaces.

1.4ARCHITECTURE OF ARM CORTEX M3

Fig2. architecture of Arm Cortex M3

 For complex applications that require more memory system features, the Cortex-
M3 processor has an optional Memory Protection Unit (MPU), and it is possible to
use an external cache if it’s required.

 Both little endian and big endian memory systems are supported.

 The Cortex-M3 processor includes a number of fixed internal debugging


components.

 These components provide debugging operation supports and features, such as


breakpoints and watch points.

1.5 REGISTERS
Registers The Cortex-M3 processor has registers R0 through R15 (see Figure 2).

R13 (the stack pointer) is banked, with only one copy of the R13 visible at a time.

R0–R12: General-Purpose Registers R0–R12 are 32-bit general-purpose registers for data
operations. Some 16-bit Thumb instructions can only access a subset of these registers (low
registers, R0–R7).

R13: Stack Pointers The Cortex-M3 contains two stack pointers (R13). They are banked so that
only one is visible at a time. The two stack pointers are as follows:

• Main Stack Pointer (MSP): The default stack pointer, used by the operating system (OS) kernel
and exception handlers

• Process Stack Pointer (PSP): Used by user application code The lowest 2 bits of the stack pointers
are always 0, which means they are always word aligned.
4

Fig. 3
Registers in the Cortex-M3

The Link Register -When a subroutine is called, the return address is stored in the link register.

R15: The Program Counter-The program counter is the current program address. This register can
be written to control the program flow.

Special Registers-The Cortex-M3 processor also has a number of special registers.They are as
follows:

• Program Status registers (PSRs)

• Interrupt Mask registers (PRIMASK, FAULTMASK, and BASEPRI)

• Control register (CONTROL) These registers have special functions and can be accessed only by
special instructions. They cannot be used for normal data processing.

Fig3. Special Registers in the Cortex-M3.


5
1.5.1 STACK POINTER
R13 is the stack pointer (SP). In the Cortex-M3 processor, there are two SPs. This duality
allows two separate stack memories to be set up. When using the register name R13, you
can only access the current SP; the other one is inaccessible unless you use special
instructions to move to special register from general-purpose register (MSR) and move
special register to general-purpose register (MRS). The two SPs are as follows:

• Main Stack Pointer (MSP)-This is the default SP; it is used by the operating system (OS)
kernel, exception handlers, and all application codes that require privileged access.

• Process Stack Pointer (PSP): This is used by the base-level application code (when not
running an exception handler)

In the Cortex-M3, the instructions for accessing stack memory are PUSH and POP. The
assembly language syntax is as follows

PUSH {R0} ; R13=R13-4, then Memory[R13] = R0 POP {R0} ;

R0 = Memory[R13], then R13 = R13 + 4

1.5.2 LINK REGISTER


R14 is the link register (LR). Inside an assembly program, you can write it as either R14
or LR. LR is used to store the return program counter (PC) when a subroutine or function
is called—for example, when you’re using the branch and link (BL) instruction:

Main ; Main program


...
BL function1 ; Call function1 using Branch with Link
instruction.
; PC = function1 and
; LR = the next instruction in main
6
...
function1
... ; Program code for function 1
BX LR ; Return

1.5.3 PROGRAM COUNTER


R15 is the PC. You can access it in assembler code by either R15 or PC. Because of the
pipelined nature of the Cortex-M3 processor, when you read this register, you will find
that the value is different than the location of the executing instruction, normally by 4.
For example:

0x1000 : MOV R0, PC ; R0 = 0x1004

The program counter is the current program address. This register can be written to
control the program flow.

1.6 SPECIAL REGISTERS


The special registers in the Cortex-M3 processor include the following

• Program Status registers (PSRs)

• Interrupt Mask registers (PRIMASK, FAULTMASK, and BASEPRI)

• Control register (CONTROL) Special registers can only be accessed via MSR and MRS
instructions; they do not have memory addresses:

MRS , ; Read special register

MSR , ; write to special register

1.6.1 PROGRAM STATUS REGISTERS


The PSRs are subdivided into three status registers:

1. Application Program Status register (APSR)


2. Interrupt Program Status register (IPSR)
3. Execution Program Status register (EPSR)

The three PSRs can be accessed together or separately using the special register access
instructions MSR and MRS. When they are accessed as a collective item, the name xPSR
is used. You can read the PSRs using the MRS instruction. You can also change the APSR
using the MSR instruction, but EPSR and IPSR are read-only. For example:

MRS r0, APSR ; Read Flag state into R0

MRS r0, IPSR ; Read Exception/Interrupt state

MRS r0, EPSR ; Read Execution state

MSR APSR, r0 ; Write Flag state


7

Fig4. Program Status Registers (PSRs) in the Cortex-M3.

Fig5. Combined Program Status Registers (xPSR) in the Cortex-M3.

1.6.2 PRIMASK, FAULTMASK, AND BASEPRI REGISTERS


The PRIMASK and BASEPRI registers are useful for temporarily disabling interrupts in
timing-critical tasks. An OS could use FAULTMASK to temporarily disable fault handling
when a task has crashed. In this scenario, a number of different faults might be taking
place when a task crashes. Once the core starts cleaning up, it might not want to be
interrupted by other faults caused by the crashed process. Therefore, the FAULTMASK
gives the OS kernel time to deal with fault conditions.

To access the PRIMASK, FAULTMASK, and BASEPRI registers, a number of functions are
available in the device driver libraries provided by the microcontroller vendors. For
example, the following:
8
x = __get_BASEPRI(); // Read BASEPRI register

x = __get_PRIMARK(); // Read PRIMASK register

x = __get_FAULTMASK(); // Read FAULTMASK register

__set_BASEPRI(x); // Set new value for BASEPRI

__set_PRIMASK(x); // Set new value for PRIMASK

__set_FAULTMASK(x); // Set new value for FAULTMASK

__disable_irq(); // Clear PRIMASK, enable IRQ

__enable_irq(); // Set PRIMASK, disable IRQ

1.6.3 CONTROL REGISTER


In the Cortex-M3, the CONTROL[1] bit is always 0 in handler mode. However, in the
thread or base level, it can be either 0 or 1.

 CONTROL[1]
o In the Cortex-M3, the CONTROL[1] bit is always 0 in handler mode.
However, in the thread or base level, it can be either 0 or 1.
o This bit is writable only when the core is in thread mode and privileged.
o In the user state or handler mode, writing to this bit is not allowed.
o Aside from writing to this register, another way to change this bit is to
change bit 2 of the LR when in exception return.
 CONTROL[0]
o The CONTROL[0] bit is writable only in a privileged state.
o Once it enters the user state, the only way to switch back to privileged is to
trigger an interrupt and change this in the exception handler.
o To access the control register in C, the following CMSIS functions are
available in CMSIS compliant device driver libraries:
x = __get_CONTROL(); // Read the current value of CONTROL
__set_CONTROL(x); // Set the CONTROL value to x

To access the control register in assembly, the MRS and MSR instructions are used:

MRS r0, CONTROL ; Read CONTROL register into R0

MSR CONTROL, r0 ; Write R0 into CONTROL register


9

1.7 OPERATION MODES


• The Cortex-M3 processor has two modes and two privilege levels.

• The operation modes - thread mode and handler mode- determine whether the
processor is running a normal program or running an exception handler like an
interrupt handler or system exception handler.

• The privilege levels (privileged level and user level) provide a mechanism for
safeguarding memory accesses to critical regions as well as providing a basic
security model.

• Software in the privileged access level can switch the program into the user access
level using the control register.

• When an exception takes place, the processor will always switch back to the
privileged state and return to the previous state when exiting the exception
handler.

• A user program cannot change back to the privileged state by writing to the control
register.

• It has to go through an exception handler that programs the control register to


switch the processor back into the privileged access level when returning to thread
mode
• It can be used in conjunction with privilege levels to protect critical memory
locations, such as programs and data for OS.

Fig6. Operation Modes and Privilege Levels in


Cortex-M3.

Fig7. Allowed Operation Mode


Transitions.

Fig8. Switching of Operation Mode by Programming the Control Register or by Exceptions.


10
• In simple applications, there is no need to separate the privileged and user access
levels.

• In these cases, there is no need to use user access level and no need to program
the control register.

• Its recommended to separate the user application stack from the kernel stack
memory to avoid the possibility of crashing a system caused by stack operation
errors in user programs.

• With this arrangement, the user program (running in thread mode) uses the PSP,
and the exception handlers use the MSP. The switching of SPs is automatic upon
entering or leaving the exception handlers.

Fig9. Simple Applications Do Not Require User Access Level in Thread Mode

Fig10. Switching Processor Mode at Interrupt.


11

Fig11.Switching Processor Mode and Privilege Level at Interrupt.

1.8STACK IMPLEMENTATION
The Cortex-M3 uses a full-descending stack operation model. The SP points to the last
data pushed to the stack memory, and the SP decrements before a new PUSH operation.

1.8.1 ONE REGISTER IN EACH STACK OPERATION

1.8.2 MULTIPLE REGISTER STACK


12
1.8.3 COMBINING POP AND RETURN

For POP operations, the data is read from the memory location pointed by SP, and then,
the SP is incremented. The contents in the memory location are unchanged but will be
overwritten when the next PUSH operation takes place Each PUSH/POP operation
transfers 4 bytes of data (each register contains 1 word, or 4 bytes), the SP
decrements/increments by 4 at a time or a multiple of 4 if more than 1 register is pushed
or popped.

In the Cortex-M3, R13 is defined as the SP. When an interrupt takes place, a number of
registers will be pushed automatically, and R13 will be used as the SP for this stacking
process. Similarly, the pushed registers will be restored/popped automatically when
exiting an interrupt handler, and the SP will also be adjusted.

1.8.4 THE TWO-STACK MODEL IN THE CORTEX-M3


The Cortex-M3 has two SPs: the MSPS and the PSP.

 The SP register to be used is controlled by the control register[1].


 When CONTROL[1] is 0, the MSP is used for both thread mode and handler mode.
In this arrangement, the main program and the exception handlers share the same
stack memory region. This is the default setting after power-up.
13
 When the CONTROL [1] is 1, the PSP is used in thread mode. In this arrangement,
the main program and the exception handler can have separate stack memory
regions.

Control [1]=0

Both Thread Level and Handler Use Main Stack

It is possible to perform read/write operations directly to the MSP and PSP, without any
confusion of which R13 you are referring to. Provided that you are in privileged level, you
can access MSP and PSP values:

In general, it is not recommended to change the stack address


14
1.9 MEMORY MAP

 The Cortex-M3 has a predefined memory map.


 This allows the built-in
peripherals, such as the
interrupt controller and the debug
components, to be accessed by simple
memory access instructions.
 Thus, most system features are
accessible in C program code.
 The predefined memory map also allows
the Cortex-M3 processor to be highly
optimized for speed and ease of
integration in system-on-a-chip (SoC)
designs.
 Overall, the 4 GB memory space can be
divided into ranges.
 The Cortex-M3 design has an internal
bus infrastructure optimized for this
memory usage.
 In addition, the design allows these
regions to be used differently. For example, data memory can still be put into the
CODE region, and program code can be executed from an external Random Access
Memory (RAM) region.

1.10 BUILT IN NESTED VECTORED INTERRUPT CONTROLLER


The Cortex-M3 processor includes an interrupt controller called the Nested Vectored
Interrupt Controller (NVIC). It is closely coupled to the processor core and provides a
number of features as follows:

• Nested interrupt support


• Vectored interrupt support
• Dynamic priority changes support
• Reduction of interrupt latency
• Interrupt masking

1.10.1 NESTED INTERRUPT SUPPORT


The NVIC provides nested interrupt support. All the external interrupts and most of the
system exceptions can be programmed to different priority levels. When an interrupt
occurs, the NVIC compares the priority of this interrupt to the current running priority
level. If the priority of the new interrupt is higher than the current level, the interrupt
handler of the new interrupt will override the current running task.

1.10.2 VECTORED INTERRUPT SUPPORT


The Cortex-M3 processor has vectored interrupt support. When an interrupt is accepted,
the starting address of the interrupt service routine (ISR) is located from a vector table in
memory. There is no need to use software to determine and branch to the starting address
of the ISR. Thus, it takes less time to process the interrupt request.
15
1.10.3 DYNAMIC PRIORITY CHANGES SUPPORT
Priority levels of interrupts can be changed by software during run time. Interrupts that
are being serviced are blocked from further activation until the ISR is completed, so their
priority can be changed without risk of accidental reentry.

1.10.4 REDUCTION OF INTERRUPT LATENCY


The Cortex-M3 processor also includes a number of advanced features to lower the
interrupt latency. These include automatic saving and restoring some register contents,
reducing delay in switching from one ISR to another, and handling of late arrival
interrupts.

1.10.5 INTERRUPT MASKING


Interrupts and system exceptions can be masked based on their priority level or masked
completely using the interrupt masking registers BASEPRI, PRIMASK, and FAULTMASK.
They can be used to ensure that time-critical tasks can be finished on time without being
interrupted. The system-level memory region contains the interrupt controller and the
debug components. These devices have fixed addresses. By having fixed addresses for
these peripherals, you can port applications between different Cortex-M3 products much
more easily.

1.11 EXCEPTIONS AND INTERRUPTS


 The Cortex-M3 supports a number of exceptions, including a fixed number of
system exceptions and a number of interrupts, commonly called IRQ.
 The number of interrupt inputs on a Cortex-M3 microcontroller depends on the
individual design.
 The typical number of interrupt inputs is 16 or 32. However, you might find some
microcontroller designs with more (or fewer) interrupt inputs
 Besides the interrupt inputs, there is also a nonmaskable interrupt (NMI) input
signal.
 The actual use of NMI depends on the design of the microcontroller or system-on-
chip (SoC) product you use.
 In most cases, the NMI could be connected to a watchdog timer or a voltage-
monitoring block that warns the processor when the voltage drops below a certain
level.
 The NMI exception can be activated any time, even right after the core exits reset.
 A number of the system exceptions are fault-handling exceptions that can be
triggered by various error conditions.
16

1.12 VECTOR TABLES


1. When an exception event takes place on the Cortex-M3 and is accepted by the
processor core, the corresponding exception handler is executed.
2. To determine the starting address of the exception handler, a vector table
mechanism is used.
3. The vector table is an array of word data inside the system memory, each
representing the starting address of one exception type.
4. The vector table is relocatable, and the relocation is controlled by a relocation
register in the NVIC.
5. After reset, this relocation control register is reset to 0; therefore, the vector table
is located in address 0x0 after reset.
6. For example, if the reset is exception type 1, the address of the reset vector is 1
times 4 (each word is 4 bytes), which equals 0x00000004, and NMI vector (type 2)
is located in 2 × 4 = 0x00000008.
7. The address 0x00000000 is used to store the starting value for the MSP.
8. The LSB of each exception vector indicates whether the exception is to be executed
in the Thumb state. Because the Cortex-M3 can support only Thumb instructions,
the LSB of all the exception vectors should be set to 1.
17

1.12 THE BUS INTERFACE


The main bus interfaces are as follows:
• Code memory buses
• System bus
• Private peripheral bus
The code memory region access is carried out on the code memory buses, which
physically consist of two buses, one called I-Code and other called D-Code.
The system bus is used to access memory and peripherals. This provides access to the
Static Random Access Memory (SRAM), peripherals, external RAM, external devices, and
part of the system level memory regions.
The private peripheral bus provides access to a part of the system-level memory dedicated
to private peripherals, such as debugging components

1.13 MPU
 The Cortex-M3 has an optional MPU.
 This unit allows access rules to be set up for privileged access and user program
access.
 When an access rule is violated, a fault exception is generated, and the fault
exception handler will be able to analyze the problem and correct it, if possible.
 The MPU can be used in various ways.
 In common scenarios, the OS can set up the MPU to protect data used by the OS
kernel and other privileged processes to be protected from untrusted user
programs.
 The MPU can also be used to make memory regions read-only, to prevent
accidental erasing of data or to isolate memory regions between different tasks in
a multitasking system.
 Overall, it can help make embedded systems more robust and reliable.

1.14 THUMB-2 INSTRUCTION SET


 The Cortex-M3 supports the Thumb-2 instruction set.
 This is one of the most important features of the Cortex-M3 processor because it
allows 32-bit instructions and 16-bit instructions to be used together for high code
density and high efficiency.
 It is flexible and powerful yet easy to use.
 In previous ARM processors, the central processing unit (CPU) had two operation
states: a 32-bit ARM state and a 16-bit Thumb state.
 In the ARM state, the instructions are 32 bits and can execute all supported
instructions with very high performance.
 In the Thumb state, the instructions are 16 bits, so there is a much higher
instruction code density, but the Thumb state does not have all the functionality
of ARM instructions and may require more instructions to complete certain types
of operations.

1.14.1 BENEFITS OF THUMB-2 INSTRUCTIONS


 No state switching overhead, saving both execution time and instruction space
 No need to separate ARM code and Thumb code source files, making software
development and maintenance easier
 It’s easier to get the best efficiency and performance, in turn making it easier to
write software, because there is no need to worry about switching code between
ARM and Thumb to try to get the best density/performance
18

QUESTION BANK

MODULE 1

1. With a neat diagram explain the architecture of ARM Cortex M3 microcontroller.


(6)
2. Give the applications Cortex-m3 processor. (6)
3. Explain the operation modes of ARM Cortex M3. (6)
4. Give the memory map of Cortex M3. (4)
5. Briefly describe the functions of the various units with the architectural block
diagram of ARM Cortex M3. (6)
6. Discuss the functions of R0 to R15 and other special registers in Cortex M3. (7)
7. Describe the functions of exceptions with a vector table and priorities. (6)
8. Explain two stack model and reset sequence in ARM Cortex M3. (7)
9. With a neat diagram explain the thumb-2 set architecture in comparison with
thumb and ARM. (4)
10. Discuss various profiles of ARM processors. (3)
11. Bring out the differences between 1) RISC and CISC architecture 2) Von Neumann
and Harvard architecture and 3) microprocessors and microcontrollers. (6)
12. Write a short note on interrupts and exceptions supported by Cortex M3. (3)
13. Give the two stack model in Cortex M3. (6)
14. Write short note on reset sequence. (4)
19
MODULE 2

ARM Cortex M3 Instruction Sets and


Programming
2.1 Assembly Basics
Here, we introduce some basic syntax of ARM assembly to make it easier to understand
the rest of the code examples.
2.1.1 Assembler Language: Basic Syntax
In assembler code, the following instruction formatting is commonly used:

Label opcode operand1, operand2, ...; Comments


The label is optional. Some of the instructions might have a label in front of them so that
the address of the instructions can be determined using the label. Then, you will find the
opcode (the instruction) followed by a number of operands. Normally, the first operand is
the destination of the operation. The number of operands in an instruction depends on
the type of instruction, and the syntax format of the operand can also be different. For
example, immediate data are usually in the form #number, as shown here:
MOV R0, #0x12; Set R0 = 0x12 (hexadecimal)
MOV R1, #'A'; Set R1 = ASCII character A
The text after each semicolon (;) is a comment. These comments do not affect the program
operation, but they can make programs easier for humans to understand. You can define
constants using EQU, and then use them inside your program code. For example,
NVIC_IRQ_SETEN0 EQU 0xE000E100
NVIC_IRQ0_ENABLE EQU 0x1
...
LDR R0,=NVIC_IRQ_SETEN0; ; LDR here is a pseudo-instruction that
; convert to a PC relative load by assembler.
MOV R1,#NVIC_IRQ0_ENABLE ; Move immediate data to register
STR R1,[R0] ; Enable IRQ 0 by writing R1 to address in R0

A number of data definition directives are available for insertion of constants inside
assembly code
For example, DCI (Define Constant Instruction) can be used to code an instruction if your
assembler cannot generate the exact instruction that you want and if you know the binary
code for the instruction.
DCI 0xBE00 ; Breakpoint (BKPT 0), a 16-bit instruction
We can use DCB (Define Constant Byte) for byte size constant values, such as characters,
and Define Constant Data (DCD) for word size constant values to define binary data in
your code.
LDR R3,=MY_NUMBER ;Get the memory address value of MY_NUMBER
LDR R4,[R3] ; Get the value code 0x12345678 in R4
...
LDR R0,=HELLO_TXT ; Get the starting memory address of HELLO_TXT
BL PrintText; Call a function called PrintText to display string
...
MY_NUMBER
DCD 0x12345678
HELLO_TXT
DCB "Hello\n",0; null terminated string

2.1.2 Assembler Language: Use of Suffixes


In assembler for ARM processors, instructions can be followed by suffixes, as shown in
20
For the Cortex-M3, the conditional execution suffixes are usually used for branch
instructions. However, other instructions can also be used with the conditional execution
suffixes if they are inside an IF-THEN instruction block. In those cases, the S suffix and
the conditional execution suffixes can be used at the same time.

Table 4.1 Suffixes in Instructions


Suffix Description
S Update Application Program Status register (APSR) (flags); for example: ADDS
R0, R1 ; this will update APSR
EQ, NE, LT, GT, and Conditional execution; EQ = Equal, NE = Not Equal, LT =
Less Than, GT = Greater so on Than, and so forth. For example:
BEQ<Label> ; Branch if equal

2.1.3 Assembler Language: Unified Assembler Language

To support and get the best out of the Thumb®-2 instruction set, the Unified Assembler
Language (UAL) was developed to allow selection of 16-bit and 32-bit instructions and to
make it easier to port applications between ARM code and Thumb code by using the same
syntax for both. (With UAL, the syntax of Thumb instructions is now the same as for ARM
instructions.)

ADD R0, R1 ; R0 = R0 + R1, using Traditional Thumb syntax


ADD R0, R0, R1 ; Equivalent instruction using UAL syntax

The traditional Thumb syntax can still be used. The choice between whether the
instructions are interpreted as traditional Thumb code or the new UAL syntax is normally
defined by the directive in the assembly file. For example, with ARM assembler tool, a
program code header with “CODE16” directive implies the code is in the traditional
Thumb syntax, and “THUMB” directive implies the code is in the new UAL syntax. One
thing you need to be careful with reusing traditional Thumb is that some instructions
change the flags in APSR, even if the S suffix is not used. However, when the UAL syntax
is used, whether the instruction changes the flag depends on the S suffix. For example,

AND R0, R1 ; Traditional Thumb syntax


ANDS R0, R0, R1 ; Equivalent UAL syntax (S suffix is added)

With the new instructions in Thumb-2 technology, some of the operations can be handled
by either a Thumb instruction or a Thumb-2 instruction. For example, R0 = R0 + 1 can
be implemented as a 16-bit Thumb instruction or a 32-bit Thumb-2 instruction. With
UAL, you can specify which instruction you want by adding suffixes:

ADDS R0, #1 ; Use 16-bit Thumb instruction by default


; for smaller size
ADDS.N R0, #1 ; Use 16-bit Thumb instruction (N=Narrow)
ADDS.W R0, #1 ; Use 32-bit Thumb-2 instruction (W=wide)

The .W (wide) suffix specifies a 32-bit instruction. If no suffix is given, the assembler tool
can choose either instruction but usually defaults to 16-bit Thumb code to get a smaller
size. Depending on tool support, you may also use the .N (narrow) suffix to specify a 16-
bit Thumb instruction. Again, this syntax is for ARM assembler tools. Other assemblers
might have slightly different syntax. If no suffix is given, the assembler might choose the
instruction for you, with the minimum code size.
21

In most cases, applications will be coded in C, and the C compilers will use 16-bit
instructions if possible due to smaller code size. However, when the immediate data
exceed a certain range or when the operation can be better handled with a 32-bit Thumb-
2 instruction, the 32-bit instruction will be used.
The 32-bit Thumb-2 instructions can be half word aligned. For example, you can have a
32-bit instruction located in a half word location.

0x1000 : LDR r0,[r1] ;a 16-bit instructions (occupy 0x1000-0x1001)


0x1002 : RBIT.W r0 ;a 32-bit Thumb-2 instruction (occupy 0x1002-0x1005)

Most of the 16-bit instructions can only access registers R0–R7; 32-bit Thumb-2
instructions do not have this limitation. However, use of PC (R15) might not be allowed
in some of the instructions.

2.2 Instruction list

The supported instructions are listed in Tables 4.2 through 4.9

Table 4.2 16-Bit Data Processing Instructions


Instruction Function
ADC Add with carry
ADD Add
ADR Add PC and an immediate value and put the result in a register
AND Logical AND
ASR Arithmetic shift right
BIC Bit clear (Logical AND one value with the logic inversion of another value)
CMN Compare negative (compare one data with two’s complement of another
data and
update flags)
CMP Compare (compare two data and update flags)
CPY Copy (available from architecture v6; move a value from one
high or low register to another high or low register); synonym of
MOV instruction
EOR Exclusive OR
LSL Logical shift left
LSR Logical shift right
MOV Move (can be used for register-to-register transfers or loading immediate
data)
MUL Multiply
MVN Move NOT (obtain logical inverted value)
NEG Negate (obtain two’s complement value), equivalent to RSB
Table 4.2 16-Bit Data Processing Instructions Continued
Instruction Function

ORR Logical OR
RSB Reverse subtract
ROR Rotate right
SBC Subtract with carry
SUB Subtract
TST Test (use as logical AND; Z flag is updated but AND result is not stored)
REV Reverse the byte order in a 32-bit register (available from architecture v6)
REV16 Reverse the byte order in each 16-bit half word of a 32-bit register (available
from architecture v6)
REVSH Reverse the byte order in the lower 16-bit half word of a 32-bit register and
sign extends the result to 32 bits (available from architecture v6)
SXTB Signed extend byte (available from architecture v6)
SXTH Signed extend half word (available from architecture v6)
UXTB Unsigned extend byte (available from architecture v6)
UXTH Unsigned extend half word (available from architecture v6)
Table 4.3 16-Bit Branch Instructions

Instruction Function

B Branch
B<cond> Conditional branch
BL Branch with link; call a subroutine and store the return address in LR (this
is actually a 32-bit instruction, but it is also available in Thumb in traditional
ARM processors)
BLX Branch with link and change state (BLX <reg> only)1
BX <reg> Branch with exchange state
CBZ Compare and branch if zero (architecture v7)
CBNZ Compare and branch if nonzero (architecture v7)
IT IF-THEN (architecture v7)
Table 4.4 16-Bit Load and Store Instructions

Instruction Function

LDR Load word from memory to register


LDRH Load half word from memory to register
LDRB Load byte from memory to register

22 | P a g e
Table 4.4 16-Bit Load and Store Instructions Continued
Instruction Function
LDRSH Load half word from memory, sign extend it, and put it in register
LDRSB Load byte from memory, sign extend it, and put it in register
STR Store word from register to memory
STRH Store half word from register to memory
STRB Store byte from register to memory
LDM/LDMIALoad multiple/Load multiple increment after
STM/STMIAStore multiple/Store multiple increment after
PUSH Push multiple registers
POP Pop multiple registers
Table 4.5 Other 16-Bit Instructions
Instruction Function
SVC Supervisor call
SEV Send event
WFE Sleep and wait for event
WFI Sleep and wait for interrupt
BKPT Breakpoint; if debug is enabled, it will enter debug mode (halted),
or if debug monitor exception is enabled, it will invoke the debug
exception; otherwise, it will invoke a fault exception
NOP No operation
CPSIEEnable PRIMASK (CPSIE i)/FAULTMASK (CPSIE f ) register (set the register to
0)
CPSIDDisable PRIMASK (CPSID i)/ FAULTMASK (CPSID f ) register (set the register
to 1)
Table 4.6 32-Bit Data Processing Instructions
Instruction Function
ADC Add with carry
ADD Add
ADDW Add wide (#immed_12)
ADR Add PC and an immediate value and put the result in a register
AND Logical AND
ASR Arithmetic shift right
BIC Bit clear (logical AND one value with the logic inversion of another value)
BFC Bit field clear
BFI Bit field insert
CMN Compare negative (compare one data with two’s complement of another data
and
update flags)

23 | P a g e
CMP Compare (compare two data and update flags)
CLZ Count leading zero
EOR Exclusive OR
LSL Logical shift left
LSR Logical shift right
MLA Multiply accumulate
MLS Multiply and subtract
MOV Move
MOVW Move wide (write a 16-bit immediate value to register)
MOVT Move top (write an immediate value to the top half word of destination reg)
MVN Move negative
MUL Multiply
ORR Logical OR
ORN Logical OR NOT
RBIT Reverse bit
REV Byte reverse word
REV16 Byte reverse packed half word
REVSH Byte reverse signed half word
ROR Rotate right
RSB Reverse subtract
RRX Rotate right extended
SBC Subtract with carry
SBFX Signed bit field extract
SDIV Signed divide
SMLAL Signed multiply accumulate long
SMULL Signed multiply long
SSAT Signed saturate
SBC Subtract with carry
SUB Subtract
SUBW Subtract wide (#immed_12)
SXTB Sign extend byte
SXTH Sign extend half word
TEQ Test equivalent (use as logical exclusive OR; flags are updated but
result is not stored)
TST Test (use as logical AND; Z flag is updated but AND result is not stored)
UBFX Unsigned bit field extract
UDIV Unsigned divide
UMLAL Unsigned multiply accumulate long
UMULL Unsigned multiply long USAT
Unsigned saturate
Continued

24 | P a g e
Table 4.6 32-Bit Data Processing Instructions Continued
Instruction Function
UXTB Unsigned extend byte
UXTH Unsigned extend half word
Table 4.7 32-Bit Load and Store Instructions
Instruction Function
LDR Load word data from memory to register
LDRT Load word data from memory to register with unprivileged access
LDRB Load byte data from memory to register
LDRBT Load byte data from memory to register with unprivileged access
LDRH Load half word data from memory to register
LDRHT Load half word data from memory to register with unprivileged access
LDRSB Load byte data from memory, sign extend it, and put it to register
LDRSBT Load byte data from memory with unprivileged access, sign extend it, and
put it to
register
LDRSH Load half word data from memory, sign extend it, and put it to register
LDRSHT Load half word data from memory with unprivileged access, sign extend it,
and put
it to register
LDM/LDMIA Load multiple data from memory to registers
LDMDB Load multiple decrement before
LDRD Load double word data from memory to registers
STR Store word to memory
STRT Store word to memory with unprivileged access
STRB Store byte data to memory
STRBT Store byte data to memory with unprivileged access
STRH Store half word data to memory
STRHT Store half word data to memory with unprivileged access
STM/STMIA Store multiple words from registers to memory
STMDB Store multiple decrement before
STRD Store double word data from registers to memory
PUSH Push multiple registers
POP Pop multiple registers
Table 4.8 32-Bit Branch Instructions
Instruction Function
B Branch
B<cond> Conditional branch
BL Branch and link
TBB Table branch byte; forward branch using a table of single byte offset
TBH Table branch half word; forward branch using a table of half word offset
Table 4.9 Other 32-Bit Instructions
25 | P a g e
Instruction Function
LDREX Exclusive load word
LDREXH Exclusive load half word
LDREXB Exclusive load byte
STREX Exclusive store word
STREXH Exclusive store half word
STREXB Exclusive store byte
CLREX Clear the local exclusive access record of local processor
MRS Move special register to general-purpose register
MSR Move to special register from general-purpose register
NOP No operation
SEV Send event
WFE Sleep and wait for event
WFI Sleep and wait for interrupt
ISB Instruction synchronization barrier
DSB Data synchronization barrier
DMB Data memory barrier
Table 4.10 Unsupported Thumb Instructions for Traditional ARM Processors
Unsupported
Instruction Function
BLX label This is branch with link and exchange state. In a format with immediate
data, BLX always changes to ARM state. Because the Cortex-M3 does not support the
ARM state, instructions like this one that attempt to switch to the ARM state will result
in a fault exception called usage fault.
SETEND This Thumb instruction, introduced in architecture v6, switches the
endian configuration during run time. Since the Cortex-M3 does not
support dynamic endian, using the SETEND instruction will result in
a fault exception.

2.2.1 Unsupported Instructions


A number of Thumb instructions are not supported in the Cortex-M3; they are presented
in Table 4.10.

A number of instructions listed in the table are not supported in the Cortex-M3. ARM v7-
M architecture allows Thumb-2 coprocessor instructions, but the Cortex-M3 processor
does not have any coprocessor support. Therefore, executing the coprocessor instructions
shown in Table 4.11 will result in a fault exception (Usage Fault with No-Coprocessor
“NOCP” bit in Usage Fault Status Register in NVIC set to 1).Some of the change process
state (CPS) instructions are also not supported in the Cortex-M3 (see Table 4.12). This is
because the Program Status register (PSR) definition has changed, so some bits defined
in the ARM architecture v6 are not available in the Cortex-M3.

26 | P a g e
Table 4.11 Unsupported Coprocessor Instructions
Unsupported
Instruction Function
MCR Move to coprocessor from ARM processor
MCR2 Move to coprocessor from ARM processor
MCRR Move to coprocessor from two ARM register
MRC Move to ARM register from coprocessor
MRC2 Move to ARM register from coprocessor
MRRC Move to two ARM registers from coprocessor
LDC Load coprocessor; load memory data from a sequence of
consecutive memory addresses to a coprocessor
STC Store coprocessor; stores data from a coprocessor to a sequence of
consecutive memory addresses
Table 4.12 Unsupported Change Process State Instructions
Unsupported Function
Instruction
CPS<IE|ID>.W AThere is no A bit in the Cortex-M3
CPS.W #modeThere is no mode bit in the Cortex-M3 PSR
Table 4.13 Unsupported Hint Instructions
Unsupported
Instruction Function
DBG A hint instruction to debug and trace system
PLD Preload data; this is a hint instruction for cache memory, however,
since there is no cache in the Cortex-M3 processor, this instruction
behaves as NOP
PLI Preload instruction; this is a hint instruction for cache memory,
however, since there is no cache in the Cortex-M3 processor, this
instruction behaves as NOP
YIELD A hint instruction to allow multithreading software to indicate to
hardware that it is doing a task that can be swapped out to improve
overall system performance.
In addition, the hint instructions shown in Table 4.13 will behave as NOP in the Cortex-
M3. All other undefined instructions, when executed, will cause the usage fault exception
to take place.

2.3 Instruction Descriptions


Here, we introduce some of the commonly used syntax for ARM assembly code. Some of
the instructions have various options such as barrel shifter.

2.3.1 Assembler Language: Moving Data


One of the most basic functions in a processor is transfer of data. In the Cortex-M3, data
transfers can be of one of the following types:
• Moving data between register and register
• Moving data between memory and register
• Moving data between special register and register
27 | P a g e
• Moving an immediate data value into a register
The command to move data between registers is MOV (move). For example, moving data
from register R3 to register R8 looks like this:
MOV R8, R3
Another instruction can generate the negative value of the original data; it is called
MVN (move negative). The basic instructions for accessing memory are Load and
Store. Load (LDR) transfers data from memory to registers, and Store transfers data
from registers to memory. The transfers can be in different data sizes (byte, half word,
word, and double word), as outlined in Table 4.14.
Multiple Load and Store operations can be combined into single instructions called
LDM (Load Multiple) and STM (Store Multiple), as outlined in Table 4.15.

The exclamation mark (!) in the instruction specifies whether the register Rd should
be updated after the instruction is completed. For example, if R8 equals 0x8000:
STMIA.W R8!, {R0-R3} ; R8 changed to 0x8010 after store (increment by 4 words)
STMIA.W R8 , {R0-R3} ; R8 unchanged after store
ARM processors also support memory accesses with preindexing and postindexing.
For preindexing, the register holding the memory address is adjusted. The memory
transfer then takes place with the updated address. For example,
LDR.W R0,[R1, #offset]! ; Read memory[R1+offset], with R1 update to R1+offset
Table 4.14 Commonly Used Memory Access Instructions
Example Description
LDRB Rd, [Rn, #offset] Read byte from memory location Rn + offset
LDRH Rd, [Rn, #offset] Read half word from memory location Rn + offset LDR
Rd, [Rn, #offset] Read word from memory location Rn + offset
LDRD Rd1,Rd2, [Rn, #offset]Read double word from memory location Rn + offset
STRB Rd, [Rn, #offset] Store byte to memory location Rn + offset
STRH Rd, [Rn, #offset] Store half word to memory location Rn + offset
STR Rd, [Rn, #offset] Store word to memory location Rn + offset
STRD Rd1,Rd2, [Rn, #offset]Store double word to memory location Rn + offset
Table 4.15 Multiple Memory Access Instructions
Example Description
LDMIA Rd!,<reg list> Read multiple words from memory location specified by Rd;
address increment after (IA) each transfer (16-bit Thumb
instruction)
STMIA Rd!,<reg list> Store multiple words to memory location specified by Rd;
address increment after (IA) each transfer (16-bit Thumb
instruction)
LDMIA.W Rd(!),<reg list>Read multiple words from memory location specified by Rd;
address increment after each read (.W specified it is a 32-bit
Thumb-2 instruction)

28 | P a g e
LDMDB.W Rd(!),<reg list> Read multiple words from memory location specified by Rd;
address Decrement Before (DB) each read (.W specified it is a
32-bit Thumb-2 instruction)
STMIA.W Rd(!),<reg list>Write multiple words to memory location specified by Rd;
address
increment after each read (.W specified it is a 32-bit Thumb-2 instruction)
STMDB.W Rd(!),<reg list> Write multiple words to memory location specified by Rd;
address DB each read (.W specified it is a 32-bit Thumb-2
instruction)

Table 4.16 Examples of Preindexing Memory Access Instructions


Example Description
LDR.W Rd, [Rn, #offset]! Preindexing load instructions for various sizes (word, byte,
half LDRB.W Rd, [Rn, #offset]! word, and double word) LDRH.W Rd, [Rn, #offset]!
LDRD.W Rd1, Rd2,[Rn, #offset]!
LDRSB.W Rd, [Rn, #offset]! Preindexing load instructions for various sizes with sign
extend
LDRSH.W Rd, [Rn, #offset]! (byte, half word)
STR.W Rd, [Rn, #offset]! Preindexing store instructions for various sizes (word, byte,
half STRB.W Rd, [Rn, #offset]! word, and double word) STRH.W Rd, [Rn, #offset]!
STRD.W Rd1, Rd2,[Rn, #offset]!
The use of the “!” indicates the update of base register R1. The “!” is optional; without
it, the instruction would be just a normal memory transfer with offset from a base
address. The preindexing memory access instructions include load and store instructions
of various transfer sizes Postindexing memory access instructions carry out the memory
transfer using the base address specified by the register and then update the address
register afterward. For example,

LDR.W R0,[R1], #offset ; Read memory[R1], with R1


; updated to R1+offset
When a postindexing instruction is used, there is no need to use the “!” sign, because all
postindexing instructions update the base address register, whereas in preindexing you
might choose whether to update the base address register or not. Similarly to
preindexing, postindexing memory access instructions are available for different transfer
sizes (see Table 4.17).

29 | P a g e
Table 4.17 Examples of Post indexing Memory Access Instructions

Example Description

LDR.W Rd, [Rn], #offset Post indexing load instructions for various sizes (word, byte,
LDRB.W Rd, [Rn], #offset half word, and double word)
LDRH.W Rd, [Rn], #offset
LDRD.W Rd1, Rd2,[Rn], #offset
LDRSB.W Rd, [Rn], #offsetPost indexing load instructions for various sizes with sign
LDRSH.W Rd, [Rn], #offset extend (byte, half word)
STR.W Rd, [Rn], #offset Postindexing store instructions for various sizes (word, byte,
STRB.W Rd, [Rn], #offset half word, and double word)
STRH.W Rd, [Rn], #offset
STRD.W Rd1, Rd2,[Rn], #offset
Two other types of memory operation are stack PUSH and stack POP. For example,
PUSH {R0, R4-R7, R9} ; Push R0, R4, R5, R6, R7, R9 into
; stack memory
POP {R2,R3} ; Pop R2 and R3 from stack
Usually a PUSH instruction will have a corresponding POP with the same register list,
but this is not always necessary. For example, a common exception is when POP is used
as a function return:
PUSH {R0-R3, LR} ; Save register contents at beginning of
; subroutine
.... ; Processing
POP {R0-R3, PC} ; restore registers and return
In this case, instead of popping the LR register back and then branching to the address
in LR, we POP the address value directly in the program counter. The Cortex-M3 has a
number of special registers. To access these registers, we use the instructions MRS and
MSR. For example,

MRS R0, PSR ; Read Processor status word into R0

MSR CONTROL, R1 ; Write value of R1 into control register


Unless you’re accessing the APSR, you can use MSR or MRS to access other special
registers only in privileged mode. Moving immediate data into a register is a common
thing to do. For example, you might want to access a peripheral register, so you need to
put the address value into a register beforehand. For small values (8 bits or less), you
can use MOVS (move). For example,

MOVS R0, #0x12 ; Set R0 to 0x12


For a larger value (over 8 bits), you might need to use a Thumb-2 move instruction. For
example,
MOVW.W R0, #0x789A ; Set R0 to 0x789A
Or if the value is 32-bit, you can use two instructions to set the upper and lower halves:
MOVW.W R0,#0x789A ; Set R0 lower half to 0x789A
MOVT.W R0,#0x3456 ; Set R0 upper half to 0x3456. Now
; R0=0x3456789A

30 | P a g e
Alternatively, you can also use LDR (a pseudo-instruction provided in ARM assembler).
For example,
LDR R0, =0x3456789A
This is not a real assembler command, but the ARM assembler will convert it into a
PC relative load instruction to produce the required data. To generate 32-bit immediate
data, using LDR is recommended rather than the MOVW.W and MOVT.W combination
because it gives better readability and the assembler might be able to reduce the memory
being used if the same immediate data are reused in several places of the same program.

2.3.2 LDR and ADR pseudo-Instructions


Both LDR and ADR pseudo-instructions can be used to set registers to a program address
value. They have different syntaxes and behaviors. For LDR, if the address is a program
address value, the assembler will automatically set the LSB to 1. For example,
LDR R0, =address1 ; R0 set to 0x4001
...
address1 ; address here is 0x4000
MOV R0, R1 ; address1 contains program code ...
You will find that the LDR instruction will put 0x4001 into R1; the LSB is set to 1 to
indicate that it is Thumb code. If address1 is a data address, LSB will not be changed.
For example,
LDR R0, =address1 ; R0 set to 0x4000
...
address1 ; address here is 0x4000
DCD 0x0 ; address1 contains data ...
For ADR, you can load the address value of a program code into a register without
setting the LSB automatically. For example,
ADR R0, address1
...
address1 ; (address here is 0x4000)
MOV R0, R1 ; address1 contains program code ...
You will get 0x4000 in the ADR instruction. Note that there is no equal sign (=) in the
ADR statement.

LDR obtains the immediate data by putting the data in the program code and uses a
PC relative load to get the data into the register. ADR tries to generate the immediate
value by adding or subtracting instructions (for example, based on the current PC value).
As a result, it is not possible to create all immediate values using ADR, and the target
address label must be in a close range. However, using ADR can generate smaller code
sizes compared with LDR. The 16-bit version of ADR requires that the target address
must be word aligned (address value is a multiple of 4). If the target address is not word
aligned, you can use the 32-bit version of ADR instruction “ADR.W.” If the target address
is more than ± 4095 bytes of current PC, you can use “ADRL” pseudo-instruction, which
gives ±1 MB range.

2.3.3 Assembler Language: processing Data

31 | P a g e
The Cortex-M3 provides many different instructions for data processing. A few basic ones
are introduced here. Many data operation instructions can have multiple instruction
formats. For example, an ADD instruction can operate between two registers or between
one register and an immediate data value:
ADD R0, R0, R1 ; R0 = R0 + R1
ADDS R0, R0, #0x12 ; R0 = R0 + 0x12
ADD.W R0, R1, R2 ; R0 = R1 + R2
These are all ADD instructions, but they have different syntaxes and binary coding. With
the traditional Thumb instruction syntax, when 16-bit Thumb code is used, an ADD
instruction can change the flags in the PSR. However, 32-bit Thumb-2 code can either
change a flag or keep it unchanged. To separate the two different operations, the S suffix
should be used if the following operation depends on the flags:

ADD.W R0, R1, R2 ; Flag unchanged


ADDS.W R0, R1, R2 ; Flag change
Aside from ADD instructions, the arithmetic functions that the Cortex-M3 supports
include subtract (SUB), multiply (MUL), and unsigned and signed divide (UDIV/SDIV).
Table 4.18 shows some of the most commonly used arithmetic instructions.

Table 4.18 Examples of Arithmetic


Instructions
Instruction Operation
ADD Rd, Rn, Rm ; Rd = Rn + Rm ADD operation
ADD Rd, Rd, Rm ; Rd = Rd + Rm
ADD Rd, #immed ; Rd = Rd + #immed
ADD Rd, Rn, # immed ; Rd = Rn + #immed
ADC Rd, Rn, Rm ; Rd = Rn + Rm + carry ADD with carry
ADC Rd, Rd, Rm ; Rd = Rd + Rm + carry
ADC Rd, #immed ; Rd = Rd + #immed +
carry
ADDW Rd, Rn,#immed ; Rd = Rn + #immed ADD register with 12-bit
immediate value
SUB Rd, Rn, Rm ; Rd = Rn − Rm SUBTRACT
SUB Rd, #immed ; Rd = Rd − #immed
SUB Rd, Rn,#immed ; Rd = Rn − #immed
SBC Rd, Rm ; Rd = Rd − Rm − borrow SUBTRACT with borrow (not
SBC.W Rd, Rn, #immed ; Rd = Rn − #immed carry)
− borrow
SBC.W Rd, Rn, Rm ; Rd = Rn − Rm –
borrow
RSB.W Rd, Rn, #immed ; Rd = #immed –Rn Reverse subtract
RSB.W Rd, Rn, Rm ; Rd = Rm − Rn
MUL Rd, Rm ; Rd = Rd * Rm Multiply
MUL.W Rd, Rn, Rm ; Rd = Rn * Rm
UDIV Rd, Rn, Rm ; Rd = Rn/Rm Unsigned and signed divide
SDIV Rd, Rn, Rm ; Rd = Rn/Rm

32 | P a g e
These instructions can be used with or without the “S” suffix to determine if the APSR
should be updated. In most cases, if UAL syntax is selected and if “S” suffix is not used,
the 32-bit version of the instructions would be selected as most of the 16-bit Thumb
instructions update APSR.

The Cortex-M3 also supports 32-bit multiply instructions and multiply accumulate
instructions that give 64-bit results. These instructions support signed or unsigned
values (see Table 4.19).

Another group of data processing instructions are the logical operations instructions
and logical operations such as AND, ORR (or), and shift and rotate functions. Table 4.20
shows some of the most commonly used logical instructions. These instructions can be
used with or without the “S” suffix to determine if the APSR should be updated. If UAL
syntax is used and if “S” suffix is not used, the 32-bit version of the instructions would
be selected as all of the 16-bit logic operation instructions update APSR.

The Cortex-M3 provides rotate and shift instructions. In some cases, the rotate
operation can be combined with other operations (for example, in memory address offset
calculation for load/store instructions). For standalone rotate/shift operations, the
instructions shown in Table 4.21 are provided. Again, a 32-bit version of the instruction
is used if “S” suffix is not used and if UAL syntax is used.

Table 4.19 32-Bit Multiply Instructions

Instruction Operation
SMULL RdLo, RdHi, Rn, Rm ; {RdHi,RdLo} = 32-bit multiply instructions for
Rn * Rm signed values
SMLAL RdLo, RdHi, Rn, Rm ; {RdHi,RdLo} +=
Rn * Rm
UMULL RdLo, RdHi, Rn, Rm ; {RdHi,RdLo} = 32-bit multiply instructions for
Rn * Rm unsigned values
UMLAL RdLo, RdHi, Rn, Rm ; {RdHi,RdLo} +=
Rn * Rm
Table 4.20 Logic Operation Instructions

Instruction Operation
AND Rd, Rn ; Rd = Rd &Rn Bitwise AND
AND.W Rd, Rn,#immed ; Rd = Rn& #immed
AND.W Rd, Rn, Rm ; Rd = Rn& Rd
ORRRd, Rn ; Rd = Rd | Rn Bitwise OR
ORR.W Rd, Rn,#immed ; Rd = Rn | #immed
ORR.W Rd, Rn, Rm ; Rd = Rn | Rd
BIC Rd, Rn ; Rd = Rd & (~Rn) Bit clear
BIC.W Rd, Rn,#immed ; Rd = Rn&(~#immed)
BIC.W Rd, Rn, Rm ; Rd = Rn&(~Rd)
ORN.W Rd, Rn,#immed ; Rd = Rn | Bitwise OR NOT
(~#immed)
ORN.W Rd, Rn, Rm ; Rd = Rn | (~Rd)

33 | P a g e
EOR Rd, Rn ; Rd = Rd ^ Rn Bitwise Exclusive OR
EOR.W Rd, Rn,#immed ; Rd = Rn | #immed
EOR.W Rd, Rn, Rm ; Rd = Rn | Rd
Table 4.21 Shift and Rotate
Instructions
Instruction Operation
ASR Rd, Rn,#immed ; Rd = Rn » immed Arithmetic shift right
ASRRd, Rn ; Rd = Rd » Rn
ASR.W Rd, Rn, Rm ; Rd = Rn » Rm
LSLRd, Rn,#immed ; Rd = Rn « immed Logical shift left
LSLRd, Rn ; Rd = Rd « Rn
LSL.W Rd, Rn, Rm ; Rd = Rn « Rm
LSRRd, Rn,#immed ; Rd = Rn » immed Logical shift right
LSRRd, Rn ; Rd = Rd » Rn
LSR.W Rd, Rn, Rm ; Rd = Rn » Rm
ROR Rd, Rn ; Rd rot by Rn Rotate right
ROR.W Rd, Rn,#immed ; Rd = Rn rot by
immed
ROR.W Rd, Rn, Rm ; Rd = Rn rot by
Rm
RRX.W Rd, Rn ; {C, Rd} = {Rn, C} Rotate right extended

Logical Shift Left (LSL)

C Register 0

Logical Shift Right (LSR)

0 Register C

Rotate Right (ROR)

Register C

Arithmetic Shift Right (ASR)

Register C

Rotate Right eXtended (RRX)

Register C

In UAL syntax, the rotate and shift operations can also update the carry flag if the S
suffix is used (and always update the carry flag if the 16-bit Thumb code is used). See
Figure 4.1.

If the shift or rotate operation shifts the register position by multiple bits, the value of
the carry flag C will be the last bit that shifts out of the register.

34 | P a g e
Why is there rotate right but no rotate Left?
The rotate left operation can be replaced by a rotate right operation with a different
rotate offset. For example, a rotate left by 4-bit operation can be written as a rotate
right by 28-bit instruction, which gives the same result and takes the same amount of
time to execute.

Table 4.22 Sign Extend


Instructions
Instruction Operation
SXTB Rd, Rm ; Rd = Sign extend byte data into word
signext(Rm[7:0])
SXTH Rd, Rm ; Rd = Sign extend half word data into word
signext(Rm[15:0])
Table 4.23 Data Reverse Ordering Instructions
Instruction Operation
REV Rd, Rn ; Rd = rev(Rn) Reverse bytes in word
REV16 Rd, Rn ; Rd = rev16(Rn)Reverse bytes in each half word
REVSH Rd, Rn ; Rd = revsh(Rn)Reverse bytes in bottom half word and sign extend the
Result
For conversion of signed data from byte or half word to word, the Cortex-M3 provides
the two instructions shown in Table 4.22. Both 16-bit and 32-bit versions are available.
The 16-bit version can only access low registers.

Another group of data processing instructions is used for reversing data bytes in a
register (see Table4.23). These instructions are usually used for conversion between little
endian and big endian data. See Figure 4.2. Both 16-bit and 32-bit versions are available.
The 16-bit version can only access low registers.

The last group of data processing instructions is for bit field processing. They include
the instructions shown in Table 4.24. Examples of these instructions are provided in a
later part of this chapter.

2.3.4 Assembler Language: Call and Unconditional Branch


The most basic branch instructions are as follows:
B label ; Branch to a labeled address
BX reg ; Branch to an address specified by a register
In BX instructions, the LSB of the value contained in the register determines the next
state (Thumb/ ARM) of the processor. In the Cortex-M3, because it is always in Thumb
state, this bit should be set to 1. If it is zero, the program will cause a usage fault
exception because it is trying to switch the processor into ARM state (See Figure 4.2.).To
call a function, the branch and link instructions should be used.

BL label ; Branch to a labeled address and save return


; address in LR

35 | P a g e
Bit Bit Bit Bit
[31:24] [23:16] [15:8] [7:0]
REV.W
(Reverse bytes in word )

REV16.W
(Reverse bytes in half word )

REVSH.W
(Reverse bytes in bottom
half word and sign extend results)

sign extend

Table 4.24 Bit Field Processing and Manipulation Instructions


Instruction Operation
BFC.W Rd, Rn, #<width> Clear bit field within a register
BFI.W Rd, Rn, #<lsb>, #<width> Insert bit field to a register
CLZ.W Rd, Rn Count leading zero
RBIT.W Rd, Rn Reverse bit order in register
SBFX.W Rd, Rn, #<lsb>, #<width>Copy bit field from source and sign extend it
UBFX.W Rd, Rn, #<lsb>, #<width>Copy bit field from source register
BLX reg ; Branch to an address specified by a register and
; save return address in LR.
With these instructions, the return address will be stored in the link register (LR) and
the function can be terminated using BX LR, which causes program control to return to
the calling process. However, when using BLX, make sure that the LSB of the register is
1. Otherwise the processor will produce a fault exception because it is an attempt to
switch to the ARM state. You can also carry out a branch operation using MOV
instructions and LDR instructions. For example,

MOV R15, R0 ; Branch to an address inside R0


LDR R15, [R0] ; Branch to an address in memory location specified by R0
POP {R15} ; Do a stack pop operation, and change the program counter value
;to the result value.
When using these methods to carry out branches, you also need to make sure that the
LSB of the new program counter value is 0x1. Otherwise, a usage fault exception will be
generated because it will try to switch the processor to ARM mode, which is not allowed
in the Cortex-M3 redundancy.

36 | P a g e
save the LR if you need To Call a subroutine
The BL instruction will destroy the current content of your LR. So, if your program
code needs the LR later, you should save your LR before you use BL. The common
method is to push the LR to stack in the beginning of your subroutine. For example,
main
...
BL functionA
...
functionA
PUSH {LR} ; Save LR content to stack ...
BL functionB ...
POP {PC} ; Use stacked LR content to return to main
functionB PUSH {LR}
...
POP {PC} ; Use stacked LR content to return to functionA
In addition, if the subroutine you call is a C function, you might also need to save
the contents in R0–R3 and R12 if these values will be needed at a later stage.
According to AAPCS [Ref. 5], the contents in these registers could be changed by a
C function.
2.3.5 Assembler Language: Decisions and Conditional Branches
Most conditional branches in ARM processors use flags in the APSR to determine whether
a branch should be carried out. In the APSR, there are five flag bits; four of them are
used for branch decisions (see Table 4.25). There is another flag bit at bit[27], called the
Q flag. It is for saturation math operations and is not used for conditional branches.

Table 4.25 Flag Bits in APSR that Can Be Used for Conditional Branches
Flag PSR Bit Description
N 31 Negative flag (last operation result is a negative value)
Z 30 Zero (last operation result returns a zero value)
C 29 Carry (last operation returns a carry out or borrow)
V 28 Overflow (last operation results in an overflow)
Flags In arm processors
Often, data processing instructions change the flags in the PSR. The flags might be
used for branch decisions, or they can be used as part of the input for the next
instruction. The ARM processor normally contains at least the Z, N, C, and V flags,
which are updated by execution of data processing instructions.

• Z (Zero) flag: This flag is set when the result of an instruction has a zero value or
when a comparison of two data returns an equal result.
• N (Negative) flag: This flag is set when the result of an instruction has a negative
value (bit 31 is 1).
• C (Carry) flag: This flag is for unsigned data processing—for example, in add (ADD)
it is set when an overflow occurs; in subtract (SUB) it is set when a borrow did not
occur (borrow is the invert of carry).
• V (Overflow) flag: This flag is for signed data processing; for example, in an add
(ADD), when two positive values added together produce a negative value, or when
two negative values added together produce a positive value.

37 | P a g e
These flags can also have special results when used with shift and rotate instructions.
Refer to the ARM v7-M Architecture Application Level Reference Manual [Ref. 2] for
details.

With combinations of the four flags (N, Z, C, and V ), 15 branch conditions are defined
(see Table 4.26). Using these conditions, branch instructions can be written as, for
example,
BEQ label ; Branch to address 'label' if Z flag is set
You can also use the Thumb-2 version if your branch target is further away. For example,
BEQ.W label ; Branch to address 'label' if Z flag is set

Table 4.26 Conditions for Branches or Other Conditional Operations


Symbol Condition Flag
EQ Equal Z set
NE Not equal Z clear
CS/HSCarry set/unsigned higher or same C set
CC/LO Carry clear/unsigned lower C clear
MI Minus/negative N set
PL Plus/positive or zero N clear
VS Overflow V set
VC No overflow V clear
HI Unsigned higher C set and Z clear
LS Unsigned lower or same C clear or Z set
GE Signed greater than or equalN set and V set, or N clear and V clear (N == V)
LT Signed less than N set and V clear, or N clear and V set (N != V)
GT Signed greater thanZ clear, and either N set and V set, or N clear and
V clear (Z == 0, N == V)
LE Signed less than or equalZ set, or N set and V clear, or N clear and V set
(Z == 1 or N != V)
AL Always (unconditional) —
The defined branch conditions can also be used in IF-THEN-ELSE structures. For
example,
CMP R0, R1 ; Compare R0 and R1
ITTEE GT ; If R0 > R1 Then if true, first 2 statements execute, if false, other 2
statements execute
MOVGT R2, R0 ; R2 = R0
MOVGT R3, R1 ; R3 = R1
MOVLE R2, R0 ; Else R2 = R1
MOVLE R3, R1 ; R3 = R0
APSR flags can be affected by the following:
• Most of the 16-bit ALU instructions
• 32-bit (Thumb-2) ALU instructions with the S suffix; for example, ADDS.W
• Compare (e.g., CMP) and Test (e.g., TST, TEQ)

38 | P a g e
• Write to APSR/xPSR directly
Most of the 16-bit Thumb arithmetic instructions affect the N, Z, C, and V flags. With
32-bit Thumb-2 instructions, the ALU operation can either change flags or not change
flags. For example,
ADDS.W R0, R1, R2 ; This 32-bit Thumb instruction updates flag
ADD.W R0, R1, R2 ; This 32-bit Thumb instruction does not
; update flag
Be careful when reusing program code from old projects. If the old project is in tradition
Thumb syntax; for example, “CODE16” directive is used with ARM assembler, then
ADD R0, R1 ; This 16-bit Thumb instruction updates flag
ADD R0, #0x1 ; This 16-bit Thumb instruction updates flag
However, if you used the same code in UAL syntax; that is “THUMB” directive is used
with ARM assembler, then
ADD R0, R1 ; This 16-bit Thumb instruction does no update flag
ADD R0, #0x1 ; This will become a 32-bit Thumb instruction that does not update flag
To make sure that the code works correctly with different tools, you should always use
the S suffix if the flags need to be updated for conditional operations such as conditional
branches. The compare (CMP) instruction subtracts two values and updates the flags
(just like SUBS), but the result is not stored in any registers. CMP can have the following
formats:

CMP R0, R1 ; Calculate R0 – R1 and update flag


CMP R0, #0x12 ; Calculate R0 – 0x12 and update flag
A similar instruction is the CMN (compare negative). It compares one value to the
negative (two’s complement) of a second value; the flags are updated, but the result is
not stored in any registers:
CMN R0, R1 ; Calculate R0 – (-R1) and update flag
CMN R0, #0x12 ; Calculate R0 – (-0x12) and update flag
The TST (test) instruction is more like the AND instruction. It ANDs two values and
updates the flags. However, the result is not stored in any register. Similarly to CMP, it
has two input formats:

TST R0, R1 ; Calculate R0 AND R1 and update flag


TST R0, #0x12 ; Calculate R0 AND 0x12 and update flag

2.3.6 Assembler Language: Combined Compare and Conditional Branch


With ARM architecture v7-M, two new instructions are provided on the Cortex-M3 to
supply a simple compare with zero and conditional branch operations. These are CBZ
(compare and branch if zero) and CBNZ (compare and branch if nonzero).

The compare and branch instructions only support forward branches. For example,
i = 5; while (i != 0 ){ func1(); ; call a
function
i−−; }

39 | P a g e
This can be compiled into the following:
MOV R0, #5 ; Set loop counter loop1 CBZ R0,loop1exit ; if
loop counter = 0 then exit the loop
BL func1 ; call a function
SUB R0, #1 ; loop counter decrement
B loop1 ; next loop loop1exit
The usage of CBNZ is similar to CBZ, apart from the fact that the branch is taken if
the Z flag is not set (result is not zero). For example,
status = strchr(email_address, '@');
if (status == 0){//status is 0 if @ is not in email_addressshow_error_message();
exit(1); }

This can be compiled into the following:


...
BL strchr
CBNZ R0, email_looks_okay ; Branch if result is not zero
BL show_error_message
BL exit email_looks_okay
...
The APSR value is not affected by the CBZ and CBNZ instructions.

Assembler Language: Conditional Execution Using IT Instructions


The IT (IF-THEN) block is very useful for handling small conditional code. It avoids branch
penalties because there is no change to program flow. It can provide a maximum of four
conditionally executed instructions.
In IT instruction blocks, the first line must be the IT instruction, detailing the choice
of execution, followed by the condition it checks. The first statement after the IT
command must be TRUE-THEN- EXECUTE, which is always written as ITxyz, where T
means THEN and E means ELSE. The second through fourth statements can be either
THEN (true) or ELSE (false):
IT<x><y><z><cond> ; IT instruction (<x>, <y>,
; <z> can be T or E)
instr1<cond><operands> ; 1st instruction (<cond>
; must be same as IT)
instr2<cond or not cond><operands> ; 2nd instruction (can be
; <cond> or <!cond>
instr3<cond or not cond><operands> ; 3rd instruction (can be
; <cond> or <!cond>
instr4<cond or not cond><operands> ; 4th instruction (can be
; <cond> or <!cond>

If a statement is to be executed when <cond> is false, the suffix for the instruction
must be the opposite of the condition. For example, the opposite of EQ is NE, the opposite
of GT is LE, and so on. The following code shows an example of a simple conditional
execution:

40 | P a g e
if (R1<R2) then
R2=R2−R1 R2=R2/2
else
R1=R1−R2
R1=R1/2

In assembly,

CMP R1, R2 ; If R1 < R2 (less then)


ITTEE LT ; then execute instruction 1 and 2
; (indicated by T)
; else execute instruction 3 and 4
; (indicated by E)
SUBLT.W R2,R1 ; 1st instruction
LSRLT.W R2,#1 ; 2nd instruction
SUBGE.W R1,R2 ; 3rd instruction (notice the GE is
; opposite of LT)
LSRGE.W R1,#1 ; 4th instruction

You can have fewer than four conditionally executed instructions. The minimum is 1.
You need to make sure the number of T and E occurrences in the IT instruction matches
the number of conditionally executed instructions after the IT.

If an exception occurs during the IT instruction block, the execution status of the block
will be stored in the stacked PSR (in the IT/Interrupt-Continuable Instruction [ICI] bit
field). So, when the exception handler completes and the IT block resumes, the rest of
the instructions in the block can continue the execution correctly. In the case of using
multicycle instructions (for example, multiple load and store) inside an IT block, if an
exception takes place during the execution, the whole instruction is abandoned and
restarted after the interrupt process is completed.

2.3.7 assembler Language: Instruction Barrier and Memory Barrier Instructions


The Cortex-M3 supports a number of barrier instructions. These instructions are needed
as memory systems get more and more complex. In some cases, if memory barrier
instructions are not used, race conditions could occur.

For example, if the memory map can be switched by a hardware register, after writing
to the memory switching register you should use the DSB instruction. Otherwise, if the
write to the memory switching register is buffered and takes a few cycles to complete,
and the next instruction accesses the switched memory region immediately, the access
could be using the old memory map. In some cases, this might result in an invalid access
if the memory switching and memory access happen at the same time. Using DSB in this
case will make sure that the write to the memory map switching register is completed
before a new instruction is executed.

The following are the three barrier instructions in the Cortex-M3:


• DMB
• DSB

41 | P a g e
• ISB
These instructions are described in Table 4.27.

The memory barrier instructions can be accessed in C using Cortex Microcontroller


Software Interface Standard (CMSIS) compliant device driver library as follows:
void __DMB(void); // Data Memory Barrier void __DSB(void);
// Data Synchronization Barrier void __ISB(void); //
Instruction Synchronization Barrier
The DSB and ISB instructions can be important for self-modifying code. For example,
if a program changes its own program code, the next executed instruction should be
based on the updated program. However, since the processor is pipelined, the modified
instruction location might have already been fetched. Using DSB and then ISB can
ensure that the modified program code is fetched again.

Architecturally, the ISB instruction should be used after updating the value of the
CONTROL register. In the Cortex-M3 processor, this is not strictly required. But if you
want to make sure your application is portable, you should ensure an ISB instruction is
used after updating to CONTROL register.

DMB is very useful for multi-processor systems. For example, tasks running on
separate processors might use shared memory to communicate with each other. In these
environments, the order of memory accesses to the shared memory can be very
important. DMB instructions can be inserted between accesses to the shared memory to
ensure that the memory access sequence is exactly the same as expected.

Table 4.27 Barrier Instructions


Instruction Description
DMB Data memory barrier; ensures that all memory accesses
are completed before new memory access is committed
DSBData synchronization barrier; ensures that all memory accesses are
completed
before next instruction is executed
ISB Instruction synchronization barrier; flushes the pipeline
and ensures that all previous instructions are
completed before executing new instructions
More details about memory barriers can be found in the ARM v7-M Architecture
Application Level Reference Manual [Ref. 2].

2.3.8 assembly Language: saturation operations


The Cortex-M3 supports two instructions that provide signed and unsigned saturation
operations: SSAT and USAT (for signed data type and unsigned data type, respectively).
Saturation is commonly used in signal processing—for example, in signal amplification.
When an input signal is amplified, there is a chance that the output will be larger than
the allowed output range. If the value is adjusted simply by removing the unused MSB,
an overflowed result will cause the signal waveform to be completely deformed (see Figure
4.3).

42 | P a g e
The saturation operation does not prevent the distortion of the signal, but at least the
amount of distortion is greatly reduced in the signal waveform.

The instruction syntax of the SSAT and USAT instructions is outlined here and in
Table 4.28.

Without
saturation

Dynamic
range 0 Amplify

With
signed
0
saturation

Table 4.28 Saturation Instructions


Instruction Description
SSAT.W <Rd>, #<immed>, <Rn>, {,<shift>}Saturation for signed value
USAT.W <Rd>, #<immed>, <Rn>, {,<shift>}Saturation for a signed value
into an unsigned value
Rn: Input value
Shift: Shift operation for input value before saturation;
optional, can be #LSL N or #ASR N Immed: Bit position
where the saturation is carried out Rd: Destination
register
Besides the destination register, the Q-bit in the APSR can also be affected by the
result. The Q flag is set if saturation takes place in the operation, and it can be cleared
by writing to the APSR (see Table4.29). For example, if a 32-bit signed value is to be
saturated into a 16-bit signed value, the following instruction can be used:
SSAT.W R1, #16, R0
Similarly, if a 32-bit unsigned value is to saturate into a 16-bit unsigned value, the
following instruction can be used:
USAT.W R1, #16, R0
This will provide a saturation feature that has the properties shown in Figure 4.4. For
the preceding 16-bit saturation example instruction, the output values shown in Table
4.30 can be observed.

43 | P a g e
Saturation instructions can also be used for data type conversions. For example, they
can be used to convert a 32-bit integer value to 16-bit integer value. However, C compilers
might not be able to directly use these instructions, so intrinsic function or assembler
functions (or embedded/inline assembler code) for the data conversion could be required.

Table 4.29 Examples of Signed


Saturation Results
Input (R0) Output (R1) Q Bit
0x00020000 0x00007FFF Set
0x00008000 0x00007FFF Set
0x00007FFF 0x00007FFF Unchanged
0x00000000 0x00000000 Unchanged
0xFFFF8000 0xFFFF8000 Unchanged
0xFFFF7FFF 0xFFFF8000 Set
0xFFFE0000 0xFFFF8000 Set

Dynamic With
Amplify unsigned
range
saturation

0 0 0

Table 4.30 Examples of Unsigned Saturation Results


Input (R0) Output (R1) Q Bit
0x00020000 0x0000FFFF Set
0x00008000 0x00008000 Unchanged
0x00007FFF 0x00007FFF Unchanged
0x00000000 0x00000000 Unchanged
0xFFFF8000 0x00000000 Set
0xFFFF8001 0x00000000 Set
0xFFFFFFFF 0x00000000 Set

2.4 Several Useful Instructions in the Cortex-M3

• MSR and MRS

• More on the IF-THEN Instruction Block

• SDIV and UDIV

• REV, REVH, and REVSH

• Reverse Bit

44 | P a g e
• SXTB, SXTH, UXTB, and UXTH

• Bit Field Clear and Bit Field Insert

• UBFX and SBFX

• LDRD and STR

2.5 Memory Systems

• Memory System Features Overview


• Memory Maps
• Memory Access Attributes
• Default Memory Access Permissions
• Bit-Band Operations
• Unaligned Transfers
• Exclusive Accesses Endian Mode

2.5.1 Memory System Features Overview

• It has a predefined memory map that specifies which bus interface is to be used
when a memory location is accessed.
• Another feature of the memory system in the Cortex-M3 is the bit-band support.
• The bit-band operations are supported only in special memory regions.
• The Cortex-M3 memory system also supports unaligned transfers and exclusive
accesses.
• The Cortex-M3 supports both little endian and big endian memory configuration.

2.5.2 Memory Maps

• Some of the memory locations are allocated for private peripherals such as
debugging components.
• These debugging components include the following:

 Fetch Patch and Breakpoint Unit (FPB)


 Data Watchpoint and Trace Unit (DWT) Instrumentation Trace Macrocell (ITM)
 Embedded Trace Macrocell (ETM)
 Trace Port Interface Unit (TPIU)
 ROM table

45 | P a g e
2.6 Bit-Band Operations

• Bit-band operation support allows a single load/ store operation to access (read/write) to a
single data bit.
• In the Cortex-M3, this is supported in two predefined memory regions called bit-band regions.

46 | P a g e
• One of them is located in the first 1 MB of the SRAM region, and the other is located in the first
1 MB of the peripheral region.
• These two memory regions can be accessed like normal memory, but they can also be accessed
via a separate memory region called the bit-band alias.

2.6.1 Bit Accesses to Bit-Band Region via the Bit-Band Alias

1. For example, to set bit 2 in word data in address 0x20000000, instead of using three instructions
to read the data, set the bit, and then write back the result, this task can be carried out by a single

instruction.
2. The assembler sequence for these two cases could be like the one shown in Figure.

2.6.2 Write to Bit-Band Alias

47 | P a g e
Example Assembler Sequence to write a Bit with and without Bit Band

• Cortex-M3 does not have special instructions for bit operation.


•converted
Special memory regions are defined so that data accesses to these regions are automatically
into bit-band operations.

2.6.3 Read from the Bit-Band Alias

Example Assembler Sequence to Read a bit with and without Bit-Band

Remapping of Bit-Band Addresses in


SRAM Region

48 | P a g e
Remapping of Bit-Band Addresses in
Peripheral Memory Region
2.6.4 Advantages of Bit-Band Operations

Implement serial data transfers in general-


purpose input/output (GPIO) ports to serial
devices.
If a branch should be carried out
based on 1 single bit in a status
register in a peripheral, instead of:

 Reading the whole register

 Masking the unwanted bits

 Comparing and branching


You can simplify the operations to:
Reading the status bit via the bit-band alias (get 0 or 1)
Comparing and branching

2.6.5 Bit-Band Operation of Different Data Sizes

 Bit-band operation is not limited to word transfers.

 It can be carried out as byte transfers or half word transfers as well.

 For example, when a byte access instruction (LDRB/STRB) is used to access a bit-band
alias address range, the accesses generated to the bitband region will be in byte size.

 The same applies to half word transfers (LDRH/ STRH).

 When you use nonword transfers to bit-band alias addresses, the address value should
still be word aligned.

2.7 Bit-Band Operations in C Programs


2.7.1 Unaligned Transfers

 The Cortex-M3 supports unaligned transfers on single accesses.


 Data memory accesses can be defined as aligned or unaligned.
 Traditionally, ARM processors (such as the ARM7/ ARM9/ARM10) allow only aligned transfers.
aThat means in accessing memory, a word transfer must have address bit[1] and bit[0] equal to 0, and
half word transfer must have address bit[0] equal to 0.

49 | P a g e
0x1002,
For example, word data can be located at 0x1000 or 0x1004, but it cannot be located in 0x1001,
or 0x1003.

 For half word data, the address can be 0x1000 or 0x1002, but it cannot be 0x1001.
Unaligned Transfer Examples(Word)

Unaligned Transfer Examples(Half Word)

Unaligned Transfer
 In the Cortex-M3, unaligned transfers are supported in normal memory accesses (such as LDR, LDRH,
STR, and STRH instructions).
 There are a number of limitations:
50 | P a g e
 Unaligned transfers are not supported in Load/ Store multiple instructions.
 Stack operations (PUSH/POP) must be aligned.
 Exclusive accesses (such as LDREX or STREX) must be aligned; otherwise, a fault exception (usage
fault) will be triggered.
 Unaligned transfers are not supported in bit-band operations. Results will be unpredictable if you
attempt to do so.

o 2.7.2 Cortex-M3 Programming

 The Cortex™-M3 can be programmed using either assembly language, C language, or other high-level
languages like National Instruments LabVIEW.
 For most embedded applications using the Cortex-M3 processor, the software can be written entirely
in C language.
 There are of course some people who prefer to use assembly language or a combination of C and
assembly language in their projects.

o A Typical Development Flow

 Various software programs are available for developing Cortex-M3 applications.


 The concepts of code generation flow in terms of these tools are similar.
 For the most basic uses, you will need assembler, a C compiler, a linker, and binary file generation
utilities.
 For ARM solutions, the RealView Development
Suite (RVDS) or RealView Compiler Tools
(RVCT) provide a file generation flow

Example Flow Using ARM Development Tools

51 | P a g e
2.7.6 Using Assembly

1. For small projects, it is possible to develop the whole application in assembly language.
2. Using assembler, best optimization is possible, but it increases the development time.
3. Handling complex data structures or function library management can be extremely difficult.

Using Assembly

• Yet even when the C language is used in a project, in some situations part of the program is implemented in
assembly language as follows:
Functions that cannot be implemented in C, such as direct manipulation of stack data or special instructions
that cannot be generated by the C compiler in normal C-code.
Timing-critical routines.
Tight memory requirements, causing part of the program to be written in assembly to get the smallest
memory size.

The Interface between Assembly and C

• In various situations, assembly code and the C program interact. For example:
When embedded assembly is used in C program code.
When C program code calls a function or subroutine implemented in assembler in a separate file.
When an assembly program calls a C function or subroutine.

The Interface between Assembly and C


1. From the above cases, it is important to understand how parameters and return results are passed between
the calling program and the function being called.
2. For simple cases, when a calling program needs to pass parameters to a subroutine or function, it will use
registers R0–R3, where R0 is the first parameter, R1 is the second, and so on.
3.Similarly, R0 and R1 is used for returning a value at the end of a function.
4. R0–R3 and R12 can be changed by a function or subroutine whereas the contents of R4–R11 should be
restored to the previous state before entering the function, usually handled by stack PUSH and stack POP.

The First Step in Assembly Programming


1. The examples here are based on ARM assembler tools (armasm) in RVDS.
2. For users of Keil MDK-ARM, the command line options are slightly different.
3. For other assembler tools, the file format and instruction syntax will also need to be modified.
4. In addition, some development tools will actually do the startup code for you, so you might not need to
worry about creating your assembly start-up code.

2.7.7 Assembling the Code

1. Program can be assembled using.

52 | P a g e
2.The-o option specifies the output file name. The test1.o is an object file. We then need to use a linker
to create an executable image (ELF) . This can be done by
3.Here, --ro-base 0x0 specifies that the read-only region (program ROM) starts at address 0x0;
rwbase specifies that the read/write region
(data memory) starts at address 0x20000000.

Assembling the Code

1. The --map option creates an image map, which is useful for understanding the memory layout of the
compiled image.
2. Finally, we need to create the binary image

3. For checking that the image looks like what we wanted, we can also generate a disassembled code list
file by

4. If everything works fine, you can then load your ELF image or binary image into your hardware or
instruction set simulator for testing.

2.7.8 Producing Outputs

1. It is always more fun when you can connect your microcontroller to the outside world.
2. The simplest way to do that is to turn on/off the LEDs.
3. However, this practice is quite limiting because it can only represent very limited information.
4. One of the most common output methods is to send text messages to a console.

Low-Cost Test Environment for Outputting Text Messages

UART interface

5. Cortex-M3 processor does not contain a UART interface, but most Cortex-M3 microcontrollers come with
UART provided by the chip manufacturers.
6.Our next example assumes that a UART is available and has a status flag to indicate whether the transmit
buffer is ready for sending out new data.

53 | P a g e
7.A level shifter is needed in the connection because RS-232 has a different voltage level than the
microcontroller I/O pins.

The “Hello World” Example

1. Before we try to write a “Hello world” program, we should figure out how to send one character through
the UART.

2.7.9 Using Data Memory

1. Back to our first example: When we were doing the linking stage, we specified the read/write memory
region.
2. How do we put data there? The method is to define a data region in your assembly file.

3. Using the same example from the beginning, we can store the data in the data memory at 0x20000000
(the SRAM region).
4.The location of the data section is controlled by a command-line option when you run the linker:

54 | P a g e
8 CMSIS Cortex Microcontroller Software Interface Standard

2.8.1 The aims of CMSIS are to:

1. Improve software portability and reusability


2. Enable software solution suppliers to develop products that can work seamlessly with
device
3. Libraries from various silicon vendors allow embedded developers to develop software
quicker with an easy-to-use and standardized software interface
4. Allow embedded software to be used on multiple compiler products
5. Avoid device driver compatibility issues when using software solutions from multiple
sources

2.8.2 Areas of standardization

 Hardware Abstraction Layer (HAL) for Cortex-M processor registers: This includes
standardized register definitions for NVIC, System Control Block registers,
SYSTICK register, MPU registers, and a number of NVIC and core feature access
functions.
 Standardized system exception names: This allows OS and middleware to use
system exceptions easily without compatibility issues.
 Standardized method of header file organization: This makes it easier for users to
learn new cortex microcontroller products and improve software portability.
 Common method for system initialization: Each Microcontroller Unit (MCU)
vendor provides a SystemInit() function in their device driver library for essential
setup and configuration, such as initialization of clocks.
 Standardized intrinsic functions: By having standardized intrinsic functions,
software reusability and portability are considerably improved.
 Common access functions for communication: This provides a set of software
interface functions for common communication interfaces including universal
asynchronous receiver/transmitter (UART), Ethernet, and Serial Peripheral
Interface (SPI).
 Standardized way for embedded software to determine system clock frequency: A
software variable called System Frequency is defined in device driver code. This
allows embedded OS to set up the SYSTICK unit based on the system clock
frequency.

2.8.3 Organization of CMSIS

The CMSIS is divided into multiple layers as follows:


1. Core Peripheral Access Layer :Name definitions, address definitions, and helper
functions to access core registers and core peripherals
2. Middleware Access Layer: Common method to access peripherals for the software
industry. Targeted communication interfaces include Ethernet, UART and SPI.
Allows portable software to perform communication tasks on any Cortex
microcontrollers that support the required communication interface.
55 | P a g e
3. Device Peripheral Access Layer (MCU specific): Name definitions, address definitions,
and driver code to access peripherals
4. Access Functions for Peripherals (MCU specific): Optional additional helper functions
for peripherals

2.8.4 Using CMSIS


Since the CMSIS is incorporated inside the device driver library, there is no special setup
requirement for using CMSIS in projects.
For each MCU device, the MCU vendor provides a header file, which pulls in additional
header files required by the device driver library, including the Core Peripheral Access Layer
defined by ARM.
The file core_cm3.h contains the peripheral register definitions and access functions for the
Cortex-M3 processor peripherals like NVIC, System Control Block registers, and SYSTICK
registers.

The core_cm3.h file also contains declaration of CMSIS intrinsic functions to allow C
applications to access instructions that cannot be generated using IEC/ISO C language. In
addition, this file also contains a function for outputting a debug message via the
Instrumentation Trace Module (ITM).

The system_<device>.h file contains microcontroller specific interrupt number definitions, and
peripheral register definitions.

The system_<device>.c file contains a microcontroller specific function called SystemInit for
system initialization.

In addition, CMSIS compliant device drivers also contain start-up code (which contains the
vector table)

56 | P a g e
2.8.5 CMSIS Files:

2.8.6 Benefits of CMSIS

Better software portability and reusability.


Allows software to be quickly ported between Cortex-M3 and other Cortex-M processors,
reducing time to market. For embedded OS vendors and middleware providers, the
advantages of the CMSIS are significant.
By using the CMSIS, their software products can become compatible with device drivers
from multiple microcontroller vendors.Without the CMSIS, the software vendors either have
to include a small library for Cortex-M3 core functions or develop multiple configurations of
their product so that it can work with device libraries from different microcontroller vendors.

2.8.7 CMSIS Avoids Overlapping Driver Code.

57 | P a g e
QUESTION BANK

1. Write a neat diagram explain thumb-2 instruction set architecture in comparison


with thumb and ARM (4).
2. Write a neat diagram explain the operation of reversing instructions. Give example
for each (6).
3. What do you mean by addressing modes explain the various addressing modes used
in unified assembler language. Give examples for each. (6)
4. Elaborate memory map of cortex M3 with neat diagram. (6)
5. Explain the 16 bit instructions in cortex M3: ADC, RSB, TST, BL, LDR, MOV, SVC
and PUSH. (6)
6. Write an ALP to find the sum of first 10 integers. (4)
7. Write a memory map of cortex M3 and explain briefly bit band operations.(5)
8. Explain the 32 bit instructions in cortex M3: AND, CMN, MLA, SDIV, STR, MRS) and
POP. (7)
9. Write a C program to toggle an LED with a small delay in Cortex M3. (5)
10. Write a diagram and explain the organisation of CMSIS. (4)
11. With examples explain the instructions ASR, LSR, ROR. (4)
12. What is meant by array instructions? Explain one, two and three operand
instructions. (4)
13. List all the instruction set in the ARM cortex microcontroller. (6)
14. Differentiate between assembly and embedded language. (4)

58 | P a g e
Module-3

Introduction to embedded systems


Learning objectives

1. Learn what an Embedded System is


2. Learn the difference between Embedded Systems and General Computing Systems
3. Know the history of Embedded Systems
4. Learn the classification of Embedded Systems based on performance, complexity and
the era in which they evolved
5. Know the domains and areas of applications of Embedded Systems
6. Understand the different purposes of Embedded Systems
7. Analysis of a real life example on the bonding of embedded technology with human life

Our day-to-day life is becoming more and more dependent on "embedded systems" and
digital techniques. Embedded technologies are bonding into our daily activities even
without our knowledge. Do you know the fact that the refrigerator, washing machine,
microwave oven, air conditioner, television, DVD players, and music systems that we use
in our home are built around an embedded system? You may be traveling by a 'Honda' or
a 'Toyota' or a 'Ford' vehicle, but have you ever thought of the genius players working
behind the special features and security systems offered by the vehicle to you? It is
nothing but an intelligent embedded system. In your vehicle itself the presence of
specialized embedded systems vary from intelligent head lamp controllers, engine
controllers and ignition control systems to complex air bag control systems to protect you
in case of a severe accident People experience the power of embedded systems and enjoy
the features and comfort provided by them. Most of us are totally unaware or ignorant of
the intelligent embedded systems giving us so much comfort and security. Embedded
systems are like reliable servants-they don't like to reveal their identity and neither they
complain about their workloads to their owners or bosses. They are always sitting in a
hidden place and arc dedicated to their assigned task till their last breath. This book gives
you an overview of embedded systems, the various steps involved in their design and
development and the major domains where they are deployed.

1.1 WHAT IS AN EMBEDDED SYSTEM?

An embedded system is an electronic/electro-mechanical system designed to perform a


59 | P a g e
specific function and is a combination of both hardware and firmware (software).
Every embedded system is unique, and the hardware as well as the firmware is highly
specialized to the application domain. Embedded systems are becoming an inevitable pan
of any product or equipment in all fields including household appliances,
telecommunications, medical equipment, industrial control, consumer products, etc.

1.2 EMBEDDED SYSTEMS vs. GENERAL COMPUTING SYSTEM

The computing revolution began with the general purpose computing requirements. Later
it was realized that the general computing requirements are not sufficient for the embedded
computing requirements. The embedded computing requirements demand "something
special" in terms of response to stimuli, meeting the computational deadlines, power
efficiency, limited memory availability, etc. Let's take the case of your personal computer,
which may be cither a desktop PC or a laptop PC or a palmtop PC. It is built around a
general purpose processor like an Intel® Centrino or a Duo/Quad' core or an AMD Turion ,M
processor and is designed to support a set of multiple peripherals like multiple USB 2.0
ports, Wi-Fi, Ethernet, video port, IEEE 1394, SD CF/MMC external interfaces, Bluetooth,
etc and with additional interfaces like a CD read/writer, on-board Hard Disk Drive (HDD),
gigabytes of RAM, etc. You can load any supported operating system (like Windows*
XP/Vista/7, or Red Hat Linux/ Ubuntu Linux, UNIX etc) into the hard disk of your PC.
You can write or purchase a multitude of applications for your PC and can use your PC for
running a large number of applications (like printing your dear's photo using a printer
device connected to the PC and printer software, creating a document using Microsoft®
Office Word tool, etc.) Now let us think about the DVD player you use for playing DVD
movies. Is it possible for you to change the operating system of your DVD? Is it possible for
you to write an application and download it to your DVD player for executing? Is it possible
for you to add printer software to your DVD player and connect a printer to your DVD
player to take a printout? Is it possible for you to change the functioning of your DVD
player to a television by changing the embedded software? The answers to all these
questions arc 'NO'. Can you see any general purpose interface like Bluetooth or Wi-Fi on
your DVD player? Of course 'NO’. The only interface you can find out on the DVD player is
the interface for connecting the DVD player with the display screen and one for controlling
the DVD player through a remote (May be an IR or any other specific wireless interface).
Indeed your DVD player is an embedded system designed specifically for decoding digital
video and generating a video signal as output to your TV or any other display screen which
supports the display interface supported by the DVD Player. Let us summarize our findings
from the comparison of embedded system and general purpose computing system with the
help of a table:

60 | P a g e
General Purpose Computing System: E m b e d d e d S y s t e m :

A system which is a combination of a generic hardware and a General Purpose Operating System for executing a variety of applications A system which is a combination of special purpose hardware and embedded OS for executing a specific set of applications

Contains a General Purpose Operating System (GPOS) May or may not contain an operating system for functioning

Applications are alterable (programmable) by the user (It is possible for the end user to re -instal the operating system, and also add or remove user applications) The firmware embedded system is
pre-pr ogra m m ed a n d it i s no n -altera ble by
the en d- use r ( T her e m ay be exc ept ion s
for systems supporting kernel image
flashing through special hardware settings )
Performance is the key deciding factor in the selection of the system. Always. 'Faster is Better' Ap plic atio n - sp eci fic re quir em ent s (li ke
perfor ma nce, P ower req u irem ents, m e m ory
Usage, etc.) are the Key deciding factors

Less/not at all tailored towards reduced operating power requirements, Highly tailored to take advanta ge of the
Options for different levels of power management. power saving modes s upported by the
hardware and the operating
Response requirements are not time-critical s y s t e m
For certain category - of embe d ded s ystems
like mission critical systems, the
response time requirement is highly critical
Need not be deterministic in execution behavior Execution behavior is deterministic for certain
types of embedded syste ms like ‘Har d Rea l
t i m e ’ s y s t e m s

However, the demarcation between desktop systems and embedded systems in certain
areas of embedded applications are shrinking in certain contexts. Smart phones are typical
examples of this. Nowadays smart phones are available with RAM up to 256 MB and users
can extend most of their desktop applications to the smart phones and it waives the clause
"Embedded systems are designed for a specific application" from the characteristics of the
embedded system for the mobile embedded device category. However, smart phones come
with a built-in operating system and it is not modifiable by the end user. It makes the
clause: "The firmware of the embedded system is unalterable by the end user", still a valid
clause in the mobile embedded device category.

1.3 HISTORY OF EMBEDDED SYSTEMS

Embedded systems were in existence even before the IT revolution. In the olden days

61 | P a g e
embedded systems were built around the old vacuum tube and transistor technologies and
the embedded algorithm was developed in low level languages. Advances in semiconductor
and nano-technology and IT revolution gave way to the development of miniature embedded
systems. The first recognized modem embedded system is the Apollo Guidance Computer
(AGC) developed by the MIT Instrumentation Laboratory for the lunar expedition. They ran
the inertial guidance systems of both the Command Module (CM) and the Lunar Excursion
Module (LEM). The Command Module was designed to encircle the moon while the Lunar
Module and its crew were designed to go down to the moon surface and land there safely.
The Lunar Module featured in total 18 engines. There were 16 reaction control thrusters,
a descent engine and an ascent engine. The descent engine was ‘designed to’ provide thrust
to the lunar module out of the lunar orbit and land it safely on the moon MIT's original
design was based on 4K words of fixed memory (Read Only Memory) and 256 words of
erasable memory (Random Access Memory). By June 1963, the figures reached 10K of fixed
and 1K of erasable memory. The final configuration was 36K words of fixed memory and
2K words of erasable memory. The clock frequency of the first microchip proto model used
in AGC was 1.024 MHz and it was derived from a 2.048 MHz crystal clock. The computing
unit of AGC consisted of approximately 11 instructions and 16 bit word logic. Around 5000
ICs (3-input NOR gates. RTL logic) supplied by Fairchild Semiconductor were used in this
design. The user interface unit of AGC is known as DSKY (display/keyboard). DSKY looked
like a calculator type keypad with an array of numerals. It was used for inputting the
commands to the module numerically.

The first mass-produced embedded system was the guidance computer for the Minuteman-
1 missile in 1961. It was the 'Autonetics D-IT guidance computer, built using discrete
transistor logic and a hard-disk for main memory. The first integrated circuit was produced
in September 1958 but computers using them didn't begin to appear until 1963. Some of
their early uses were in embedded systems, notably used by NASA for the Apollo Guidance
Computer and by the US military in the Minuteman-ɪɪ intercontinental ballistic missile.

1.4 CLASSIFICATION OF EMBEDDED SYSTEM S

It is possible to have a multitude of classifications for embedded systems, based on different


criteria. Some of the criteria used in the classification of embedded systems are:
1. Based on generation
2. Complexity and performance requirements
3. Based on deterministic behaviour
4. Based on triggering.
The classification based on deterministic system behaviour is applicable for 'Real Time'
systems. The application/task execution behaviour for an embedded system can be cither
deterministic or non- deterministic. Based on the execution behaviour, Real Time
embedded systems arc classified into Hard and Soft. We will discuss about hard and soft
real time systems in a later chapter. Embedded Systems which are 'Reactive' in nature (Like
62 | P a g e
process control systems in industrial control applications) can be classified based on the
trigger. Reactive systems can be either event triggered or time triggered.

1.4.1 CLASSIFICATION BASED ON GENERATION

This classification is based on the order in which the embedded processing systems evolved
from the first version to where they are today. As per this criterion, embedded systems can
be classified into:
1.4.1.1 First Generation The early embedded systems were built around 8bit
microprocessors like 8085 and Z80, and 4bit microcontrollers. Simple in hardware circuits
with firmware developed in Assembly code. Digital telephone keypads, stepper motor
control units etc. are examples of this.

1.4.1.2 Second Generation These are embedded systems built around 16bit
microprocessors and 8 or 16 bit microcontrollers, following the first generation embedded
systems. The instruction set for the second generation processors/controllers were much
more complex and powerful than the first generation processors/controllers. Some of the
second generation embedded systems contained embedded operating systems for their
operation. Data Acquisition Systems. SCADA systems, etc. are examples of second
generation embedded systems.
1.4.1.3 Third Generation With advances in processor technology, embedded system
developers started making use of powerful 32bit processors and I6bit microcontrollers for
their design. A new concept of application and domain specific processors/controllers like
Digital Signal Processors (DSP) and Application Specific Integrated Circuits (ASICs) came
into the picture. The instruction set of processors became more complex and powerful and
the concept of instruction pipelining also evolved. The processor market was flooded with
different types of processors from different vendors. Processors like Intel Pentium, Motorola
68K, etc, gained attention in high performance embedded requirements. Dedicated
embedded real time and general purpose operating systems entered into the embedded
market. Embedded systems spread its ground to areas like robotics, media, industrial
process control, networking, etc.

1.4.1.4 Fourth Generation The advent of System on Chips (SoC), reconfigurable processors
and multicore processors are bringing high performance, tight integration and
miniaturisation into the em- bedded device market. The SoC technique implements a total
system on a chip by integrating different functionalities with a processor core on an
integrated circuit. We will discuss about SoC’s in a later chapter. The fourth generation
embedded systems are making use of high performance real time embedded operating
systems for their functioning. Smart phone devices, mobile internet devices (MIDs), etc. are
examples of fourth generation embedded systems.

63 | P a g e
1.4.1.5 What Next? The processor and embedded market is highly dynamic and
demanding. So 'what will be the next smart move in the next embedded generation?' Let's
wait and see.

1.4.2 CLASSIFICATION BASED ON COMPLEXITY AND PERFORMANCE

This classification is based on the complexity and system performance requirements.


According to this classification, embedded systems can be grouped into:
1.4.2.1 Small-Scale Embedded Systems Embedded systems which arc simple in
application needs and where the performance requirements are not time critical fall under
this category. An electronic toy is a typical example of a small-scale embedded system.
Small-scale embedded systems are usually built around low performance and low cost 8 or
16 bit microprocessors/microcontrollers. A small-scale embedded system may or may not
contain an operating system for its functioning.
1.4.2.2 Medium-Scale Embedded Systems Embedded systems which arc slightly complex
in hardware and firmware (software) requirements fall under this category. Medium-scale
embedded systems are usually built around medium performance, low cost 16 or 32 bit
microprocessors/microcontrollers or digital signal processors. They usually contain an
embedded operating system (cither general purpose or real time operating system) for
functioning.
1.4.2.3 Large-Scale Embedded Systems/Complex Systems Embedded systems which
involve highly complex hardware and firmware requirements fall under this category. They
are employed in mission critical applications demanding high performance. Such systems
are commonly built around high performance 32 or 64 bit RISC processors/controllers or
Reconfigurable System on Chip (RSoC) or multi-core processors and programmable logic
devices. They may contain multiple processors/controllers and co-units/hardware
accelerators for offloading the processing requirements from the main processor of the
system. Decoding/encoding of media, cryptographic function implementation, etc. are
examples for processing requirements which can be implemented using a co-
processor/hardware accelerator. Complex embedded systems usually contain a high
performance Real Time Operating System (RTOS) fur task scheduling, prioritization and
management.

1.5 MAJOR APPLICATION AREAS OF EMBEDDED SYSTEMS

64 | P a g e
We are living in a world where embedded systems play a vital role in our day-to-day life,
starting from home to the computer industry, where most of the people find their job for a
livelihood. Embedded technology has acquired a new dimension from its first generation
model, the Apollo guidance computer, to the latest radio navigation system combined with
in-car entertainment technology and the microprocessor based "Smart" running shoes
launched by Adidas in April 2005. The application areas and the products in the embedded
domain arc countless. A few of the important domains and products are listed below:

1. Consumer electronics: Camcorders, cameras, etc.

2. Household appliances: Television, DVD players, washing machine, fridge, microwave


oven. etc.
3. Home automation and security systems: Air conditioners, sprinklers, intruder
detection alarms, closed circuit television cameras, fire alarms, etc.
4. Automotive industry: Anti-lock breaking systems (ABS), engine control, ignition
systems, automatic navigation systems, etc.
5. Telecom: Cellular telephones, telephone switches, handset multimedia applications,
etc.
6. Computer peripherals: Printers, scanners, fax machines, etc.
7. Computer networking systems: Network routers, switches, hubs, firewalls, etc.
8. Healthcare: Different kinds of scanners, EEG, ECG machines etc.
9. Measurement & Instrumentation: Digital multi meters, digital CROs, logic analyzers
PLC systems, etc.
10. Banking & Retail: Automatic teller machines (ATM) and currency counters, point of
sales (POS)
11. Card Readers: Barcode, smart card readers, hand held devices, etc.

1.6 PURPOSE OF EMBEDDED SYSTEMS

As mentioned in the previous section, embedded systems are used in various domains like
consumer electronics, home automation, telecommunications, automotive industry,
healthcare, control & instrumentation, retail and banking applications, etc. Within the
domain itself, according to the application usage context, they may have different
functionalities. Each embedded system is designed to serve the purpose of any one or a
combination of the following tasks:
1. Data collection/Storage/Representation
2. Data communication
3. Data (signal) processing
4. Monitoring
5. Control
6. Application specific user interface
65 | P a g e
DATA COLLECTIO N / S TORAGE / REPRESENTATION

An embedded system designed for the purpose of data collection performs acquisition of
data from the external world. Data collection is usually done for storage, analysis,
manipulation and transmission. The term "data" refers all kinds of information, viz. text,
voice, image, video, electrical signals and any other measurable quantities. Data can be
either analog (continuous) or digital (discrete). Embedded systems with analog data
capturing techniques collect data directly in the form of analog signals whereas embedded
systems with digital data collection mechanism converts the analog signal to corresponding
digital signal using analog to digital (A/D) converters and then collects the binary equivalent
of the analog data. If the data is digital, it can be directly captured without any additional
interface by digital embedded systems.
The collected data may be stored directly in the system or may be transmitted to some other
systems or it may be processed by the system or it may be deleted instantly after giving a
meaningful representation. These actions are purely dependent on the purpose for which
the embedded system is designed Embedded systems designed for pure measurement
applications without storage, used in control and instrumentation domain, collects data
and gives n meaningful representation of the collected data by means of graphical
representation or quantity value and deletes the collected
data when new data arrives at the data collection terminal.
Analog and digital CROs without storage memory arc
typical examples of this. Any measuring equipment used in
the medical domain for monitoring without storage
functionality also comes under this category.
Some embedded systems store the collected data for
processing and analysis. Such systems incorporate a built-in/plug-in storage memory for
storing the captured data. Some of them give the user a meaningful representation of the
collected data by visual (graphical/quantitative) or audible means using display units
[Liquid Crystal Display (LCD), Light Emitting Diode (LED), etc.] buzzers, alarms, etc.
Examples are: measuring instruments with storage memory and monitoring instruments
with storage memory used in medical applications. Certain embedded systems store the
data and will not give a representation of the same to the user, whereas the data is used
for internal processing.
A digital camera is a typical example of an embedded system with data
collection/storage/ representation of data. Images are captured and the captured image
may be stored within the memory of the camera. The captured image can also be presented
to the user through a graphic LCD unit.

[Fig. 1.1] A digital camera for image capturing/storage/display

DATA COMMUNICATION

66 | P a g e
Embedded data communication systems
are deployed in applications ranging from
complex satellite communication systems
to simple home networking systems. As
mentioned earlier in this chapter, the data
collected by an embedded terminal may
require

transferring of the same to some other system located


remotely. The transmission is achieved cither by a wire-
line medium or by a wireless medium. Wire-line medium was the most common choice in
all olden days embedded systems.AS technology is changing, wireless medium is becoming
the de-facto standard for data communication in embedded systems.A wireless medium
offers cheaper connectivity solutions and make the communication link free from the hassle
of wire bundles. Data can either be transmitted by analog means or by digital means.
Modern industry trends are settling towards digital communication.

The data collecting embedded terminal itself can incorporate data communication units like
wireless modules (Bluetooth, ZigBee, Wi-Fi, EDGE, GPRS, etc.) or wire-line modules (RS-
232C, USB, TCP/IP, PS2, etc.). Certain embedded systems act as a dedicated transmission
unit between the sending and receiving terminals, offering sophisticated functionalities like
data packetizing, encrypting and decrypting. Network hubs, routers, switches, etc. arc
67 | P a g e
typical examples of dedicated data transmission embedded systems. They act as mediators
in data communication and provide various features like data security, monitoring etc.

1.6.1 DATA (signal) processin g

As mentioned earlier, the data (voice, image, video, electrical signals and other
measurable quantities) collected by embedded systems may be used for various
kinds of data processing. Embedded systems with signal processing
functionalities arc employed in applications demanding signal processing like
speech coding, synthesis, audio video codec, transmission applications, etc.
A digital hearing aid is a typical example of an embedded system employing data
processing. Digital hearing aid improves the hearing capacity of hearing
impaired persons.

[Fig.1.3] A digital hearing aid employing signal processing technique

1.6.2 MONITORING

Embedded systems falling under this category are specifically designed for monitoring
purpose. Almost all embedded products coming under the medical domain arewith
monitoring functions only. They are used for determining the state of some variables using
input sensors. They cannot impose control over variables. A very good example is the electro
cardiogram (ECG) machine for monitoring the heartbeat of a patient. The machine is
intended to do the monitoring of the heartbeat. It cannot impose control over the heartbeat.
The sensors used in ECG are the different electrodes connected to the patient's body.
Some other examples of embedded systems with monitoring function are measuring
instruments like digital CRO, digital multimeters, logic analyzers.etc. used in Control
& Instrumentation applications.
They are used for knowing (monitoring) the status of some variables like current,
voltage, etc. They cannot control the variables in turn.

1.6.3 CONTROL

Embedded systems with control functionalities impose control over


some variables according to the changes in input variables. A system
with control functionality contains both sensors and actuators.
Sensors are connected to the input port for capturing the changes
in environmental variable or measuring variable. The actuators connected to the output port

68 | P a g e
are controlled according to the changes in input variable to put an impact on the controlling
variable to brine the controlled variable to the specified range.
Air conditioner system used in our home to control the room temperature to a specified
limit is a typical example for embedded system for control purpose. An air conditioner
contains a room temperature sensing element (sensor) which may be a thermistor and a
handheld unit for setting up (feeding) the desired temperature. The handheld unit may be
connected to the central embedded unit residing inside the air conditioner through a
wireless link or through a wired link. The air compressor unit acts as the actuator. The
compressor is controlled according to the current room temperature and the desired
temperature set by the end user.
Here the input variable is the current room temperature and the controlled variable is also
the room temperature. The controlling variable is cool air flow by the compressor unit. If
the controlled variable and input variable are not at the same value, the controlling variable
tries to equalise them through taking actions on the cool air flow.

1.6.4 APPLICATION SPECIFIC USER INTERFACE

These are embedded systems with application-specific user interfaces like buttons,
switches, keypad, lights, bells, display units, etc. Mobile phone is an example for this. In
mobile phone the user interface is provided through the keypad, graphic LCD module,
system speaker, vibration alert, etc

69 | P a g e
Fig. 1.6 An embedded system with an application -
specific user interface (photo courtesy of Nokia
mobile h andsets ( www.nokia.com) )

1.7 'SMART'RUNNING SHOES FROM ADIDAS—THE


INNOVATIVE

BONDING OF LIFESTYLE WITH EMBEDDED TECHNOLOGY

After three years of extensive research work, Adidas launched


the "Smart" running shoes in the market in April 2005. The
term "Smart Shoe" may sound gimmicky. But adaptive
cushioning provided by the shoe makes sense, and the design
engineering behind the shoes is very impressive. The shoe
constantly adapts its shock-absorbing characteristics to customize its value to the
individual runner, depending on the running style, pace, body weight, and running
surface. The shoe uses a magnetic sensing system to measure cushioning level, which
is adjusted via a digital signal processing unit that controls a motor-driven cable
system.

A hall effect sensor is positioned at the top of the "cushioning element", and the magnet
is placed at the bottom of the element. As the cushioning compresses on each impact, the
sensor measures the distance from top to bottom of mid-sole (accurate to 0.1 mm). About
1000 readings per second are taken and relayed to the shoe's microprocessor. The
Microprocessor (MPU) is positioned under the arch of the shoe. It runs an algorithm that
compares the compression messages received from the sensor to a preset range of proper
cushioning levels, so it understands if the shoe is too soft or too firm. Then the MPU sends
a command to a micro motor, housed in the mid-foot. The micro motor turns a lead screw
to lengthen or shorten a cable secured to the walls of a plastic-cushioning element. When
the cable is shortened, the cushioning element is pulled taut and compresses very little. A
longer cable allows for a more cushioned feel. A replaceable 3-V battery powers the motor
and lasts for about 100 hours of running.
The Portland, Ore-based Adidas Innovation Team that developed the shoe was led by
Christian DiBenedetto. It also included electromechanical engineer Mark Oleson, as well
as a footwear developer and two industrial designers. Oleson explains that the team chose
a magnetic sensor because it could measure the amount of compression in addition to the
time it took to reach full compression. Gathering sensor data, he says, meant little without
building a comparative "running context". So one of the first steps in developing the MPU
algorithms was building this database. Runners wore test shoes that gathered information
70 | P a g e
about various compression levels during a run. Then the runners were interviewed to learn
their thoughts about the different cushion levels. "When the two matched up that helped
validate our sensor," says Oleson.
Adaptations in the cushioning element account for the change of running surface and pace
of the runner, and they're made gradually over an average of four running steps. The goal
is for the runner not to feel any sudden changes. Adaptations are made during the "swing"
phase rather than the "stance" phase of the stride (i.e. when the foot is off the ground). If
the shoe's owner prefers a more cushioned or a firmer "ride." adjustments can be made via
"+" and "- " buttons that also activate the intelligent functions of the shoe.
LED indicators confirm when the electronics arc turned on (The lights do not remain on
when the shoes are in use). If the shoes aren't turned on,they operate like old-fashioned
"manual"running shoes. The shoes turn off if their owner is either inactive or at a walking
pace for 10 minutes.
The Typical Embedded System

FPGA/ASIC/DSP/SoC
Microprocessor/controller Embedded
Firmware

Memory

Communication Interface

System
I/p Ports Core O/p Ports
(Sensors)
(Actuators)

Other supporting
Integrated Circuits &
subsystems

Embedded System

Real World

Typical Embedded System

71 | P a g e
A typical embedded system contains a single chip controller, which acts as the master
brain of the system. The controller can be a Microprocessor (e.g. Intel 8085) or a
microcontroller (e.g. Atmel AT89C51) or a Field Programmable Gate Array (FPGA) device (e.g.
Xilinx Spartan) or a Digital Signal Processor (DSP) (e.g. Blackfin® Processors from Analog
Devices) or an Application Specific

Integrated Circuit (ASIC)/Application Specific Standard Product (ASSP) (e.g.


ADE7760 Single Phase Energy Metering IC from) Analog Devices for energy metering
applications).

Embedded hardware/software systems are basically designed to regulate a physical


variable or to manipulate the state of some devices by sending some control signals to the
Actuators or devices connected to the O/p ports of the system, in response to the input
signals provided by the end users or Sensors which are connected to the input ports. Hence
an embedded system can be viewed as a reactive system. The control is achieved by
processing the information coming from the sensors and user interfaces, and controlling
some actuators that regulate the physical variable.

Key boards, push button switches, etc. are examples for common user interface input
devices whereas LEDs, liquid crystal displays, piezoelectric buzzers, etc. are examples for
common user interface output devices for a typical embedded system. It should be noted
that it is not necessary that all embedded systems should incorporate these 1/0 user
interfaces. It solely depends on the type of the application for which the embedded system
is designed. For example, if the embedded system is designed for any handheld application,
such as a mobile handset application, then the system should contain user inter\ faces like
a keyboard for performing input operations and display unit for providing users the status
of various activities in progress.

Some embedded systems do not require any manual intervention for their operation. They
automatically sense the variations in the input parameters in accordance with the changes
in the real world, to which they are interacting through the sensors which are connected to
the input port of the system. The sensor information is passed to the processor after signal
conditioning and digitization. Upon receiving the sensor data this processor or brain of the
embedded system performs some pre-defined operations with the help of the firmware
embedded in the system and sends some actuating signals to the actuator connected to the
output port of the embedded system, which in turn acts on the controlling variable to bring
the controlled variable to the desired level to make the embedded system work in the desired
manner.

The Memory of the system is responsible for holding the control algorithm and other
important configuration details. For most of embedded systems, the memory for storing the
algorithm or configuration data is of fixed type, which is a kind of Read Only Memory (ROM)
and it is not available for the end user for modifications, which means the memory is
protected from unwanted user interaction by implementing some kind of memory protection
mechanism. The most common types of memories used in embedded systems for control
72 | P a g e
algorithm storage are OTP, PROM, UVEPROM, EEPROM and FLASH. Depending on the
control application, the memory size may vary from a few bytes to megabytes. We will
discuss them in detail in the coming sections. Sometimes the system requires temporary
memory for performing arithmetic operations or control algorithm execution and this type
of memory is known as “working memory”. Random Access Memory (RAM) is used in most
of the systems as the working memory. Various types of RAM like SRAM, DRAM and NVRAM
are used for this purpose. The size of the RAM also varies from a few bytes to kilobytes or
megabytes depending on the application. The details given under the section “Memory” will
give you a more detailed description of the working memory.

An embedded system without a control algorithm implemented memory is just like a new
born baby. It is having all the peripherals but is not capable of making any decision
depending on the situational as well as real world changes. The only difference is that the
memory of a new born baby is self-adaptive, meaning that the baby will try to learn from
the surroundings and from the mistakes committed. For embedded systems it is the
responsibility of the designer to impart intelligence to the system.

In a controller-based embedded system, the controller may contain internal memory for
storing the control algorithm and it may be an EEPROM or FLASH memory varying from a
few kilobytes to megabytes. Such controllers are called controllers with on-chip ROM, e. g.
Atmel AT89C51. Some controllers may not contain on-chip memory and they require an
external (off-chip) memory for holding the control algorithm, e. g. Intel 8031AH.

2.1 CORE OF THE EMBEDDED SYSTEM

Embedded systems are domain and application specific and are built around a
central core. The core of the embedded system falls into any one of the following categories:

1. General Purpose and Domain Specific Processor


1.1 Microprocessors
1.2 Microcontrollers
1.3 Digital Signal Processors

2. Application Specific Integrated Circuits (ASICs)

3. Programmable Logic Devices (PLDs)

4. Commercial off-the-shelf Components (COTS)

If you examine any embedded system you will find that it is built around any of the
core units mentioned above.

2.1.1 General Purpose and Domain Specific Processors

73 | P a g e
Almost 80% of the embedded systems are processor/controller based. The processor
may be a microprocessor or a microcontroller or a digital signal processor, depending on the
domain and application Most of the embedded systems in the industrial control and
monitoring applications make use of the commonly available microprocessors or
microcontrollers whereas domains which require signal processing such as speech coding,
speech recognition, etc. make use of special kind of digital signal processors supplied by
manufacturers like, Analog Devices, Texas Instruments, etc.

2.1.1.1 Microprocessors

Microprocessor is a silicon chip representing a central processing unit (CPU), which is


capable of performing arithmetic as well as logical operations according to a pre-de. Fined
set of instructions, which is specific to the manufacturer. In general the CPU contains the
Arithmetic and Logic Unit (ALU), control unit and working registers. A microprocessor is a
dependent unit and it requires the combination of other hardware like memory, timer unit,
and interrupt controller, etc., for proper functioning. Intel claims the credit for developing
the first microprocessor unit Intel 4004‘a 4bit processor which was released in November
1971. It featured 1K data memory, a 12bit program counter and 4K program memory,
sixteen 4bit general purpose registers and 46 instructions. It ran at a clock speed of 740
kHz. It was designed for olden day’s calculators. In 1972, 14 more instructions were added
to the 4004 instruction set and the program space is upgraded to 8K. Also interrupt
capabilities were added to it and it is renamed as Intel 4040. It was quickly replaced in April
1972 by Intel 8008 which was similar to Intel 4040, the only difference was that its program
counter was 14 bits wide and the 8008 served as a terminal controller. In April 1974 Intel
launched the first 8 bit processor, the Intel 8080, with 16bit address bus and program
counter and seven 8bit registers (A-E,H,L: BC, DE, and HL pairs formed the 16bit register
for this processor). Intel 8080 was the most commonly used processors for industrial control
and other embedded applications in the 19755. Since the processor required other hardware
components as mentioned earlier for its proper functioning, the systems made out of it were
bulky and were lacking compactness.

Immediately after the release of Intel 8080, Motorola also entered the market with their
processor, Motorola 6800 with a different architecture and instruction set compared to
8080.

In 1976 Intel came up with the upgraded version of 8080 Intel 8085, with two newly added
instructions, three interrupt pins and serial I/O. Clock generator and bus controller circuits
were built-in and the power supply part was modified to a single +5 V supply.

In July 1976 Zilog entered the microprocessor market with its Z80 processor as competitor
to Intel. Actually it was designed by an ex-intel designer, Frederico Faggin and it was an
improved version of Intel’s 8080 processor, maintaining the original 8080 architecture and
instruction set with an 8bit data bus and a 16bit address bus and was capable of executing
all instructions of 8080. It included 80 more new instructions and it brought out the concept

74 | P a g e
of register banking by doubling the register set. 280 also included two sets of index registers
for flexible design.

Technical advances in the field of semiconductor industry brought a new dimension to the
microprocessor market and twentieth century witnessed a fast growth in processor
technology. 16, 32 and 64 bit processors came into the place of conventional 8bit processors.
The initial 2 MHZ clock is now an old story. Today processors with clock speeds up to 2.4
GHz are available in the market. More and more competitors entered into the processor
market offering high speed, high performance and low cost processors for customer design
needs.

Intel, AMD, Freescale, IBM, TI, Cyrix, Hitachi, NBC, LSI Logic, etc. are the key players in
the processor market. Intel still leads the market with cutting edge technologies in the
processor industry.

Different instruction set and system architecture are available for the design of a
microprocessor. Harvard and Von-Neumann are the two common system architectures for
processor design. Processors based on Harvard architecture contains separate buses for
program memory and data memory, whereas processors based on Von-Neumann
architecture shares a single system bus for program and data memory. We will discuss more
about these architectures later, under a separate topic. Reduced Instruction Set Computing
(RISC) and Complex Instruction Set Computing (CISC) are the two common Instruction Set
Architectures (ISA) available for processor design. We will discuss the same under a separate
topic in this section.

2.1.1.2 General Purpose Processor (GPP) vs. Application-Specific Instruction Set


Processor (ASIP)

A General Purpose Processor or GPP is a processor designed for general computational


tasks. The processor running inside your laptop or desktop (Pentium MAME Athlon, etc.) is
a typical ex: ample for general purpose processor. They are produced in large volumes and
targeting the general market. Due to the high volume production, the per unit cost for a
chip is low compared to ASIC or other specific ICs. A typical general purpose processor
contains an Arithmetic and Logic Unit (ALU) and Control Unit (CU). On the other hand,
Application Specific Instruction Set Processors (ASIPs) are processors with architecture and
instruction set optimized to specific-domain/application requirements like network
processing, automotive, telecom, media applications, digital signal processing, control
applications, etc. ASIPs fill the architectural spectrum between general purpose processors
and Application Specific Integrated Circuits (ASICs). The need for an ASIP arises when the
traditional general purpose processor are unable to meet the increasing application needs.
Most of the embedded systems are built around application specific instruction set
processors. Some microcontrollers (like automotive AVR, USB AVR from Atmel), system on
chips, digital signal processors, etc. are examples for application specific instruction set
processors (ASIPs). ASIPs incorporate a processor and on-chip peripherals, demanded by
the application requirement, program and data memory.
75 | P a g e
2.1.1. 3 Microcontrollers

A Microcontroller is a highly integrated chip that contains a CPU, scratch pad RAM, special
and general purpose register arrays, on chip ROM/FLASH memory for program storage,
timer and interrupt control units and dedicated I/O ports. Microcontrollers can be
considered as a super set of microprocessors. Since a microcontroller contains all the
necessary functional blocks for independent working, they found greater place in the
embedded domain in place of microprocessors. Apart from this, they are cheap, cost effective
and are readily available in the market.

Texas Instrument’s TMS 1000 is considered as the world’s first microcontroller. We cannot
say it as a hilly functional microcontroller when we compare it with modern
microcontrollers. TI followed Intel’s 4004/4040, 4 bit processor design and added some
amount of RAM, program storage memory (ROM) and support on a single chip, there by
eliminated the requirement of multiple hardware chips for self-functioning. Provision to add
custom instructions to the CPU was another innovative feature of TMS 1000. TMS 1000 was
released in 1974.

In 1977 Intel entered the microcontroller market with a family of controllers coming under
one umbrella named MCS-48TM family. The processors came under this family were
8038HL, 8039HL, 8040AHL, 8048H, 8049H and 8050AH. Intel 8048 is recognized as Intel’s
first microcontroller and it was the most prominent member in the MCS-48TMT family. It
was used in the original IBM PC keyboard. The inspiration behind 8048 was Fairchild’s F8
microprocessor and Intel’s goal of developing a low cost and small size processor. The design
of 8048 adopted a true Harvard architecture where program and data memory shared the
same address bus and is differentiated by the related control signals. MCS-48m is a trade
mark owned by Intel.

Eventually Intel came out with its most fruitful design in the 8bit microcontroller domain-
the 8051 family and its derivatives. It is the most popular and powerful 8bit microcontroller
ever built. It Wasdeveloped in the 19805 and was put under the family MCS-Sl. Almost 75%
of the microcontroller used in the embedded domain were 8051 family based controllers
during the 1980-903. 8051 processor cores are used in more than 100 devices by more than
20 independent manufacturers like Maxim, Philips, Atmel, etc. under the license from Intel.
Due to the low cost, wide availability, memory efficient instruction set, mature development
tools and Boolean processing (bit manipulation operation) capability, 8051 family derivative
microcontrollers are much used in high-volume consumer electronic devices) entertainment
industry and other gadgets where cost-cutting is essential.

Another important family of microcontrollers used in industrial control and embedded


applications is the PIC family micro controllers from Microchip Technologies (It will be
discussed in detail in a later section of this book). It is a high performance RISC

76 | P a g e
microcontroller complementing the CISC (complex instruction set computing) features of
8051. The terms RISC and CISC will be explained in detail in a separate heading.

Some embedded system applications require only 8bit controllers whereas some embedded
applications requiring superior performance and computational needs demand l6/32bit
microcontrollers. Infineon, Freescale, Philips, Atmel, Maxim, Microchip etc. are the key
suppliers of 16bit microcontrollers. Philips tried to extend the 8051 family microcontrollers
to use for 16bit applications by developing the Philips XA (extended Architecture)
microcontroller series.

8bit microcontrollers are commonly used in embedded systems where the processing
power is not a big constraint. As mentioned earlier, more than 20 companies are producing
different flavors of the 8051 family microcontroller. They try to add more and more
functionalities like built in SP1, 12C serial buses, USB controller, ADC, Networking
capability, etc. So the competitive market is driving towards a one-stop solution chip in
microcontroller domain. High processing speed microcontroller families like ARM11 series
are also available in the market, which provides solution to applications requiring hardware
acceleration and high processing capability.

Freescale, NBC, Zilog, Hitachi, Mitsubishi, Infineon, ST Micro Electronics, National, Texas
Instruments, Toshiba, Philips, Microchip, Analog Devices, Daewoo, Intel, Maxim, Sharp,
Silicon Laboratories, TDK, Triscend, Winbond, Atmel, etc. are the key players in the
microcontroller market. Of these Atmel has got special significance. They are the
manufacturers of a variety of Flash memory based microcontrollers. They also provide In-
System Programmability (which will be discussed in detail in a later section of this book) for
the controller. The Flash memory technique helps in fast reprogramming of the chip and
thereby reduces the product development time. Atmel also provides another special family
of microcontroller called AVR (it will be discussed in detail in a later chapter), an 8bit RISC
Flash microcontroller, and fast enough to execute powerful instructions in a single clock
cycle and provide the latitude you need to optimize power consumption.

The instruction set architecture of a microcontroller can be either RISC or CISC.


Microcontrollers are designed for either general purpose application requirement (general
purpose controller) or domain specific application requirement (application specific
instruction set processor). The Intel 8051 microcontroller is a typical example for a general
purpose microcontroller, whereas the automotive AVR microcontroller family from Atmel
Corporation is a typical example for ASIP specifically designed for the automotive domain.

2.1.1.4 Microprocessor vs Microcontroller

The following table summarizes the differences between a microcontroller and


microprocessor.

2.1.1.6 RISC vs. CISC Processors/Controllers


77 | P a g e
The term RISC stands for Reduced Instruction Set Computing. As the name implies,
Microprocessor Microcontroller

A silicon chip representing a Central A microcontroller is a highly integrated


Processing Unit (CPU), which is capable chip that contains a CPU, scratch pad
of performing arithmetic as well as RAM, Special and General purpose
logical operations according to a pre- Register Arrays, On Chip ROM/FLASH
defined set of Instructions memory for program storage, Timer and
Interrupt control units and dedicated I/O
ports

It is a dependent unit. It requires the It is a self-contained unit and it doesn’t


combination of other chips like Timers, require external Interrupt Controller,
Program and data memory chips, Timer, UART etc. for its functioning
Interrupt controllers etc. for functioning

Most of the time general purpose in Mostly application oriented or domain


design and operation specific

Doesn’t contain a built in I/O port. The Most of the processors contain multiple
I/O Port functionality needs to be built-in I/O ports which can be operated
implemented with the help of external as a single 8 or 16 or 32 bit Port or as
Programmable Peripheral Interface individual port pins
Chips like 8255

Targeted for high end market where Targeted for embedded market where
performance is important performance is not so critical (At present
this demarcation is invalid)

Limited power saving options compared Includes lot of power saving features
to microcontrollers

all RISC processors/controllers possess lesser number of instructions, typically in the range
of 30 to 40. CISC stands for Complex Instruction Set Computing. From the definition itself
it is clear that the instruction set is complex and instructions are high in number. From a
programmers point of View RISC processors are comfortable since s/he needs to learn only
a few instructions, whereas for a CISC processor s/he needs to learn more number of
instructions and should understand the context of usage of each instruction (This scenario
is explained on the basis of a programmer following Assembly Language coding. For a
programmer following C coding it doesn’t matter since the cross-compiler is responsible for
the conversion of the high level language instructions to machine dependent code). Atmel
AVR microcontroller is an example for a RISC processor and its instruction set contain only
32 instructions. The original version of 8051 microcontroller (e.g. AT89C51) is a CISC
controller and its instruction set contains 255 instructions. There are some other factors
like pipelining features, instruction set type, etc. for determining the RISC/CISC criteria.
Some of the important criteria are listed below:
78 | P a g e
RISC CISC
Lesser number of instruction Greater number of Instruction
Instruction pipelining and increased Generally no instruction pipelining
execution speed feature
Non-orthogonal instruction set (All
Orthogonal instruction set (Allows each
instructions are not allowed to operate
instruction to operate on any register
on any register and use any addressing
and use any addressing mode)
mode. It is instruction-specific)
Operations are performed on registers Operations are performed on registers or
only, the only memory operations are memory depending on the instruction
load and store
Limited number of general purpose
A large number of registers are available
registers
Instructions are like macros in C
language. Programmer can achieve the
Programmer needs to write more code to desired functionality with a single
execute a task since the instructions are instruction which in tum provides the
simpler ones effect of using more simpler single
instructions in RISC

Single, fixed length instructions Variable length instructions


More silicon usage since more additional
Less silicon usage and pin count decoder logic is required to implement
the complex instruction decoding.
Can be Harvard or Von-Neumann
With Harvard Architecture
Architecture

I hope now you are clear about the terms RISC and CISC in the processor technology. Isn’t
it?

2.1.1.7 Harvard vs. Von-Neumann Processor/Controller Architecture

The terms Harvard and Von-Neumann refers to the processor architecture design.
Microprocessors/controllers based on the Von-Neumann architecture shares a single
common bus for fetching both instructions and data. Program instructions and data are
stored in a common main memory. Von-Neumann architecture based
processors/controllers first fetch an instruction and then fetch the data to support the
instruction from code memory. The two separate fetches slows down the controller‘s
operation. Von-Neumann architecture is also referred as Princeton architecture, since it was
developed by the Princeton University.

79 | P a g e
I/O CPU Memory

Program
CPU Data Memory
Memory

Single shared Bus

Von-Neumann Architecture Harvard Architecture

Microprocessors/controllers based on the Harvard architecture will have separate


data bus and instruction bus. This allows the data transfer and program fetching to occur
simultaneously on both buses. With Harvard architecture. The data memory can be read
and written while the program memory is being accessed. These separated data memory
and code memory buses allow one instruction to execute while the next instruction is
fetched (“pre-fetching”).

The pre-fetch theoretically allows much faster execution than Von-Neumann


architecture. Since some additional hardware logic is required for the generation of control
signals for this type of operation it adds silicon complexity to the system. Figure explains
the Harvard and Von-Neumann architecture concept.

The following table highlights the differences between Harvard and Von-Neumann
architecture.

Harvard Architecture Von-Neumann Architecture


Separate buses for instruction and data Single shared bus for instruction and
fetching data fetching
Easier to pipeline, so high performance can Low performance compared to Harvard
be achieved architecture
Comparatively high cost Cheaper
No memory alignment problems Allows self-modifying codes

Since data memory and program memory Since data memory and program memory
are stored physically in different locations, are stored physically in the same chip,
no chances for accidental corruption of chances for accidental corruption of
program memory program memory
2.1.1.8 Big-Endian vs. Little-Endian Processors/Controllers

Endianness specifies the order in which the data is stored in the memory by processor
operations in a multi byte system (Processorswhose word size is greater than one byte).
Suppose the word length is two byte then data can be stored in memory in two different
ways:

1. Higher order of data byte at the higher memory and lower order of data byte at location
just below the higher memory.

80 | P a g e
2. Lower order of data byte at the higher memory and higher order of data byte at location
just below the higher memory.

Little-endian means the lower-order byte of the data is stored in memory at the lowest
address, and the higher-order byte at the highest address. (The little end comes first.) For
example, a 4 byte long integer Byte3 Byte2 Byte1 Byte0 will be stored in the memory as
shown below:

Base Address + 0 Byte 0 Byte 0 0x20000 (Base Address)

Base Address + 1 Byte 1 Byte 1 0x20001 (Base Address + 1)

Base Address + 2 Byte 2 Byte 2 0x20002 (Base Address + 2)

Base Address + 3 Byte 3 Byte 3 0x20003 (Base Address + 3)

Little-endian Operation

Base Address + 0 Byte 3 Byte 3 0x20000 (Base Address)

Base Address + 1 Byte 2 Byte 2 0x20001 (Base Address + 1)

Base Address + 2 Byte 1 Byte 1 0x20002 (Base Address + 2)

Base Address + 3 Byte 0 Byte 0 0x20003 (Base Address + 3)

Big-endian Operation

2.1.1.9 Load Store Operation and Instruction Pipelining

As mentioned earlier, the RISC processor instruction set is orthogonal, meaning it


operates on registers. The memory access related operations are. Performed by the special
instructions load and store. If the operand is specified as memory location, the content of it
is loaded to a register using the load instruction. The instruction store stores data from a

81 | P a g e
specified register to a specified memory location. The concept of Load Store Architecture is
illustrated with the following example:

R1 R2 R3
1 3 3 1
load R1, x
load R2, y 2
x 00 add R3, R1, R2 3
y 7F ALU 3
store R3, z 4
z 23

4
Load Store Operation

Suppose x, y and z are memory locations and we want to add the contents of x and y
and store the result in location 2. Under the load store architecture the same is achieved
with 4 instructions as shown in Figure.

The first instruction load R. X loads the register R1 with the content of memory
location x, the second instruction load R2, y loads the register R2 with the content of
memory location y. The instruction add R3. R1, R2 adds the content of registers R1 and R2
and stores the result in register R3. The next instruction store R3.z stones the content of
register R3 in memory location z.

The conventional instruction execution by the processor follows the fetch-decode-


execute sequence. Where the ‘fetch’ part fetches the instruction from program memory or
code memory and the decode part decodes the instruction to generate the necessary control
signals. The execute stage reads the operands, perform ALU operations and stores the
result. In conventional program execution, the fetch and decode operations are performed
in sequence. For simplicity let’s consider decode and execution together. During the decode
operation the memory address bus is available and if it is possible to effectively utilize it for
an instruction fetch, the processing speed can be increased. In its simplest form instruction
Pipelining refers to the overlapped execution of instruction. Under normal program
execution how it is meaningful to fetch the next instruction to execute, while decoding and
execution of the current instruction is in progress. If the current instruction in progress is
a program control how transfer instruction like jump or call instruction, there is no meaning
in fetching the instruction following the current instruction. In such cases the instruction
fetched is flushed and a new instruction fetch is performed to fetch the instruction.
Whenever the current instruction is executing the program counter will be loaded with the
82 | P a g e
address of the next instruction. In case of jump or branch instruction, the new location is
known only after completion of the jump or branch instruction. Depending on the stages
involved in an instruction (fetch, read register and decode. execute instruction, access an
operand in data memory, write back the result to register, etc.), there can be multiple levels
of instruction pipelining. Figure illustrates the concept of Instruction pipelining for single
stage pipelining.

Clock Pulses Clock Pulses Clock Pulses

Machine Cycle 1 Machine Cycle 2 Machine Cycle 3


Fetch (PC)
Execute (PC - 1) Fetch (PC+1)
Execute (PC) Fetch (PC+2)
PC : Program Counter Execute (PC+1)

The Single stage pipelining concept

2.1.2 Application Specific Integrated Circuits (ASICs)

Application Specific Integrated Circuit (ASIC) is a microchip designed to perform a


specific or unique application. It is used as replacement to conventional general purpose
logic 0’s. It integrates several functions into a single chip and there by reduces the system
development cos Most of the ASICs are proprietary products. As a single chip, ASIC
consumes a very small area in the total system and thereby helps in the design of smaller
systems with high capabilities/functionalities.

ASICs can be pre-fabricated for a special application or it can be custom fabricated by using
the components from a re-usable ‘building block’ library of components for a particular
customer application, ASIC based systems are profitable only for large volume commercial
productions. Fabrication of ASICs requires a non-refundable initial investment for the
process technology and configuration expenses. This investment is known as Non-Recurring
Engineering Charge (NRE) and it is a one-time investment.

If the Non-Recurring Engineering Charges (NRE) is borne by a third party and the
Application Specific Integrated Circuit (ASIC) is made openly available in the market, the
ASIC is referred as Application Specific Standard Product (ASSP). The ASSP is marketed to
multiple customers just as a general-purpose product is, but to a smaller number of
customers since it is for a specific application. “The ADE7760 Energy Meter ASIC developed
by Analog Devices for Energy metering applications is a typical example for ASSP".

83 | P a g e
Since Application Specific Integrated Circuits (ASICs) are proprietary products, the
developers of such chips may not be interested in revealing the internal details of it and
hence it is very difficult to point out an example of it. Moreover it will create legal disputes
if an illustration of such an ASIC product is given without getting prior permission from the
manufacturer of the ASIC. For the time being, let us forget about it. We will come back to it
in another part of this book series (Namely, Designing Advanced Embedded Systems).

2.1.3 Programmable Logic Devices

Logic devices provide specific functions, including device-to-device interfacing, data


communication, signal processing, data display, timing and control operations, and almost
every other function a system must perform. Logic devices can be classified into two broad
categories fixed and programmable. As the name indicates, the circuits in a fixed logic device
are permanent, they perform one function or set of functions-once manufactured, they
cannot be changed. On the other hand, Programmable Logic Devices (PLDs) offer customers
a wide range of logic capacity, features, speed, and voltage characteristics-and these devices
can be re-configured to perform any number of functions at any time.

With programmable logic devices, designers use inexpensive software tools to quickly
develop, simulate, and test their designs. Then, a design can be quickly programmed into a
device, and immediately tested in a live circuit. The PLD that is used for this prototyping is
the exact same PLD that will be used in the final production of a piece of end equipment,
such as a network router, a DSL modem, a DVD player, or an automotive navigation
systen§3 There are no NRE costs and the final design is completed much faster than that of
a custom, fixed logic device. Nether key benefit of using PLDs is that during the design phase
customers can change the circuitry as often as they want until the design operates to their
satisfaction. That’s because PLDs re based on re-writable memory technology to change the
design, the device is simply reprogrammed. Once the design is final, customers can go into
immediate production by simply programming as many PLDs as they need with the final
software design file.

Advantages of PLD:

Programmable logic devices offer a number of important advantages over fixed logic devices,
including:

 PLDs offer customers much more flexibility during the design cycle because design
iterations are simply a matter of changing the programming file, and the results of
design changes can be seen immediately in working parts.
 PLDs do not require long lead times for prototypes or production parts-the PLDs are
already on a distributor’s shelf and ready for shipment.
 PLDs do not require customers to pay for large NRE costs and purchase expensive
mask sets+PLD suppliers incur those costs when they design their programmable
devices and are able to amortize those costs over the multi-year lifespan of a given
line of PLDs.

84 | P a g e
 PLDs allow customers to order just the number of parts they need, when they need
them, allowing them to control inventory! Customers who use fixed logic devices
often end up with excess inventory which must be scrapped, or if demand for their
product surges, they may be caught short of parts and face production delays.
 PLDs can be reprogrammed even after a piece of equipment is shipped to a customer
In fact, thanks to programmable log1c devices, a number of equipment
manufacturers now tout the ability to add new features or upgrade products that
already are in the field. To do this, they simply upload a new programming file to the
PLD, via the Internet, creating new hardware logic in the system.
Over the last few years programmable logic suppliers have made such phenomenal
technical advances that PLDs are now seen as the logic solution of choice from many
designers. One reason for this is that PLD suppliers such as Xilinx are “fabless” companies;
instead of owning chip manufacturing foundries, Xilinx outsource that job to partners like
Toshiba and UMC, whose chief occupation is making chips. This strategy allows Xilinx to
focus on designing new product architectures, software tools, and intellectual property cores
while having access to the most advanced semiconductor process technologies. Advanced
process technologies help PLDs in a number of key areas: faster performance, integration of
more features, reduced power consumption, and lower cost.

FPGAs are especially popular for prototyping ASIC designs where the designer can
test his design by downloading the design file into an FPGA device. Once the design is set,
hardwired chips are produced for faster performance.

Just a few years ago, for example, the largest FPGA was measured in tens of
thousands of system gates and operated at 40 MHz Older FPGAs also were relatively
expensive, costing often more than $150 for the most advanced parts at the time. Today,
however, FPGAs with advanced features offer millions of gates of logic capacity, operate at
300 MHz, can cost less than $10, and offer a new level of integrated functions such as
processors and memory.

2.1.3.1 CPLDS and FPGAs

The two major types of programmable logic devices are Field Programmable Gate
Arrays (FPGAS) and Complex Programmable Logic Devices (CPLDS). Of the two, FPGAS offer
thehighest amount of logic density, the most features, and the highest performance. The
largest FPGA now shipping, part of the Xilinx VirtexTM line of devices, provides eight million
“system gates” (the relative density of logic). These advanced devices also offer features such
as built-in hardwired processors (such as the IBM power PC), substantial amounts of
memory, clock management systems, and support for many of the latest, very fast device-
to-device signaling technologies. FPGAs are used in a wide variety of applications ranging
from data processing and storage, to instrumentation, telecommunications, and digital
signal processing.

CPLDs, by contrast, offer much smaller amounts of logic-up to about 10, 000 gates But
CPLDs offer very predictable timing characteristics and are therefore ideal or critical control
85 | P a g e
application CPLDs such as Xilinx CoolRunnerTM series also require extremely low amounts
of power and are very inexpensive, making them ideal for cost-sensitive, battery-operated,
portable applications such as mobile phones and digital handheld assistants.

2.1.4 Commercial Off-the-Shelf Components (COTS)

A Commercial Off-the-Shelf (COTS) product is one which is used ‘as-is’. COTS products are
designed in such a way to provide easy integration and interoperability with existing system
components. The COTS component itself may be developed around a general purpose or
domain specific processor or an Application Specific Integrated circuit or a programmable
logic device. Typical examples of COTS hardware unit are remote controlled toy car control
units including the circuitry part, high performance, high frequency microwave electronics
(2-200 GHz), high bandwidth analog-to-digital converters, devices and components for
operation at very high temperatures, electro-optic IR imaging arrays, UV/IR detectors, etc.
The major advantage of using COTS is that they are readily available in the market, are
cheap and adeveloper can cut down his/her development time to a great extent. This in turn
reduces the time to market your embedded systems.”

The TCP/IP plug-in module available from various 1% manufactures like ‘WIZnet’,
‘Freescale’, ‘Dynalog’, etc. \é;., are very good examples of COTS product. This “network plug-
in module gives the TCP/IP connectivity told the system you are developing there is no need
to design. K«AM.This module yourself and write the firmware for the TCP/ S. Q A tip protocol
and data transfer. Everything will be readily supplied by the COTS manufacturer. What you
need is to do is identify the COTS for your system and give the plug-in option on your board
according to the hardware COTS. Though multiple vendors supply COTS for the same
application, the major problem faced by the end user is that there are no operational and
manufacturing standards. A Commercial off-the-shelf (COTS) component manufactured by
a vendor need not have hardware plug-in and firmware interface compatibility with one
manufactured by a second vendor for the same application. This restricts the end-user to
stick to a particular vendor for a particular COTS. This greatly affects the product design.

The major drawback of using COTS components in embedded design is that the
manufacturer of the COTS component may withdraw the product or discontinue the
production of the COTS at any time if a rapid change in technology occurs, and this will
adversely affect a commercial manufacturer of the embedded system which makes use of
the specific COTS product.

2.2 MEMORY

Memory is an important part of a processor/controller based embedded systems. Some of


the processors/controllers contain built in memory and this memory is referred as on-chip
memory. Others do not contain any memory inside the chip and requires external memory
to be connected with the controller/processor to store the control algorithm. It is called off-
chip memory. Also some working memory is required for holding data temporarily during

86 | P a g e
certain operations. This section deals with the different types of memory used in embedded
system applications.

2.2.1 Program Storage Memory (ROM)

The program memory or code storage of an embedded system stores the program
instructions and it can be classified into different types as per the block diagram
representation given in Fig.

FLASH Code Memory NVRAM


(ROM)

PROM Masked ROM


EPROM EEPROM
(OTP) (MROM)

Classification of Program Memory (ROM)

The code memory retains its contents even after the power to it is turned off. It is
generally known as non-volatile storage memory. Depending on the fabrication, erasing and
programming techniques they are classified into the following types.

2.2.1.1 Masked ROM (MROM)

Masked ROM is a one-time programmable device. Masked ROM makes use of the hardwired
technology for storing data. The device is factory programmed by masking and metallization
process at the time of production itself, according to the data provided by the end user, The
primary advantage of this is low cost for high volume production. They are the least
expensive type of solid state memory. Different mechanisms are used for the masking
process of the ROM, like:

1. Creation of an enhancement or depletion mode transistor through channel implant.

2. By creating the memory cell either using a standard transistor or a high threshold
transistor. In the high threshold mode, the supply voltage required to turn ON the transistor

87 | P a g e
is above the normal ROM IC operating voltage. This ensures that the transistor is always off
and the memory cell stores always logic0.

Masked ROM is a good candidate for storing the embedded firmware for low cost embedded
devices. Once the design is proven and the firmware requirements are tested and frozen, the
binary data (The firmware cross compiled/assembled to target processor specific machine
code) corresponding to it can be given to the MROM fabricator. (The limitation with MROM
based firmware storage is the inability to modify the device firmware against firmware
upgrades. Since the MROM is permanent in hit storage, it is not possible to alter the bit
information

2.2.1.2 Programmable Read Only Memory (PROM) / (OTP)

Unlike Masked ROM Memory, 6 Time Programmable Memory (OTP) or PROM is not pre-
programmed by the manufacturer. The end user is responsible for programming these
devices. This memory has chromeorpolysilicon wires arranged in a matrix. These wires can
be functionally viewed as fuses. It is programmed by: 1 FROM programmer which selectively
bums the fuses according to the bit pattern to be stored. Fuses which are not blown/burned
represents a logic “1” whereas fuses which are blown/burned represents a logic “0”. The
default state is logic “1”. DTP is widely used for commercial production of embedded systems
whose prom-typed versions are proven and 1re code is finalized. It is a low cost solution for
commercial production. OTPs cannot be reprogrammed.

2.2.1.3 Erasable Programmable Read Only Memory (EPROM)

OTPs are not useful and worth for development purpose. During the development phase the
de is subject to continuous changes and using an OTP each time to load the code is not
economical. Erasable Programmable Read Only Memory (EPROM) gives the flexibility to rep-
program the same chip. EPROM stores the bit information by charging the floating gate of
an FET. Bit information is stored using an EPROM programmer, which applies high voltage
to charge the floating gate. EPROM contains a quartz crystal window for erasing the stored
information. If the window is exposed to ultraviolet rays for a fixed duration, the entire
memory will be erased. Even though the EPROM chip is flexible in terms of re-
programmability, it needs to be taken out of the circuit board and put in a UV eraser device
for 20 to 30 minutes. So it is a tedious and time-consuming process.

2.2.1.4 Electrically Erasable Programmable Read Only Memory (EEPROM)

As the name indicates, the information contained in the EEPROM memory can be
altered by using electrical signals at the register/Byte level. They can be erased and

88 | P a g e
reprogrammed in-circuit. These chips include a chip erase mode and in this mode they can
be erased in a few milliseconds. It provides greater flexibility for system design. The only
limitation is their capacity is limited when compared with the standard R019 (A few
kilobytes).

2.2.1.5 FLASH

FLASH is the latest ROM technology and is the most popular ROM technology used
in today’s embedded designs. FLASH memory is a variation of EEPRO technology. It
combines the re-programmability of EEPROM and the high capacity of standard ROM.
FLASH memory is organized as sectors (blocks) or pages. FLASH memory stores information
in an array of floating gate MOSFET transistors. The erasing of memory can be done at
sector level or page level without affecting the other sectors or pages. Each sector/page
should be erased before re-programming. The typical erasable capacity of FLASH is 1000
cycled W27C512 from WINBOND (www.winbond.com) is an example of 64KB FLASH
memory.

2.2.1.6 NVRAM

Non-volatile RAM is a random access memory with battery backup. It contains static
RAM based memory and a minute battery for providing supply to the memory in the absence
of external power supply. The memory and battery are packed together in a single package.
The life span of NVRAM is expected to be around 10 years”) DS-644 from Maxim/Dallas is
an example of 32KB NVRAM.

2.2.2 Read-Write Memory/ Random Access Memory (RAM)

RAM is the data memory or working memory of the controller/processor.


Controller/processor can read from it and write to it. RAM is volatile, meaning when the
power is turned off, all the contents are destroyed. RAM is a direct access memory, meaning
we can access the desired memory location, directly without the need for traversing through
the entire memory locations to reach the desired memory position) (i.e. random access of
memory location). This is in contrast to the Sequential Access Memory (SAM), where the
desired memory location is accessed by either traversing through the entire memory or
through a ‘seek’ method. Magnetic tapes, CD ROMS, etc. are examples of sequential access
memories. RAM generally falls into three categories: Static RAM (SRAM), dynamic RAM
(DRAM) and non-volatile RAM (NVRAM).

89 | P a g e
Read/Write
Memory (RAM)

SRAM DRAM NVRAM

Classification of working memory (RAM)

2.2.2.1 Static RAM (SRAM)

Static RAM stores data in the form of voltage. They are made up of flip-flops. Static RAM is
the fastest form of RAM available. In typical implementation, an SRAM cell (bit) is realized
using six transistors (or 6 MOSFETs). Four of the transistors are used for building the Latch
(fIip-fl0p) part of the memory cell and two for controlling the access. SRAM is fast in
operation due to its resistive networking and switching capabilities. In its simplest
representation and SRAM cell can be visualized as shown in Fig:

Bit Line B\ Bit Line B


Q1 Q3

Q5 Q6

Q2 Q4
Vcc

Word Line

SRAM cell implementation

This implementation in its simpler form can be Visualized as two cross coupled
inverters with read/ write control through transistors. The four transistors in the middle
form the cross-coupled inverters. This can be visualized as shown in Fig.

From the SRAM implementation diagram, it is clear that access to the memory cell is
controlled by the line Word Line, which controls the access transistors (MOSFETS) Q5 and

90 | P a g e
Q6. The access transistors control the connection to bit lines B & B\. In order to write a
value to the memory cell, apply the desired value to the bit control lines (For writing 1,
make B = 1 and B =0; for writing 0, make B = 0 and B\ =1) and assert the Word Line (Make
Word linehigh). This operation latches the bit written in the dip-hop. For reading the content
of the memory cell, assert both 8 and B\ bit lines to l and set the Word line to 1.

The major limitations of SRAM are low capacity and high cost. Since a minimum of
six transistors are required to build a single memory cell, imagine how many memory cells
we can fabricate on a silicon wafer.

2.2.2.2 Dynamic RAM(DRAM)

Dynamic RAM stores data in the form of charge. They are made up of MOS transistor
gates. The advantages of DRAM are its high density and low cost compared to SRAM. The
disadvantage is that since the information is stored as charge it gets leaked off with time
and to prevent this they need to be refreshed periodically. Special Circuits called DRAM
controllers are used for the refreshing operation. The refresh operation is done periodically
in milli-sec0nds interval. Figure illustrates the typical implementation of a DRAM cell.

Bit Line B

Word Line

+
-

The MOSFET acts as the gate for the incoming and outgoing data whereas the
capacitor acts as the bit storage unit. Table given below summarizes the relative merits and
demerits of SRAM and DRAM technology.

91 | P a g e
SRAM Cell DRAM Cell

Made up of 6 CMOS Made up of a MOSFET and a capacitor


transistors (MOSFET)

Doesn’t Require refreshing Requires refreshing

Low capacity (Less dense) High Capacity (Highly dense)

More expensive Less Expensive

Fast in operation. Typical Slow in operation due to refresh


access time is 10ns requirements. Typical access time is 60ns.
Write operation is faster than read operation.

2.2.2.3 NVRAM

Non-volatile RAM is a random access memory with battery backup. It contains static
RAM based memory and a minute battery for providing supply to the memory in the absence
of external power supply. The memory and battery are packed together in a single package.
NVRAM is used for the non-volatile storage of results of operations or for setting up of flags,
etc. The life span of NVRAM is expected to be around 10 years. DSl744 from Maxim/Dallas
is an example for 32KB NVRAM.

2.3 SENSORS AND ACTUATORS


At the very beginning of this chapter it is already mentioned that an embedded system
is in constant interaction with the Real world and the controlling/monitoring functions
executed by the embedded system is achieved in accordance with the changes happening
to the Real world. The changes in system environment or variables are detected by the
sensors connected to the input port of the embedded system. If the embedded system is
designed for any controlling purpose, the system will produce some changes in the
controlling variable to bring the controlled variable to the desired value. It is achieved
through an actuator connected to the output port of the embedded system. If the embedded
system is designed for monitoring purpose only, then there is no need for including an
actuator in the system. For example, take the case of an ECG machine. It is designed to
monitor the heart beat status of a patient and it cannot impose a control over the patient’s
heart beat and its order. The sensors used here are the different electrode sets connected to
the body of the patient. The variations are captured and presented to the user (may be a
doctor) through a visual display or some printed chart.

2.3.1 Sensors

Sensor is a transducer device that converts energy from one farm to another for any
measurement or control purpose. This is what I “by-hearted” during my engineering degree
from the transducers paper. If we look back to the “Smart” running shoe example given at
the end of Chapter 1, we can identify that the sensor which measures the distance between
the cushion and magnet in the smart running shoe is a magnetic Hall Effect sensor.

92 | P a g e
2.3.2 Actuators
Actuator is a form of transducer device (mechanical or electrical) which converts
signals to corresponding physical action (motion). Actuator acts as an output device.
Looking back to the “Smart” running shoe example given at the end of Chapter 1, we can
see that the actuator used for adjusting the position of the cushioning element is a micro
stepper motor.

2.3.3 The I/0 Subsystem


I/0 subsystem of the embedded system facilitates the interaction of the embedded
system with the external world? As mentioned earlier the interaction happens through the
sensors and actuators connected to the input) and output ports respectively of the
embedded system. The sensors may not be directly interfaced to the input ports, instead
they may be interfaced through signal conditioning and translating systems like ADC,
opt0couplers, etc. This section illustrates some of the sensors and actuators used in
embedded systems and the I/O systems to facilitate the interaction of embedded systems
with external world.

2.3.3.1 Light Emitting Diode (LED)

Light Emitting Diode (LED) is an important output device for visual indication in any
embedded system. LED can be used as an indicator for the status of various signal: or
situations. Typical examples are indicating the presence of power conditions like ‘Device ON’
Battery low’ or ‘Charging of battery’ for a battery operated hand held embedded devices.

Light Emitting Diode is a pm junction diode (Refer Analog Electronics fundamentals


to refresh your memory for p-n junction diode) and it contains an anode and a cathode. For
proper functioning of the LED, the anode of it should be connected to +ve terminal of the
supply voltage and cathode to the -ve terminal of supply voltage. The current flowing
through the LED must be limited to a value below the maximum current that it can conduct.
A resister is used in series between the power supply and the LED GND to limit the current
through the LED. The ideal LED interfacing circuit is shown in Figure.

Vcc
R

GND

LED’s can be interfaced to the port pin of a processor/controller in two ways. In the
first method, the anode is directly connected to the port pin and the port pin drives the LED.
In this approach the port pin ‘sources’ current to the LED when the port pin is at logic High

93 | P a g e
(Logic ‘1’). In the second method, the cathode of the LED is connected to the port pin of the
processor/controller and the anode to the supply voltage through a current limiting resistor.
The LED is turned on when the port pin is at logic Low (Logic ‘0’). Here the port pin ‘sinks’
current. If the LED is directly connected to the port pin, depending on the maximum current
that a port pin can source, the brightness of LED may not be to the required level. In the
second approach, the current is directly sourced by the power supply and the port pin acts
as sink for current. Here we will get the required brightness for the LED.

2.3.3.2 7-Segment LED Display


The 7-segment LED display is an output device for
displaying alpha numeri2.characters. It contains
8 light-emitting diode (LED) segments arranged
in a special form. Out of the 8 LED segments, 7 are
used for displaying alpha numeric characters and 1 is
used for representing ‘decimal point’ in decimal
number display. Figure explains the
arrangement of LED segments in a 7-segment LED display.

The LED segments are named A to G and the decimal point LED segment is named as DP.
The LED segments A to G and DP should be lit accordingly to display numbers and
characters. For exam 1e, for dis la in the number 4, the segments F, G, B and C are lit. For
displaying 3, the segments A, B, C, D, G and DP are lit. For displaying the character ‘D’, the
segments B, C, D, E and G are lit. All these 8 LED segments need to be connected to one
port of the processor/controller for displaying alpha numeric digits. The 7-segment LED
displays are available in two different configurations, namely; Common Anode and Common
Cathode. In the common anode configuration, the anodes of the 8 segments are connected
commonly whereas in the common cathode configuration, the 8 LED segments share a
common cathode line. Figure 2.15 illustrates the Common Anode and Cathode
configurations.

Based on the configuration of the 7-segment LED unit, the LED segment’s anode or
cathode is connected to the port of the processor/controller in the order ‘A’ segment to the
least significant port pin and DP segment to the most significant port pin.

Anode Common Cathode LED Display


DP G F E D C B A

DP G F E D C B A
Common Anode LED Display Cathode

Fig.2.15 common anode and cathode configurations of a 7-segment LED Display


The current flow through each of the LED segments should be limited to the
maximum value supported by the LED display unit. The typical value for the current falls
94 | P a g e
within the range of 20mA. The current through each segment can be limited by connecting
a current limiting resistor to the anode or cathode of each segment. The value for the current
limiting resistors can be calculated using the current value from the electrical parameter
listing of the LED display.

For common cathode configurations, t anode of each LED segment is connected to


the port pins of the port to which the display is interface the anode of the common anode
LED display is connected to the 5V supply voltage through a current limiting resistor and
the cathode of each LED segment is connected to the respective port pin line. For an LED
segment to light in the Common anode LED configuration, the port pin to which the cathode
of the LED segment is connected should be set at logic 0. 7-Segment LED display is a
popular choice for low cost. LED embedded applications like, Public telephone call
monitoring I/0 Interface devices, point of sale terminals, etc.

2.3.3.3 Opto-coupler

Opto-coupler is a solid state device to isolate two parts of a circuit. Optocoupler


combines an LED and a photo-transistor in a single housing (package).

In electronic circuits, an Optocoupler is used for suppressing interference in data


communication, circuit isolation, high voltage separation, simultaneous separation and
signal intensification, etc. Optocouplers can be used in either input circuits or in output
circuits. Figure illustrates the usage of optocoupler in input circuit and output circuit of an
embedded system with a micro-controller as the system core. Optocoupler is available as IC
from different semiconductor manufacturers. The MCT2M IC from Fair child semiconductor
is an example for optocoupler IC.

Figure Optocoupler device illustrates the functioning of an Optocoupler device.


Vcc

LED AT89C51 LED


I/p interface Port Pin
Opto-coupler in Input and Output circuit Port Pin
O/p interface

Photo-transistor Photo-transistor

Opto-Coupler Microcontroller Opto-Coupler


IC MCT2M IC MCT2M

1.3.3.5 Relay

Relay is an electr0-mechanical device. In embedded application, the ‘Relay’ unit acts


as dynamic path selectors for signals and power. The ‘Relay’ unit contains a relay coil made
up of insulated wire on a metal core and 3 metal armature with one or more contacts.

‘Relay’ works on electromagnetic principle. When a voltage is applied to the relay coil,
current flows through the coil, which in turn generates a magnetic field. The magnetic field

95 | P a g e
attracts the armature core and moves the contact point. The movement of the contact point
changes the power/signal flow path. ‘Relays’ are available in different configurations. Figure
given below illustrates the widely used relay configurations for embedded applications.

Relay Coil
Relay Coil

Relay Coil
Single Pole Single Single Pole Single Single Pole Double
Throw Normally Throw Normally Throw
Open Closed

Relay configuration
The Single Pole Single Throw configuration has only one path for information flow.
The path is either open or closed in normal condition. For normally Open Single Pole Single
Throw relay, the circuit is normally open and it becomes closed when the relay is energized.
For normally closed Single Pole Single Throw configuration, the circuit is normally closed
and it becomes open when the relay is energized. For Single Pole Double Throw Relay, there
are two paths for information how and they are selected by energizing or de-energizing the
relay.

The Relay is normally controlled using a relay driver circuit connected to the port pin
of the processor/controller. A transistor is used for building the relay driver circuit. Figure
illustrates the same.

Vcc
Freewheeling Diode

Relay Coil

Load
Port Pin

Relay Unit

Transistor based Relay driving circuit


A free-wheeling diode is used for free-wheeling the voltage produced in the opposite
direction when the relay coil is de-energized. The freewheeling diode is essential for
protecting the relay and the transistor.

96 | P a g e
Most of the industrial relays are bulky and requires high voltage to operate. Special
relays called ‘Reed’ relays are available for embedded application requiring switching of low
voltage DC signals.

2. 3.6 Piezo Buzzer

Piezo buzzer is a piezoelectric device for generating audio indications in embedded


application. A piezoelectric buzzer contains a piezoelectric diaphragm which produces
audible sound in response to the voltage applied to it. Piezoelectric buzzers are available in
two types. ‘Self-driving’ and ‘external driving’. The ‘Self-driving’ circuit contains all the
necessary components to generate sound at a predefined tone. It will generate a tone on
applying the voltage. External driving piez0buzzers supports the generation of different
tones. The tone can be varied by applying a variable pulse train to the piezoelectric buzzer.
A piez0buzzer can be directly interfaced to the port pin of the processor/control. Depending
on the driving current requirements, the piezo0buzzer can also be interfaced using a
transistor based driver circuit as in the case of a ‘Relay’.

1.3.3.6 Push Button Switch

It is an input device. Push button switch comes in two configurations, namely ‘Push
to Make’ and ‘Push to Break’. In the ‘Push to Make’ configuration, the switch is normally
in the open state and it makes a circuit contact when it is pushed or pressed. In the ‘Push
to Break ‘configuration, the switch is normally in the closed state and it breaks the circuit
contact when it is pushed or pressed. The push button stays in the ‘closed’ (For Push to
Make type) or ‘open’ (For Push to Break type) state as long as it is kept in the pushed state
and it breaks/makes the circuit connection when it is released.

The Push button is normally connected to the port pin of the host processor/controller.
Depending on the way in which the push button interfaced to the controller, it can
generate either a ‘HIGH’ pulse or a ‘LOW’ pulse. Figure illustrates how the push button
can be used for generating ‘LOW’ and ‘HIGH’ pulses.

97 | P a g e
2.4 COMMUNICATION INTERFACE
Communication interface is essential for communicating with various subsystems of
the embedded system and with the external world. For an embedded product, the
communication interface can be viewed in two different perspectives; namely; Device/board
level communication interface (On-board Communication Interface) and Product level
communication interface (External Communication Interface). Embedded product is a
combination of different types of components (chips devices) arranged on a printed circuit
board (PCB). The communication channel which interconnects the various components
within an embedded product is referred as device/board level communication interface (on-
board communication interface). Serial interfaces like I2C. SPI, UART, l-Wire, etc. and
parallel bus interface are examples of On-board Communication Interface.

Some embedded systems are self-contained units and they don’t require any
interaction and data transfer with other sub-systems or external world. On the other hand,
certain embedded systems may be a part of 3 large distributed system and they require
interaction and data transfer between various devices and sub-modules. The Product a
wired media or a wireless media and it can be a serial or a parallel interface. Infrared (IR),
Bluetooth (BT), Wireless LAN (Wi-Fi), Radio Frequency waves (RF), GPRS, etc. are examples
for wireless communication interface. RS-232C/RS-422/RS-485. USB, Ethernet IEEE 1394
port, Parallel port, CF-II interface, SDIO, PCMCIA, etc. are examples for wired interfaces. It
is not mandatory that an embedded system should contain an external communication
interface. Mobile communication equipment is an example for embedded system with
external communication interface.

The following section gives you an overview of the various ‘On-board’ and ‘External’
communication interfaces for an embedded product. We will discuss about the various
physical interface. Firmware requirements and initialization and communication sequence
for these interfaces in a dedicated book titled Device interfacing, which is planned under
this series.

2.4.1 On-board Communication Interfaces

On-board Communication Interface refers to the different communication


channels/buses for interconnecting the various integrated circuits and other peripherals
within the embedded system. The following section gives an overview of the various
interfaces for on-board communication.

2.4.1.1 Inter Integrated Circuit (12C) Bus

The Inter Integrated Circuit Bus (I2C-Pronounced ‘1 square C‘) is a synchronous bi-
directional half duplex (one-directional communication at a given point of time) two wire
serial interface bus. The concept of 12C bus was developed by ‘Philips semiconductors’ in
98 | P a g e
the early 1980s. The original intention of I2C was to provide an easy way of connection
between a microprocessor/micr0controller system and the peripheral chips in television
sets. The I2C bus comprise of two bus lines. Namely; Serial Clock SCL and Serial Data SDA.
SCL line is responsible for generating synchronization clock pulses and SDA is responsible
for transmitting the serial data across devices. I2C bus is a shared bus system to which
many number of I2C devices can be connected. Devices connected to the 12C bus can act
as either 'Master’ device or ‘Slave’ device. The ‘Master' device is responsible for controlling
the communication by initiating/terminating data transfer. Sending data and generating
necessary synchronization clock pulses. ‘Slave’ devices wait for the commands from the
master and respond upon receiving the command, ‘Master’ and ‘Slave’ devices can act as
either transmitter or receiver. Regardless whether a master is acting as transmitter or
receiver, the synchronization clock signal is generated by the ‘Master’ device only. 12C
supports multi masters on the same bus.

The following bus interface diagram shown in Fig. illustrates the connection of master and
slave devices on the 12C bus.

SCL SDA Vcc


2.2K

SDA
2.2K
Port Pins SCL
Slave 1
SCL I2C Device
Master SDA (Eg: Serial
(Microprocessor/ EEPROM)
Controller)

SCL Slave 2
SDA I2C Device

I2C Bus

I2C Bus Interfacing

The 12C bus interface is built around an input buffer and an open drain or collector
transistor. When the bus is in the idle state, the open drain/collector transistor will be in
the floating state and the output lines (SDA and SCL) switch to the ‘High Impedance’ state.
For proper operation of the bus, both the bus lines should be pulled to the supply voltage
99 | P a g e
(+5V for TTL family and +3.3V for CMOS family devices) using pull-up resistors. The typical
value of resistors used in pull-up is 2.2K. With pull-up resistors, the output lines of the bus
in the idle state will be ‘HIGH’.

The address of a 12C device is assigned by hardwiring the address lines of the device
to the desired logic level. The address to various 12C devices in an embedded device is
assigned and hardwired at the time of designing the embedded hardware. The sequence of
operations for communicating with a 12C slave device is listed below:

1. The master device pulls the clock line (SCL) of the bus to ‘HIGH’

2. The master device pulls the data line (SDA) ‘LOW’, when the SCL line is at logic ‘HIGH’
(This is the ‘Start’ condition for data transfer)

3. The master device sends the address (7 bit or 10 bit wide) of the ‘slave’ device to which it
wants to communicate, over the SDA line. Clock pulses are generated at the SCL line for
synchronizing the bit reception by the slave device. The MSB of the data is always
transmitted first. The data in the bus is valid during the ‘HIGH’ period of the clock signal

4. The master device sends the Read or Write bit (Bit value = 1 Read operation; Bit value 0
Write Operation) according to the requirement.

5. The master device waits for the acknowledgement bit from the slave device whose address
is sent on the bus along with the Read/Write operation command.

6. Slave devices connected to the bus compares the address received with the address
assigned to them 6. The slave device with the address requested by the master device
responds by sending an acknowledge bit (Bit value = 1) over the SDA line.

7. Upon receiving the acknowledge bit, the Master device sends the 8bit data to the slave
device over SDA line, if the requested operation is ‘Write to device‘. If the requested operation
is ‘Read from device', the slave device sends data to the master over the SDA line.

8. The master device waits for the acknowledgement bit from the device upon byte transfer
complete for a write operation and sends an acknowledge bit to the Slave device for a read
operation

9. The master device terminates the transfer by pulling the SDA line ‘HlGI-I’ when the clock
line SCL is at logic ‘HIGH’ (Indicating the ‘STOP’ condition)

12C bus supports three different data rates. They are: Standard mode (Data rate up
to 100kbits/sec (100 kbps)), Fast mode (Data rate up to 400kbits sec (400 kbps)) and High
Speed mode (Data rate up to 3.4 Mbps). The first generation 12C devices were designed to
support data rates only up to 100kbps. The new generation 12C devices are designed to
operate at data rates up to 3.4Mbits/sec.
100 | P a g e
2.4.1.2 Serial peripheral Interface (SPI)

Bus The Serial Peripheral Interface Bus (SPI) is a synchronous bi-directional full
duplex four-wire serial interface bus. The concept of SPI was introduced by Motorola. SPI is
a single master multi-slave system. It is possible to have a system where more than one SPI
device can be master, provided the condition only one master device is active at any given
point of time, is satisfied. SPI requires four signal lines for communication. They are:

Master Out Slave in (MOSI): Signal line carrying the data from master to slave device. It is
also known as Slave Input/Slave Data in (SI/SD1)

Master in Slave out (MISO): Signal line carrying the data from slave to master device. It is
also known as Slave Output (SO/SDO)

Serial Clock (SCLK): Signal line carrying the clock signals

Slave Select (SS): Signal line for slave device select. It is an active low signal

The bus interface diagram shown in Figure illustrates the connection of master and
slave devices on the SPI bus.

MOSI SCL MISO

MISO
SCL
MOSI MOSI Slave 1
SCL SPI Device
Master
MISO (Eg: Serial
(Microprocessor/
SS\ EEPROM)
Controller)
SS1\
SS2\
MOSI
Slave 2
SCL
SPI Device
MISO
(Eg: LCD)
SS\

SPI Bus

SPI bus Interfacing

101 | P a g e
The master device is responsible for generating the clock signal. It selects the required
slave device by asserting the corresponding slave device’s slave select signal ‘LOW’. The data
out line (MISO) of all the slave devices when not selected floats at high impedance state.

The serial data transmission through SP1 bus is fully configurable. SP1 devices
contain a certain set of registers for holding these configurations. The serial peripheral
control register holds the various configuration parameters like master slave selection for
the device, baud rate selection for communication, clock signal control, etc. The status
register holds the status of various conditions for transmission and reception.

SP1 works on the principle of ‘Shift Register’. The master and slave devices contain a
special shift register for the data to transmit or receive. The size of the shift register is device
dependent. Normally it is a multiple of 8. During transmission from the master to slave, the
data in the master’s shift register is shined out to the M081 pin and it enters the shift
register of the slave device through the M081 pin of the slave device. At the same time the
shifted out data bit from the slave device’s shift register enters the shift register of the master
device through MlSO pin. In summary, the shift registers of ‘master’ and ‘slave’ devices form
a circular buffer. For some devices, the decision on whether the LS/MS bit of data needs to
be sent out first is configurable through configuration register (e.g. LSBF bit of the SP1
control register for Motorola’s 68HC12 controller). When compared to 12C. SPI bus is most
suitable for applications requiring transfer of data in ‘streams'. The only limitation is SPI
doesn’t support an acknowledgement mechanism.

2.4.1.3 Universal Asynchronous Receiver Transmitter (UART)

Universal Asynchronous Receiver Transmitter (UART) based data transmission is an


asynchronous form of serial data transmission. UART based serial data transmission
doesn’t require a clock signal to synchronize the transmitting end and receiving end for
transmission. Instead it relies upon the pre-defined agreement between the transmitting
device and receiving device. The serial communication settings (Baud rate, number of bits
per byte. parity, number of start bits and stop bit and flow control) for both transmitter and
receiver should be set as identical. The start and stop of communication is indicated through
inserting special bits in the data stream. While sending a byte of data, a start bit is added
first and a stop bit is added at the end of the bit stream. The least significant bit of the data
byte follows the ‘start’ bit.

The ‘start’ bit informs the receiver that a data byte is about to arrive. The receiver
device starts polling its received line’ as per the baud rate settings. If thebaud rate is ‘x’ bits
per second, the time slot available for one bit is l x seconds. The receiver unit polls the
receiver line at exactly half of the time slot available for the bit. If parity is enabled for
communication, the UART of the transmitting device adds a parity bit (bit value is l for odd
number of ls in the transmitted bit stream and 0 for even number of is the UART of the
receiving device calculates the parity ofthe bits received and compares it with the received
parity bit for error checking. The UART of the receiving device discards the ‘Start’, ‘Stop’ and
‘Parity' the received serial bit data to a word (In the TXD case of 8 bits/byte, the byte is
102 | P a g e
formed with the UART E ) (D UART received 8 bits with the first received bit as the RXD LSB
and last received data bit as MSB).

For proper communication, the ‘Transmit line’ of the sending device should be
connected to the ‘Receive line’ of the receiving device. Figure illustrates the same.

TXD TXD
UART UART
RXD RXD

TXD: Transmitter Line


RXD: Receiver Line

Inaddition to the serial data transmission TXD: Transmitted line function,UART


provides hardware handshake receiver support for controlling the serial data now. UART
chips are available from different semiconductor manufacturers. National Semiconductor’s
8250 UART chip is considered as the standard setting UART. It was used in the original IBM
PC.

Nowadays most of the microprocessors/controllers are available with integrated


UART functionality and they provide built-in instruction support for serial data
transmission and reception.

2.4.1.4 1–Wire Interface

1-wire interface is an asynchronous half-duplex communication protocol developed


by Maxim Dallas Semiconductor (https: //www.maxim-ic.com). It is also known as Dallas
1-Wire® protocol. It makes use of only a single signal line (wire) called DQ for
communication and follows the master-slave communication model. One of the key feature
of l-wire bus is that it allows power to be sent along the signal wire as well. The 12C slave
devices incorporate internal capacitor (typically of the order of 800 pF) to power the device
from the signal line. The l-wire interface supports a Single master and one or more slave
devices on the bus. The bus interface diagram shown in Figure illustrates the connection of
master and slave devices on the l-wire bus.

103 | P a g e
Vcc

4.7K

DQ Slave 1
Port Pin
1-Wire Device
(Eg: DS2760 Battery
GND
monitor IC )
Master
(Microprocessor/
Controller) DQ Slave 2
1-Wire Device
(Eg: DS2431 1024
GND GND
Bit EEPROM )

1-Wire Interface

Every l-wire device contains a globally unique 64bit identification number stored
within it. The unique identification number can be used for addressing individual devices
present on the bus in case there are multiple slave devices connected to the 1-wire bus. The
identifier has three parts: an 8bit family code, a 48bit serial number and anS bit CRC
computed from the first 56 bits. The sequence of operation for communicating with a 1-wire
slave device is listed below:

1. The master device sends a ‘Reset’ pulse on the l-wire bus.

2. The slave device(s) present on the bus respond with 3 ‘Presence’ pulse.

3. The master device sends a ROM command (Net Address Command followed by the 64bit
address of the device). This addresses the slave device(s) to which it wants to initiate a
communication.

4. The master device sends a read/write function command to read/write the internal
memory or register of the slave device.

5. The master initiates a Read data/Write data from the device or to the device

All communication over the l-wire bus is master initiated. The communication over
the l-wire bus 3, divided into timeslots of 60 microseconds. The ‘Reset’ pulse occupies 8
time slots. For starting a communication, the master asserts the reset pulse by pulling the
1-wire bus ‘LOW’ for at least 8 time slots ‘slave’ device is present on the bus and is ready
for communication it should respond to the master with a ‘Presence’ pulse, within 60us of
the release of the ‘Reset’ pulse by the master. The slave device(s) responds with a ‘Presence’
pulse by pulling the l-wire bus ‘LOW’ for a minimum of 1 time slot (60448). For writing a bit
value of 1 on the l-wire bus, the bus master pulls the bus for l to l5 bus and then releases
the bus for the rest of the time slot. A bit value of ‘0’ is written on the bus by master pulling

104 | P a g e
the bus for a minimum of 1 time slot (60us) and a maximum of 2 time slots (1201.1.03). To
Read a bit from the slave device, the master pulls the bus ‘LOW’ for l to 15us. If the slave
wants to send a bit value ‘1’ in response to the read request from the master, it simply
releases the bus for the rest of the time slot. If the slave wants to send a bit value ‘0’, it pulls
the bus ‘LOW’ for the rest of the time slot.

2.4.1.5 Parallel Interface

The on-board parallel interface is normal used for communicating with peripheral
devices which are memory mapped to the host of the system. The host processor/controller
of the embedded system contains a parallel bus and the device which supports parallel bus
can directly connect to this bus system. The communication through the parallel bus is
controlled by the control signal interface between the device and the host. The ‘Control
Signals’ for communication includes ‘Read/ Write’ signal and device select signal. The device
normally contains a device select line and the device becomes active only when this line is
asserted by the host processor. The direction of data transfer (Host to Device or Device to
Host) can be controlled through the control signal lines for ‘Read’ and ‘Write’. Only the host
processor has control over the ‘Read’ and ‘Write’ control signals. The device is normally
memory mapped to the host processor and a range of address is assigned to it. An address
decoder circuit is used for generating the chip select signal for the device. When the address
selected by the processor is within the range assigned for the device, the decoder circuit
activates the chip select line and thereby the device becomes active. The processor then can
read or write from or to the device by asserting the corresponding control line (RD and WR
respectively). Strict timing characteristics are followed for parallel communication. As
mentioned earlier, parallel communication is host processor initiated. If a (device wants to
initiate the communication, it can inform the same to the processor through interrupts. For
this, the interrupt line of the device is connected to the interrupt line of the processor and
the core, responding interrupt is enabled in the host processor. The width of the parallel
interbank is determined by the data bus width of the host processor. It can be 4bit, 8bit,
16bit, 32bit or 64bit etc.The bus width supported by the device should be same as that of
the host processor.

The bus interface diagram shown in Figure illustrates the interfacing of devices through
parallel interface.

105 | P a g e
D0 to Data Bus
Dx-1 Peripheral Device
RD\ RD\ (Eg: ADC)
WR\ WR\
Host Control Signals CS\
(Microprocessor/
Controller) Chip Select

A0 to Address Bus Address De-coder


Ay-1 Circuit

x: Data bus width


y: Address Bus width

Fig: Interfacing of devices through parallel interface

2.4.2 External Communication Interfaces:

The External Communication Interface refers to the different communication


channels/buses used by the embedded system to communicate with the external world. The
following section gives an overview of the various interfaces for external communication.

2.4.2.1 RS-232 & RS 485:

RS-232 C (Recommended Standard number 232, revision C from the Electronic


Industry Association) is a legacy, full duplex, wired, asynchronous serial communication
interface. The RS-232 interface is developed by the Electronics Industries Association (EIA)
during the early 1960s. RS-232 extends the UART communication signals for external data
communication.UART uses the standard TTL/CMOS logic (Logic ‘High’ corresponds to bit
value 1 and Logic ‘Low‘ corresponds to hit value 0) for bit transmission whereas RS-232
follows the EIA standard for bit transmission. As per the BIA standard, a logic ‘0’ is
represented with voltage between +3 and +25V and a logic ‘l’ is represented with voltage
between -3 and -25V. In EIA standard, logic ‘0’ is known as ‘Space’ and logic ‘1’ as ‘Mark’.
The RS-232 interface defines various handshaking and control signals for communication
apart from the ‘Transmit’ and ‘Receive' signal lines for data communication. RS-232
supports two different types of connectors, namely; DB-9: 9-Pin connector and DB-ZS: 25-
Pin connector. Figure illustrates the connector details for DB-9 and DB-25.

1 13
1 5

6 9 14 25
DB-25
DB-9

106 | P a g e
The pin details for the DB-9 connectors are explained in the following table:

Pin Pin No: Description


Name
(For DB-9
Connector)

TXD 3 Transmit Pin. Used for Transmitting


Serial Data

RXD 2 Receive Pin. Used for Receiving serial


Data

RTS 7 Request to send.

CTS 8 Clear To Send

DSR 6 Data Set ready

GND 5 Signal Ground

DCD 1 Data Carrier Detect

DTR 4 Data Terminal Ready

RI 9 Ring Indicator

RS-232 is a point-to-point communication interface and the devices involved in RS-


232 communication are called “Data Terminal Equipment (DTE)” and “Data Communication
Equipment (DCE)”. If no data flow control is required. Only TXD and RXD signal lines and
ground line (GND) are required for data transmission and reception. The RXD pin of DCE
should be connected to the TXD pin of DTE and vice versa for proper data transmission.

If hardware data how control is required for serial transmission, various control
signal lines of the RS-232 connection are used appropriately. The control signals are
implemented mainly for modem communication and some of them may not be relevant for
other type of devices. The Request to Send (RTS) and Clear to Send (CTS) signals co-ordinate
the communication between DTE and DCE. Whenever the DTE has a data to send, it
activates the RTS line and if the DCE is ready to accept the data, it activates the CTS line.

The Data Terminal Ready (DTR) signal is activated by DTE when it is ready to accept
data. The Data Set Ready (DSR) is activated by DCE when it is ready for establishing a
communication link. DTR should be in the activated state before the activation of DSR.

The Data Carrier Detect (DCD) control signal is used by the DCE to indicate the DTE
that a good signal is being received.

107 | P a g e
Ring Indicator (RI) is a modem specific signal line for indicating an incoming call on
the telephone line.

The 25 pin DB connector contains two sets of signal lines for transmit, receive and
control lines. Nowadays DB-25 connector is obsolete and most of the desktop systems are
available with DB~9 connectors only.

As per the EIA standard RS~232 C supports baud rates up to 20Kbps (Upper limit
19.2 Kbps) The commonly used baud rates by devices are 300bps, l200bps, 2400bps,
9600bps, 11.52Kbps and 19.2Kbps. 9600 is the popular baud rate setting used for PC
communication. The maximum operating distance supported by RS-232 is 50 feet at the
highest supported baudrate.

Embedded devices contain a UART for serial communication and they generate signal
levels conforming to TTL CMOS logic. A level translator lC like MAX 232 from Maxim Dallas
semiconductor is used for converting the signal lines from the UART to RS-232 signal lines
for communication. On the receiving side the received data is converted back to digital logic
level by a converter IC. Convener chips contain converters for both transmitter and receiver.

Though RS-232 was the most popular communication interface during the olden
days, the advent of other communication techniques like Bluetooth, USB, Fire wire, etc. are
pushing down RS-232 from the scenes. Still RS-232 is popular in certain legacy industrial
applications.

RS-232 supports only point-to-point communication and not suitable for multi-drop
communication. It uses single ended data transfer technique for signal transmission and
thereby more susceptible to noise and it greatly reduces the operating distance.

RS-422 is another serial interface standard from EIA for differential data
communication. It supports data rates up to 100 Kbps and distance up to 400 ft. The same
RS-232 connector is used at the device end and an RS-232 to RS-422 converter is plugged
in the transmission line. At the receiver end the conversion from RS-422 to RS-232 is
performed. RS-422 supports multi-drop communication with one transmitter device and
receiver devices up to 10.

RS-485 is the enhanced version of RS-422 and it supports multi-drop communication


with up to 32 transmitting devices (drivers) and 32 receiving devices on the bus. The
communication between devices in the bus uses the ‘addressing’ mechanism to identify
slave devices.

2.4.2.2 Universal Serial Bus (USB):

Universal Serial Bus (USB) is a wired high speed serial bus for data communication.
The first version of USB (USB1.0) was released in 1995 and was created by the USB core

108 | P a g e
group members consisting of Intel, Microsoft, IBM, Compaq, Digital and Northern Telecom.
The USB communication system follows a star topology with a USB host at the center and
one or more USB peripheral devices/USB hosts connected to it. A USB host can support
connections up to 127, including slave peripheral devices and other USB hosts.

Figure illustrates the star topology for USB device connection.

Peripheral
Device 2

Peripheral USB Host Peripheral


Device 1 (Hub) Device 3

USB Host
(Hub)

Peripheral Peripheral
Device 4 Device 5

USB Device Connection Topology

USB transmits data in packet format. Each data packet has a standard format. The
USB communication is a host initiated one. The USB host contains a host controller which
is responsible for controlling the data communication, including establishing connectivity
with USB slave devices, packetizing and formatting the data. There are different standards
for implementing the USB Host Control interface; namely Open Host Control Interface
(OHCI) and Universal Host Control Interface (UHCI).

USB uses differential signals for data transmission. It improves the noise immunity.
USB interface has the ability to supply power to the connecting devices. Two connection
lines (Ground and Power) of the USB interface are dedicated for carrying power. It can supply
power up to 500 mA at 5 V. It is sufficient to operate low power devices. Mini and Micro USB
connectors are available for small form factor devices like portable media players.

The pin details for connectors are listed below:

Pin No: Pin Name Description


1 VBUS Carries power (5V)
2 D- Differential data carrier line
3 D+ Differential data carrier line
4 GND Ground signal line

109 | P a g e
Each USB device contains a Product ID (PID) and a Vendor ID (VID). The PID and
VID are embedded into the USB chip by the USB device manufacturer. The VID for a device
is supplied by the USB standards forum. PID and VID are essential for loading the drivers
corresponding to a USB device for communication.

USB supports four different types of data transfers, namely; Control, Bulk,
Isochronous and Interrupt. Control transfer is used by USB system software to query,
configure and issue commands to the USB device. 8qu transfer is used for sending a block
of data to a device. Bulk transfer supports error checking and correction. Transferring data
to a printer is an example for bulk transfer. Isochronous data transfer is used for real-time
data communication. In Isochronous transfer, data is transmitted as streams in real-time.
Isochronous transfer doesn’t support error checking and re-transmission of data in case of
any transmission loss. All streaming devices like audio devices and medical equipment for
data collection make use of the isochronous transfer. Interrupt transfer is used for
transferring small amount of data. Interrupt transfer mechanism makes use of polling
technique to see whether the USB device has any data to send. The frequency of polling is
determined by the USB device and it varies from 1 to 255 milliseconds. Devices like Mouse
and Keyboard, which transmits fewer amounts of data, uses interrupt transfer.

USB.ORG (www.usb.org) is the standards body for defining and controlling the
standards for USB communication. Presently USB supports four different data rates
namely; Low Speed (1.5Mbps), Full Speed (l2Mbps), High Speed (480Mbps) and Super Speed
(4.8Gbps). The Low Speed and Full Speed specifications are defined by USB 1.0 and the
High Speed specification is defined by USB 2.0. USB 3.0 defines the specifications for Super
Speed. USB 3.0 is expected to be in action by year 2009. There is a move happening towards
wireless USB for data transmission using Ultra Wide Band (UWB) technology. Some laptops
are already available in the market with wireless USB support.

2.4.2.3 IEEE 1394 (Fire wire):

IEEE 1394 is a wired, isochronous high speed serial communication bus. It is also
known as High Performance Serial Bus (HPSB). The research on 1394 was started by Apple
Inc. in 1985 and the standard for this was coined by IEEE.The implementation of it is
available from various players with different names. Apple Inc.’s (www.apple.com)
implementation of 1394 protocol is popularly known as Fir-mire. LLINK is the 1394
implementation from Sony Corporation (www.sony.netl and Lynx is the implementation
from Texas Instruments www.ti.com). 1394 supports peer-to-peer connection and point-to-
multipoint communication allowing 63 devices to be connected on the bus in a nee topology.
1394 is a wired serial interface and it can support a cable length of up to 15 feet for
interconnection.

110 | P a g e
The 1394 standard has evolved a lot from the first version IEEE 1394-1995 released
in 1995 to the recent version IEEE 1394-2008 released in June 2008. The 1394 standard
supports a data rate of 400 to 3200Mbits/second. The IEEE 1394 uses differential data
transfer (The information is sent using differential signals through a pair of twisted cables.
It increases the noise immunity) and the interface cable supports 3 types of connectors,
namely; 4-pin connector, 6-pin connector (alpha connector) and 9 pin connector (beta
connector). The 6 and 9 pin connectors carry power also to support external devices (In case
an embedded device is connected to a PC through an IEEE 1394 cable with 6 or 9 pin
connector interface, it can operate from the power available through the connector.) It can
supply unregulated power in the range of 24 to 30V.

(TheApple implementation is for battery operated devices and it can supply a voltage in the
range 9 to 12V.) The table given below illustrates the pin details for 4, 6 and 9 pin
connectors.

Pin Pin No: Pin No: Pin No: Description


Name
(4 Pin (6 Pin (9 Pin
Connector) Connector) Connector)

Power 1 8 Unregulated DC supply. 24 to 30V

Signal 2 6 Ground connection


Ground

TPB- 1 3 1 Differential Signal line for Signal Line


B

TPB+ 2 4 2 Differential Signal line for Signal Line


B

TPA- 3 5 3 Differential Signal line for Signal Line


A

TPA+ 4 6 4 Differential Signal line for Signal Line


A

TPA(S) 5 Shield for the differential signal line A.


Normally grounded

TPB(S) 9 Shield for the differential signal line B.


Normally grounded

NC 7 No connection

111 | P a g e
There are two differential data transfer lines A and B per connector. In a 1394 cable,
normally the differential lines of A are connected to B (TPA+ to TPB+ and TPA-to TPB~) and
vice versa.

1394 is a popular communication interface for connecting embedded devices like


Digital Camera, Camcorder, and Scanners to desktop computers for data transfer and
storage.

Unlike USB interface (Except USB OTG), IEEE 1394 doesn‘t require a host for
communicating between devices. For example, you can directly connect a scanner with a
printer for printing. The data rate supported by 1394 is far higher than the one supported
by 0582.0 interface. The 1394 hardware implementation is much costlier than USB
implementation.

2.4 IrDA (Infrared):

Infrared (IrDA) is a serial, half duplex, line of sight based wireless technology for data
communication between devices. It is in use from the olden days of communication and you
may be very familiar with it. The remote control of your TV, VCD player, etc. works on
infrared data communication principle. Infrared communication technique uses infrared
waves of the electromagnetic spectrum for transmitting the data. IrDA supports point-point
and point-to-multipoint communication, provided all devices involved in the communication
are within the line of sight. The typical communication range for IrDA lies in the range 10
cm to 1 m. The range can be increased increasing the transmitting power of the IR device.
IR supports data rates ranging from 9600bits/second to 16Mbps. Depending on the speed
of data transmission IR is classified into Serial IR (SIR), Median1 IR (MIR), Fast IR (FIR),
Very Fast IR (VFIR) and Ultra-Fast IR (UFIR). SIR supports transmissio1 rates ranging from
9600bps to 115.2kbps. MIR supports data rates of 0.576Mbps and 1.152Mbps. FIR
supports data rates up to 4Mbps. VFIR is designed to support high data rates up to 16Mbps.
The UFIR specs are under development and it is targeting a data rate up to l00Mbps.

IrDA communication involves a transmitter unit for transmitting the data over IR and
a receiver for receiving the data. Infrared Light Emitting Diode (LED) is the IR source for
transmitter and at the receiving end a photodiode acts as the receiver. Both transmitter and
receiver unit will be present in each device supporting IrDA communication for bidirectional
data transfer. Such IR units are known as ‘Transceiver’. Certain devices like a TV remote
control always require unidirectional communication and so they contain either the
transmitter or receiver unit (The remote control unit contains the transmit. per unit and TV
contains the receiver unit).

‘Infra-red Data Association’ (IrDA https://www.irda.org/) is the regulatory body


responsible for defining and licensing the specifications for IR data communication. IrDA
communication has two essential parts; a physical link part and a protocol part. The

112 | P a g e
physical link is responsible for the physical transmission of data between devices supporting
IR communication and protocol part is responsible for defining the rules of communication.
The physical link works on the wireless principle making use of Infrared for communication.
The IrDA specifications include the standard for both physical link and protocol layer.

The IrDA control protocol contains implementations for Physical Layer (PHY), Media
Access Control (MAC) and Logical Link Control (LLC). The Physical Layer defines the
physical characteristics of communication like range, data rates, power, etc.

IrDA is a popular interface for tile exchange and data transfer in low cost devices.
IrDA was the prominent communication channel in mobile phones before Bluetooth’s
existence. Even now most of the mobile phone devices support IrDA.

2.5 Bluetooth (BT):

Bluetooth is a low cost, low power, short range wireless technology for data and voice
communication. Bluetooth was first proposed by ‘Ericsson’ in 1994. Bluetooth operates at
2.4GHz of the Radio Frequency spectrum and uses the Frequency Hopping Spread
Spectrum (FHSS) technique for communication. Literally it supports a data rate of up to lips
and a range of approximately 30 feet for data communication. Like IrDA, Bluetooth
communication also has two essential parts; a physical link part and a protocol part. The
physical link is responsible for the physical transmission of data between devices supporting
high Bluetooth communication and protocol part is responsible for defining the rules of
communication. The physical link works on the wireless principle making use of RF waves
for communication. Bluetooth enabled devices essentially contain a Bluetooth wireless radio
for the transmission and reception of data. The rules governing the Bluetooth
communication is implemented in the ‘Bluetooth protocol stack’. The Bluetooth
communication IC holds the stack. Each Bluetooth device will have a 48 bit unique
identification number. Bluetooth communication follows packet used data.

Bluetooth supports point-to-point (device to device) and point-to-multipoint (device


to multiple device broadcasting) wireless communication. The point-to-point
communication follows the master slave relationship. A Bluetooth device can function as
either master or slave. When a network is formed with one Bluetooth device as master and
more than one device as slaves, it is called a Pico net/ A Pico net supports a maximum of
seven slave devices.

Bluetooth is the favourite choice for short range data communication in handheld
embedded devices. Bluetooth technology is very popular among cell phone users as they are
the easiest communication channel for transferring ringtones, music files, pictures, media
files, etc. between neighbouring Bluetooth enabled phones.

The Bluetooth standard specifies the minimum requirements that a Bluetooth device
must support for a specific usage scenario. The Generic Access Profile (GAP) defines the
requirements for detecting a Bluetooth device and establishing a connection with it. All other

113 | P a g e
specific usage profiles are based on GAP. Serial Port Profile (SPP) for serial data
communication, File Transfer Profile (FTP) for file transfer between devices, Human Interface
Device (HID) for supporting human interface devices like keyboard and mouse are examples
for Bluetooth profiles.

The specifications for Bluetooth communication is defined and licensed by the


standards body ‘Bluetooth Special interest Group (SIG)’. For more information, please visit
the website www.bluetooth.org.

2.6 WI-FI:

Wi-Fi or Wireless Fidelity is the popular wireless communication technique for


networked communication of devices. Wi-Fi follows the IEEE 802.11 standard. Wi-Fi 1s
intended for network communication and it supports Internet Protocol (IP) based
communication it is essential to have device identities in a multipoint communication to
address specific devices for data communication. In a 1P based communication each device
is identified by an IP address, which is unique to each device on the network. Wi-Fi based
communications require an intermediate agent called Wi-Fi router/Wireless Access point to
manage the communications. The Wi-Fi router is responsible for restricting the access to a
network, assigning IP address to devices on the network, routing data packets to the
intended devices on the network. Wi-Fi enabled devices contain a wireless adaptor for
transmitting and receiving data in the form of radio signals through an antenna. The
hardware part of it is known as Wi-Fi Radio.

Wi-Fi operates at 2.4GHz or 5GHz of radio spectrum and they co-exist with other ISM
band devices like Bluetooth. Figure illustrates the typical interfacing of devices in a Wi-Fi
network.

Wi-Fi Router

Device 1
Device 2 Device 3

For communicating with devices over a Wi-Fi network, the device when its Wi-Fi radio
is turned ON, searches the available Wi-Fi network in its vicinity and lists out the Service
Set Identifier (SSID) of the available networks. If the network is security enabled, a password
may be required to connect to a particular SSID. Wi-Fi employs different security
mechanisms like Wired Equivalency Privacy (WEP) Wireless Protected Access (WPA), etc. for
securing the data communication.

114 | P a g e
Wi-Fi supports data rates ranging from lips to 150Mbps (Growing towards higher
rates as technology progresses) depending on the standards (802.11a/b/g/n) and
access/modulation method. Depending on the type of antenna and usage location
(indoor/outdoor), Wi-Fi offers a range of 100 to 300 feet.

2.7 ZigBee:

ZigBee is a low power, low cost, wireless network communication protocol based on
the IEEE 802.15.4-2006 standard. ZigBee is targeted for low power, low data rate and secure
applications for wireless Area Networking (W PAN). The ZigBee specifications support a
robust mesh network containing multiple nodes. This networking strategy makes the
network reliable by permitting messages to travel through a number of different paths to get
from one node to another.

ZigBee operates worldwide at the unlicensed bands of Radio spectrum, mainly at


2.400 to 2.484 GHZ, 902 to 928 MHz and 868.0 to 868.6MHz. ZigBee Supports an
operating distance of up to 109 meters and a data rate of 20 to 250Kbps.

ZigBee Coordinator (ZC)/Network Coordinator:The ZigBee coordinator acts as the root of


the ZigBee network. The ZC is responsible for initiating the ZigBee network and it has the
capability to Store information about the network.

ZigBee Router (ZR)/Full function Device (FFD): Responsible for passing information from
device to another device or to another ZR.

ZigBee End Device (ZED)/Reduced Function Device (RFD): End device containing ZigBee
functionality for data communication. It can talk only with a ZR or ZC and doesn’t have the
capability to act as a mediator for transferring data from one device to another.

ZED ZED

ZED

ZR ZC ZR

ZED ZED

2.5 EMBEDDED FIRMWARE

Embedded firmware refers to the control algorithm (Program instructions) and or the
configuration settings that an embedded system developer dumps into the code (Program)

115 | P a g e
memory of the embedded system. It is an un-avoidable part of an embedded system. There
are various methods available for developing the embedded firmware. They are listed below.

1. Write the program in high level languages like Embedded C/C++ using an Integrated
Development Environment (The IDE will contain an editor, compiler, linker, debugger,
simulator, etc. IDES are different for different family of processors/controllers. For example,
Keil micro vision3 IDE is used for all family members of 8051 microcontroller, since it
contains the generic 8051 compiler C51).

2. Write the program in Assembly language using the instructions supported by your
application’s target processor controller.

The instruction set for each family of processor/controller is different and the
program written in either of the methods given above should be converted into a processor
understandable machine code before loading it into the program memory. The process of
converting the program written in either a high level language or processor/controller
specific Assembly code to machine readable binary code is called ‘HEX File Creation’. The
methods used for ‘HEX File Creation’ is different depending on the programming techniques
used. If the program is written in Embedded C CH using an IDE, the cross compiler included
in the IDE converts it into corresponding processor/controller understandable ‘HEX File’. If
you are following the Assembly language based programming technique (method 2), you can
use the utilities supplied by the processor/controller vendors to convert the source code
into ‘HEX File’. Also third party tools are available, which may be of free of cost, for this
conversion.

For a beginner in the embedded software field, it is strongly recommended to use the
high level language based development technique. The reasons for this being: writing codes
in a high level language is easy, the code written in high level language is highly portable
which means you can use the same code to run on different processor/controller with little
or less modification. The only thing you need to do is re-compile the program with the
required processor’s IDE, after replacing the include files for that particular processor. Also
the programs written in high level languages are not developer dependent. Any skilled
programmer can trace out the functionalities of the program by just having a look at the
program. It will be much easier if the source code contains necessary comments and
documentation lines. It is very easy to debug and the overall system development time will
be reduced to a greater extent.

The embedded software development process in assembly language is tedious and


time consuming. The developer needs to know about all the instruction sets of the
processor/controller or at least it should carry an instruction set reference manual with
her/him. A programmer using assembly language technique writes the program according
to his/her view and taste. Often he/she may be writing a method or functionality which can
be achieved through a single instruction as an experienced person’s point of view, by two or
three instructions in his/her own style. So the program will be highly dependent on the

116 | P a g e
developer. It is very difficult for a second person to understand the code written in Assembly
even if it is well documented.

We will discuss both approaches of embedded software development in a later chapter


dealing with design of embedded firmware, in detail. Two types of control algorithm design
exist in embedded firmware development. The first type of control algorithm development is
known as the infinite loop of ‘super loop’ based approach, where the control how runs from
top to bottom and then jumps back to the top of the program in a conventional procedure.
It is similar to the while (I) { }; based technique in C. The second method deals with splitting
the functions to be executed into tasks and running these tasks using a scheduler which is
part of a General Purpose or Real Time Embedded Operating System (GPOS/RTOS). We will
discuss both of these approaches in separate chapters of this book.

2.6 OTHER SYSTEM COMPONENTS

The other system components refer to the components/circuits/ICS which are


necessary for the proper functioning of the embedded system. Some of these circuits may
be essential for the proper functioning of the processor/controller and firmware execution.
Watchdog timer, Reset IC (or passive circuit), brown-out protection 1C (or passive circuit),
etc. are examples of circuits/1C5 which are essential for the proper functioning of the
processor/controllers. Some of the controllers or SOC’s integrate these components within
a single IC and doesn’t require such components externally connected to the chip for proper
functioning. Depending on the system requirement, the embedded system may include other
integrated circuits for performing specific functions, level translator ICs for interfacing
circuits with different logic levels, etc. The following section explains the essential circuits
for the proper functioning of the processor/controller of the embedded system.

2.6.1 Reset Circuit

The reset circuit is essential to ensure that the device is not operating at a voltage
level where the device is not guaranteed to operate, during system power ON. The reset
signal brings the internal registers and the different hardware systems of the
processor/controller to a known state and starts the firmware execution from the reset
vector (Normally from vector address 0x0000 for conventional processors/controllers. The
reset vector can be relocated to an address for processors/controllers supporting
bootloader). The reset signal can be either active high (The
processor undergoes reset when the reset pin of the
processor is at logic high) or active low (The processor
undergoes reset when the reset pin of the processor is at
logic low). Since the processor operation is synchronized to
a clock signal, the reset pulse should be wide enough to
give time for the clock oscillator to stabilize before the
internal reset state starts. The reset signal to the processor
can be applied at power ON through an external passive
reset circuit comprising a Capacitor and Resistor or
117 | P a g e
through a standard Reset IC like MAX810 from Maxim Dallas (www.maxim-ic.com). Select
the reset IC based on the type of reset signal and logic level (CMOS/TT L) supported by the
processor controller in use. Some microprocessors/controllers contain built-in internal
reset circuitry and they receive external reset circuitry.

Figure illustrates a Resistor capacitor based passive reset circuit for active high and
low configuration:

RC based Reset Circuit

2.6.2 Brown-out Protection Circuit

Brown-out protection circuit prevents the


processor/controller m unexpected program execution
behavior when the supply voltage to the
processor/controller falls below a specified voltage It is
essential for battery powered devices since there are greater chances for the battery voltage
to drop below the required threshold. The processor behavior may not be predictable if the
supply voltage falls below the recommended operating voltage. It may lead to situations like
data corruption. A brown-out protection circuit holds the processor/controller in reset state,
when the operating voltage falls below the threshold, until it rises above the threshold
voltage. Certain processors/controllers support built in brown-out protection circuit which
monitors the supply voltage internally. If the processor/controller doesn’t integrate a built-
in brown-out protection circuit, the same can be implemented using external passive
circuits or supervisor ICs. Figure illustrates a brown-out circuit implementation using Zener
diode and transistor for processor/controller with active low Reset Logic.

Vcc

R1

V BE
R2
Q

Reset Pulse
DZ Active Low
Vz

R3

GND

Brown-out Protection circuit using active low output

The Zener diode Dz and transistor Q forms the heart of this circuit. The transistor
conducts always when the supply voltage Vcc is greater than that of the sum of VBE and Vz
(Zener voltage). The transistor stops conducting when the supply voltage falls below the sum
118 | P a g e
of Var. and VzSelect the Zener diode with required voltage for setting the low threshold value
for Vcc. The values of R1, R2, and R3 can be selected based on the electrical characteristics
(Absolute maximum current and voltage ratings) of the transistor in use. Microprocessor
Supervisor like D81232 from Maxim Dallas (www.maximigcom) also provides Brown-out
protection.

2.6.3 Oscillator Unit

A microprocessor/microcontroller is a digital device made up of digital combinational


and sequential circuits. The instruction execution of a microprocessor/controller occurs in
sync with a clock signal. It is analogous to the heartbeat of a living being which synchronizes
the execution of life. For a living being the heart is responsible for the generation of the beat
whereas the oscillator unit of the embedded system is responsible for generating the precise
clock for the processor. Certain processors/controllers integrated a built-in oscillator unit
asimply require an external ceramic resonator/quartz crystal for producing the necessary
clock signals Quartz crystals and ceramic resonators are equivalent in operation, however
they possess physical difference. A quartz crystal is normally mounted in a hermetically
sealed meal case with two leads protruding out of the case Certain devices may not contain
a built-in oscillator unit and require the clock pulses to be generated and supplied
externally. Quartz crystal Oscillators are available in the form chips and they can be used
for generating the clock pulses in such a cases. The Speed of operation of a processor is
primarily dependent on the clock frequency. However we cannot increase the clock
frequency blindly for increasing the speed of execution. The logical circuits lying inside the
processor always have an upper threshold value for the maximum clock at which the system
can run, beyond which the system becomes unstable and non-functional. The total system
power consumption is directly proportional to the clock frequency. The power consumption
increases with increase in clock frequency. The accuracy of program execution depends on
the accuracy of the clock signal. The accuracy of the crystal oscillator or ceramic resonator
is normally expressed in terms of +/-ppm (Parts per million).

Figure illustrates the usage of quartz crystal/ceramic resonator and external


oscillator chip for clock generation.

Microcontroller Microprocessor
C : Capacitor
Y : Resonator

Crystal Oscillator
Oscillator
Unit
Quartz Crystal Clock Input Pin
Resonator C C
Y Oscillator
Unit

Oscillator circuitry using quartz crystal and quartz crystal oscillator


119 | P a g e
2.6.4. Real-Time Clock (RTC)

Real-Time Clock (RTC) is a system component responsible for keeping track of time.
RTC holds information like current time (In hours, minutes and seconds) in 12 hour/24
hour format, date, month, year, day of the week, etc. and supplies timing reference to the
system. RTC is intended to function even the absence of power. RTCs are available in the
form of Integrated Circuits from different semiconductor manufacturers "like Maxim/Dallas,
ST Microelectronics etc. The RTC chip contains a microchip for holding the time and date
related information and backup battery cell for functioning in the absence of power, in a
single 1C package. The RTC Chip is interfaced to the processor or controller of the embedded
system. For Operating System based embedded devices, a timing reference is essential for
synchronizing the operations of the OS kernel. The RTC can interrupt the OS kernel by
asserting the interrupt line of the processor/controller to which the RTC interrupt line is
connected. The OS kernel identifies the interrupt in terms of the Interrupt Request (IRQ)
number generated by an interrupt controller. One IRQ can be assigned to the RTC interrupt
and the kernel can perform necessary Operations like system date time updating, managing
software timers etc. when an RTC timer tick interrupt occurs. The RTC can be configured
to interrupt the processor at predefined intervals or to interrupt the processor when the RTC
register reaches a specified value (used as alarm interrupt).

2.6.5 Watchdog Timer

In desktop Windows systems, if we feel our application is behaving in an abnormal


way or if the system hangs up, we have the ‘Ctrl + Alt + Del’ to come out of the situation.
What if it happens to our embedded system? Do we really have a ‘Ctrl + Alt + Del’ to take
control of the situation? Of course not ®, but we have a watchdog to monitor the firmware
execution and reset the system processor/microcontroller when the program execution
hangs up. A watchdog timer, or simply a watchdog, is a hardware timer for monitoring the
firmware execution. Depending on the internal implementation, the watchdog timer
increments or decrements a free running counter with each clock pulse and generates a
reset signal to reset the processor if the count reaches zero for a down counting watchdog,
or the highest count value for an up counting watchdog. If the watchdog counter is in the
enabled state, the firmware can write a zero (for upcounting watchdog implementation) to it
before starting the execution of a piece of code (subroutine or portion of code which is
susceptible to execution hang up) and the watchdog will start counting. If the firmware
execution doesn’t complete due to malfunctioning, within the time required by the watchdog
to reach the maximum count, the counter will generate a reset pulse and this will reset the
processor to the watchdog timer register.

Most of the processors implement watchdog as a built-in component and provides


status register to control the watchdog timer (like enabling and disabling watchdog
functioning) and watchdog timer register for writing the count value. If the
processor/controller doesn’t contain a built in watchdog timer, the same can be
implemented using an external watchdog timer IC circuit. The external watchdog timer uses

120 | P a g e
hardware logic for enabling/disabling, resetting the watchdog count, etc. instead of the
firmware based ‘writing’ to the status and watchdog timer register. The Microprocessor
supervisor IC DS 1232 integrates a hardware watchdog timer in it. In modem systems
running on embedded operating systems, the watchdog can be implemented in such a way
that when a watchdog timeout occurs, an interrupt is generated instead of resetting the
processor. The interrupt handler for this handles the situation in an appropriate fashion.

Figure illustrates the implementation of an external watchdog timer based


microprocessor supervisor circuit for a small scale embedded system.

Microoprocessor/
Controller
Watchdog
Free Running
Reset Pin
Counter

Watchdog Reset

System Clock

Watch Dog timer for firmware execution supervision

2.7 PCB AND PASSIVE COMPONENTS

Printed Circuit Board (PCB) is the backbone of every embedded system. After
finalizing the component and the inter-connection among them, a schematic design is
created and according to the schematic PCB is fabricated. This will be described in detail in
a chapter dedicated for “Embedded Hardware Design and Development”. PCB acts as a
platform for mounting all the necessary components as M the design requirement. Also it
acts as a platform for testing your embedded firmware. Apart from the above-mentioned
important subsystems of an embedded system, you can find some passive electronic
components like resistor, capacitor, diodes, etc. on your board. They are the co-workers of
various chips contained in your embedded hardware. They are very essential for the proper
functioning of you embedded system. For example for providing a regulated ripple-free
supply voltage to the system, a regulator IC and spike suppressor filter capacitors are very
essential.

QUESTION BANK
1. Explain the 6 purposes of embedded systems with an example for each. (4)
2. Differentiate between (I) General Computing Systems and Embedded Systems and (ii)
RISC and CISC architectures. (4)
3. Explain the 3 classifications of embedded systems based on complexity and
performance. (6)
4. Mention the applications of embedded systems with an example for each. (4)
5. Explain the functions of Opt coupler and SPI bus with diagrams. (3)
6. Write a note on embedded firmware. (4)
121 | P a g e
7. Explain SRAM design and features with a diagram. (6)
8. Explain USAT and SSAT with example. (5)
9. Mention the role of watch dog timer in embedded system with relevant examples. (5)
10. Discuss the I2c communication interface with neat diagram. (5)
11. Elaborate the working of SPI bus with a neat interfacing diagram. (6)
12. Compare PLD, ASIC and COTS. (5)
13. Explain the working of a relay driver with a diagram. (4)
14. Explain operation of UART .Compare UART and USB. (4)
15. Compare serial and parallel communication. (4)
16. List various purposes of an embedded systems. (3)
17. Write a note on 1-wire bus .Explain its advantage and disadvantages.(4)
18. Write a note on embedded firmware. (5)
19. Explain the working of following instructions DMB, WFI, SVC. (6)
20. Explain the reset and brown out protection circuits their significance and
application in embedded system. (5)

Module 4
CHARACTERISTICS OF AN EMBEDDED SYSTEM
Unlike general purpose computing system, embedded system possess certain specific
characteristics and these characteristics are unique each embedded system. Some of the
importance characteristics of embedded systems are:

1. Application and domain specific


2. Reactive and real time
3. Operates in harsh environments
4. Distributed
5. Small size and weight
6. Power concerns

Application and domain specific


If you closely observe any embedded system, you will find that each embedded system is
having certain function to perform and they are developed in such a manner to do the
intended functions only. They cannot be used for any other purpose. It is the major criterion
which distinguishes an embedded system from a general purpose system.
122 | P a g e
For example, you cannot replace the embedded control unit of your micro oven with your
air conditioner’s embedded control unit, because the embedded control units of micro
oven And air conditioner are specifically designed to perform certain specific tasks.

Reactive and real time

Embedded systems are in constant system are in constant interaction with the real world
through sensor and user-defined input devices which are connected to the input port of the
system.

Any changes happening in the real world is captured by the sensor input devices in real
time and control algorithm running inside the unit reacts in a designed manner to bring the

Controlled output variables to the desired level. The event may be periodic or unpredicted
one. If the event is unpredicted one then such a system should be designed in a such a way
that

It should be scheduled to capture the events without missing them. Embedded systems
produces changes in the output as changes in the input. So they are generally referred as
Reactive systems.

Real time system operation means the timing behavior of the system should be
deterministic; meaning the system should respond to request or tasks in a known amount
of time. A real time system

Should not miss any deadlines for tasks or operations. It is not necessary that all embedded
systems should be real time in operations.

Operates in harsh environment

It is not necessary that all embedded system should be deployed in controlled environments.
The environment in which the embedded system deployed may be a dusty one or a high
temperature zone or an area subject to vibrations and shock. System placed in such areas
should be capable to withstand all these adverse operating conditions. The design should
take care of the operating conditions of thearea where the system is going to implement.

Distributed

The term distributed means that embedded system may be a part of larger system. Many
numbers of such distributed embedded system form a single large embedded control unit.

An automatic vending machine is a typical example for this. The vending machine contains
a card reader,a vending unit, etc. Each of them are independent embedded units but they
work together to perform the overall Vending function.

Small size and weight

123 | P a g e
Product aesthetic is another important factor in choosing a product. For example , when
you plan to buy a new mobile phone, you make a comparative study on the pros and corns
of the product available in the market.

Definitely the product aesthetics will be one of the deciding factor to choose a product.

Power concerns

Power management is another important factor that needs to consider in designing


embedded system. Embedded systems should be designed in such a way as to minimize
the heat dissipation by the system.

The product of high amount of heat demands cooling requirements like cooling fans which
in turn occupies additional space and make the system bulky.

Operational quality attributes

The operational quality attributes represent the relevant quality attributes related to the
embedded systems when it is in the operational mode or ‘online’ mode.

1. Response
2. Throughput
3. Reliability
4. Maintainability
5. Security
6. Safety

Response: Response is a measure of quickness of the system. It gives you an idea about
how fast your system is tracking the changes in input variables.

Most of the embedded system demand fast response which should be almost real time.

Throughput: throughput deals with the efficiency of a system .In general it can be defined
as the rate of production or operation of a defined process over a started period of time.

The rates can be expressed in terms of units of products, batches produced, or any other
meaningful measurements. Throughput is generally measured in terms of ‘BENHMARK’.

Reliability: Reliability is a measure of how much % you can rely upon the proper
functioning of the system or what is the % susceptibility of the system to failures.

Mean time between failures (MTBF) and mean time to repair(MTTR) are the terms used in
defining system reliability.MTBF gives the frequency of failures in hours/weeks/months.

MTTR specifies how long the system is allowed to be out of order following a failure.

124 | P a g e
Maintainability: maintainability deals with support and maintenance to the end user or
client in case of technical issues and product failure or on the basis of a routine system
check-up. Reliability and maintainability

Are considered as two complementary disciplines. A more reliable system means a system
with less corrective maintainability requirements and vice versa.

As the reliability of the system increases the chances of failure and non-functioning also
reduces, thereby the need for maintainability is also reduced. Maintainability is closely
related to the system availability.

Security confidentiality: ‘integrity’ and ‘availability’ are the three major measures of
information security. Confidentiality deals with the protection of data and application from
unauthorised disclosure.

Integrity deals with the protection of data and application from unauthorized modification.
Availability deals with protection data and application from authorized users a very good
example of the security aspect in a embedded product is a Personal Digital assistant(PDA).

Safety: safety and security are two confusing terms sometimes you may feel both of them
as a single attribute. But they represent two unique aspects in quality attributes. Safety
deals with the possible damage that can happen to the operators, public and the
environment due to the breakdown of an embedded system or due to the emission of
radioactive or hazardous material from the embedded products. Safety analysis is must in
product engineering to evaluate the anticipated damages and determine the best course of
action to bring down the consequences of the damages to an acceptable level.

Non-Operational Quality Attributes

The quality attributes that needs to be addressed for the product ‘not’ on the basis of
operational aspects are grouped under thls category. The important quality attributes
coming under this category are listed below.

1. Testability & Debug-ability

2. Evolvability

3. Portability

4. Time to prototype and market 5. Per unit and total cost.

Testability & Debug-ability

Testability deals with how easily one can test his/ her design. Application and by which
means he/she can test it for an embedded products testability is applicable

to both the embedded hardware and firmware, embedded hardware testing ensure that the
peripheral and the total hardware functions in the desired manner, whereas firmware
125 | P a g e
testing ensure that the firmware is functioning in the expected way, debug ability is a means
of debugging the product as such for figuring out the probable sources that creating the
unexpected behaviour in the total system. Debug-ability has two aspects in the embedded
system development context, namely, hardware level debugging and firmware level
debugging hardware debugging is used for figuring out the issues created by hardware
problems whereas firm debugging is employed to figure out the probable errors that appear
as a result of flaws in the firmware

Evolvability: Evolvability is a term which is closely related to Biology. Evolvability referred


as the non-heritable variation. For an embedded system, the quality attributes refers to the
ease with which the embedded product (including firmware and hardware) c to take
advantage of new firmware or hardware technologies

Portability

Portability is a measure of system independence. An embedded product is said to be portable


if the product is capable of turned; as such in various environments, target
processors/controllers and embedded operating system the case with which embedded
product can be ported on to a new platform is a direct measure of re-work require A In
standard embedded product should always be flexible and portable. In embedded products,
the term ‘porting’ represents the migration of the embedded firmware write for one target
processor (e. g Intel x86) to a different target processor (say Hitach1 8H3 professor) If the
firmware is written in a high level language like ‘C’ with little target processor-specific
function (operating system extensions or compiler specific utilities), it is very easy to port
the firmware for the new processor by replacing those ‘target processor-specific functions’
with the ones for the new target processor and re-compiling the program for the new target
processor specific settings. Re-compiling the program for the new target processor generates
the new target processor-specific machine codes. If the firmware is written in Assembly
Language for a particular family of processor (say x86 family), it will be very difficult to
translate the assembly language instructions to the new target processor specific language
and so the portability is poor.

If you look into various programming languages for application development for desktop
applications, you will see that certain applications developed on certain languages run only
on specific operating systems and some of them run independent of the desktop operating
systems. For example, applications developed using Microsoft technologies (e.g. Microsoft
Visual Choosing Visual studio) is capable of running only on Microsoft platforms and will
not function on other operating systems; whereas applications developed using ‘Java’ from
Sun Microsystems works on any operating system that supports java standards.

Time-to-Prototype and Market

Time-to-market is the time elapsed between the conceptualisation of a product and the time
at which t e product is ready for selling (for commercial product) or use (for non-commercial
product). The commercial embedded product market is highly competitive and 'me to market
126 | P a g e
the product i a critical factor in the success of a commercial embedded product. There may
be multiple players in the embedded industry who develop products of the same category
(like mobile phone, portable media players, etc.). If you come up with a new design and if it
takes long time to develop and market it, the competitor product may take advantage of it
with their product. Also, embedded technology is one where rapid technology change is
happening. If you start your design by making use of a new technology and if it takes long
time to develop and market the product, by the time you market the product, the technology
might superseded with a new technology. Product prototyping helps a lot in reducing time
-to-market. never you have a product idea, you may not be certain about the feasibility of
the idea is an informal kind of rapid development in which the important features of the pro
or consideration are development in which the important is also another critical factor. If
the prototype is developed faster, the actual estimate can be brought down significantly In
order to shorten the time to prototype, make use of all possible options like the use of off-
the-shelf components, re-usable assets, etc.

Per Unit Cost and Revenue

Cost is a factor which is closely monitored by both end user (those who buy the product)
and product manufacturer (those who build the product). Cost is a highly sensitive factor
for commercial products. Any failure to position the cost of a commercial product at nominal
rate, may lead to the failure of e product in the market. Proper market study and cost benefit
analysis should be carried out before taking a decision on ta per -unit cost of the embedded
products. From a designer/product deve10pment company perspective the ultimate aim of
a product is to general's) margin 1 profit. So the budget and total system cost should be
properly balanced to provide a marginal profits embedded product has a product life cycle
which starts with the design and developments phase. The product idea generation,
prototyping, definition, actual product design and development. Exponent are the activities
carried 0ut during this phrase. 'the design and development phase there is only investment
and no returns once the product is ready to sell, it is introduced to the market. This stage
is known as the Product Iteration stage. During the initial period the sales and revenues
will be low. There won’t be much competition and the products sales and revenue increases
with time the growth phase, the product grabs high market. The maturity phase, the growth
and sales will be steady and the revenue reaches at its peak. The Product
Retirement/Decline phase starts with the drop in sales volume, market share and revenue
the/decline happens due to various reasons like competition from similar product with
enhanced features or technology changes, etc. At some point of the decline stage, the
manufacturer announces discontinuing of the product. The different stages of the embedded
products life cycle-revenue, unit cost and profit in each stage-are represented in the
following Product Life-cycle graph.

127 | P a g e
“Product Life Cycle(PLC) curve”

From the graph, it is clear that the total revenue increases from the product introduction
stage to the product maturity stage. The revenue peaks at the maturity stage and starts
falling in the declined/retirement stage. The unit cost is very high during the Introductory
stage (a typical example is cell phone if you buy a new model of cell phone during its launch
time, the price will be high and you will get the same model with a very reduced price after
three or four months of its launching). The profit increases with increase in sales and attains
a steady value and then falls with a dip in sales. You can see a negative value for profit
during the initial period. It is because during the product development phase there is only
investment and no returns. Profit occurs only when the total returns exceed the investment
and operating cost.

Summary

1. There exists a set of characteristics which are unique to each embedded system.

2 . Embedded systems are application and domain specific.

128 | P a g e
3. Quality attributes of a system represents the non-functional requirements that need to
be documented properly in any system design.

4. The operational quality attributes of an embedded system refers to the non~flmctional


requirements that needs to be considered for the operational mode of the system. Response,
Throughput, Reliability, Maintainability, Security. Safety's etc. are examples of operational
quality attributes.

5. The non-operational quality attributes of an embedded system refers to the non-


functional requirements that needs to be considered for the non-operational mode of the
system. Testability, debug-ability, evolvability, portability, time-to-prototype and market,
per unit cost and revenue, etc. are examples of non-operational quality attributes.

6. The product life cycle curve (PLC) is the graphical representation of the unit cost, product
sales and profits with respect to the various life cycle stages of the product starting from
conception to disposal.

7. For a commercial embedded product, the unit cost is peak at the introductory stage and
it falls in the maturity stage.

8.The revenue of a commercial embedded product is at the peak during the maturity stage.

Keywords

Quality attributes : The non functional requirements that need to be addressed in any
system design

Reactive system : An embedded system which produces changes in output in response too
the changes In input

Real-Time systems : A system which adheres to strict timing behaviour and rewards to
requests in a known amount of time

Response : It is a measure of quickness ofthe system

Throughput : The rate of production or operation of a defined process over a stated period
of time

Reliability : It is a measure of how much % one can rely on up on the proper functioning
of the system.

EMBEDDED SYSTEMS -APPLICATION –AND


DOMAIN-SPECIFIC
As mentioned in the previous chapter on the characteristics of embedded systems,
129 | P a g e
embedded systems are application and domain specific, meaning ; they are specifically built
for certain applications in certain domains like consumer electronics, telecom, automotive,
industrial control, etc. IN general purpose computing, it is possible to replace a system with
another system which is closely matching with the existing system, whereas it is not the
case with embedded systems, hence it is not possible to replace an embedded system
developed for a specific application. Hence it is not possible to replace an embedded system
developed for a specific application in a specific domain with another embedded system
designed for some idea on the application and domain specific characteristics of embedded
systems.

4.1 washing machine-application-specific embedded system

People experience the power of embedded systems and enjoy the features and comfort
provided by them, but they are totally unaware or ignorant of the intelligent embedded
players working behind the products providing enhanced features and comfort, washing
machine is a typical example of an embedded system providing extensive support in home
automation

applications [fig.4.1].

fig 4.1

As mentioned in an earlier chapter, an embedded system contains sensors, actuator, control


unit and application-specific user interfaces like keyboards, display units, etc., You can see
all these components in a washing machine if you have a closer look at it, Some of them are
visible and some of them may be invisible to you.

The actuator part of the washing machine consists of a motorized agitator, tumble tub,
water drawing pump and inlet to consists of the water temperature sensor, level sensor ,etc.
.The control part contains a micro-processor/controller based board with interfaces to the
sensors and actuators. The sensor data is fed back to the control unit and the control unit
also provides connectivity to user interfaces like keypad for setting the washing time,
130 | P a g e
selecting the type of be washed like light ,medium, heavy duty ,etc. User feedback is
reflected through the display unit and LEDs connected to the control board. The functional
block diagram of a washing machine is shown in Fig.4.2.

fig 4.2

Washing machine comes in two models, namely, top loading and front loading
machines, In top loading models the agitator of the machine twists back and forth and
pulls the cloth down to the bottom of the tub. On reaching the bottom of the tub clothes
works their way back up to the top of the tub where the agitator grabs them again and
repeats the mechanism. In the front loading machines, the clothes are tumbled and plunged
into the water over and again. This is the first phase of washing.

In the second phase of washing, water is pumped out from the tub and the inner tub
uses centrifugal force to wring out more water from the clothes by spinning at several
hundred Rotations Per Minute (RPM). This is called a ‘Spin phase’. If you look in to the
keyboard panel of your washing machines you can see three buttons namely* Wash, Spin
and Rinse. You can use these buttons to configure the washing stages. As you can see
from the picture, the inner tub of the machine contains a number of holes and during the
spin cycle the inner tub spins, and forces the water out through these holes to the
stationary outer tub from which it is drained off through the outlet.

It is to be noted that the design of washing machines may vary from manufacturer to
manufacturer, but the general principle underlying in the working of the washing machine
remains the same. The basic controls consist of a timer, cycle selector mechanism, water
temperature selector, load size selector and start button. The mechanism includes the
motor, transmission, clutch, pump, agitator, inner tub, outer tub and water inlet valve.
Water inlet valve connects to the water supply line using at home and regulates the flow of
water into the tub.

131 | P a g e
The integrated control panel consists of a microprocessor/controller based board with
I/O interfaces and a control algorithm running in it. Input interface includes the keyboard
which consists of wash type selector namely* Wash, Spin and Rinse, cloth type selector
namely* Light, Medium, Heavy duty and washing time setting, etc. The output interface
consists of LED/LCD displays, status indication LEDs, etc. Connected to the I/O bus of
the controller. It is to be noted that this interface may vary from manufacturer and model.
The other types of I/O interfaces which are invisible to the end user are different kinds of
sensor interfaces, namely, water temperature sensor, water level sensor, etc. and actuator
interface including motor control for agitator and tub movement control, inlet water flow
control, etc.

4.2AUTOMOTIVE–DOMAIN-SPECIFIC EXAMPLES OF EMBEDDED


SYSTEM
The major application domains of embedded systems are consumer, industrial,
automotive, telecom, etc., of which telecom and automotive industry holds a big market
share.

Figure 4.3 give an overview of the various types of electronic control units employed in
automotive applications.

4.2.1 Inner workings of automotive embedded systems


Automotive embedded systems are the one where electronics take control over the
mechanical systems. The presence of automotive embedded system in a vehicle varies from
simple mirror and wiper controls to complex air bag controller and antilock brake systems
(ABS). Automotive embedded systems are normally built around microcontroller or DSPs
or a hybrid of the two and are generally known as Electronic Control Units (ECUs). The
number of embedded controllers in an ordinary vehicle varies from 20 to 40 whereas a
luxury vehicle like Mercedes S and BMW 7may content 75 to 100 numbers of embedded
controllers. Government regulations on fuel economy. Environmental factors and emission
standard and increasing customer demands on safety. Comfort and Infotainment forces the
automotive manufactures to opt for sophisticated embedded control units within the vehicle.
The first embedded system used In automotive application was the microprocessor based
fuel injection system introduced by Volkswagen 1600 In 1968. The various types of
electronic control units (ECUs) used In the automotive embedded industry can be broadly
classified into two-High-speed embedded control units and Low-speed embedded control
unit .

132 | P a g e
4.2.1.1 High-Speed Electronic Control Units (HECUs) High-speed electronic control
uints (HECUs) are deployed in critical control units requiring fast response. They include
fuel injection systems, antilock brake systems, engine control, electronic throttle,
steering controls, transmission control unit and central control unit.

4.2.1.2 Low-speed Electronic Control Unit (LECUs)Low-Speed Electronic Control Units


(LECUs) are deployed in application where response time is not so critical. They generally
are built around low microcontrollers/microprocessors and digital signal processors. Audio
controllers, passenger and driver door locks, door glace controls (power windows), wiper
control are examples of LECUs

4.2.2 Automotive Communication Buses


Automotive applications make use of serial buses for communication. Which greatly reduces the
amount of wiring required inside a vehicle. The following section will give you an overview of the
different types of serial interface buses deployed in automotive embedded applications.

4. 2. 2. I Controller Area Network (CAN) The CAN bus was originally proposed by Robert
Bosch, pioneer in the Automotive embedded solution providers. It supports medium speed (1501
l 5l9-class B with data rates up to 125 Kbps) and high speed (ISO11898 class C with data rates
up to lMbps) data transfer. CAN is an event-driven protocol interface with support for error
handling in data transmission. It is generally employed in safety system like airbag control;
power train systems like engine control and Antilock Brake System (ABS); and navigation
systems like GPS. The protocol form and interface application development for CAN bus will be
explained in detail in another volume of this book series.

4.2.2.2 Local Interconnect Network (LIN) LIN bus is a single master multiple slave (up to 16
independent slave nodes) communication interface. LIN is a low speed. single Wire

133 | P a g e
communication interface with support for data rates up to 20Kbpsandis used for
sensor/actuator interfacing. LIN bus follows the master communication triggering technique to
eliminate the possible bus arbitration problem that can occur by the simultaneous talking of
different slave nodes connected to a single interface bus. LIN bus is employed in applications
like mirror controls, fan controls, seat positioning controls, window controls, and position
controls where response time is not a critical issue.

4.2.2.3 Media-Oriented System Transport (MOST) Bus The Media-oriented system transport
(MOST) is targeted for automotive audio/video equipment interfacing, used primarily in
European cars. A MOST bus is a multimedia fibre-optic point-to-point network implemented in
a star, ring or daisy- chained topology over optical fibre cables. The MOST bus specifications
define the physical (electrical and optical parameters) layer as well as the application layer,
network layer, and media access control. MOST bus is an optical fibre cable connected between
the Electrical Optical Converter (EOC) and Optical Electrical Converter (OEC), which would
translate into the optical cable MOST bus.

4.2.3 Key Players of the Automotive Embedded Market

The key players of the automotive embedded market can be visualized in three verticals namely,
silicon providers, solution providers and tools and platform providers.

4.2.3.1 Silicon Providers Silicon providers are responsible for providing the necessary chips
which are used in the control application development. The chip maybe a standard product like
microcontroller or DSP or ADC/DAG chips. Some applications may require specific chips and
they are manufactured as Application Specific Integrated Chip (ASlC). The loading silicon
providers in the automotive industry are:

Analog Device (www.analog.com): Provider of world class digital signal processing chips,
precision analog microcontrollers. programmable inclinometer/accelerometer, LED drivers, etc.
for automotive signal processing applications, driver assistance system, audio system,
GPS/Navigation system etc. Xilins (www.xilinx.com): Supplier of high performance FPGAs,
CPLDs and automotive specific IP cores for GPS navigation systems. driver information systems,
distance control, collision avoidance, rear seat entertainment. adaptive cruise control, voice
receptionist. etc.

Atmel (www.atmel.com): Supplier of cost-effective high-density Flash controllers and memories.


Atmel provides a series of high performance microcontrollers, namely, ARM1, ARM2, and 80C51.
A wide range of Application Specific Standard Products (ASSPs) for chassis, body electronics,
security, safety and car infotainment and automotive networking products for CAN, LIN and
FlexRay are also supplied by Atmel.

Maxim/Dallas (www.maxim-ic.com): Supplier of world class analog, digital and mixed signal
products (Microcontrollers, ADC/DAC, amplifiers, comparators, regulators, etc), RF
components, etc. for all kinds of automotive solutions.

NXP semiconductor (www.nxp.com): Supplier of 8/16/32 Flash microcontrollers.


134 | P a g e
Renesas (www.renesas.com): Provider of high speed microcontrollers and Large Scale
Integration (LSI) technology for car navigation systems accommodating three transfer speeds:
high, medium and low.

Texas Instruments (www.ti.com): Supplier of microcontrollers, digital signal processors and


automotive communication control chips for Local Inter Connect (LIN) bus products.

Fujitsu (www.fmal.fujitsu.com): Supplier of fingerprint sensors for security applications, graphic


display controller for instrumentation application, AGPS/GPS for vehicle navigation system and
different types of microcontrollers for automotive control applications.

Infineon (www.iniineon.com): Supplier of high performance microcontrollers and customised


application specific chips.

NEC (www.mec.co.jp): Provider of high performance microcontrollers.

There are lots of other silicon manufactures which provides various automotive support systems
like power supply, sensors/actuators, optoelectronics, etc. Describing all of them is out of the
scope of this book. Readers are requested to use the Internet for finding more information on
them.

4.3. 3.2 Tools and Platform Providers Tools and platform providers are manufacturers and
suppliers of various kinds of development tools and Real Time Embedded Operating Systems for
developing and debugging different control unit related applications. Tools fall into two
categories, namely embedded software application development tools and embedded hardware
development tools. Sometimes the silicon suppliers provide the development suite for application
development using their chip. Some third party suppliers may also provide development kits
and libraries. Some of the leading suppliers of tools and platforms in automotive embedded
applications are listed below.

ENEA (www.cnea.com): ENEA Embedded Technology is the developer of the OSE Real-Time
operating system. The OSE RTOS supports both CPU and DSP and has also been specially
developed to support multi-core and fault-tolerant system development.

The Math Works (www.mathworks.com): It is the world's leading developer and supplier of
technical software. It offers a wide range of tools, consultancy and training for numeric
computation, visualization, modelling and simulation across many different industries. Math
Work’s breakthrough product is MATLAB-a high-level programming language and environment
for technical computation and numerical analysis. Together MATLAB, SIMULINK. State flow and
Real-Time Workshop provide top quality tools for data analysis, test & measurement, application
development and deployment, image processing and development of dynamic and reactive
systems for DSP and control applications.

Keil Software (www.keil.com): The Integrated Development Environment Keil Micro vision from
Keil software is a powerful embedded software design tool for 8051 & C166 family of
microcontrollers.
135 | P a g e
Lauterbach (http://www.lauterbach.com/): It is the world’s number one supplier of debug tools,
providing support for processors from multiple silicon vendors in the automotive market.

ARTiSAN (www.artisansw.com): Is the leading supplier of collaborative modelling tools for


requirement analysis, specification, design and development of complex applications.

Microsoft (www.microsoft.com)z It is a platform provider for automotive embedded applications.


Microsoft’s WindowsCE is a powerful RTOS platform for automotive applications. Automotive
features are included in the new WinCE Version for providing support for automotive application
developers.

4.2.3.3 Solution Providers Solution providers supply OEM and complete solution for
automotive applications making use of the chips, platforms and different development tools. The
major players of this domain are listed below.

Bosch Automotive (www.boschindia.com): Bosch is providing complete automotive solution


ranging from body electronics, diesel engine control, gasoline engine control, powertrain
systems, safety systems, in-car navigation systems and infotainment systems.

DENSO Automotive (www.globaldensoproducts.com): Denso is an Original Equipment


Manufacturer (08M) and solution provider for engine management, climate control, body
electronics, driving control & safety, hybrid vehicles, embedded infotainment and
communications.

Infosys Technologies (WWW. infosys. com): Infosys is a solution provider for automotive
embedded hardware and software Infosys provides the competitive edge in integrating
technology change through costeffective solutions.

Delphi (www.delphi.com): Delphi is the complete solution provider for engine control, safety,
infotainment, etc., and OEM for spark plugs, bearings, etc

...... and many more. The list is incomplete. Describing all providers is out of the scope of this
book.

Hardware Software Co-Design and Program


Modelling
In the traditional embedded system development approach, the hardware software
partitioning is done at an early stage and engineers from the software group take care of the
software architecture development and implementation, whereas engineers from the
hardware group are responsible for building the hardware required for the product. There
is less interaction between the two teams and the development happens either serially or in
parallel. Once the hardware and software are ready, the integration is performed. The

136 | P a g e
increasing competition in the commercial market and need for reduced ‘time-to-market’ the
product calls for a novel approach for embedded system design in which the hardware and
software are co-developed instead of independently developing both.
During the co-design process, the product requirements captured from the customer are
converted into system level needs or processing requirements. At this point of time it is not
segregated as either hardware requirement or software requirement, instead it is specified
as functional requirement. Th4 system level processing requirements are then transferred
into functions which can be simulated a verified against performance and functionality. The
Architecture design follows the system design. The partition of system level processing
requirements into hardware and software takes place during the architecture design phase.
Each system level processing requirement is mapped as either hardware and/or software
requirement. The partitioning is performed based on the hardware-software trade-offs. We
will discuss the various hardware software trade-offs in hardware software co-design in a
separate topic. The architectural design results in the detailed behavioural description of
the hardware requirement and the definition of the software required for the hardware. The
processing requirement behaviour is usually captured using computational models and
ultimately the models representing the software processing requirements are translated into
firmware implementation using programming languages.

7.1 FUNDAMENTAL ISSUES IN HARDWARE SOFTWARE CO-DESIGN

The hardware software co-design is a problem statement and when we try to solve this
problem statement in real life we may come across multiple issues in the design. The
following section illustrates game of the fundamental issues in hardware software co-design.
Selecting the model: hardware software co-design, models are used for capturing and
describing the system characteristics. Model is a formal system consisting of objects and
composition rules. It is hard to make a decision on which model should be followed in a
particular system design. Most often designers switch between a variety of models from the
requirements specification to the implementation aspect of the system design) the reason
being, the objective varies with each phase “; for example, at the 5peciiication stage, only
the functionality of the system is in focus and not the implementation information. When
the design moves to the implementation aspect, the information about the system
components is revealed and the designer has to switch to a model capable of capturing the
system’s structure. We will discuss about the different models in a later section of this
chapter.
Selecting the Architecture A model only captures the system characteristics and does not
provide information on ‘how the system can be manufactured?’.@i.e. architecture specifies
how a system is going to implement in terms of the number and types of different
components and the interconnection among them Controller architecture, Data path
Architecture, Complex Instruction Set Computing (CISC), Reduce Instruction Set
Computing (RISC), Very Long Instruction Word Computing (VLIW), Single Instruction
Multiple Data (SIMD), Multiple Instruction Multiple Data (MIMD), etc. are the commonly
used architectures in system design. Some of them fall into Application Specific Architect
rite Class (like controller architecture), while others fall into either general p 086
architecture class (CISC, RISC, etc.) or Parallel processing class (like VLIW, SIMD, MIMD,
etc.).
The controller architecture implements the finite state machine model (With we will discuss
in a later section) using a state register and two combinational circuits (we will discuss

137 | P a g e
about combinational circuits in a later chapter). The state register holds the present state
and the combinational circuits implement the logic for next state and output.
The data path architecture is best suited for implementing the data flow graph model where
the output is generated as a result of a set of predefined computations on the input data. A
data path represents a channel between the input and output and in data path architecture
the data path may contain registers, counters, register tiles, memories and ports along with
high speed arithmetic unity Ports connect the data path to multiple buses. Most of the time
the arithmetic units are connected in parallel with pipelining support for bringing high
performance.

The Finite State Machine Data path (F SMD) architecture combines the controller
architecture with data path architecture. It implements a controller with data path. The
controller generates the control input whereas the data path processes the data. The data
path contains two types of I/O ports, out of which one acts as the control port for
receiving/sending the control signals from/to the controller unit and the Second 1/0 port
interfaces the data path with ‘external world for data input and data output’. Normally thug
data path is implemented in a chip and the I/O pins of the chip acts as the data input
output ports for the Chi resident data path.
The Complex Instruction Set Computing (CISC) architecture uses an instruction set
representing complex operations. It is possible for a CISC instruction set to perform a large
complex operation (e-g Reading a register value and comparing it with a given value and
then transfer the program executing to a new address location (The CJNE instruction for
8051 ISA)) with a single instructor} The use 0ft Single complex instruction in place of
multiple simple instructions greatly reduces the program memory access and program
memory size requirement. However, 't requires additional silicon for implementing
microcode decoder for decoding the CISC instruction. The data path for the CISC processor
is complex. On the other hand, Reduced Instruction Set Computing (RISC) architecture uses
instruction Set representing simple operations and it requires the execution of multiple
RISC instructions to perform a complex operation. The data path of RISC architecture
contains a large register file for storing the op~ errands and output. RISC instruction set is
designed to operate on registers. RISC architecture supports extensive pipelining.
The Very Long Instruction Word (VLIW) architecture implements multiple functional units
(ALUS, mu tipplers, etc.) in the data path. The VLIW instruction packages one standard
instruction per functional unit of the data path
Parallel processing architecture implements multiple concurrent Processing Elements (PBS)
and each processing element may associate a data path containing register and local
memory. Single Instruction Multiple Data (SIMD) and Mu ' 1e Instruction Multiple Data
(MIMD) architectures are examples for parallel processing architecture In SIMD
architecture, a single instruction is executed in parallel with the help of the Processing
Elements. The scheduling of the instruction execution and controlling of each PE is
performed through a single controller. The SIMD architecture forms the basis of re-
configurable processor we will discuss about re-configurable processors in a later chapter).
On the other hand, the processing elements of the MIMD architecture execute different t
instructions at a given point of time. The MIMD architecture forms the basis of
multiprocessor system The PBS in a multiprocessor system communicates through
mechanisms like shared memory and message passing.
Selecting the language, A programming language captures a ‘Computational Model’ and
maps it into architecture. There is no hard and fast rule to specify this language should be
used for capturing this model A model can be captured using multiple programming

138 | P a g e
languages like C, C++, C#, Java, etc. for software implementations and languages like VHDL,
System C, Verilog, etc. for hardware implementations) 0n the other hand, a single language
can be used for capturing a variety of models. Certain languages are good in capturing
certain computational model. For example, CH is a good candidate for capturing an object
oriented model. The only pre-requisite in selecting a programming language f0! capturing a
model is that the language should capture the model easily.
Partitioning System Requirements into hardware and software So far we discussed about
the models for capturing the system requirements and the architecture for implementing
the system from an implementation perspective, it may be possible to implement the system
requirements in either hardware or software (firmware). It is a tough decision making task
to figure out which one to op \ 0m hardware software trade-0E3 are used for making a
decision on the hardware-software parttioning will discuss them in detail in a later section
of this chapter.

7.20 COMPUTATIONAL MODELS IN EMBEDDED DESIGN


Data Flow Graph (DFG) model, State Machine model, Concurrent Process model, Sequential
Program model, Object Oriented model, etc. are the commonly used computational models
in embedded system Assign. The following sections give an overview of these models.

7.1 .1 Data Flow Graph/Diagram (DFG) Mode

a b c

The Data Flow Graph (DF G) model translates the data processing requirements into a data
flow graph. The Data Flow Graph (DFG) model is a data driven model in which the program
execution is determined by data. This model emphasis on the data and operations on the
data which transforms the input data to output data. Indeed, Data Flow Graph (DFG) is a
visual model in which the operation on a b c tile data (process) is represented using a block
(circle) and data flow is represented using arrows. An inward arrow to the process (circle)
represents input data and Data flow node an outward arrow from the process (circle)
represents output data in DFG notation.

Embedded applications which are computational intensive and data driven are modeled
using the DFG model. DSP applications are typical examples for it. Now let’s have a look at
the implementation of a BFG. Suppose one of the functions in our application y contains
the computational requirement x = a + b; and yzx «ac. Figure 7.1 illustrates the
implementation of Figure 7.1 Data “m” graph (DFG) ““491 a FG model for implementing
these requirements. Zia a DP G model, a data path is the data How path from input to
output. A DFG model is said to be acyclic DF G (ADF G) if it does ’t contain multiple values
for the input variable and multiple output values for a given set of input(s). feedback inputs

139 | P a g e
(Output is fed back to Input), events, etc. are examples for non-acyclic inputs. A DF model
translates the program as a single sequential process execution

7.2.2. Control Data Flow Graph/Diagram (CDFG)

Control mode
Fla
g=
F A 1? t b

+ Data flow mode

-
-

We have seen that the DF G model is a data driven model in which the execution is controlled
by data and it doesn’t involve any control operations (conditionals). The Control DFG (CDFG)
model is used for modelling applications involving conditional program execution. CDFG
models contains both data Operations and control operations. The CDFG uses Data Flow
Graph (DFG) as element and conditional (constructs) as decision maker Q F G contains both
data flow nodes and decision nodes, whereas DFG contains only data flow nodes Let us have
a look at the implementation of the CDFG for the following requirement.
IF flag=1, x=a +b; else y=a-b;
This requirement contains a decision making process. The CDFG model for the same is
given in fig 7.2. The control node is represented by a ‘Diamond’ block which is the decision
making element in formal flow chart based design. CDF G translates the requirement, which
is modelled to a concurrent process model The decision on which process is to be executed
is determined by the control node.

A real world example for modelling the embedded application using CDFG is the capturing
and saving of t e image to a format set by the user in a digital. still camera where everything
is data. driven starting from the Analog Front End which converts the CCD sensor generated
analog signal to Digital Signal and the task which stores the data from ADC to a frame buffer
for the use of a media processor which performs various operations like, auto correction,
white balance adjusting, etc. The decision on, in which format the image is stored (formats
like J PEG, TIFF, BMP, etc.) is controlled by the camera settings configured by the user.

Embedded Firmware Design and Development

LEARNING OBJECTIVES
140 | P a g e
1. Learn the different steps involved in the design and development of firmware for
embedded systems

2. Learn about the different approaches for embedded firmware design and
development, the merits and limitations each

3. Learn about the different languages for embedded firmware development and the
merits and limitations of each

4. Learn about assembly language and instruction mnemonics

5. Learn the steps involved or converting an Assembly Language program to machine


executable code

6. Learn about the assembler, linker, and locater and object to hex file converter

7. Learn the advantages and drawbacks of Assembly language based firmware


development 8. Learn the various steps involved in, the conversion of a program written
in high level language to machine executable code

9. learn about the advantages and limitations of high level language based embedded
firmware development

10. Learn the different ways of mixing assembly language with high level language for
embedded application development

11. Learn about the fundamentals of embedded firmware design using Embedded ’C’

12. learn the similarities and differences between conventional ’C’ programming and ’C’
programming for embedded application development

13. Learn the difference between native and cross-platform development

14. Learn about Keywords and Identifiers, Data types, Storage Classes, Arithmetic and
Logic Operations, Relational Operations, Branching Instructions, Looping Instructions,
Arrays and Pointers, Characters and Strings, Functions. function Pointers, Structures
and Unions, Pre-processors and Macros, Constant Declarations, Volatile Variables,
Delay generation and Ichnite loops, Bit manipulation operations, Coding Interrupt
Service Routines, Recursive and re-entrant functions, and Dynamic memory allocation
in Embedded C The embedded firmware is responsible for controlling the various
peripherals of the embedded hard ware and generating response in accordance with the
functional requirements mentioned 1n the require’ cuts for the particular embedded
141 | P a g e
product. Firmware is considered as the master brain of the embedded system. Imparting
intelligence to an embedded system is a one-time process and it can happen at any stage,
it can be immediately after the fabrication of the embedded hardware or at a later stage.
Once intelligence is imparted to the embedded product, by embedding the firmware in
the hardware, the product starts functioning properly and will continue serving the
assigned task till hardware breakdown occurs or a corruption in embedded firmware
occurs. In case of hardware breakdown, the damaged component may need to be
replaced by a new component and for firmware corruptions the firmware should be
reloaded, to bring back the embedded product to the normal functioning. Coming back
to the new-born baby example, the new-born baby is very adaptive in terms of
intelligence, meaning it learns from mistakes and updates its memory each time a
mistake or a deviation in expected behaviour occurs, whereas most of the embedded
systems are less adaptive or non-adaptive. For most of the embedded products the
embedded firmware is stored at a permanent memory (ROM) and they are no alterable
by end users. Some of the embedded products used in the Control and Instrumentation
domain are adaptive. This adaptability is achieved by making use configurable
parameters which are stored in the alterable permanent memory area (like
NVRAM/RLASH). The parameters get updated in accordance with the deviations from
expected behaviour and the firmware makes use of these parameters for creating the
response next time for similar variations. Designing embedded firmware requires
understanding of the particular embedded product hardware, like various component
interfacing, memory map details, I/O port details, configuration and register details of
various hardware chips used and some programming language (either target processor/
controller specific low level assembly language or a high level language like
C/C++/JAVA). Embedded firmware development process starts with the conversion of
the firmware requirements into a program model using modelling tools like UML or flow
chart based representation. The UML ‘diagrams or flow chart gives a diagrammatic
representation of the decision items to be taken and the tasks to be performed (Fig. 9.1).
Once the program model is created, the next step is the implementation of the tasks and
actions by capturing the model using a language which is understandable by the target
processor/controller. The following sections are designed to give an overview of the
various steps involved in the embedded firmware design and development.

9.1 EMBEDDED FIRMWARE DESIGN APPROACHES

142 | P a g e
The firmware design approaches for embedded product is purely dependent on the
complexity of the functions to be performed, the speed of operation required, etc. Two
basic approaches are used for embedded firmware design. They are ‘conventional
procedural Based Firmware Design’ and ‘Embedded Operating (OS) based design’. The
conventional procedural based design is also known as ‘Super loop Model’. We will
discuss each of them in detail in the following sections.
9.1.1 The Super Loop Based Approach The Super Loop based firmware development
approach is adopted for applications that are not time Critical and w etc. the response
time is not so important (embedded systems where missing deadlines are acceptable) It
is very similar to a conventional procedural programming where the code is executed
task by task. The task listed at the top of the program code is executed first and the
tasks just below the top are executed after completing the first task. This is a true
procedural one. In a multiple task based System. Each task is executed in serial in this
approach. The firmware execution flow for this will be

1. Configure the common parameters and perform initialization for various hardware
components memory, registers, etc.

2. Start the first task and execute it

3. Execute the second task

4. Execute the next task

5. Execute the last defined task

6. Jump back to the first task and follow the same flow from the firmware execution
sequence, it is obvious that the order in which the tasks to be executed fixed and they
are hard coded in the code itself. Also the operation is an infinite loop based approach.
We can visualize the operational sequence listed above in terms of a ‘C’ program code as
void main () { Configuration (); Initializations (); while (1) {

Task 1 (); Task 2 (); : : Task n (): } }

Almost all tasks in embedded applications are non-ending and are repeated infinitely
throughout the operation. From the above ‘C‘ code you can see that the tasks 1 to n are
performed one after another and when the last task (nth task) is executed, the firmware
execution is again re-directed to Task 1 and it is repeated forever in the loop. This
repetition is achieved by using an infinite loop. Here the while ( l) { 1 loop. This approach
is also referred as ‘Super loop based Approach’. Since the tasks are running inside an
143 | P a g e
infinite loop, the only way to come out of the loop is either a hardware reset or an
interrupt assertion. A hardware reset brings the program execution back to the main
loop. Whereas an interrupt request suspends the task execution temporarily and
performs the corresponding interrupt routine and on completion of the interrupt routine
it restarts the task execution from the point where it got interrupted. The ‘Super loop
based design' doesn’t ‘t require an operating system, since there is no need for scheduling
which task is to be executed and assigning priority to each task. In a super loop based
design. the priorities are fixed and the order in which the tasks to be executed are also
fixed. Hence the code for performing these tasks will be residing in the code memory
without an operating system image. This type of design is deployed in low-cost
embedded products and products where response time is not time critical. Some
embedded products demand this type of approach if some tasks itself sequential. For
example, reading/writing data to and from a card using a card reader requires a
sequence of operations like checking the presence of card, authenticating the operation,
reading/writing,etc.it should strictly follow a specified sequence and the combination of
these series of tasks constitutes a single task-namely data read/write. There is no use
in putting the sub-tasks into independent tasks and running them parallel. It won’t work
at all. A typical example of a ‘Super loop based’ product is an electronic video game toy
containing keypad and display unit. The program running inside the product may be
designed in such a way that it reads the keys to detect whether the user has given any
input and if any key press is detected the graphic display is updated. The keyboard
scanning and display updating happens at a reasonably high rate. Even if the application
misses a key press, it won’t create any critical issues; rather it will be treated as a bug
in the firmware @. It is not economical to embed an OS into low cost products and it is
an utter waste to do so if response requirements are not crucial. The ‘Super loop based
design’ is simple and straight forward without any OS related overheads. The major
drawback of this approach is that any failure in any part of a single task will affect the
total system. If the program hangs up at some point while executing a task, it will remain
there forever and ultimately the product stops functioning. There are remedial measures
for overcoming this. Use of Hardware and software Watch Dog Timers (WDTs) helps in
coming out from the loop when an unexpected failure occurs or when the processor
hangs up. This, in turn, may cause additional hardware cost and firmware overheads.
Another major drawback of the ‘Super loop’ design approach is the lack of real timeliness.
If the a herd of ks to be executed within an application increases, the time at which each
task is repeated also increases. This brings the probability of missing out some events.
For example, in a system with Keypads, ‘cording to the ‘Super loop design’, there will be

144 | P a g e
a task for monitoring the keypad connected 1/0 lines and this need not be the task
running while you press the keys (That is key pressing event may not be in sync with
the keypad press monitoring task within the firmware). In order to identify the key press,
you may have to press the keys for a sufficiently long time till the keypad status
monitoring task is executed internally by the firmware. This will really lead to the lack
of real timeliness. There are corrective measures for this also. The best advised option in
use interrupts for external events requiring real time attention. Advances in processor
technology brings out low cost high speed processors/controllers, use of such processors
in super loop design greatly reduces the time required to service different tasks and
thereby are capable of providing a nearly real time attention to external events.
Throughout this book under the title ‘Embedded Firmware Design and Development’, we
will be discussing only the ‘Super loop based design’. Again the discussion is narrowed
to super loop based firmware development for 8051 controller.

9.1.2 The Embedded Operating System (0S) Based Approach

The Operating System (OS) based approach contains operating systems, which can be
either a General Purpose Operating System (OPOS) or a Real Time Operating System
(RTOS) to host the user written application firmware. The General Purpose OS (GPOS)
based design is very similar to a conventional PC based application development where
the device contains an operating system (Windows/Unix/ Linux, etc. for Desktop PCs)
and you will be creating and naming user applications on top of it Example of a GPOS
used in embedded product development is Microsoft® Windows XP Embedded. Examples
of Embedded products using Microsoft® Windows XP 08 are Personal Digital Assistants
(PDAs), Hand held devices/Portable devices and Point of Sale (PoS) terminals. Use of
GPOS in embedded products merges the demarcation of Embedded Systems and general
computing systems in terms of OS for Developing applications on top of the OS, the OS
supported APIs are used Similar to the different hardware specific drivers, OS based
applications also require ‘Driver software’ for different hardware present on the board to
communicate with them. Real Time Operating System (RTOS) based design approach is
employed in embedded products demanding Real-time response. RTOS respond in a
timely and predictable manner to events. Real Time operating system contains a Real
Time kernel responsible for performing pre-emptive multitasking, scheduler for
scheduling tasks, multiple threads, etc. A Real Time Operating System (RTOS) allows
flexible scheduling of system resources like the CPU and memory and offers some way
to communicate between tasks. We will discuss the basics of RTOS based system design
145 | P a g e
in a later chapter titled ‘Designing with Real Time Operating Systems (R T 0S)’. ‘Windows
CE’, ‘pSOS’, ‘VxWorks’, ‘ThreadX’, ‘MicroC/OS-II’, ‘Embedded Linux’, ‘Symbian’ etc. are
examples of RTOS employed in embedded product development. Mobile phones, PDAs
(Based on Windows CE/Windows Mobile Platforms), handheld devices, etc. are examples
of ‘Embedded Products’ based on RTOS. Most of the mobile phones are built around the
popular RTOS ‘Symbian’. 9.2 EMBEDDED FIRMWARE DEVELOPMENT LANGUAGES
As mentioned in Chapter 2, you can use either a target processor/controller specific
language (Generally known as Assembly language or low level language) or a target
processor/controller independent language (Like C, C++, JAVA, etc. commonly known
as High Level Language) or a combination of Assembly and High level Language. We will
discuss where each of the approach is used and the relative merits and de-merits of
each, in the following sections.

9.2.1 Assembly Language based Development ‘Assembly language’ is the human


readable notation of ‘machine language’, whereas ‘machine language’ is a processor
understandable language. Processors deal only with binaries (1s and 0s). Machine
language is a binary representation and it consists of ls and 0s. Machine language is
made readable by

using specific symbols called ‘mnemonics’. Hence machine language can be considered
as an interface between processor and programmer. Assembly language and machine
languages are processor/controller dependent and an assembly program written for one
processor/controller family will not work with others. Assembly language programming
is the task of writing processor specific machine code in mnemonic form, converting the
mnemonics into actual processor instructions (machine language) and associated data
using an assembler. Assembly Language program was the most common type of
programming adopted in the beginning of software revolution. If we look back to the
history of programming, we can see that a large number of programs were written
entirely in assembly language. Even in the 1990s, the majority of console video games
were written in assembly language, including most popular games written for the Sega
Genesis and the Super Nintendo Entertainment System. The popular arcade game NBA
Jam released in 1993 was also coded entirely using the assembly language. Even today
also almost all low level, system related, programming is carried out using assembly
language. Some Operating System dependent tasks require low-level languages. In
particular, assembly language is often used in writing the low level interaction between
the operating system and the hardware, for instance in device drivers. The general
format of an assembly language instruction is an Opcode followed by Operands. The

146 | P a g e
Opcode tells the processor/controller what to do and the Operands provide the data and
information required to perform the action Specified by the opcode. It is not necessary
that all opcode should have Operands following them. Some of the Opcode implicitly
contains the operand and in such situation no operand is required. The operand may be
a single operand, dual operand or more. We will analyses each of them with the 8051
ASM instructions as an example. Mov A, #30 This instruction mnemonic moves decimal
value 30 to the 8051 Accumulator register. Here MOV A is the Opcode and 30 is the
operand (single operand). The same instruction when written in machine language will
look like 01110100 00011110 where the first 8-bit binary value 01110100 represents
the opcode M0 VA and the second 8 bit binary value 00011110 represents the operand
30. The mnemonic INC A is an example for instruction holding operand implicitly in the
Opcode. The machine language representation of the same is 00000100. This instruction
increments the 8051 Accumulator register content by 1. The mnemonic MOV A, #30
explained above is an example for single operand instruction. 16bit address is an
example for dual operand instruction. to machine language for the same is 00000010
addr_bit15 to addr_bit 8 addr_bit7 to addr_bit 0 The first binary data is the
representation of the LIMP machine code. The first operand that immediately follows the
opcode represents the bits 8 to 15 of the 16bit address to which the jump is required
and the second operand represents the bits 0 to 7 of the address to which the jump is
targeted. Assembly language instructions are written one per line. A machine code
program thus consists of a sequence of assembly language instructions, where each
statement contains a mnemonic (Opcode + Operand). Each line of an assembly language
program is split into four fields as given below LABEL OPCODE OPERAND
COMMENTS LABEL is an optional field. A ‘LABEL’ is an identifier used extensively in
programs to reduce the reliance on programmers for remembering where data or code is
located. LABEL is commonly used for representing A memory location, address of a
program, sub-routine, code portion, etc.

The maximum length of a label differs between assemblers. Assemblers insist strict
formats for labelling. Labels are always suffixed by a colon and begin with a valid
character. Labels can contain number from 0 to 9 and special character __ (underscore).
Labels are used for representing subroutine names and jump locations in Assembly
language programming. It is to be noted that ‘LABEL’ is not a mandatory field; it is
optional only. The sample code given below using 8051 Assembly language illustrates
the structured assembly Language programming.
;################################################################### ;
SUBROUTINE res. GENERATING DELAY ; DELAY PARAMETER PASSED THROUGH
147 | P a g e
REGISTER R1 ;. RETURN VALUE NONE ; REGISTERSUSED:R0,R1
;##i#######################‘#’##################1#######################
DELAY: MOV R0, #255; Load Register so with 255 DJNZ R1, DELAY;
decrement R1 and loop till. ; R1= 0 RET
; Return to calling program

The Assembly program contains a main routine which starts at address OOOOH and it
may or may not contain subroutines. The example given above is a subroutine, where in
the main program the subroutine is invoked by the Assembly instruction LCALL. DELAY
Executing this instruction transfers the program flow to the memory address referenced
by the ‘LABEL’ DELAY. It is a good practice to provide comments to your subroutines
before the beginning of it by indicating the purpose of that subroutine, what the input
parameters are and how they are passed to the subroutines, which are the return values,
how they are returned to the calling function, etc. While assembling the code a ‘;’ informs
the assembler that the rest of the part coming in a line after the ‘;’ symbol is comments
and simply ignore it. Each Assembly instruction should be written in a separate line.
Unlike C and other high level languages, more than one ASM code lines are not allowed
in a single line. In the above example the LABEL DELAY represents the reference to the
start of the subroutine DELAY. You can directly replace this LABEL by putting the
desired address first and then writing the Assembly code for the routine as given below.

ORG 01000H MOV R0, #255; Load Register R0 with 50H DJNZ R1,
0100H; Decrement R1 and loop till R1=O RET ; Return to
calling program The advantage of using a label is that the required address is calculated
by the assembler at the time of assembling the program and it replaces the Label. Hence
even if you add some code above the LABEL ‘DELAY’ at a later stage, it won’t create any
issues like code overlapping, whereas in the second method where you are implicitly
telling the assembler that this subroutine should start at the specified address (in the
above example O1OOH). If the code written above this subroutine itself is crossing the
O1OOH mark of the program memory, it will be over written by the subroutine code and
it will generate unexpected results©. Hence for safety don’t assign any address by
yourself, let us refer the required address by using labels and let the assembler handle
the responsibility for finding out the address where the code can be placed. In the above
example you can find out that the label DELAY is used for calling the subroutine as well
as looping (using jumping instruction based on decision-DJNZ). You can also use the
normal jump instruction to jump to the label by calling LJMP DELAY

148 | P a g e
The statement ORG 0100H in the above example is not an assembly language
instruction; it is an assembler directive instruction. It tells t e assembler that the
Instructions from here onward should be placed at location starting from O1OOH. The
Assembler directive instructions are known as ‘pseudopods’. They are used for

1. Determining the start address of the program (e.g. ORG OOOOH)

2. Determining the entry address of the program (e. g. ORG O1OOH)

3. Reserving memory for data variables, arrays and structures (eg. var EQU 70H )

4. Initializing variable values (e. g. Val DATA 12H) ‘

The EQU directive is used for allocating memory to a variable and DATA directive is used
for initialising a variable with data. No machine codes are generated for the ‘pseudo-ops'.

Till now we discussed about Assembly language and how it is used for writing programs.
Now let us have a look at how assembly programs are organised and how they are
translated into machine readable codes.

The Assembly language program written in assembly code is saved as.asm (Assembly file)
file of an.src (source) file or an extension format supported by the tool chain/assembler.
Any text editor like ‘notepad’ or 'WordPad' from Microsoft® or the text editor provided by an
Integrated Development (lDE) tool can be used for writing the assembly instructions.

Similar to ‘C‘ and other high level language programming, you can have multiple source
file: called modules in assembly language programing. Each module is represented by an
‘.asm‘ or ‘.src‘ or a file with an extension format specific format similar to thettol
chain/assembler used similar to the ‘.c’ tiles in C programming. ‘approach is known as ‘
ular Programming’.

Modular programming is employed when the program is too complex or too bi


‘Modular Programming', the mare code is divided into sub modules and each module
is made sable) Modular Programs are usually easy to code, debug and alter.
Conversion of the assembly language to machine language is carried out by a
sequence of operations, as illustrated below.

9.3.1.1Source object file translation

149 | P a g e
Translation of assembly code to machine code is performed by assemblers for different
machines are different and it is common that assemblers from multiple vendors are
available in the market for the same target machines. Some target processor's/controller’s
assembler may be proprietary and is supplied by a single vendor only. Some assemblers
are freely available i the internet for downloading. Some assemblers are commercial and
requires licence from the vendor 51 Macro Assembler from Keil software is a popular
assembler for the 805l family microcontroller. the various steps involved in the conversion
of a program written in assembly language to corresponding binary file machine language
is illustrated in Fig.9.1.

Each source module I written to Assembly and is stored as .m file or .am tile. Each Elem}!
Assembled separately to examine the syntax error: and incorrect assembly instructions. a
corresponding object tile is created with extension ‘.obj‘. The obj eq file does not contain the
absolute address of where the generated code needs to be placed on the program memory
and hence it is called a re-locatable segment It can be placed at any code memory location
in the reparability of the linker/locater to assign absolute address for this module. Absolute
address allocation is done at the absolute object file creation stage. Each module can share
variables and subroutines (function) among them. Exporting a variable/function from a
module (making a variable/function a module available to all other modules) is done by
declaring that variable/function as PUBLIC in the source module importing a variable or
function from a module (taking a variable or function from any one other modules) as done
by declaring that variable or function as EXTRN (EXTERN) in the module where it is going

150 | P a g e
to be accessed. The ‘PUBLIC ‘Keyword informs the assembler that the variables or function
declared as ‘PUBLIC ‘needs to be exported. Similarly the ‘EXTRN‘Keyword tells the
assembler that the variable or functions declared as ‘EXTRN‘needs to be imported from
some other modules. While assembling a module, on seeing variables/functions with
keyword ‘EXTRN’, the assembler understands that these variables or functions come from
an external module and it proceeds assembling the entire module without throwing any
errors, though the assembler cannot find the definition of the variable and implementation
of the functions. Corresponding to a variable or function declared as ‘PUBLIC‘ in a module,
there can be one or more modules using these variables or functions using ‘EXT RN‘
keyword, for all those modules using variables or functions with ‘EXTRN’ keyword, there
should be one and only one module which exports those variables or functions with
‘PUBLIC’ keyword. If more than one module in a project tries to export variables or functions
with the same name using ‘PUBLIC’ keyword, it will generate ‘linker’ errors.

Illustrative example for A51 Assembler-Usage of ‘PUBLIC' for importing variables with same
name on different modules. The target application (Simulator) contains three modules
namely ASAMPLEI. .A51, ASAMPLE2.51I and ASAMPLE3.A51 (The file extension. A51 is
the.asm extension specific to A51 assembler). The modules ASAMPLE2.A51 and
ASAWLE3.A51 contain a function named PUTCHAR. Both of these modules try to export
this function by declaring the function as ‘PUBLIC’ in the respective modules. While linking
the modules, the linker identifies that two modules are exporting the function with name
PUTCHAR. This confuses the linker and it throws the error ‘MULIYPLE PUBLIC
DEFINITIONS’.

Build target ‘simulator'

assembling ASAMPLE1 .A51. . .

assembling ASAMPLE2.A51. . .

assembling ASAMPLE3.A51. . .

Linking...

MODULE: ASAMPLE3 .obj (CHAR_IO)

If a variable or function declared as ‘EXTRN’ in one or two modules, there should be one
module denning these variables or functions and exporting them using ‘PUBLIC’ keyword.
If no modules in a project export the variables or functions which are declared as ‘EX TRN’
in other modules, it will generate ‘linker’ warnings or errors depending on the error
level/warning level settings of the linker.

151 | P a g e
Illustrative example for A51 Assembler-Usage of EX TRN without variables exported. The
target application (Simulator) contains three modules, namely, ASAMPLEI .A51,
ASAMPLE2.A51 and ASAMPLE3. A51 (The file extension .A51 is the .asm extension specific
to A51 assembler). The modules ASAMPLE1.A51 imports a function named PUT_CRLF
which is declared as ‘EX T RN’ in the current module and it expects any of the other two
modules to export it using the keyword ‘PUBLIC’. But none of the other modules export this
function by declaring the function as ‘PUBLIC’ in the respective modules. While linking the
modules, the

Linker identifies that there is no function ex ' ‘ ' ‘ \ porting for this function. The linker
generates a warning or error message UNRESOLVED EXTERNAL SYMBOL’ depending on
the linker ‘level’ settings.

9.2.1.2 Library File Creation and Usage

Libraries are specially formatted, ordered program collections of object modules that may
be used by the linker at a later time. When the linker processes a library, only those object
modules in the library that are necessary to create the program are used. Library files are
generated with extension ‘.lib’. Library tile is some kind of source code hiding technique. If
you don’t want to reveal the source code behind the various functions you have written in.
your program and at the same time you want them to be distributed to application
developers for making use Of them in their applications, you can supply them as library
files and give them me details of the public functions available from the library (function
name, function input/output, etc). For using a library file in a project, add the library to
the project. If you are using a commercial version of the assembler/compiler suite for your
development, the vendor of the utility may provide you pre-written library tiles for
performing multiplication, floating point arithmetic, etc. as an add-on utility or as a bonus©.

‘LIB51’ from Keil Software is an example for a library creator and it is used for creating
library files for A51 Assembler/C51 Compiler for 8051 specific controller.

9.2.1.3 Linker and Locator

Linker and Locator is another software utility responsible for “linking the various Object
modules in a multi module project and assigning absolute address to each module”. Linker
generates an absolute object module by extracting the Object modules from the library, if
any and those obj tiles created by the assembler, which is generated by assembling the
individual modules of a project. It is the responsibility of the linker to link any external
dependent variables or functions declared on various modules and resolve the external
dependencies among the modules. An absolute Object file Or module does not contain any
re-locatable code or data.

All code and data reside at lixed memory locations. The absolute Object tile is used for
creating hex files for dumping into the code memory of the processor/controller.
152 | P a g e
‘BL51’ from Keil Software is an example for a Linker & Locator for A51 Assembler/C51
Compiler for i051 specific controller.

9.2.1.4 Object to Hex File Converter

This is the final stage in the conversion of Assembly language (mnemonics) to machine
understandable language (machine code). Hex File is the representation of the machine
code and the hex file is dumped into the code memory of the processor/controller. The hex
file representation varies depending on the target processor/controller make. For Intel
processors/controllers the target hex tile format will be ‘Intel HEX‘ and for Motorola, the
hex tile should be in ‘Motorola HEX’ format. HEX files are ASCII files that contain
Hexadecimal representation Of target application. Hex tile is created from the final ‘Absolute
Object File’ using the Object to Hex File Converter utility.

‘0H5l ‘from Keil software is an example for Object to Hex File Converter utility for A51
Assembler, C51 Compiler for 8051 specific controller.

9.2.1.5 Advantages of assembly Language Based Development

Assembly Language based development was (is©) the most common technique adopted
from the beginning of embedded technology development. Thorough understanding of the
processor architecture, memory organisation, regime, sets and mnemonics is very essential
for Assembly Language based development. If you master one processor architecture and
its assembly instructions, you can make the processor as flexible as a gymnast. The major
advantages of Assembly Language based development is listed below.

Efficient Code Memory and Data Memory Usage (Memory Optimisation)

Since the developer is well versed with the target processor architecture and memory
organisation, optimised code can be written for performing operations. This leads to less
utilisation of code memory and etiicrem utilisation of data memory. Remember memory is
a primary concern in any embedded product (Though silicon is cheaper and new memory
techniques make memory less costly, external memory operation impact directly on system
performance).

High Performance Optimised code not only improves the code memory usage but also
improves the total system performance. Through effective assembly coding, optimum
performance can be achieved for a target application.

Low Level Hardware Access Most of the code for low level programming like accessing
external device specific registers from the operating system kernel, device drivers, and low
level input routines, etc. are making use of direct assembly coding since low level device

153 | P a g e
specific operation support is not commonly available with most of the high-level language
cross compiler.

Code Reverse engineering is the process of understanding the technology behind a product
by extracting the information from a finished product Reverse engineering is performed by
‘hawkers' to reveal the technology behind ‘Proprietary Products‘. Though most of the
products employ code memory protection, if it may be possible to break the memory
protection and read the code memory, it can easily be converted into assembly code using
a dis-assembler program for the target machine.

9.2.1.6 Drawbacks of Assembly Language Based Development

Every technology has its own pros and cons. From certain technology aspects assembly
language development is the most efficient technique. But it is having the following
technical limitations also.

High Development Time Assembly language is much harder to program than high level
languages. The developer must pay attention to more details and must have thorough
knowledge of the architecture. Memory organisation and register details of the target
processor in use. Learning the inner details of tilt processor and its assembly instructions
is highly time consuming and it creates a delay impact in product development. One
probable solution for this is use a readily available developer who is well versed in the target
processor architecture assembly instructions. Also more lines of assembly code are required
for performing an action which can be done with a single Instruction in a high-level
language like ‘C’.

Developer Dependency There is no common written rule for developing assembly


language based applications whereas all high level languages instruct certain set of rules
for application development. In assembly language programming, the developers will have
the Freedom to choose the different memory petition and registers. Also the programming
approach varies from developer to developer depending on hrs/per taste. For example
moving data from a memory location to accumulator can be achieved through different
approaches. If the approach done by a developer is not documented properly at are
development stage, he/she may not be able to recollect why this approach is followed at a
later stage or when a new developer is instructed to analyse this code, he/she also may not
be able to understand what i5 done and why it is done. Hence upgrading an assembly
program or modifying it on a later stage is Very difficult. Well documenting the assembly
code is a solution for reducing the developer dependency on assembly language
programming. If the code is too large and complex, documenting all lines of code may not
be productive.

Non-Portable Target applications written in assembly instructions are valid only for that
particular family of processors (e.g. Application written for lnielx86 family of processors)

154 | P a g e
and cannot be re-used for another target processors/controllers (Say ARMl l family of
processors). If the target processor controller changes. a complete re-writing of the
application usmg the assembly instructions for the new target processor controller is
required. This is the major drawback of assembly language programming and it makes the
assembly language applications non-portable.

“Though Assembly Language programming possesses [an of drawback, as a developer, from


my personal experience I prefer assembly language based development. Once you master
the internals of a processor/controller; you can really perform magic with the
processor/controller and can attract the maximum out of it. '

9.2.2 High Level Language Based Development

As we have seen in the earlier section. Assembly language based programing is highly time
consuming, tedious and requires skilled programmers with sound knowledge of the target
processor architecture. Also applications developed in Assembly language are non-portable.
Here comes the role of high level languages. Any high level language (like C, C++ or Java)
with a supported cross compiler (for mg the application developed in high level language to
target processor specific assembly code .-We will discuss cross-compilers in detail in a later
section) for the target processor can be used for embedded firmware development. The most
commonly used high level language for embedded firmware application development is ‘C’.
You may be thinking why ‘C’ is used as the popular embedded firmware development
language. The answer is “C is the well-defined. easy to use high level language With
extensive cross platform development tool support“. Nowadays Cross-compilers for C++ is
also emerging out and embedded developers are making use of C++ for embedded
application development.

The various steps involved in high level language based embedded firmware development is
same as that of assembly language based development except that the conversion of source
file written in high level language to object tile is done by a cross-compiler, whereas in
Assembly language based development it is carried out by an assembler. The various steps
involved in the conversion of a program written in high level language to corresponding
binary his machine language is illustrated in Fig. 9 2.

The program written in any of the high level language is saved With the corresponding
language extension(.c for c. .cpp for C++, etc). Any text editor like ‘notepad’ or ‘WordPad'
from Microsoft or the text editor provided by an integrated Development (IDE) tool
supporting the high level language in me can be used for writing the program. Most of the
high level languages support modular programming approach and hence you can have
multiple some tiles called modules written in corresponding high level language. The source
tiles corresponding to each module is represented by a file with corresponding language
extension. Translation of high level source code to executable object code is done by cross-
compiler. The cross-compilers for different high level languages for the same target
155 | P a g e
processor differentiate. It should be noted that each high level language should have a
cross-compiler for converting the high level source code into the target processor machine
code. Without cross-compiler support a high level language cannot be used for embedded
firmware development. C51 Cross-compiler from Keil software is an example for Cross-
compiler. C51 is a popular cross-compiler available for ‘C’ language for the 8051 family of
micro controller. Conversion of each module‘s source code to corresponding object tile is
performed by the cross compiler. Rest of the steps involved in the conversion of high level
language to target processor‘s machine code are same as that of the steps involved in
assembly language limed development.

An example of high level language based embedded firmware development. We will discuss
about ‘Embedded C' is used for embedded firmware development, in a later section of this
chapter.

Fig 9.2 High level language to machine language conversion process Development

9.2.2.1 Advantages of High Level Language Based

Reduced Development Time Developer requires less or little knowledge on the internal
hardware details and architecture of the target processor controller. Bare minimal
knowledge of the memory organisation and register details of the target processor in use
and syntax of the high level language are the only pre-requisites for high level language
based firmware development. Rest of the things will be taken care of by the cross-compiler
used for the high level language. Thus the ramp up time required by the developer in
understanding the target hardware and target machine's assembly instructions is waived
156 | P a g e
off by the cross compiler and it reduces the development time by significant reduction in
developer effort High level language based development also refines the scope of embedded
firmware development from a team of specialised architects to anyone knowing the syntax
of the language and Willing to put little effort on understanding the minimal hardware
details. With high level language. Each task can be accomplished by lesser number of lines
of code compared to the target processor controller specific Assembly language based
development.

Developer Independency The syntax used by most of the high level languages are universal
and , program written in the high level language can easily be understood by a second
person knowing the syntax of the language. Certain instructions may require little
knowledge of the target hardware details like register set, memory map etc. Apart from
these. The high level language based firmware development makes the firmware, developer
independent. High level languages always instruct certain set of, rules for writing the code
and commenting the piece of code. If the developer strictly adheres to the rules, the firmware
will be 100% developer independent.

Portability Target applications written in high level languages are converted to target
processor/controller understandable format (machine codes) by a cross-compiler. An
application when in high level language for a particular target processor can easily be
converted to another target processor controller specific application, with little or less effort
by simply recompiling/little code modification followed by recompiling the application for
the required target processor controller. Provided, the cross-compiler has support for the
processor controller selected. This makes applications written in high level language highly
portable. Little effort may be required in the existing code to replace the target processor
specific header files with new header files, register definitions with new ones, etc. This is
the major flexibility offered by high level language based design.

9.2.2.2 Limitations of High Level Language Based Development

The merits offered by high level language based design take advantage over its limitations.
Some cross-compilers available for high level languages may not be so efficient in generating
optimised target processor specific instructions. Target images created by such compilers
may be messy and non-optimised in terms of performance as Well as code size. For example,
the task achieved by cross-compiler generated machine instructions from a high level
language may be achieved through a lesser number of instructions if the same task is land
coded using target processor specific machine codes. The time required to execute a task
also Increases With the number of instructions. However modern cross-compilers are
tending to adopt designs incorporating optimisation techniques for both code size and
performance. High level language based code snippets may not be efficient in accessing low
level hardware where hardware access tinting is ‘”ideal (of the order of nano or micro
seconds).

157 | P a g e
The investment required for high level language based development tools (Integrated
Development Environment incorporating cross-compiler) is high compared to Assembly
Language based firmware development tools.

9.2.3 Mixing Assembly and High Level Language

Certain embedded firmware development situations may demand the mixing of high level
language huh Assembly and vice versa. High level language and assembly languages are
usually mixed in three ways; namely, mixing assembly language with high level language,
mixing higher level language with assembly and in-line assembly programming.

9.2.3.1 Mixing Assembly with High level language (e.g. Assembly Language with
'C’)

Assembly routines are mixed with ‘C’ in situations where the entire program is written in
‘C’ and the cross compiler in use do not have a built in support for implementing certain
features like Interrupt Service Routine functions (ISR) or if the programmer wants to take
advantage of the speed and optimised code offered by machine code generated by hand
written assembly rather than cross compiler generated machine code. When accessing
certain low level hardware, the timing specifications may be very critical and a cross
compiler generated binary may not be able to offer the required time specifications
accurately. Writing the hardware/ peripheral access routine in processor/controller specific
Assembly language and invoking it from ‘C’ is the most advised method to handle such
situations. ,

Mixing ‘C’ and Assembly is little complicated in the sense-the programmer must be aware
of how parameters are passed from the ‘C’ routine to Assembly and values are returned
from assembly routine to ‘C’ and how ‘Assembly routine’ is invoked from the ‘C’ code.
Passing parameter to the assembly routine and returning values from the assembly routine
to the caller ‘C’ function and the method of invoking the assembly routine from ‘C’ code is
cross compiler dependent. There is no universal written rule for this. You must get these
information’s from the documentation of the cross compiler you are using. Different cross
compilers implement these features 1n different ways depending on the general purpose
registers and the memory supported by the target processor/controller. Let’s examine this
by taking Keil C51 cross compiler for 8051 controller. The objective of this example 18 to
give an idea on how C51 cross compiler performs the mixing of Assembly code with ‘C. ' A
1. Write a simple function 1n C that passes parameters and returns values the way you
want your assembly routine to.

2. Use the SRC directive (#PRAGMA SRC at the top of the file) so that the C compiler
generates an. SRC file instead of an .03] file.

3. Compile the C file. Since the SRC directive is specified, the .SRC file is generated. The
.SRC file contains the assembly code generated for the C code you wrote
158 | P a g e
4. Rename the .SRC file to .A51 file.

5. Edit the .A51 tile and insert the assembly code you want to execute in the body of the
assembly function shell included in the .A51 file.

As an example consider the following sample code (Extracted from Keil C51 documentation)

#pragma SRC
unsigned char my_assembly_func (unsigned int argument)
{ return (argument + l);
}
This C function on cross compilation generates the following assembly SRC file.
NAME', TESTCODE
?PR?__my_assembly_func?TESTCODE SEGMENT CODE
PUBLIC _my__assembly_func
; #pragma SRC
; unsigned char my__assembly__func (
RSEG '2PR?___my_wassemblwaunc?TESTCQDE
USING 0
_my__assembly_func
;--- Variable ‘argument?040’ assigned to Register ‘R6/R7’ ----'
; SOURCE LINE # 2
; unsigned int argument)
;{
; SOURCE LINE # 4
; return (argument + 1) // Insert dummy lines to access all args
; andretvals
; SOURCE LINE # 5
MOV A, R7
INC A
MOV R7, A
; }
;SOURCE LINE # 6
?C0001:

RET

; END OF _my_assembly_func
END
The special compiler directive SRC generates the Assembly code corresponding to the ‘C’
function and each lines of the source code is converted to the corresponding Assembly
instruction. You can easily identify the Assembly code generated for each line of the source
code since it is implicitly mentioned in the generated .SRC file. By inspecting this code
segments you can find out which registers are used for holding the variables of the ‘C’
function and you can modify the source code by adding the assembly routine you want.

159 | P a g e
9.2.3.3 Mixing High level language with Assembly (e.g. ‘C’ with Assembly Language)

Mixing the code written in a high level language like ‘C’ and Assembly language is useful in
the following scenarios:

1, The source code is already available in Assembly language and a routine written in a
high level language like ‘C’ needs to be included to the existing code.

2, The entire source code is planned in Assembly code for various reasons like optimised
code, optimal performance, efficient code memory utilisation and proven expertise in
handling the Assembly, etc. But some portions of the code may be very difficult and tedious
to code in Assembly. For example 16bit multiplication and division in 8051 Assembly
Language.

3, To include built in library functions written in ‘C’ language provided by the cross
compiler. For example Built in Graphics library functions and String operations Supported
by ‘C’.

Most often the functions written in ‘C’ use parameter passing to the function and returns
value/s to the calling functions. The major question that needs to be addressed in mixing
a ‘C’ function with Assembly is that how the parameters are passed to the function and how
values are returned from the function and how the function is invoked from the assembly
language environment. Parameters are passed to the function and values are returned from
the function using CPU registers, stack memory and fixed memory. Its implementation is
cross compiler dependent and it varies across cross compilers. A typical example is given
below for the Keil C51 cross compiler.C51 allows passing of a maximum of three arguments
through general purpose registers R2 to R7. If the three arguments are char variables, they
are passed to the function using registersR7, R6 and R5 respectively.

If the parameters are int values, they are passed using register pairs (R7, R6), (R5, R4) and
(R3, R2). If the number of arguments is greater than three, the first three arguments are
passed through registers and test is passed through fixed memory locations. Refer to C51
documentation for M details. Return values are usually passed through general purpose
registers. R7 is used for team char value and register pair (R7, R6) is used for returning int
value. The ‘C' subroutine can be invoked from the assembly program using the subroutine
call Assembly instruction (Again cross compiler dependent).

E.g. LCALL _C function Where C function is a function written in ‘C’. The prefix -_ informs
the cross compiler that the parameters to the function are passed through registers. If the

160 | P a g e
function is invoked without the _ prenx, it is understood that me parameters are passed
through mixed memory locations.

9.2.3.3 lnline Assembly

lnline assembly is another technique for inserting target processor/controller specilic


Assembly instructions at any location of a source code written in high level language ‘C’.
This aVoids the delay in calling an assembly routine from a ‘C’ code (If the Assembly
instructions to be inserted are put in a subroutine as mentioned in the section mixing
assembly with ‘C’). Special keywords are used to indicate that the start and end of Assembly
, instructions. The keywords are cross-compiler specific. C51 uses the keywords #pragma
asm and #pragma endasm to indicate a block of code written in assembly.

E.g. #pragma asm

MOV A, #13H

#pragma endasm

Important Note:

The examples used for illustration throughout the section Mixing Assembly & High Level
Language is Keil C51 cross compiler specific. The operation is crass compiler dependent
and it varies from cross compiler to cross compiler The intention of the author is just to
give an overall idea about the mixing of Assembly code and High level language ‘C’ in writing
embedded programs. Readers are advised to go through the documentation of the cross
compiler they are using for understanding the procedure adopted for the cross compiler in
use.

QUESTION BANK
1. Discuss characteristics of an embedded system. (6)
2. Explain non-operational quality attributes. (6)
3. Explain data flow graph and control data flow graph. (5)
4. What is operational quality attribute? Explain the important non-operational
quality attributes to be considered in any embedded system design. (8)
5. What is hardware software co-design? Explain the fundamental issues in
hardware software co-design. (7)
6. With a state diagram explain automatic seat belt control problem. (5)
7. Discuss the embedded system used in washing machine. (6)
8. Discuss the embedded system used in automotive machine.(6)
161 | P a g e
9. With a neat block diagram explain assembly to hex file creation. (5)
10. List the advantages and drawbacks of assembly to hex file creation. (4)
11. Differentiate between compiler and cross compiler. (4)
12. Explain the embedded firmware design approaches.

162 | P a g e
Real-Time Operating System (RTOS) based
Embedded System Design

The super loop executes the tasks sequentially in the order in which the tasks are “Sled
Within the loop. Here every task is repeated at regular intervals and the task execution is
non-realm"18 As the number of task increases, the time intervals at which a task gets
serviced also increases. lf some of the tasks involve waiting for external events or 1/0 device
usage, the task execution time also gm pushed off in accordance with the ‘wait’ time
consumed by the task. The priority in which a task “h be executed is fixed and is determined
by the task placement within the loop, in a super loop based execution. This type firmware
execution is suited for embedded devices where response time fora task 15 not time critical.
Typical examples are electronic toys and video gaming devices. Here any respond delay is
acceptable and it will not create any operational issues or potential hazards. Whereas cert-
ii,1 applications demand time critical response to tasks/events and any delay in the
response may become catastrophic. Flight Control systems, Air bag control and Anti-
Locking Brake (ABS) systems for vehicles, Nuclear monitoring devices, etc. are typical
examples of applications/devices demanding time critical task response.
How the increasing need for time critical response for tasks/events is addressed in
embedded applications? Well the answer is
1. Assign priority to tasks and execute the high priority task when the task is ready to
execute.
2. Dynamically change the priorities of tasks if required on a need basis.
3. Schedule the execution of tasks based on the priorities.
4. Switch the execution of task when a task is waiting for an external event or a system
resource including [/0 device operation.

163 | P a g e
The introduction of operating system based firmware execution in embedded devices can
address these needs to a greater extent.

10.1 OPERATING SYSTEM BASICS

The operating system acts as a bridge between the user applications/tasks and the
underlying system resources through a set of system functionalities and services. The OS
manages the system resources and makes them available to the user applications/tasks on
a need basis. A normal computing system is a collection of different I/O subsystems,
working, and storage memory. The primary functions of an operating system is
Make the system convenient to use
Organize and manage the system resources eminently and correctly
Figure 10.1 gives an insight into the basic components of an operating system and their
Interface with rest of the world.

ARM INTERNAL TEST PAPERS

1.a.Explain TBB instruction with an example.

A. Table Branch Byte (TBB) and Table Branch Halfword (TBH) are for implementing
branch tables. TheTBB instruction uses a branch table of byte size offset.Since the bit 0 of
a program counter is always zero, the value in the branch table is multiplied by two before
it’s added to PC. Furthermore, because the PC value is the current instruction address plus
four,
the branch range for TBB is (2 × 255) + 4 = 514, and the branch range for TBH is (2 × 65535)
+ 4 =131074..

TBB has this general syntax:


TBB.W [Rn, Rm]
where Rn is the base memory offset and Rm is the branch table index. The branch table
item for TBB is located at Rn + Rm. Assuming we used PC for Rn, we can see the operation
.

In ARM assembler (armasm), the TBB branch table can


be created in the following way:
TBB.W [pc, r0] ; when executing this instruction, PC equal ; branchtable
branchtable
DCB ((dest0 − branchtable)/2) ; Note that DCB is used because
; the value is 8-bit
DCB ((dest1 − branchtable)/2)
DCB ((dest2 − branchtable)/2)
DCB ((dest3 − branchtable)/2)
dest0
... ; Execute if r0 = 0
dest1
... ; Execute if r0 = 1

164 | P a g e
dest2
... ; Execute if r0 = 2
dest3
... ; Execute if r0 = 3
1.b. What is bit - band memory region? Explain advantage of using bit-band
addressing.

• A. Bit-band operation support allows a single load/ store operation to access


(read/write) to a single data bit.
• In the Cortex-M3, this is supported in two predefined memory regions called bit-band
regions.
• One of them is located in the first 1 MB of the SRAM region, and the other is located in
the first 1 MB of the peripheral region.
• These two memory regions can be accessed like normal memory, but they can also be
accessed via a separate memory region called the bit-band alias.

Bit Accesses to Bit-Band Region via the Bit-Band Alias

• For example, to set bit 2 in word data in address 0x20000000, instead of using three
instructions to read the data, set the bit, and then write back the result, this task can
be carried out by a single instruction.

• The assembler sequence for these two cases could be like the one shown in Figure.

Write to Bit-Band Alias

165 | P a g e
Example Assembler Sequence to Write a Bit with and without BitBand

• Cortex-M3 does not have special instructions for bit operation.

• Special memory regions are defined so that data accesses to these regions are
automatically converted into bit-band operations.
Read from the Bit-Band Alias

Advantages of Bit-Band Operations

• Implement serial data transfers in general-purpose input/output (GPIO) ports to serial


devices.

• If a branch should be carried out based on 1 single bit in a status register in a


peripheral, instead of:

 Reading the whole register

 Masking the unwanted bits

 Comparing and branching

• You can simplify the operations to:

 Reading the status bit via the bit-band alias (get 0 or 1)

 Comparing and branching


2.a.what is CMSIS? Discuss its advantage and explain CMSIS structure.

166 | P a g e
A. advantages:
improve software portability and reusability
• enable software solution suppliers to develop products that can work seamlessly with
device libraries from various silicon vendors
• allow embedded developers to develop software quicker with an easy-to-use and
standardized software interface
• allow embedded software to be used on multiple compiler products
• avoid device driver compatibility issues when using software solutions from multiple
sources

The CMSIS is divided into multiple layers as follows:


Core Peripheral Access Layer
• Name definitions, address definitions, and helper functions to access core registers and
core peripherals
Middleware Access Layer
• Common method to access peripherals for the software industry (work in progress)
• Targeted communication interfaces include Ethernet, UART, and SPI.
• Allows portable software to perform communication tasks on any Cortex microcontrollers
that support the required communication interface Device Peripheral Access Layer (MCU
specific)
• Name definitions, address definitions, and driver code to access peripherals
Access Functions for Peripherals (MCU specifi

• Optional additional helper functions for peripherals

2.b. Write a IT statement for following code


If( R0>R1)
R3=R0+R4;
R5=R3/2;
Else
R3=R0-R4
R5=R3*2

167 | P a g e
A.Branch condition instructions can be replaced with IF-THEN-ELSE structure.

3.a. Discuss any two method used to access the peripheral registers.

A. Accessing Memory-Mapped Registers in C

• There are various ways to access memorymapped peripheral registers in C language.

• For illustration, we will use the System Tick (SYSTICK) Timer in the Cortex-M3 as an
example peripheral to demonstrate different access methods in C language.

• The SYSTICK is a 24-bit timer which contains only four registers.

Accessing Peripheral Registers as Pointers

• This is the easiest method defining each register as a pointer.

Alternative Way of Accessing Peripheral Registers as Pointers

• We can define a macro to convert address values to C pointer.

• The C-code looks a bit different, but the generated code is the same as previous
implementation.

168 | P a g e
Accessing Peripheral Registers asPointers to Elements in a DataStructure

• Method 2 is to define the registers as a data structure, and then define a pointer of the
defined structure.

• This is the method used in CMSIS compliant device driver libraries.

3.b. WCP to turn on the LED.

4.a. With a neat diagram explain the detailed memory map of ARM cortex M3.

A. Memory Maps

• Some of the memory locations are allocated for private peripherals such as debugging
components.
• These debugging components include the following:

Fetch Patch and Breakpoint Unit (FPB)


Data Watch point and Trace Unit (DWT)
Instrumentation Trace Macro cell (ITM)
Embedded Trace macro cell (ETM)
Trace Port Interface Unit (TPIU)

169 | P a g e
ROM table

4.b. Explain the components of a typical Embedded System with a neat diagram.

170 | P a g e
FPGA/ASIC/DSP/SoC
Microprocessor/controller Embedded
Firmware

Memory

Communication Interface

System
I/p Ports Core O/p Ports
(Sensors)
(Actuators)

Other supporting
Integrated Circuits &
subsystems

Embedded System

Real World
A.

5.a. What are the different processors used in Embedded System?

A. The Core of the Embedded Systems

The core of the embedded system falls into any one of the following categories.

General Purpose and Domain Specific Processors

 Microprocessors
 Microcontrollers
 Digital Signal Processors
 Programmable Logic Devices (PLDs)
 Application Specific Integrated Circuits (ASICs)
 Commercial off the shelf Components (COTS)

The Core of the Embedded Systems

The core of the embedded system falls into any one of the following categories.

General Purpose and Domain Specific Processors

 Microprocessors
171 | P a g e
 Microcontrollers
 Digital Signal Processors
 Programmable Logic Devices (PLDs)
 Application Specific Integrated Circuits (ASICs)
 Commercial off the shelf Components (COTS)
General Purpose Processor or GPP is a processor designed for general computational tasks
GPPs are produced in large volumes and targeting the general market. Due to the high
volume production, the per unit cost for a chip is low compared to ASIC or other specific
ICs

A typical general purpose processor contains an Arithmetic and Logic Unit (ALU) and Control
Unit (CU) Application Specific Instruction Set processors (ASIPs) are processors with
architecture and instruction set optimized to specific domain/application requirements like
Network processing, Automotive, Telecom, media applications, digital signal processing,
control applications etc.

ASIPs fill the architectural spectrum between General Purpose Processors and Application
Specific Integrated Circuits (ASICs).The need for an ASIP arises when the traditional general
purpose processor are unable to meet the increasing application needs.Some
Microcontrollers (like Automotive AVR, USB AVR from Atmel), System on Chips, Digital
Signal Processors etc are examples of Application Specific Instruction Set Processors
(ASIPs).ASIPs incorporate a processor and on-chip peripherals, demanded by the application
requirement, program and data memory

5.b. Mention the different applications of an embedded system.

A. Major Application Areas of Embedded Systems

 Consumer Electronics: Camcorders, Cameras etc.


 Household Appliances: Television, DVD players, Washing machine, Fridge,
Microwave Oven etc.

 Home Automation and Security Systems: Air conditioners, sprinklers, Intruder


detection alarms, Closed Circuit Television Cameras, Fire alarms etc.

 Automotive Industry: Anti-lock breaking systems (ABS), Engine Control, Ignition


Systems, Automatic Navigation Systems etc.
 Telecom: Cellular Telephones, Telephone switches, Handset Multimedia Applications
etc.
 Computer Peripherals: Printers, Scanners, Fax machines etc.
 Computer Networking Systems: Network Routers, Switches, Hubs, Firewalls etc.

172 | P a g e
 Health Care: Different Kinds of Scanners, EEG, ECG Machines etc.

 Measurement & Instrumentation: Digital multi meters, Digital CROs, Logic Analyzers
PLC systems etc.

 Banking & Retail: Automatic Teller Machines (ATM) and Currency counters, Point of
Sales (POS)
 Card Readers: Barcode, Smart Card Readers, Hand held Devices etc.

6.a.Discuss the different RAM memories used in embedded systems with relevant
circuit diagrams.

A. Memory – Read-Write Memory/Random Access Memory (RAM)

 RAM is the data memory or working memory of the controller/processor


 RAM is volatile, meaning when the power is turned off, all the contents are destroyed
 RAM is a direct access memory, meaning we can access the desired memory location
directly without the need for traversing through the entire memory locations to reach
the desired memory position (i.e. Random Access of memory location)

Read/Write
Memory (RAM)

SRAM DRAM NVRAM

Memory – RAM – Static RAM (SRAM)

 Static RAM stores data in the form of Voltage. They are made up of flip-flops

 In typical implementation, an SRAM cell (bit) is realized using 6 transistors (or 6


MOSFETs). Four of the transistors are used for building the latch (flip-flop) part of
the memory cell and 2 for controlling the access.

173 | P a g e
 Static RAM is the fastest form of RAM available. SRAM is fast in o peration due to its
resistive networking and switching capabilities

Bit Line B\ Bit Line B


Q1 Q3

Q5 Q6

Q2 Q4
Vcc

Word Line

 Memory – RAM – Dynamic RAM (DRAM)


 Dynamic RAM stores data in the form of charge. They are made up of MOS transistor
gates
 The advantages of DRAM are its high density and low cost compared to SRAM

 The disadvantage is that since the information is stored as charge it gets leaked off
with time and to prevent this they need to be refreshed periodically
 Special circuits called DRAM controllers are used for the refreshing operation. The
refresh operation is done periodically in milliseconds interval

Bit Line B

Word Line

+
-


 Memory – RAM – Non Volatile RAM (NVRAM)
 Random access memory with battery backup
 It contains Static RAM based memory and a minute battery for providing supply to
the memory in the absence of external power supply
174 | P a g e
 The memory and battery are packed together in a single package

 NVRAM is used for the non volatile storage of results of operations or for setting up
of flags etc
 The life span of NVRAM is expected to be around 10 years
 DS1744 from Maxim/Dallas is an example for 32KB NVRAM

6.b. Explain the working of a relay driver with a neat diagram.

A. – Relay

 An electro mechanical device which acts as dynamic path selectors for signals and
power
 The ‘Relay’ unit contains a relay coil made up of insulated wire on a metal core and
a metal armature with one or more contacts.
 ‘Relay’ works on electromagnetic principle. When a voltage is applied to the relay coil,
current flows through the coil, which in turn generates a magnetic field. The magnetic
field attracts the armature core and moves the contact point. The movement of the
contact point changes the power/signal flow path

Relay Coil
Relay Coil

Relay Coil

Single Pole Single Single Pole Single Single Pole Double


Throw Normally Throw Normally Throw
Open Closed
Relay
Driver Circuit

 The Relay is normally controlled using a relay driver circuit connected to the port pin
of the processor/controller
 A transistor can be used as the relay driver. The transistor can be selected depending
on the relay driving current requirements

175 | P a g e
Vcc

Freewheeling Diode

Relay Coil
Load
Port Pin

Relay Unit

7.a. Discuss the I2C communication interface with neat diagram

A. On-board Communication Interface - I2C

 Inter Integrated Circuit Bus (I2C - Pronounced ‘I square C’) is a synchronous bi-
directional half duplex (one-directional communication at a given point of time) two
wire serial interface bus

 The concept of I2C bus was developed by ‘Philips Semiconductors’ in the early 1980’s.
The original intention of I2C was to provide an easy way of connection between a
microprocessor/microcontroller system and the peripheral chips in Television sets
 The I2C bus is comprised of two bus lines, namely; Serial Clock – SCL and Serial
Data – SDA. SCL line is responsible for generating synchronization clock pulses and
SDA is responsible for transmitting the serial data across devices.
 I2C bus is a shared bus system to which many number of I2C devices can be
connected. Devices connected to the I2C bus can act as either ‘Master’ device or
‘Slave’ device
 The ‘Master’ device is responsible for controlling the communication by
initiating/terminating data transfer, sending data and generating necessary
synchronization clock pulses
 ‘Slave’ devices wait for the commands from the master and respond upon receiving
the commands
 ‘Master’ and ‘Slave’ devices can act as either transmitter or receiver

 Regardless whether a master is acting as transmitter or receiver, the synchronization


clock signal is generated by the ‘Master’ device only

I2C supports multi masters on the same bus

176 | P a g e
SCL SDA Vcc
2.2K

SDA
2.2K
Port Pins SCL
Slave 1
SCL I2C Device
Master SDA (Eg: Serial
(Microprocessor/ EEPROM)
Controller)

SCL Slave 2
SDA I2C Device

I2C Bus

7.b. Write a note on embedded firmware.


A. Embedded Firmware

 The control algorithm (Program instructions) and or the configuration settings that
an embedded system developer dumps into the code (Program) memory of the
embedded system
 The embedded firmware can be developed in various methods like
 Write the program in high level languages like Embedded C/C++ using an
Integrated Development Environment (The IDE will contain an editor,
compiler, linker, debugger, simulator etc. IDEs are different for different family
of processors/controllers.

 Write the program in Assembly Language using the Instructions Supported by


your application’s target processor/controller

8.a. Explain the operation of UART. Compare UART and USB.


A.UART:

The master device is responsible for generating the clock signal. Master device selects the
required slave device by asserting the corresponding slave device’s slave select signal ‘LOW’.
The data out line (MISO) of all the slave devices when not selected floats at high impedance
state

 The serial data transmission through SPI Bus is fully configurable. SPI devices
contain certain set of registers for holding these configurations. The Serial Peripheral
Control Register holds the various configuration parameters like master/slave
selection for the device, baudrate selection for communication, clock signal control

177 | P a g e
etc. The status register holds the status of various conditions for transmission and
reception.
 SPI works on the principle of ‘Shift Register’. The master and slave devices contain
a special shift register for the data to transmit or receive. The size of the shift register
is device dependent. Normally it is a multiple of 8. During transmission from the
master to slave, the data in the master’s shift register is shifted out to the MOSI pin
and it enters the shift register of the slave device through the MOSI pin of the slave
device. At the same time the shifted out data bit from the slave device’s shift register
enters the shift register of the master device through MISO pin

TXD TXD
UART UART
RXD RXD

TXD: Transmitter Line


RXD: Receiver Line

USB:

 Universal Serial Bus (USB) is a wired high speed serial bus for data communication

 The USB communication system follows a star topology with a USB host at the center
and one or more USB peripheral devices/USB hosts connected to it
 A USB host can support connections up to 127, including slave peripheral devices
and other USB hosts

 USB transmits data in packet format. Each data packet has a standard format. The
USB communication is a host initiated one
 The USB Host contains a host controller which is responsible for controlling the data
communication, including establishing connectivity with USB slave devices,
packetizing and formatting the data packet. There are different standards for
implementing the USB Host Control interface; namely Open Host Control Interface
(OHCI) and Universal Host Control Interface (UHCI).

 The Physical connection between a USB peripheral device and master device is
established with a USB cable
 The USB cable supports communication distance of up to 5 meters
 The USB standard uses two different types of connectors namely ‘Type A’ and ‘Type
B’ at the ends of the USB cable for connecting the USB peripheral device and host
device

178 | P a g e
 ‘Type A’ connector is used for upstream connection (connection with host) and ‘Type
B’ connector is used for downstream connection (connection with slave device)

Peripheral
Device 2

Peripheral USB Host Peripheral


Device 1 (Hub) Device 3

USB Host
(Hub)

Peripheral Peripheral
Device 4 Device 5

Pin No: Pin Name Description


1 VBUS Carries power (5V)
2 D- Differential data carrier line
3 D+ Differential data carrier line
4 GND Ground signal line

External Communication Interface – Universal Serial Bus (USB)

 Each USB device contains a Product ID (PID) and a Vendor ID (VID)


 The PID and VID are embedded into the USB chip by the USB device manufacturer
 The VID for a device is supplied by the USB standards forum
 PID and VID are essential for loading the drivers corresponding to a USB device for
communication
 USB supports four different types of data transfers, namely; Control, Bulk,
Isochronous and Interrupt
 Control transfer is used by USB system software to query, configure and issue
commands to the USB device

179 | P a g e
 Bulk transfer is used for sending a block of data to a device. Bulk transfer supports
error checking and correction. Transferring data to a printer is an example for bulk
transfer.

 Isochronous data transfer is used for real time data communication. In Isochronous
transfer, data is transmitted as streams in real time. Isochronous transfer doesn’t
support error checking and re-transmission of data in case of any transmission loss

 Interrupt transfer is used for transferring small amount of data. Interrupt transfer
mechanism makes use of polling technique to see whether the USB device has any
data to send
 The frequency of polling is determined by the USB device and it varies from 1 to 255
milliseconds. Devices like Mouse and Keyboard, which transmits fewer amounts of
data, uses interrupt transfer.

8.b. Bring out difference between serial and parallel communication.

A.

9.a. Elaborate the working of SPI bus with a neat interfacing diagram.

A. On-board Communication Interface – Serial Peripheral Interface (SPI) Bus

The Serial Peripheral Interface Bus (SPI) is a synchronous bi-directional full duplex four
wire serial interface bus. The concept of SPI is introduced by Motorola. SPI is a single master
multi-slave system. It is possible to have a system where more than one SPI device can be
master, provided the condition only one master device is active at any given point of time,
is satisfied. SPI requires four signal lines for communication. They are:

180 | P a g e
Master Out Slave In (MOSI): Signal line carrying the data from master to slave device. It
is also known as Slave Input/Slave Data In (SI/SDI)

Master In Slave Out (MISO): Signal line carrying the data from slave to master device. It is
also known as Slave Output (SO/SDO)

Serial Clock (SCLK): Signal line carrying the clock signals

Slave Select (SS): Signal line for slave device select. It is an active low
signal

MOSI SCL MISO

MISO
SCL
MOSI MOSI Slave 1
SCL SPI Device
Master
MISO (Eg: Serial
(Microprocessor/
SS\ EEPROM)
Controller)
SS1\
SS2\
MOSI
Slave 2
SCL
SPI Device
MISO
(Eg: LCD)
SS\

SPI Bus
 The master device is
responsible for generating the clock signal. Master device selects the required slave
device by asserting the corresponding slave device’s slave select signal ‘LOW’. The
data out line (MISO) of all the slave devices when not selected floats at high
impedance state

 The serial data transmission through SPI Bus is fully configurable. SPI devices
contain certain set of registers for holding these configurations. The Serial Peripheral
Control Register holds the various configuration parameters like master/slave
selection for the device, baud rate selection for communication, clock signal control
etc. The status register holds the status of various conditions for transmission and
reception.
 SPI works on the principle of ‘Shift Register’. The master and slave devices contain
a special shift register for the data to transmit or receive. The size of the shift register
is device dependent. Normally it is a multiple of 8. During transmission from the
master to slave, the data in the master’s shift register is shifted out to the MOSI pin
and it enters the shift register of the slave device through the MOSI pin of the slave

181 | P a g e
device. At the same time the shifted out data bit from the slave device’s shift register
enters the shift register of the master device through MISO pin

9.b. Explain the role of Watchdog Timer in Embedded System with relevant example.

 A A timer unit for monitoring the firmware execution


 Depending on the internal implementation, the watchdog timer increments or
decrements a free running counter with each clock pulse and generates a reset signal
to reset the processor if the count reaches zero for a down counting watchdog, or the
highest count value for an up counting watchdog
 If the watchdog counter is in the enabled state, the firmware can write a zero (for up
counting watchdog implementation) to it before starting the execution of a piece of
code (subroutine or portion of code which is susceptible to execution hang up) and
the watchdog will start counting. If the firmware execution doesn’t complete due to
malfunctioning, within the time required by the watchdog to reach the maximum
count, the counter will generate a reset pulse and this will reset the processor
 If the firmware execution completes before the expiration of the watchdog timer the
WDT can be stopped from action

 Most of the processors implement watchdog as a built-in component and provides


status register to control the watchdog timer (like enabling and disabling watchdog
functioning) and watchdog timer register for writing the count value. If the
processor/controller doesn’t contain a built in watchdog timer, the same can be
implemented using an external watchdog timer IC circuit.

Microoprocessor/
Controller
Watchdog
Free Running
Reset Pin
Counter

Watchdog Reset

System Clock

10.a. Explain USB protocol with a neat diagram also discuss different data transferred
in USB.

 A. Universal Serial Bus (USB) is a wired high speed serial bus for data
communication

 The USB communication system follows a star topology with a USB host at the center
and one or more USB peripheral devices/USB hosts connected to it
 A USB host can support connections up to 127, including slave peripheral devices
and other USB hosts
182 | P a g e
 USB transmits data in packet format. Each data packet has a standard format. The
USB communication is a host initiated one
 The USB Host contains a host controller which is responsible for controlling the data
communication, including establishing connectivity with USB slave devices,
packetizing and formatting the data packet. There are different standards for
implementing the USB Host Control interface; namely Open Host Control Interface
(OHCI) and Universal Host Control Interface (UHCI)

Peripheral
Device 2

Peripheral USB Host Peripheral


Device 1 (Hub) Device 3

USB Host
(Hub)

Peripheral Peripheral
Device 4 Device 5

 The Physical connection between a USB peripheral device and master device is
established with a USB cable
 The USB cable supports communication distance of up to 5 meters
 The USB standard uses two different types of connectors namely ‘Type A’ and ‘Type
B’ at the ends of the USB cable for connecting the USB peripheral device and host
device

 ‘Type A’ connector is used for upstream connection (connection with host) and ‘Type
B’ connector is used for downstream connection (connection with slave device)
Pin No: Pin Name Description
1 VBUS Carries power (5V)
2 D- Differential data carrier line
3 D+ Differential data carrier line
4 GND Ground signal line

External Communication Interface – Universal Serial Bus (USB)


183 | P a g e
 Each USB device contains a Product ID (PID) and a Vendor ID (VID)
 The PID and VID are embedded into the USB chip by the USB device manufacturer
 The VID for a device is supplied by the USB standards forum
 PID and VID are essential for loading the drivers corresponding to a USB device for
communication

 USB supports four different types of data transfers, namely; Control, Bulk,
Isochronous and Interrupt

 Control transfer is used by USB system software to query, configure and issue
commands to the USB device

 Bulk transfer is used for sending a block of data to a device. Bulk transfer supports
error checking and correction. Transferring data to a printer is an example for bulk
transfer.

 Isochronous data transfer is used for real time data communication. In Isochronous
transfer, data is transmitted as streams in real time. Isochronous transfer doesn’t
support error checking and re-transmission of data in case of any transmission loss
 Interrupt transfer is used for transferring small amount of data. Interrupt transfer
mechanism makes use of polling technique to see whether the USB device has any
data to send
 The frequency of polling is determined by the USB device and it varies from 1 to 255
milliseconds. Devices like Mouse and Keyboard, which transmits fewer amounts of
data, uses interrupt transfer.

10. Elaborate the working of brownout protection circuit for an embedded system.

 A. Brown-out protection circuit prevents the processor/controller from unexpected


program execution behavior when the supply voltage to the processor/controller falls
below a specified voltage
 The processor behavior may not be predictable if the supply voltage falls below the
recommended operating voltage. It may lead to situations like data corruption

 A brown-out protection circuit holds the processor/controller in reset state, when the
operating voltage falls below the threshold, until it rises above the threshold voltage

 Certain processors/controllers support built in brown-out protection circuit which


monitors the supply voltage internally

 If the processor/controller doesn’t integrate a built-in brown-out protection circuit,


the same can be implemented using external passive circuits or supervisor ICs

184 | P a g e
Vcc

R1

V BE
R2
Q

Reset Pulse
DZ Active Low
Vz

R3

GND

3rd Internals Question Paper

1.a. List all the characteristics of embedded systems.

Ans.- Each Embedded System possess a set of characteristics which are unique to it. Some
important characteristics of embedded systems are:

 Application & Domain Specific


 Reactive & Real Time
 Operates in ‘harsh’ environment
 Distributed Small size and Weight Power Concerns

1,b Explain 3 characteristics of embedded systems.

Ans.- 1 ) Embedded systems are typically designed to meet real time constraints; a real
time system reacts to stimuli from the controlled object/ operator within the time interval
dictated by the environment. For real time systems, right answers arriving too late (or
even too early) are wrong.

2) Embedded systems often interact (sense, manipulate & communicate) with external
world through sensors and actuators and hence are typically reactive systems; a reactive
system is in continual interaction with the environment and executes at a pace
determined by that environment.

3) They generally have minimal or no user interface.

185 | P a g e
2.a. Describe embedded system used in washing machine.

3.Discuss embedded systems in automotive market with a neat sketch.

Ans.-

Extensively used in Home Automation for washing and drying clothes


Contains User Interface units (I/O) like Keypads, Display unit, LEDs for accepting user
inputs and providing visual indications
Contains sensors like, water level sensor, temperature sensor etc
Contains actuators like spin and agitation control motor units
Contains an integrates embedded controller for controlling the washing operations
Sensors, actuators and I/O devices are interfaced to the I/O subsystem of the embedded
control unit
Ans.-
Generally built around Digital Signal Processors, Application Specific Instruction Set
Processors (like Atmel Automotive AVR) and General purpose processors/Controllers,
System on Chips, Programmable Logic Devices or Application Specific Integrated/Standard
Products (ASIC/ASSP) or a combination of these.

Automotive Automotive Embedded Embedded Control Control units are generally generally
known as Electronic Electronic Control Units (ECUs).The presence of ECUs vary from simple
mirror control units to complex Airbag deployment and Antilock braking Systems (ABS)

The number of embedded controllers in an ordinary vehicle varies from 20 to 40 whereas a


luxury vehicle may contain 100s embedded control units.The first embedded system used
in automotive application was the microprocessor based fuel injection system introduced by
Volkswagen 1600 in 1968.

4.a.With a state diagram explain automatic seat belt control problem.


Ans.- Requirement:

When the vehicle ignition is turned on and the seat belt is not fastened within 10 seconds
of ignition ON, the system generates an alarm signal for 5 seconds. The Alarm is turned off

186 | P a g e
when the alarm time (5 seconds) expires or if the driver/passenger fastens the belt or if the
ignition switch is turned off, whichever happens first.

4.b. Discuss any two operational quality attributes of embedded systems.

Ans.- Response, Throughput

5.a.Bring out differences between DFG and CDFG.

Ans.- Data Flow Graph/Diagram (DFG) Model

Translates the data processing requirements into a data flow graph

187 | P a g e
A data driven model in which the program execution is determined by data.

Emphasizes on the data and operations on the data which transforms the input data to
output data.

A visual model in which the operation operation on the data (process) (process) is
represented represented using a block (circle) and data flow is represented using arrows. An
inward arrow to the process (circle) represents input data and an outward arrow from the
process (circle) represents output data in DFG notation

Best suited for modelling embedded systems which are computational intensive (like DSP
applications)

188 | P a g e
6..a. Explain Hex file creation.

Ans.

189 | P a g e
190 | P a g e
6.b.Differenciate between simulator and emulator.

Ans.-

7.a.With a neat block diagram explain operating system architecture.

Ans.- The Operating System acts as a bridge between the user applications/tasks and the
underlying system resources through a set of system functionalities and services OS
manages the system resources and makes them available to the user applications/tasks on
a need basis The primary functions of an Operating system is Make the system convenient
to use Organize and manage the system resources efficiently and correctly

8.b. Define cross compiler.

Ans.- A cross compiler is a compiler capable of creating executable code for a platform other
than the one on which the compiler is running. For example, a compiler that runs on a
Windows 7 PC but generates code that runs on Android smartphone is a cross compiler.

9.a. What is meant by pre-empting a process? Discuss Different techniques.

ANs.- Pre-emptive Multitasking: Pre-emptive multitasking ensures that every task/process


gets a chance to execute. When and how much time a process gets is dependent on the
implementation of the pre-emptive scheduling. As the name indicates, in pre-emptive
multitasking, the currently running task/process is pre-empted to give a chance to other
tasks/process to execute. The pre-emption of task may be based on time slots or
task/process priority Pre-emptive scheduling – Pre-emptive SJF Scheduling/ Shortest
Remaining Time (SRT).

The non-pre-emptive SJF scheduling algorithm sorts the ‘Ready’ queue only after the
current process completes execution or enters wait state, whereas the pre-emptive SJF
scheduling algorithm sorts the ‘Ready’ queue when a new process enters the ‘Ready’ queue
and checks whether the execution time of the new process is shorter than the remaining of
the total estimated execution time of the currently executing process. If the execution time
of the new process is less, the currently executing process is pre-empted and the new
process is scheduled for execution. Always compares the execution completion time (ie the

191 | P a g e
remaining execution time for the new process) of a new process entered the ‘Ready’ queue
with the remaining time for completion of the currently executing process and schedules
the process with shortest remaining time for execution.

Pre-emptive SJF Scheduling


Three processes with process IDs P1, P2, P3 with estimated completion time 10, 5, 7
milliseconds respectively enters the ready queue together. A new process P4 with estimated
completion time 2ms enters the ‘Ready’ queue after 2ms. Assume all the processes contain
only CPU operation and no I/O operations are involved

At the beginning, there are only three processes (P1, P2 and P3) available in the ‘Ready’
queue and the SRT scheduler picks up the process with the Shortest remaining time for
execution Pre-emptive scheduling – Pre-emptive SJF Scheduling and the SRT scheduler
picks up the process with the Shortest remaining time for execution completion (In this
example P2 with remaining time 5ms) for scheduling. Now process P4 with estimated
execution completion time 2ms enters the ‘Ready’ queue after 2ms of start of execution of
P2. The processes are re-scheduled for execution in the following order
10.a. What is task communication? Explain how shared memory is used for task
communication.

192 | P a g e
Ans.- Task Communication

In a multitasking system, multiple tasks/processes run concurrently (in pseudo parallelism)


and each process may or may not interact between. Based on the degree of interaction, the
processes /tasks running on an OS are classified as

•Co-operating Processes: In the co-operating interaction model one process requires the
inputs from other processes to complete its execution.

•Competing Processes: The competing processes do not share anything among themselves
but they share the system resources. The competing processes compete for the system
resources such as file, display device etc .The co-operating processes exchanges information
and communicate through

•Co-operation through sharing: Exchange data through some shared resources.

•Co-operation through Communication: No data is shared between the processes. But they
communicate for execution synchronization.

10.b.Explain
different functions of a Real time Kernel.

Ans.- 1. Resource allocation- The kernel's primary function is to manage the computer's
resources and allow other programs to run and use these resources. These resources are-
CPU, Memory and I/O devices.

2. Process Management- A process defines which memory portions the application can
access. The main task of a kernel is to allow the execution of applications and support them
with features such as hardware abstraction.
193 | P a g e
To run an application, a kernel first set up an address space for the application, then loads
the file containing the application's code into memory, then set up a stack for the program
and branches to a given location inside the program, thus finally starting its execution.

3. Memory Management- The kernel has full access to the system's memory. It allows
processes to safely access this memory as they require it. Virtual addressing helps kernel
to create virtual partitions of memory in two disjointed areas, one is reserved for the kernel
(kernael space) and the other for the applications (user space).

4. I/O Device Management- To perform useful functions, processes need access to the
peripherals connected to the computer, which are controlled by the kernel through Device
Drivers. A device driver is a computer program that enables the operating system to interact
with a hardware device. It provides the operating system with information of how to control
and communicate with a certain piece of hardware.

A kernel maintains a list of available devices. A device manager first performs a scan on
different hardware buses, such as Peripheral Component Interconnect (PCI) or Universal
Serial Bus (USB), to detect installed devices, then searches for the appropriate drivers. The
kernel provides the I/O to allow drivers to physically access their devices through some port
or memory location.

5. Inter- Process Communication- Kernel provides methods for Synchronization and


Communication between processes called Inter- Process Communication (IPC). There are
various approaches of IPC say, semaphore, shared memory, message queue, pipe (or named
fifo), etc.

6.Scheduling- In a Multitasking system, the kernel will give every program a slice of time
and switch from process to process so quickly that it will appear to the user as if these
processes were being executed simultaneously. The kernel uses Scheduling Algorithms to
determine which process is running next and how much time it will be given. The algorithm
sets priority among the processes.

7. System Calls and Interrupt Handling- A system call is a mechanism that is used by
the application program to request a service from the operating system. System calls include
close, open, read, wait and write. To access the services provided by the kernel we need to
invoke the related kernel functions. Most kernels provide a C Library or an API, which in
turn invokes the related kernel functions.

There are few methods by which the respective kernel function can be invoked- using
Software- Simulated Interrupt, or using a Gate Call, or by using a Special System Call
Instruction and by using a Memory- based Queue.

194 | P a g e
8. Security or Protection Management- Kernel also provides protection from faults (error
control) and from malicious behaviours (Security). One approach toward this can
be Language based protection system, in which the kernel will only allow code to execute
which has been produced by a trusted language compiler.

195 | P a g e
MODEL QUESTION PAPER

ARM Microcontroller and Embedded Systems

MODULE-1:
1a- Briefly explain functions of various units with the architectural block diagram of

ARM CORTEX M3?

• The Cortex-M3 processor includes an interrupt controller called the Nested Vectored Interrupt
Controller(NVIC). It is closely coupled to the processor core and provides a number of features.
• The Cortex-M3 has a predefined memory map. This allows the built-in peripherals, such as the
interrupt controller and the debug components, to be accessed by simple memory access
instructions.
• The code memory region access is carried out on the code memory buses, The private peripheral bus
provides access to a part of the system-level memory dedicated to private
peripherals, such as debugging components.

• The Cortex-M3 has an optional MPU. This unit allow access rules to be set up for privileged access and
user program access.

196 | P a g e
• The Cortex-M3 processor includes a number of debugging features, such as program execution
controls,including halting and stepping, instruction breakpoints, data watchpoints.

1b- Explain applications of CORTEX M3?

• Low-cost microcontrollers: The Cortex-M3 processor is ideally suited for low-cost


microcontrollers,which are commonly used in consumer products.
• Automotive: Another ideal application for the Cortex-M3 processor is in the automotive industry.The
Cortex-M3 processor has very high-performance efficiency and low interrupt latency, allowing it to be
used in real-time systems.
• Data communications: The processor’s low power and high efficiency, coupled with instructions in
Thumb-2 for bit-field manipulation, make the Cortex-M3 ideal for many communications applications,
such as Bluetooth etc.
• Industrial control: In industrial control applications, simplicity, fast response, and reliability arekey
factors. Again, the Cortex-M3 processor’s interrupt feature, low interrupt latency, and enhanced fault-
handling features make it a strong candidate in this area.
1c- Describe the functions of R1 – R15, and other special registers in Cortex-M3?

• R0–R12 are 32-bit general-purpose registers for data operations. Some 16-bit Thumb instructions can
only access a subset of these registers (low registers, R0–R7).
• R13(Stack Pointer): The Cortex-M3 contains two stack pointers (R13). They are banked so that only
one is visible at a time. The two stack pointers are as follows:Main Stack Pointer (MSP): The default
stack pointer,used by the
operating system (OS) kernel and exception handlers

Process Stack Pointer (PSP): Used by user application code

• R14: The Link Register: When a subroutine is called, the return address is stored in the link register.
• R15: The Program Counter:The program counter is the current program address. This register can be
written to control the Program flow.
• Special Registers:The Cortex-M3 processor also has a number of special registers, Program Status
registers (PSRs) Interrupt Mask registers control register (CONTROL).
Special Registers and Their Functions

• xPSR: Provide arithmetic and logic processing flags (zero flag and carry flag),execution status, and
current executing interrupt number
• PRIMASK: Disable all interrupts except the nonmaskable interrupt (NMI) and hard fault
• FAULTMASK: Disable all interrupts except the NMI
• BASEPRI: Disable all interrupts of specific priority level or lower priority level CONTROL: Define
privileged status and stack pointer selection.

2a- Describe the functions of exceptions with a vector table and priorities.

• The Cortex-M3 supports a number of exceptions, including a fixed number of system exceptions and
a number of interrupts, commonly called IRQ.
197 | P a g e
• The number of interrupt inputs on a Cortex-M3 microcontroller depends on the individual design.
• The typical number of interrupt inputs is 16 or 32. However, you might find some microcontroller
designs with more (or fewer) interrupt inputs.
Vector table:
• When an exception event takes place on the Cortex-M3 and is accepted by the processor core, the
corresponding exception handler is executed.
• To determine the starting address of the exception handler, a vector table mechanism is used.
• The vector table is an array of word data inside the system memory, each representing the starting
address of one exception type.
• The vector table is relocatable, and the relocation is controlled by a relocation register in the NVIC.
• After reset, this relocation control register is reset to 0; therefore, the vector table is located in address
0x0 after reset.
• For example, if the reset is exception type 1, the address of the reset vector is 1 times 4 (each word is
4 bytes), which equals 0x00000004, and NMI vector (type 2) is located in 2 × 4 = 0x00000008.
• The address 0x00000000 is used to store the starting value for the MSP.
• The LSB of each exception vector indicates whether the exception is to be executed in the Thumb
state. Because the Cortex-M3 can support only Thumb instructions, the LSB of all the exception vectors
should be set to 1.

198 | P a g e
2b- Explain the operation modes of Cortex M3 with diagrams.

• The Cortex-M3 processor has two modes and two privilege levels.
• The operation modes - thread mode and handler mode- determine whether the processor is running
a normal program or running an exception handler like an interrupt handler or system exception
handler.
• The privilege levels (privileged level and user level) provide a mechanism for safeguarding memory
accesses to critical regions as well as providing a basic security model.
• When the processor is running a main program (thread mode), it can be either in a privileged state or
a user state, but exception handlers can only be in a privileged state.
• When the processor exits reset, it is in thread mode, with privileged access rights. In the privileged
state, a program has access to all memory ranges (except when prohibited by MPU settings) and can
use all supported
instructions.

2c- Explain two stack model and reset sequence in ARM cortex M3.

• The Cortex-M3 has two SPs: the MSPS and the PSP.
• The SP register to be used is controlled by the control register[1].
• When CONTROL[1] is 0, the MSP is used for both thread mode and handler mode.
• In this arrangement, the main program and the exception handlers share the same stack memory
region. This is the default setting after power-up.
• When the CONTROL[1] is 1, the PSP is used in thread mode.
• In this arrangement, the main program and the exception handler can have separate stack memory
regions.

Control[1]=0

199 | P a g e
• Both Thread Level and Handler Use Main Stack CONTROL[1]=1:
Thread Level Uses Process Stack and Handler Uses Main Stack

MODULE-2:
3a- Explain the following 16 bit instructions in Cortex M3: ADC, RSB, TST,

BL,LDR,MOV,SVC,PUSH ADC: Add


with carry: Syntax
op{S}{cond} {Rd,} Rn, Operand2 op{cond} {Rd,}
Rn, #imm12 ;
S Is an optional suffix. If S is specified, the condition code flags are updated on the result of the
operation. cond Is an optional condition code.

Rd Specifies the destination register. If Rd is omitted, the destination register is Rn. Rn Specifies the
register holding the first operand.

Operation

200 | P a g e
The ADD instruction adds the value of Operand2 or imm12 to the value in Rn.

Examples
ADD R2, R1, R3 ;Adds R3 content to R1 and store result in R2.

• MOV: Move (can be used for register-to-register transfers or loading immediate data)
Syntax
MOV{S}{cond} Rd, Operand2
MOV{cond} Rd, #imm16
Operation
The MOV instruction copies the value of Operand2 into Rd.

When Operand2 in a MOV instruction is a register with a shift other than LSL #0, the preferred

• RSB: Reverse subtract


Syntax
RSB.W Rd, Rn, #immed ; Rd = #immed –Rn

RSB.W Rd, Rn, Rm ; Rd = Rm − Rn

• TST: Test (use as logical AND; Z flag is updated but AND result is not stored)
Syntax
TST {cond} Rn, Operand2
Operation
The TST instruction performs a bitwise AND operation on the value in Rn and the value of Operand2. This
is the same as the ANDS instruction, except that it discards the result.

Examples
TST R0, #0x3F8 ; Perform bitwise AND of R0 value to 0x3F8

• BL: Branch with link; call a subroutine and store the return address in LR (this is actually a 32-bit
instruction, but it is also available in Thumb in traditional ARM processors)
Example: BL label ; Branch to a labeled address and save return

; address in LR

• LDR: Load and Store with immediate offset, pre-indexed immediate offset, or post-indexed immediate
offset. Syntax
op{type}{cond} Rt, [Rn{, #offset}] ; immediate offset op{type}{cond} Rt,
[Rn, #offset]! ; pre-indexed op{type}{cond} Rt, [Rn], #offset ; post-
indexed
Operation

201 | P a g e
LDR instructions load one or two registers with a value from memory.

Example: LDR R8, [R10] ; Loads R8 from the address in R10.

• PUSH: Push multiple registers


Syntax
PUSH {cond} reglist
Operation
PUSH stores registers on the stack, with the lowest numbered register using the lowest memory address
and the highest numbered register using the highest memory address. Examples

PUSH {R0,R4-R7} ; Push R0,R4,R5,R6,R7 onto the stack

• SVC Supervisor call

3b- Write an ALP to find the sum of first 10 integer numbers. Algorithm:

1- Initialize the number of elements as count in an array.


2- Add the elements of the array and store carry if generated until count becomes o.
3- Stop.
Program:
AREA CODE1, CODE, READONLY

EXPORT START1

START1

Mov R0, #10 ;initialize the number of elements

Mov R1, #00

LOOP

Add R1, R0 ; R1=R1+R0

Subs R0, #1

BNE LOOP ;branch if R0 is not equal to 0

Stop

B stop

End

EXPECTED OUTPUT:
Result: 37H
202 | P a g e
3c- Write the memory map of Cortex M3 and explain briefly bit-band operations.

The Cortex-M3 has a predefined memory map.

This allows the built-in peripherals, such as the interrupt controller and the debug components, to be
accessed by simple memory access instructions.

Thus,most system features are accessible in C program code. The predefined memory map also allows the

Cortex-M3 processor to be highly optimized for speed and ease of integration in system-on-a-chip (SoC)
designs.

Overall, the 4 GB memory space can be divided into ranges as shown in Figure

Bit-Band Operations:
Bit-band operation support allows a single load/store operation to access (read/write) to a single data bit.

In the Cortex-M3, this is supported in two predefined memory regions called bit-band regions.

One of them is located in the first 1 MB of the SRAM region, and the other is located in the first 1 MB of
theperipheral region.

203 | P a g e
For example, to set bit 2 in word data in address 0x20000000, instead of using three instructions to read
the data, set the bit, and then write back the result, this task can be carried out by a single instruction
using bit-band operations.

the Cortex-M3 uses the following terms for the bit-band memory addresses: • Bit-band region: This is
a memory address region that supports bit-band operation.

• Bit-band alias: Access to the bit-band alias will cause an access (a bit-band operation) to the bit-band
region.

4a- Explain the following 32 bit instructions in Cortex M3: AND, CMN,

MLA,SDIV,STR,MRS,POP

• AND: Logical AND syntax op{S}{cond} {Rd,} Rn,


Operand2 operation
The AND instruction perform bitwise AND operation on the values in Rn and Operand2.

Examples
Assume R2= 0xFBCD4567 R5=0xFF453243

AND R9, R2, #0xFF00 ;ANDs 0xFF00 with R2 and store result in R9. ; AE R9=0x00004500

• CMN: Compare negative (compare one data with


two’s complement of another data and update flags)
Syntax
CMN{cond} Rn, Operand2
Operation
The CMN instruction adds the value of Operand2 to the value in Rn. This is the same as an ADDSinstruction,
except that the result is discarded.

• STR: Store word from register to memory Syntax


op{type}{cond} Rt, [Rn{, #offset}] ; immediate offset op{type}{cond} Rt,
[Rn, #offset]! ; pre-indexed op{type}{cond} Rt, [Rn], #offset ; post-
indexed
Operation
STR instructions store one or two register values to memory. Examples

STR R2, [R9,#10] ; Store the content of R2 to effective address R9+10

• POP: Pop multiple registers


Syntax
POP {cond} reglist where:

204 | P a g e
cond Is an optional condition code.

reglist Is a non-empty list of registers, enclosed in braces. It can contain register ranges. Operation
POP loads registers from the stack, with the lowest numbered register using the lowest memory

Examples
POP {R0,R6,PC} ; Pop R0,R6 and PC from the stack, then branch to the new PC. MLA: Multiply
accumulate

Syntax
MLA{cond} Rd, Rn, Rm, Ra ; Multiply with accumulate

Operation
The MLA instruction multiplies the values from Rn and Rm, adds the value from Ra, and places the least
significant 32 bits of the result in Rd.

SDIV: Signed divide Syntax

SDIV{cond} {Rd,} Rn, Rm

Operation
SDIV performs a signed integer division of the value in Rn by the value in Rm.

Examples
SDIV R0, R2, R4 ; Signed divide, R0 = R2/R4.

4b- Write a C language program to toggle an LED with a small delay in Cortex M3.

Program:
#define LED *((volatile unsigned int *)(0xDFFF000C)) int main
(void)

int i; /* loop counter for delay function */

volatile int j; /* dummy volatile variable to prevent C compiler from optimize the delay away */

while (1)

LED = 0x00; /* toogle LED */ for


(i=0;i<10;i++) {j=0;} /* delay */

205 | P a g e
LED = 0x01; /* toogle LED */ for
(i=0;i<10;i++) {j=0;} /* delay */

return 0;

4c- With a diagram, explain the operation of CMSIS.

The CMSIS was developed by ARM to allow users of the

Cortex-M3 microcontrollers to gain the most benefit from all these software solutions and to allow them
to develop their embedded application quickly and reliably The aims of CMSIS are to:

• improve software portability and reusability


• enable software solution suppliers to develop products that can work seamlessly with device
libraries from various silicon vendors
• allow embedded developers to develop software quicker with an easy-to-use and
standardized software interface
• allow embedded software to be used on multiple compiler products
• avoid device driver compatibility issues when using software solutions from multiple sources

MODULE3
5a- Explain the 6 purposes of Embedded systems with an example for each.

Each Embedded Systems is designed to serve the purpose of any one or a combination of the following
tasks.
206 | P a g e
• Data Collection/Storage/Representation
Example: Digital Camera for Image capturing/storage/display

• Data Communication: example: Network hubs, Routers, switches, Modems etc are typical
examples for dedicated data transmission embedded systems
• Data (Signal) Processing: example: Digital hearing Aid employing Signal Processing Technique
• Monitoring: example: Measuring instruments like Digital CRO, Digital Multi meter, Logic
Analyzer etc
• Control: example: Air Conditioner for controlling room temperature
• Application Specific User Interface:
Example: Mobile handsets, Control units in industrial applications etc

5b- Differentiate between (1)General Computing Systems and Embedded Systems and (2) RISC and CISC
architecture.

1- General Purpose System


• A system which is a combination of generic hardware and General Purpose
Operating System for executing a variety of applications

• Contain a General Purpose Operating System (GPOS)


• Applications are alterable (programmable) by user (It is possible for the end user to re-install the
Operating System, and add or remove user applications)
• Performance is the key deciding factor on the selection of the system. Always ‘Faster is Better’
• Less/not at all tailored towards reduced operating power requirements, options for different levels of
power management.
2- Embedded System
• A system which is a combination of special purpose hardware and embedded
OS for executing a specific set of applications

• May or may not contain an operating system for functioning


• The firmware of the embedded system is pre-programmed and it is nonalterable by end-user (There
may be exceptions for systems supporting OS
kernel image flashing through special hardware settings)

• Application specific requirements (like performance, power requirements, memory usage etc) are the
key deciding factors
• Highly tailored to take advantage of the power saving modes supported by hardware and Operating
System

3- RISC architecture:
• Lesser no. of instructions

207 | P a g e
• Instruction Pipelining and increase execution speed
• Orthogonal Instruction Set (Allows each instruction to operate on any register and use any addressing
mode)
• Operations are performed on registers only the only memory operations are load and store
• Large number of registers are available
• With Harvard Architecture

4- CISC architecture

• Greater no. of Instructions


• Generally no instruction pipelining feature
• Non Orthogonal Instruction Set (All instructions are not allowed to operate on any register and use
any addressing mode. It is instruction specific)
• Operations are performed on registers or memory depending on the instruction
• Limited no. of general purpose registers
• Can be Harvard or Von-Neumann Architecture
5c- Explain the three classifications of Embedded systems based on complexity and performance.

Classification of Embedded Systems:


• Based on Generation
• Based on Complexity & Performance Requirements
• Based on deterministic behavior
• Based on Triggering
Classification based on Complexity & Performance

• Small Scale: The early embedded systems built around 8bit microprocessors like 8085 and Z80 and
4bit microcontrollers

• Medium Scale: Embedded Systems built around 16bit microprocessors and 8 or 16bit
microcontrollers, following the first generation embedded systems

• Large Scale/Complex: Embedded Systems built around high performance 16/32 bit
Microprocessors/controllers, Application Specific Instruction set processors like Digital Signal
Processors (DSPs), and Application Specific Integrated Circuits (ASICs)

5d- Mention the applications of Embedded systems with an example for each.

• Consumer Electronics: Camcorders, Cameras etc.


• Household Appliances: Television, DVD players, Washing machine, Fridge, Microwave Oven etc.

• Home Automation and Security Systems: Air conditioners, sprinklers, Intruder detection alarms,
Closed Circuit Television Cameras, Fire alarms etc.

208 | P a g e
• Automotive Industry: Anti-lock breaking systems (ABS), Engine Control, Ignition Systems, Automatic
Navigation Systems etc.

• Telecom: Cellular Telephones, Telephone switches, Handset Multimedia Applications etc.

• Computer Peripherals: Printers, Scanners, Fax machines etc.

• Computer Networking Systems: Network Routers, Switches, Hubs, Firewalls etc.

• Health Care: Different Kinds of Scanners, EEG, ECG Machines etc.

6a- Explain the functions of Optocoupler and SPI bus with diagrams.

• Optocoupler is a solid state device to isolate two parts of a circuit. Optocoupler combines an LED and
a photo-transistor in a single housing (package)

• In electronic circuits, optocoupler is used for suppressing interference in data communication, circuit
isolation, High voltage separation, simultaneous separation and intensification signal etc

• Optocouplers can be used in either input circuits or in output circuits

LED
I/O interface
optocoupler optocoupler in input and output circuit I/O interface
Photo-tranVsicsctor

2- SPI:

The Serial Peripheral Interface Bus (SPI) is a synchronous biLED AT89C51 -directional full duplex
LED four wire serial I/p interface interface bus. The concept of SPI is introduced Port Pin by Motorola.
SPI is a single master multi-slave system. It is possible to have a system where more than one O/p
interface Port Pin

SPI device can be master, provided the condition only one master device is active at any given point of time,
is satisfied. SPI requires four signal lines for communication. Photo-transistor Photo-transistor

209 | P a g e
They are:

Opto-Coupler Microcontroller Opto-Coupler


IC MCT2M IC MCT2M
Master Out Slave In (MOSI): Signal line carrying the data from master to slave device. It is also known as
Slave Input/Slave Data In (SI/SDI)

Master In Slave Out (MISO): Signal line carrying the data from slave to master device.It is also known as
Slave Output

Serial Clock (SCLK): Signal line carrying the clock signals

Slave Select (SS):Signal line for slave device select. It is an active low signal

MOSI SCL MISO

MISO

SCL

MOSI MOSI Slave 1


SCL SPI Device
Master
6b- (Write a note on Embedded firmware.Microprocessor/ MSISS\ O
E(EEgP: RSOerMia) l Controller)

210 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

• The control algorithm (Program instructions) and or the configuration settings SS1\
that an embedded system developer dumps into the code (Program) memory of the
embedded systemSS2\
I
• The embedded firmware can be developed in various methods likeMOS
• Write the program in high level languages SClike L Embedded SlavC/C++ e 2
using an Integrated Development Environment (The IDE will contain an editor, compiler,
MISO SPI Device linker, debugger, simulator etc. IDEs are different for (Egdifferent :
LCD) family of processors/controllers. SS\
• Write the program in Assembly Language using the Instructions Supported by your
application’s target processor/controller
6c- Explain SRAM design and features with a diagram.

• Static RAM stores data in the form of Voltage. They are made up of flip-flops
• In typical implementation, an SRAM cell (bit) is realized using 6 transistors (or 6 SPI Bus
MOSFETs). Four of the transistors are used for building the latch (flip-flop) part of the memory
cell and 2 for controlling the access.
• Static RAM is the fastest form of RAM available. SRAM is fast in operation due
• Throughput : Throughput deals with the efficiency of a system. In general it can be defined
as the rate of production or operation of a defined process over a stated period of time.
• Reliability : Reliability is a measure of how much % you can rely upon the proper functioning
of the system or what is the % susceptibility of the system to failures.
• Maintainability : Maintainability deals with support and maintenance to the end user or
client in case of technical issues and product failures or on the basis of a routine system check
up.
• Security : Security aspect in an embedded product is a Personal Digital Assistant. The PAD
can be either shared resource or an individual one.
• Safety : Safety and security are two confusing terms Safety deals with the possible Damages
that can happen to the operators, public and the environment due to the break down of an
embedded system .

7b- Define the 6 characteristics of an embedded system.

211 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

• Application and Domain Specific : The embedded system is having certain functions to
perform and they are developed in such a manner to do the intended functions only . For
example, microwave oven and air conditioner are specifically designed to perform certain
specific task.
• Reactive and real-time : Embedded systems are reactive because they produce changes in
output in response to the changes in the input.
• Real time system operation means that timing behavior of the system Should be
deterministic.

• Operate in harsh environment: The environment in which the embedded system deployed
maybe a dusty one or a high temperature zone or an area subject to vibrations and shock.
• Distributed: Many numbers of such distributed embedded systems form a single large
embedded control unit.
• Small size and weight: It is convenient to handle a compact device than bulky product.
• Power concerns: Embedded system should be designed in such way as to minimize the heat
dissipation by the system.

7c- With a block diagram, mention the components used in the design of a washing machine and also
explain its working.

• Extensively used in Home Automation for washing and drying clothes.


• Contains User Interface units (I/O) like Keypads, Display unit, LEDs for accepting inputs and
providing visual indications.
• Contains sensors like, water level sensor, temperature sensor etc. Contains actuators like
spin and agitation control motor units.
• Contains an integrates embedded controller for controlling the washing operations.
• Sensors, actuators and I/O devices are interfaced to the I/O subsystem of the embedded
control units.
• Specifically designed for serving the application ‘Wash & Rinse’ of clothes and cannot be used
for any other applications.

8a- Explain the assembly language based embedded firmware development with a diagram and
mention its advantages and disadvantages.

212 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

• The Assembly language program written in assembly code is saved as .asm


(Assembly file ) file or a .src(source) file or a format supported by the assembler
• Similar to ‘C’ and other high level language programming, it is possible to have multiple
source files called modules in assembly language programming. Each module is represented
by a ‘.asm’ or ‘.src’ file or the assembler supported file format similar to the ‘.c’ files in C
programming
• The software utility called ‘Assembler’ ‘Assembler’ performs the translation of assembly
code to machine code.
• The assemblers for different family of target machines are different. A51 Macro Assembler
from Keil software is a popular assembler for the 8051 family micro controller
• Each source file can be assembled separately to examine the syntax errors and incorrect
assembly instructions.
• Assembling of each source file generates a corresponding object file. The object file does not
contain the absolute address of where the generated code needs to be placed (a re-locatable
code) on the program memory.
• The software program called linker/locater is responsible for assigning absolute address to
object files during the linking process.
• The Absolute object file created from the object files corresponding to different source code
modules contain information about the address where each instruction needs to be placed
in code memory.
• A software utility called ‘Object to Hex file converter’ translates the absolute object file to
corresponding hex file (binary file).

Advantages:

• Efficient Code Memory & Data Memory Usage (Memory Optimization)


• High Performance
• Low level Hardware Access, Code Reverse Engineering

Disadvantages

213 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

• High Development time


• Developer dependency
• Non portable

8b- with fsm model, explain the design and operation of automatic tea/coffee vending machine
- It consists of 4 states ; wait for coin, wait for user input, dispense tea, dispense coffee.
- The event Insert coin, transitions the state to wait for user input.
- System stays in this state until user gives input.
- If the event triggered in wait state is cancel, the coin is pushed out and the state transitions to
wait for coin
- If the button pressed is either coffee or tea, the state changes to dispense tea or dispense
coffee
- Once the dispense id over, the states transitions back to the wait for coin state/

9a- Briefly explain the functions of the operating system, with a diagram.

The primary functions of an operating system is Make the system convenient to use Organize and
manage the system resources efficiently and correctly

• Process management : It deals with managing the process and also setting up the memory
space for the process loading the process code into memory space, allocating system
resources, scheduling and managing the execution of the process.
• Primary memory management : the term primary memory reverse to the volatile memory
(RAM) where processes are loaded and variables are shared data associated with each
process are stored. The memory management unit of the kernel is responsible for Keeping
track of which part of the memory area is currently used by the process, Allocating and de
allocating memory space on a need basis.
• File system management: file is a collection of related information. A file could be
program,text files, image files , documents, audio or video file etc
• I/O system management: kernel is responsible for routing the i/o requestsComing from
different user applications to the appropriate I/O devices of the system.

214 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

• Time management: Accurate time management is essential for providing precise time
reference for all applications.

9b- Describe preemptive SJF scheduling. Determine average turn around time and average waiting
time, if processes P1 P2 and P3 with estimated completion time of 10, 5, 7 milliseconds enter ready
queue together and later P4 with a completion time of 2 msec enters ready queue after 2 msec.

• At the beginning, there are only three processes (P1, P2 and P3) available in the ‘Ready’
queue and the SRT scheduler picks up the process with the Shortest remaining time for
execution completion (In this example P2 with remaining time 5ms) for scheduling. Now
process P4 with estimated execution completion time 2ms enters the ‘Ready’ queue after
2ms of start of execution of P2. The processes are re-scheduled for execution in the following
order.

The waiting time for all the processes are given as

- Waiting Time for P2 = 0 ms + (4 -2) ms = 2ms (P2 starts executing first and is interrupted by P4
and has to wait till the completion of P4 to get the next CPU slot)
- Waiting Time for P4 = 0 ms (P4 starts executing by preempting P2 since the execution time for
completion of P4 (2ms) is less than that of the Remaining time for execution completion of P2
(Here it is 3ms))
- Waiting Time for P3 = 7 ms (P3 starts executing after completing P4 and P2)

215 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

- Waiting Time for P1 = 14 ms (P1 starts executing after completing P4, P2 and P3)

Average waiting time = (Waiting time for all the processes) / No. of Processes Non-preemptive
scheduling – Preemptive SJF Scheduling

= (Waiting time for (P4+P2+P3+P1)) / 4

= (0 + 2 + 7 + 14)/4 = 23/4

= 5.75 milliseconds

- Turn Around Time (TAT) for P2 = 7 ms (Time spent in Ready Queue + Execution Time)
- Turn Around Time (TAT) for P4 = 2 ms
- Turn Around Time (TAT) for P3 = 14 ms (Time spent in Ready Queue + Execution Time)
- Turn Around Time (TAT) for P1 = 24 ms (Time spent in Ready Queue + Execution Time)

Average Turn Around Time = (Turn Around Time for all the processes) / No. of Processes

=(Turn Around Time for (P2+P4+P3+P1)) / 4(7+2+14+24)/4 = 47/4 = 11.75 milliseconds

9C- With a state transition diagram, structure and memory organization of a Process, describe the
process state transitions

• The creation of a process to its termination is not a single step operation The process
traverses through a series of states during its transition from the newly created state to the
terminated state.
• Created State: The state at which a process is being created is referred as ‘Created State’.
The Operating System recognizes a process in the Created State But no resources are
allocated to the process
• Ready State: The state, where a process is incepted into the memory and awaiting the
processor time for execution, is known as Ready State ’. At this stage, the process is placed
in the Ready list ’ queue maintained by the OS
• Running State: The state where in the source code instructions corresponding to the process
is being executed is called Running State ’. Running state is the state at which the process
execution happens.
• Blocked State/Wait State: Refers to a state where a running process is temporarily suspended
from execution and does not have immediate access to resources. The blocked state might
have invoked by various conditions like- the process enters a wait state for an event to occur
(E.g. Waiting for user inputs such as keyboard input) or waiting for getting access to a shared
resource like semaphore, murex etc.
• Completed State: A state where the process completes its execution
• The transition of a process from one state to another is known as State transition.

216 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

• When a process changes its state from Ready to running or from running to blocked or
terminated or from blocked to running, the CPU allocation for the process may also change.

10a- Explain out of circuit in-system programming methods for integration of hardware and
firmware.

• Integration of hardware and firmware deals with the embedding of firmware into the target
hardware board.
• It is the process of Embedding Intelligence to the product
• The embedded processors/controllers may or may not contain the built in code memory.
• If the processors/controllers contain the built in code memory, and the total size of firmware
is fitting into the code memory area, thecode memory is downloaded into the target
controller.
• If not an external dedicated EPROM/FLASH memory is used for holding the firmware.
• The chip is interfaced to the controller.
• Commonly used techniques for embedding the firmware into the target board are, out-of-
circuit programming etc.
10b- Explain simulator based debugging and ICE based target debugging techniques.

217 | P a g e
RNSIT NOTES ARM EMBEDED SYSTEMS

Features of simulator based debugging are

- Purely software based


- Doesn’t require a real target system
- Very primitive
- Lack of real time behavior

Advantages of simulator based debugging

- No need for original target board


- Simulate i/o peripherals
- Simulates abnormal conditions

Limitations of this debugging technique:

- Deviation from real behavior - Lack of real timeliness

ICE(in circuit based emulator)

- Emulator is a self contained hardware device which emulates the target device.
- The emulator hardware contains necessary emulation logic and is hooked to the debugging
application running on the development PC, on one end and connects to the target board
through some interface on the other end.
- In summary emulator emulates the target board
- Nowadays pure software applications which perform the functioning of a hardware emulator
is also called as emulators.
- A hardware emulator is controlled by a debugger application running on the development PC.
- The debugger may be part of the IDE or a third party supplied tool.
- The emulators for different families of processors/controllers are different.

218 | P a g e

Potrebbero piacerti anche