Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
INTRODUCTION
This chapter deals with fifth generation TI TMS320C5X DSP processor (denoted as 5X).
The TMS320 DSP family:
•Two types of single-chip DSPs: 16-bit fixed-point and 32-bit floating-point.
• Has the operational flexibility of high-speed controllers and the numerical capability of
array processors.
•Inexpensive alternative to custom fabricated VLSI multitichip bit-slice processors.
The block diagram of the internal architecture of C5X is shown in Fig. 2.1. The 320C5X DSP
uses advanced Harvard architecture because they have separate memory bus structures for
program and data memory.
1
Fig. 2.1 Internal architecture of C5X
BUS STRUCTURE
Separate program and data buses allow simultaneous access to program instructions
and data, providing a high degree of parallelism. For example, while data is
multiplied, a previous product can be loaded into, added to or subtracted from the
accumulator and, at the same time, a new address can be generated. Such parallelism
supports a powerful set of arithmetic, logic and bit-manipulation operations that can
all be performed in a single machine cycle. The C5X architecture has four buses and
their functions are as follows:
Program bus (PB): It carries the instruction code and immediate operands from
program memory space to the CPU.
Program address bus (PAB): It provides addresses to program memory space for
both reads and writes.
2
Data read bus (DB): It interconnects various elements of the CPU to data memory
space.
Data read address bus (DAB): It provides the address to access the data memory
space.
The program and data buses can work together to transfer data from on-chip data
memory and internal or external program memory to the multiplier for single-cycle
multiply/accumulate operations.
3
This makes the instructions to be executed faster compared to the conventional
microprocessors. For example, let us consider the following sequence of 8085
instructions:
M0V A, M
INX H
The 16-bit INDX is used by the ARAU as a step value (addition or subtraction by
more than 1) to modify the address in the ARs during indirect addressing. For
example, when the ARAU steps across a row of a matrix, the indirect address is
incremented by 1. However, when the ARAU steps down a column, the address is
incremented by the dimension of the matrix. The ARAU can add or subtract the
value stored in the INDX from the current AR as part of the indirect address
operation. INDX can also map the dimension of the address block used for bit-
reversal addressing.
The 16-bit ARCR is used for address boundary comparison. The CMPR instruction
compares the ARCR to the selected AR and places the result of the compare in the
TC bit of ST1.
The 16-bit BMAR holds an address value to be used with block moves and
multiply/accumulate operations. This register provides the 16-bit address for an
indirect-addressed second operand.
4
All these registers are 16-bit wide. Repeat counter register (RPTC) holds the repeat
count in a repeat single-instruction operation and is loaded by the RPT and RPTZ
instructions. Block repeat counter register (BRCR) holds the count value for the
block repeat feature. This value is loaded before a block repeat operation is initiated.
Block repeat program address start register (PASR) indicates the 16-bit address
where the repeated block of code starts. The block repeat program address end
register (PAER) indicates the 16-bit address where the repeated block of code ends.
The PASR and PAER are loaded by the RPTB instruction.
MEMORY-MAPPED REGISTERS
The C5X has 96 registers mapped into page 0 of the data memory space. All C5X
DSPs have 28 CPU registers and 16 input/output (I/O) port registers but have
different numbers of peripheral and reserved registers. Since the memory-mapped
registers are a component of the data memory space, they can be written to and read
from in the same way as any other data memory location. The memory- mapped
registers are used for indirect data address pointers, temporary storage, CPU status
and control, or integer arithmetic processing through the ARAU.
PROGRAM CONTROLLER
The program controller contains logic circuitry that decodes the instructions,
manages the CPU pipeline, stores the status of CPU operations and decodes the
conditional operations. Parallelism of architecture lets the C5X perform three
concurrent memory operations in any given machine cycle: fetch an instruction, read
an operand and write an operand. The program controller consists of the following
elements:
16-bit program counter (PC)
16-bit status registers ST0, ST1,
Processor mode status register (PMST)
Circular buffer control register (CBCR)
5
(8 x 16)-bit hardware stack
Address generation logic
Instruction register
Interrupt flag register and interrupt mask register
STATUS REGISTERS (Status registers ST0 and ST1 and their description)
A status register is a 16 bit register which contains status and control bits like
carry, overflow(OV) and auxiliary register pointer(ARP) etc.
The status registers can be stored into data memory and loaded from data memory,
thereby allowing the C5X status to be saved and restored for subroutines.
The bit assignment details for ST0 are given in Fig. 2.2.
ARP (Auxiliary Register Pointer): These bits select the AR to be used in indirect
addressing. When the ARP is loaded, the previous ARP value is copied to the
auxiliary register buffer (ARB) in ST1.
OVM (Overflow Mode) bit: This bit enables/disables the accumulator overflow
saturation mode in the ALU.
INTM (Interrupt Mode) bit. This bit globally masks or enables all interrupts. The
INTM bit has no effect on the non-maskable RS and NMI interrupts.
DP (Data Memory Page Pointer) bits: These bits specify the address of the current
data memory page. The DP bits are concatenated with the 7 LSBs of an instruction
word to form a direct memory address of 16 bits.
6
The bit assignment details for ST1 are given in Fig. 2.3.
ARB(Auxiliary Register Buffer): This 3-bit field holds the previous value contained
in the ARP in ST0. Whenever the ARP is loaded, the previous ARP value is copied to
the ARB, except when using the LST #0 instruction. When the ARB is loaded using
the LST #1 instruction, the same value is also copied to the ARP. This is useful when
restoring context (when not using the automatic context save) in a subroutine that
modifies the current ARP.
CNF ( On-chip RAM configuration control bit): This 1-bit field enables the on- chip
dual-access RAM block 0 (DARAM B0) to be addressable in data memory space or
program memory space. The CNF bit can be modified by the LST #1 instruction. If
CNF is 0, the on-chip DARAM block 0 is mapped into data memory space. The
CNF bit can be cleared by a reset or the CLRC CNF instruction. When CNF is 1, the
on-chip DARAM block 0 is mapped into program memory space. The CNF bit can
be set by the SETC CNF instruction.
TC (Test/control flag bit): This 1-bit flag stores the results of the ALU or parallel
logic unit (PLU) test bit operations. The status of the TC bit determines if the
conditional branch, call and return instructions are to be executed.
SXM (Sign-extension mode bit): This 1-bit field enables/disables sign extension of
an arithmetic operation. The SXM bit does not affect the operations of certain
arithmetic or logical instructions; the ADDC, ADDS, SUBB or SUBS instruction
suppresses sign extension, regardless of SXM.
7
Sign extension
Description
2.This is done by appending digits to the most significant side of the number,
following a procedure dependent on the particular signed number representation
used.
Example 1. If six bits are used to represent the number " 00 1010 " (decimal positive
10) and the sign extend operation (i.e. SXM =1) increases the word length to 16 bits,
then the new representation is simply " 0000 0000 0000 1010 ".
Example 2. If ten bits are used to represent the value “ 11 1111 0001” (decimal
negative 15) using 2’s complement, and this is sign extended (i.e. SXM =1) to 16 bits
, the new representation is “1111 1111 1111 0001”. Thus, by appending ones on left
side of the number, the negative sign and the value of the original number are
maintained.
When SXM=0, the sign bit of the number is not extended i.e. Zeros or ones are not
appended before the number.
Example 3. Given the number 1234h. If the number is left shifted by 3 bits, find the
32 bit representation of the number for a) SXM =0 b) SXM=1.
Solution:
a) Given number 1234h= 0001 0010 0011 01002
Left shift by 3 bits: 1 0010 0011 0100 0002
For SXM=0 no bits are to be appended: Hence the result is: 91A0h
b) For SXM=1 sign should be extended i.e. Ones are to be appended.
Hence the number is 1111 1111 1111 1111 1001 0001 1010 00002 = FFFF91A0h
C (Carry bit): This 1-bit field indicates an arithmetic operation carry or borrow in the
ALU. The single bit shift and rotate instructions affect the C bit.
8
M (Hold mode bit): This 1-bit field determines whether the central processing unit
(CPU) stops or continues execution when acknowledging an active HOLD signal.
XF (pin status bit): This 1-bit field determines the level of the external flag (XF)
output pin.
PM (Product shift mode bits): This 2-bit field determines the product shifter (P-
SCALER) mode and shift value for the PREG output into the ALU. Table 3.2 gives
the PM bits and the function performed.
The C5X has a total address range of 224K words x 16 bits. The memory space is
divided into four individually selectable memory segments: 64K-word program
memory space, 64K-word local data memory space, 64K-word I/O ports and 32K-
word global data memory space.
Program ROM
All C5X DSPs carry a 16-bit on-chip maskable programmable ROM (see Fig. 2.1
for sizes). Some of the C5X DSPs have boot loader code resident in the on-chip
ROM, and the other C5X DSPs offer the boot loader code as an option. This
9
memory is used for booting program code from slower external ROM or EPROM to
fast on-chip or external RAM. Once the custom program has been booted into RAM,
the boot ROM space can be removed from program memory space by setting the
MP/ bit in the processor mode status register (PMST). The on-chip ROM is
selected at reset by driving the MP/ pin low. If the on-chip ROM is not selected,
the C5X devices start execution from off-chip memory.
All C5X DSPs carry a 1056-word x 16-bit on-chip dual-access RAM (DARAM).
The DARAM is divided into three individually selectable memory blocks: 512-
word data or program DARAM block B0, 512-word data DARAM block B1 and 32-
word data DARAM block B2. The DARAM is primarily intended to store data
values but, when needed, can be used to store programs as well. DARAM blocks B1
and B2 are always configured as data memory; however. DARAM block B0 can be
configured by software as data or program memory. DARAM improves the
operational speed of the C5X CPU. The CPU operates with a 4-deep pipeline.
In this pipeline, the CPU reads data on the third stage and writes data on the fourth
stage. Hence, for a given instruction sequence, the second instruction could be
reading data at the same time the first instruction is writing data. The dual data buses
(DB and DAB) allow the CPU to read from and write to DARAM in the same
machine cycle.
Almost all C5X DSPs carry a 16-bit on-chip single-access RAM (SARAM) of sizes
varying from 1- 9K (16–bits) words. Code can be booted from an off-chip ROM and
then executed at full speed once it is loaded into the on-chip SARAM. The SARAM
can be configured by software as data memory, as program memory or combination
of both data memory and program memory. The SARAM is divided into 1K- and/or
2K-word blocks contiguous in address memory space. All C5X CPUs support
parallel accesses to these SARAM blocks. However, one SARAM block can be
accessed only once per machine cycle. In other words, the CPU can read from or
write to one SARAM block while accessing another SARAM block.
10
[label] [:] mnemonic [operand list] [;comment ]
[ e.g. Loop1: ADD #20h ; add 20h to accumulator]
Operands can be constants or assembly-time expressions that refer to memory, I/O ports,
register addresses, pointers, shift counts and a variety of other constants.
ADDRESSING MODES
Direct addressing
The data memory used with C5X processors is splited into 512 pages each of 128 words
long. The data memory page pointer (DP) in ST0 holds the address of the
11
current data memory page. In the direct addressing mode of C5X, only lower-order 7
bits of the address are specified in the instruction. The upper 9 bits are taken from
the DP as shown in Fig. 2.4.
Example 2.1 Explain direct addressing mode of 5X. Explain execution of the
instruction ADDC 20h with address generation process if content of DP =06h and
content of data memory location 0320h to 032fh are 20h.Take the content of ACC as
30h.
The content of data memory address (dma) and the value of carry bit C are added to
the contents of accumulator(ACC) with sign extension mode suppressed. The result
is stored in the ACC. The carry bit C is set if result of addition generates a carry
otherwise it is cleared.
12
Memory-mapped register addressing
The RAM area in page 0 is used for storing some of the registers, interrupt vector
addresses and so on. These locations can be accessed by specifying the actual
address or by the register name, (e.g., the AR0 can either be denoted by the actual
memory location (10h) used for storing its value or by the symbol AR0). Since these
memory locations can be interchangeably used with the register names, the registers
corresponding to page 0 are referred to as memory-mapped registers (MMRs). With
memory-mapped register addressing, the MMRs can be modified without affecting
the current data page pointer value. The memory- mapped register addressing mode
operates like the direct addressing mode, except that the 9 MSBs of the address are
forced to 0 instead of being loaded with the contents of the DP. This allows the
memory-mapped registers of data page 0 to be modified directly without the
overhead of changing the DP or auxiliary register. The following instructions operate
in the memory mapped register addressing mode. These instructions does not affect
the contents of the DP:
Example 2.2. The instruction LMMR AR0, #1500h loads AR0 with the content of
the location 1500h as shown in Fig. 2.5. Let the content of AR0 and the data
memory location 1500h be 2345h and 6789h, respectively, before executing the
instruction. After executing the instruction their contents become 6789h and 6789h.
Fig. 2.5
13
Example 2.3 Let the content of AR0 and the data memory location be 2345h and
6789h, respectively, before executing the instruction SMMR AR0, # 1500h. After
executing the instruction their contents become 2345h and 2345h as shown in Fig.
2.6.
Fig.2.6
The SMMR does the reverse operation.
Indirect addressing
The ARs (AR0-AR7) are used for accessing data, using indirect addressing mode.
Out of the eight ARs the one which is currently used for accessing data is denoted by
the register ARP. The indirect addressing mode of C5X permits the AR used for the
addressing to be updated automatically either after or before the operand is fetched.
Hence a separate instruction is not required to update the AR. The manner in which
the memory address is computed and the manner in which the AR is altered after the
instruction depends on the instruction. This is indicated to the assembler by the
symbols *, *+, *–,*0+, *0–, *BR0+ and *BR0–. The symbol used to indicate the
indirect addressing mode and the action taken after executing the instruction are
given in Table 2.1.
14
Symbol Value of AR pointed by ARP after instruction execution
* AR unaltered
*+ AR incremented by 1
*- AR decremented by 1
*0+ AR incremented by the content of INDX
*0- AR decremented by the content of INDX
*BR0+ AR incremented by the content of INDX with reverse carry propagation
*BR0_ AR decremented by the content of INDX with reverse carry propagation
Example 2.4 Let the value of ARP, AR2 and INDX register be 2, 1250h and 2h,
respectively, and the content of the data memory location 1240h–1260h be filled
with the data 2345h. Let SXM be0. The value of ACC and AR2 after the following
sequence of LACC (load accumulator with shift) instructions are executed is shown
in Fig. 2.7.
LACC *, 0
LACC *+, 1
LACC *–, 2
LACC *0+, 4
LACC *0–, 3
Solution:
Fig. 2.7
15
Immediate addressing
In immediate addressing, the instruction word(s) contains the value of the immediate
operand. The C5X has both 1-word (8-bit, 9-bit and 13-bit constant) short immediate
instructions and 2-word (16-bit constant) long immediate instructions. This mode is
indicated by the symbol #.
Examples:
i)Short immediate instructions
e. g. ADD #567h ( adds 56h to ACC)
ii)Long immediate instructions
e. g. ADD # 4567h ( adds 4567h to ACC)
Dedicated-register addressing
The dedicated-registered addressing mode operates like the long immediate addressing
mode, except that the address comes from one of two special-purpose memory-mapped
registers in the CPU:
The block move address register (BMAR)
The dynamic bit manipulation register (DBMR)
The advantage of this addressing mode is that the address of the block of memory to be
acted upon can be changed during execution of the program.
Example 2.5 Let the content o f ARP, AR2 be 2 ,1250h respectively and the content of
data memory location( dma) 1249h-1250 be filled with data 2345h.Let SXM be
0. Find the values of ACC , AR2 and ARP after the execution of ADD *,2
instruction.
Example 2.7 Explain the operations performed when the TMS320C5X issues the
following instructions
i) ADDC 20h ( see example 2.1)
ii) SMMR AR0, #1800( see example 2.6)
iii) BLDP 00h: Block move from data memory to program memory
The contents of the data memory address (dma) are copied to the program memory
address(pma) pointed by the block move address register (BMAR).The source and
destination blocks do not have to be entirely on-chip or off-chip.
Circular Addressing Mode ( Refer Ch. No. 4 Avtar Singh)
The buffer size in the first two cases = (EAR - SAR + 1) and in the last two it is
= (SAR - EAR + 1).
Figure 2.8 Register pointer updating algorithm for circular buffer addressing
mode. SAR = start address register contents, EAR = end address register
contents, PNTR = pointer
18
Low address
SAR
New PNTR
Equal
EAR
Updated PNTR
High address
Figure 2.9a) Case 1: SAR < EAR , and Updated PNTR > EAR
Low address
Updated PNTR
SAR
Equal
New PNTR
EAR
High address
Figure 2.9b) Case 2: SAR < EAR , and Updated PNTR < SAR
19
Low address
EAR
New PNTR
Equal
SAR
Updated PNTR
High address
Figure 2.9c) SAR > EAR , and Updated PNTR > SAR
Low address
Updated PNTR
EAR
Equal
New PNTR
SAR
High address
Figure 2.9d) SAR > EAR , and Updated PNTR < EAR
20
Example 2.8 A DSP has a circular buffer with the start and the end addresses as 0200h
and 020Fh, respectively. What would be the new values of the address pointer of the
buffer if, in the course of address computation, it gets updated to (a) 0212h, (b) 0lFCh?
a. The new value of the pointer is updated value - buffer length, i.e., 0212h- 0010h
0202h.
b. The new value of the pointer is updated value + buffer length, i.e. 0lFCh + 0010h =
020Ch.
Example 2.9 Repeat the problem of Example 2.8 if the start and end addresses of the
circular buffer are 0210h and 0201h, respectively.
Solution:
a. The new value of the pointer is the updated value - buffer length, i.e., 0212h - 0010h
=0202h.
b. The new value of the ,pointer is the updated value +buffer length, i.e., 0lFCh + 0010h
= 020Ch.
Note that these values are the same as those in the previous example. This shows that in
a circular buffer, the address pointer wraps around to point to an address inside the
buffer, irrespective of whether the buffer start address is higher or the end address is
higher.
21