Sei sulla pagina 1di 101

www.roeverengg.edu.

in

ROEVER ENGINEERING COLLEGE


ELECTRONICS AND COMMUNICATION ENGINEERING

SUBJECT: DIGITAL SIGNAL PROCESSING


SUBJECTCODE: EC1302
DEPARTMENT OF ECE
PREPARED BY:
S.KARTHIKEYAN,AP/ECE

www.roeverengg.edu.in

www.roeverengg.edu.in

Digital Signal Processing


I UNIT

UNIT I DISCRETE FOURIER


TRANSFORM
9
DFT and its properties, Relation between DTFT and DFT, FFT computations using
Decimation in time and Decimation in frequency algorithms, Overlap-add and save
methods
PART-A
1. Differentiate DTFT and DFT
DTFT output is continuous in time where as DFT output is Discrete in time.
2.Differentiate between DIT and DIF algorithm
DIT Time is decimated and input is bi reversed format output in natural order
DIF Frequency is decimated and input is natural order output is bit reversed
format.
3. How many stages are there for 8 point DFT
8
4. How many multiplication terms are required for doing DFT by expressional
method and FFT method
expression n2 FFT N /2 log N
5. What is meant by autocorrelation?
The autocorrelation of a sequence is the correlation of a sequence with its shifted
version, and this indicates how fast the signal changes.
6. Define circular convolution.
Let x1(n) and x2(n) are finite duration sequences both of length N with
DFTs X1(K) and X2(k)
If X3 (k)=X1(k)X2(k) then the sequence x3(n) can be obtained by
circular convolution defined as
N-1
x3(n)= [1(m)x2((n-m))N
m=0
7. How to obtain the output sequence of linear convolution through circular
convolution? Consider two finite duration sequences x(n) and h(n) of duration L
samples and M
samples. The linear convolution of these two sequences produces an output sequence of
duration L+M-1 samples, whereas, the circular convolution of x(n) and h(n) give N
samples where N=max(L,M).In order to obtain the number of samples in circular
www.roeverengg.edu.in

www.roeverengg.edu.in

convolution equal to L+M-1, both x(n) and h(n) must be appended with appropriate
number of zero valued samples. In other words by increasing the length of the sequences
x (n) and h(n) to L+M -1 points and then circularly convolving the resulting sequences
we obtain the same result as that of linear convolution.
8. What is zero padding? What are its uses?
Let the sequence x (n) has a length L. If we want to find the N-point DFT(N>L)
of the sequence x(n), we have to add (N-L) zeros to the sequence x(n). This is known
as zero padding.
The uses of zero padding are
1) We can get better display of the frequency spectrum.
2) With zero padding the DFT can be used in linear filtering.
9. Define sectional convolution.
If the data sequence x(n) is of long duration it is very difficult to obtain the output
sequence y(n) due to limited memory of a digital computer. Therefore, the data sequence
is divided up into smaller sections. These sections are processed separately one at a time
and controlled later to get the output.
10. What are the two methods used for the sectional
convolution? The two methods used for the sectional
convolution are
1)the overlap-add method and 2)overlap-save method.
11. What is overlap-add method?
In this method the size of the input data block xi(n) is L. To each data block we
append M-1 zeros and perform N point cicular convolution of xi(n) and h(n). Since each
data block is terminated with M-1 zeros the last M -1 points from each output block
must be overlapped and added to first M-1 points of the succeeding blocks.This method
is called overlap-add method.
12. What is overlap-save method?
In this method the data sequence is divided into N point sections xi(n).Each
section contains the last M-1 data points of the previous section followed by L new data
points to form a data sequence of length N=L+M-1.In circular convolution of xi(n) with
h(n) the first M-1 points will not agree with the linear convolution of xi(n) and h(n)
because of aliasing, the remaining points will agree with linear convolution. Hence we
discard the first (M -1) points of filtered section xi(n) N h(n). This process is repeated
for all sections and the filtered sections are abutted together.
13. Why FFT is needed?
The direct evaluation DFT requires N2 complex multiplications and N2 N
complex additions.Thus for large values of N direct evaluation of the DFT is difficult.By
using FFT algorithm the number of complex computations can be reduced. So we use
FFT.

www.roeverengg.edu.in

www.roeverengg.edu.in

14. What is FFT?


The Fast Fourier Transform is an algorithm used to compute the DFT. It makes
use of the symmetry and periodicity properties of twiddle factor to effectively reduce the
DFT computation time.It is based on the fundamental principle of decomposing the
computation of DFT of a sequence of length N into successively smaller DFTs.
15. How many multiplications and additions are required to compute N point DFT
using redix-2 FFT?
The number of multiplications and additions required to compute N point DFT
using radix-2 FFT are N log2 N and N/2 log2 N respectively,.
16. What is meant by radix-2 FFT?
The FFT algorithm is most efficient in calculating N point DFT. If the number of output
points N can be expressed as a power of 2 that is N=2M, where M is an integer, then this
algorithm is known as radix
.
17. List any four properties of DFT
1.
2.
3.
4.

Periodicity
Linearity
Time reversal
Circular time shift

18. State periodicity property with respect to DFT.


If x(k) is N-point DFT of a finite duration sequence x(n), then x(n+N) = x(n) for all n.
x(k+N) = x(k) for all k.
19. State linear property with respect to DFT.
If x1(k) and x2(k) are N-point DFTs of finite duration sequences x1(n) and x2(n), then
DFT [a x1(n) + b x2(n)] = a x1(k) + b x2(k), a, b are constants
20. Why FFT is needed?
FFT is needed to compute DFT with reduced number of calculations. DFT is required for
spectrum analysis and filtering operations on the signals using digital computers.
21. Calculate the number of multiplications needed in the calculation of DFT and
FFT with 64 point sequence.

www.roeverengg.edu.in

www.roeverengg.edu.in

The number of complex multiplications required using direct computation is N2 = 642 =


4096.
The number of complex multiplications required using FFT is N log2 N = 64 log264 =
192

22. Distinguish between linear convolution and circular convolution of two


sequences.
Circular convolution

No.

Linear convolution

If x(n) is a sequence of L number of


If x(n) is a sequence of L number of samples
samples and h(n) with M number of
and h(n) with M samples, after convolution
samples, after convolution y(n) will have y(n) will have N=max(L,M) samples.
N=L+M-1 samples.

www.roeverengg.edu.in

www.roeverengg.edu.in

PART-B
1. Determine the DFT of the sequence
x(n) =1/4, for 0<=n <=2
0, otherwise
Ans: The N point DFT of the sequence x(n) is defined as
N-1
x(k)= x(n)e-j2nk/N K=0,1,2,3,N-1
n=0
x(n) = (1/4,1/4,1/4)
N-1
X(k) = e-j2nk/3[1+2cos(2_k/3)] where k= 0,1,.,N-1
n=0
2. Derive the DFT of the sample data sequence x(n) = {1,1,2,2,3,3}and compute the
corresponding amplitude and phase spectrum.
Ans: The N point DFT of the sequence x(n) is defined as
N-1
X(k)= x(n)e-j2_nk/N K=0,1,2,3,N-1
n=0
X(0) = 12
X(1) = -1.5 + j2.598
X(2) = -1.5 + j0.866
X(3) = 0
X(4) = -1.5 j0.866
X(5) =-1.5-j2.598
X(k) = {12, -1.5 + j2.598, -1.5 + j0.866,0, -1.5 j0.866, -1.5-j2.598}
|X(k)|={12,2.999,1.732,0,1.732,2.999}
3.Given x(n) = {0,1,2,3,4,5,6,7} find X(k) using DIT FFT algorithm.
Ans: Given N = 8
WN k = e-j(2_/N)k
0
W8
=1
1
W8
=0.707-j0.707
2
W8
= -j
3
W8
=-0.707-j0.707
Using butterfly diagram

www.roeverengg.edu.in

www.roeverengg.edu.in

X(k) = {28,-4+j9.656,-4+j4,-4+j1.656,-4,-4-j1.656,-4-j4,-4-j9.656}

4. Given X(k) = {28,-4+j9.656,-4+j4,-4+j1.656,-4,-4-j1.656,-4-j4,-4-j9.656} ,find x(n)


using inverse DIT FFT algorithm.

WN k = ej(2/N)k
0
W8= 1
1
W8 =0.707+j0.707
2
W8 = j
3
W8 = -0.707+j0.707

www.roeverengg.edu.in

www.roeverengg.edu.in

x(n) = {0,1,2,3,4,5,6,7}

5. Find the inverse DFT of X(k) = {1,2,3,4}


Ans: The inverse DFT is defined as
N-1
x(n)=(1/N ) x(k)ej2nk/N n=0,1,2,3,N-1
k=0
x(0) = 5/2
x(1) = -1/2-j1/2
x(2) = -1/2
x(3) = -1/2+j1/2
x(n) = {5/2, -1/2-j1/2, -1/2, -1/2+j1/2}

www.roeverengg.edu.in

www.roeverengg.edu.in

6. State the properties of DFT.


1) Periodicity
2) Linearity and symmetry
3) Multiplication of two DFTs
4) Circular convolution
5) Time reversal
6) Circular time shift and frequency shift
7) Complex conjugate
8) Circular correlation
7. Derive 8 point radix 2 DIT-FFT algorithm.
Ans: Basic butterfly diagram,

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

Complete butterfly diagram.

8. Derive 8 point radix 2 DIF-FFT algorithm


Ans: Basic butterfly diagram,

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

Complete butterfly diagram.

9. Derive 16 point radix 4 DIF-FFT algorithm.


Ans: Basic butterfly diagram,

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

Sixteen-point radix-4 decimation-in-time algorithm with input in normal order and output in digit-reversed
order
Complete butterfly diagram

www.roeverengg.edu.in

www.roeverengg.edu.in

Sixteen-point, radix-4 decimation-in-frequency algorithm with input in normal


order and output in digit-reversed order.

10.Explain the methods of filtering long data sequences.


Ans: Overlap add method
Overlap save method
Overlap Add
o Use a block length of N=L+P-1
o Append (L-1) zeros to h[n] and compute the N-length DFT H[k] once. It will be used for all blocks.
o Start the input block index at 0.
o Repeat the following:
Get the next length L sequence of input from x[n] starting at the block index
Append (P-1) zeros and compute the N-length DFT X[k].
Compute the N-point IDFT of (H[k]X[k]/N) to get a partial output sequence.
Overlap the last partial output sequence with the current output sequence by adding the last (P-1)
outputs of the last partial output sequence to the first (P-1) outputs of the current partial output
sequence.
www.roeverengg.edu.in

www.roeverengg.edu.in

Output the first L outputs of the sum.


Save the remaining (P-1) outputs for use with the next block of L input values

Overlap Save
o Use a block length of N=L+P-1
o Append (L-1) zeros to h[n] and compute the N-length DFT H[k] once. It will be used for all blocks.
o Start the input block index at 0.
o Initialize the current x[n] to all zeros.
o Repeat the following:
Get the next length L sequence of input from x[n] starting at the block index
Store it into the last L locations of x[n].
Compute the N-length DFT X[k].
Compute the N-point IDFT of (H[k]X[k]/N) to get a temporary output sequence with some valid
results and some invalid results.
Output the L valid results for linear convolution at the end of the temporary output sequence and
discard the first (P-1) invalid results. (Refer to linear convolution example above.)
Move the last (P-1) values in x[n] to the first [P-1] entries to be used again in the next block
computation

www.roeverengg.edu.in

www.roeverengg.edu.in

II UNIT
UNIT II INFINITE IMPULSE RESPONSE DIGITAL FILTERS:
9
Review of design of analogue Butterworth and Chebyshev Filters, Frequency transformation in
analogue domain - Design of IIR digital filters using impulse invariance technique - Design of digital
filters using bilinear transform - pre warping - Realization using direct, cascade and parallel forms.

PART-A

1. Write the expression for order of Butterworth filter?


The expression is N=log (_ /) 1/2/log (1/k)
2. Write the expression for the order of chebyshev filter?
N=cosh-1(_ /e)/cosh-1(1/k)

3. Write the various frequency transformations in analog domain?


LPF to LPF:
LPF to HPF
LPF to BPF
LPF to BSF
4. Write the steps in designing chebyshev filter?
1. Find the order of the filter.
2. Find the value of major and minor axis. _
3. Calculate the poles.
4. Find the denominator function using the above poles.
5. The numerator polynomial value depends on the value of n.
If n is odd: put s=0 in the denominator polynomial.
If n is even put s=0 and divide it by (1+e2)1/2
5. Write down the steps for designing a Butterworth filter?
1. From the given specifications find the order of the filter
2 find the transfer function from the value of N
3. Find
c
4 find the transfer function ha(s) for the above value of
c by su s by that value.
6. State the equation for finding the poles in chebyshev filter
sk=acosk+jbsink,where k=/2+(2k-1)/2n)
7. State the steps to design digital IIR filter using bilinear method
Substitute s by 2/T (z-1/z+1), where T=2/_ (tan (w/2) in h(s) to get h (z)
8. What is warping effect?
For smaller values of w there exist linear relationship between w and .but for
www.roeverengg.edu.in

www.roeverengg.edu.in

larger values of w the relationship is nonlinear. This introduces distortion in thefrequency axis. This effect
compresses the magnitude and phase response. Thiseffect is called warping effect
9. Write a note on pre warping.
The effect of the non linear compression at high frequencies can be compensated.When the desired
magnitude response is piecewise constant over frequency, thiscompression can be compensated by
introducing a suitable rescaling or prewarping the critical frequencies.
10. Give the bilinear transform equation between s plane and z plane
s=2/T (z-1/z+1)
11. Why impulse invariant method is not preferred in the design of IIR filters other
than low pass filter?
In this method the mapping from s plane to z plane is many to one. Thus there ire
an infinite number of poles that map to the same location in the z plane, producing
an aliasing effect. It is inappropriate in designing high pass filters. Therefore this
method is not much preferred.
12. By impulse invariant method obtain the digital filter transfer function and thedifferential
equation of the analog filter h(s) =1/s+1
H (z) =1/1-e-Tz-1
Y/x(s) =1/s+1
Cross multiplying and taking inverse lap lace we get,
D/dt(y(t)+y(t)=x(t)
13. What is meant by impulse invariant method?
In this method of digitizing an analog filter, the impulse response of the resulting
digital filter is a sampled version of the impulse response of the analog filter. For
e.g. if the transfer function is of the form, 1/s-p, then
H (z) =1/1-e-pTz-1
14. What do you understand by backward difference?
One of the simplest methods of converting analog to digital filter is to
approximate the differential equation by an equivalent difference equation.
d/dt(y(t)/t=nT=(y(nT)-y(nT-T))/T
15. What are the properties of chebyshev filter?
1. The magnitude response of the chebyshev filter exhibits ripple either in the stop
band or the pass band.
2. The poles of this filter lies on the ellipse
16. Give the Butterworth filter transfer function and its magnitude characteristics for
different orders of filter.
The transfer function of the Butterworth filter is given by
H (j_) =1/1+j (_/_c) N
17. Give the magnitude function of Butterworth filter.
The magnitude function of Butterworth filter is
www.roeverengg.edu.in

www.roeverengg.edu.in

|h(j_)=1/[1+(_/_c)2N]1/2 ,N=1,2,3,4,.
18. Give the equation for the order N, major, minor axis of an ellipse in case of chebyshev filter?
The order is given by N=cosh-1(((10.1_p)-1/10.1_s-1)1/2))/cosh-1_s/_p
A= (1/N--1/N)/2p
B=p (1/N+ -1/N)/2
19. Give the expression for poles and zeroes of a chebyshev type 2 filters
The zeroes of chebyshev type 2 filter SK=j_s/sink_k, k=1.N
The poles of this filter xk+jyk
xk= _s_k/ _s2+_k2
yk=_s_k/ _s2+_k2 _k=acos_k
20. How can you design a digital filter from analog filter?
Digital filter can de designed from analog filter using the following methods
1. Approximation of derivatives
2. Impulse invariant method
3. Bilinear transformation
21. Write down bilinear transformation.
s=2/T (z-1/z+1)
22. List the Butterworth polynomial for various orders.
N Denominator polynomial
1 S+1
2 S2+.707s+1
3 (s+1)(s2+s+1)
4 (s2+.7653s+1)(s2+1.84s+1)
5 (s+1)(s2+.6183s+1)(s2+1.618s+1)
6 (s2+1.93s+1)(s2+.707s+1)(s2+.5s+1)
7 (s+1)(s2+1.809s+1)(s2+1.24s+1)(s2+.48s+1)
23. Differentiate Butterworth and Chebyshev filter.
Butterworth dampimg factor 1.44 chebyshev 1.06
Butterworth flat response damped response.
24. What is filter?
Filter is a frequency selective device ,which amplify particular range of
frequencies and attenuate particular range of frequencies.
25. What are the types of digital filter according to their impulse response?
IIR(Infinite impulse response )filter
FIR(Finite Impulse Response)filter.
26. How phase distortion and delay distortion are introduced?
The phase distortion is introduced when the phase characteristics of a filter is
nonlinear with in the desired frequency band.
The delay distortion is introduced when the delay is not constant with in the
www.roeverengg.edu.in

www.roeverengg.edu.in

desired frequency band.


27. Define IIR filter?
The filter designed by considering all the infinite samples of impulse response are
called IIR filter.
to design lowpass and bandpass filter.
(ii) Antisymmetric condition h(n)=-h(N-1-n)
28.What are the differences and similarities between DIF and DIT algorithms?
Differences:
1)The input is bit reversed while the output is in natural order for DIT, whereas for DIF the output is bit
reversed while the input is in natural order.
2)The DIF butterfly is slightly different from the DIT butterfly, the difference being that the complex
multiplication takes place after the add-subtract operation in DIF.
Similarities:
Both algorithms require same number of operations to compute the DFT.Both algorithms can
be done in place and both need to perform bit reversal at some place during the computation.
29. What are the different types of filters based on impulse response? Based on
impulse response the filters are of two types
1. IIR filter
2. FIR filter
The IIR filters are of recursive type, whereby the present output sample depends on the
present input, past input samples and output samples.
The FIR filters are of non recursive type, whereby the present output sample depends on the present input
sample and previous input samples

30. what is mean by FIR filter?


The filter designed by selecting finite number of samples of impulse response
(h(n) obtained from inverse fourier transform of desired frequency response
H(w)) are called FIR filters
31. Write the steps involved in FIR filter design
Choose the desired frequency response Hd(w)
Take the inverse fourier transform and obtain Hd(n)
Convert the infinite duration sequence Hd(n) to h(n)
Take Z transform of h(n) to get H(Z)
32. What are advantages of FIR filter?
Linear phase FIR filter can be easily designed .
Efficient realization of FIR filter exists as both recursive and non-recursive
structures.
FIR filter realized non-recursively stable.
www.roeverengg.edu.in

www.roeverengg.edu.in

The round off noise can be made small in non recursive realization of FIR filter.
33. What are the disadvantages of FIR FILTER
The duration of impulse response should be large to realize sharp cutoff filters.
The non integral delay can lead to problems in some signal processing
applications.
34. What is the necessary and sufficient condition for the linear phase characteristic of a FIR filter?
The phase function should be a linear function of w, which inturn requires
constant group delay and phase delay.
35.List the well known design technique for linear phase FIR filter design?
Fourier series method and window method
Frequency sampling method.
Optimal filter design method.

36. For what kind of application , the antisymmetrical impulse response can be used?
The ant symmetrical impulse response can be used to design Hilbert transforms
and differentiators.
37. For what kind of application , the symmetrical impulse response can be used?
The impulse response ,which is symmetric having odd number of samples can be
used to design all types of filters ,i.e , lowpass,highpass,bandpass and band reject.
The symmetric impulse response having even number of samples can be used
to design lowpass and bandpass filter.
38.What is the reason that FIR filter is always stable?
FIR filter is always stable because all its poles are at the origin.
39.What condition on the FIR sequence h(n) are to be imposed n order that this filter can be called a
liner phase filter?
The conditions are
(i) Symmetric condition h(n)=h(N-1-n)
(ii) Antisymmetric condition h(n)=-h(N-1-n)
40. Under what conditions a finite duration sequence h(n) will yield constant group
delay in its frequency response characteristics and not the phase delay?
If the impulse response is anti symmetrical ,satisfying the condition
H(n)=-h(N-1-n)
The frequency response of FIR filter will have constant group delay and not the
phase delay .
41. State the condition for a digital filter to be causal and stable?
A digital filter is causal if its impluse response h(n)=0 for n<0.
A digital filter is stable if its impulse response is absolutely summable,
www.roeverengg.edu.in

www.roeverengg.edu.in

42. What are the properties of FIR filter?


1.FIR filter is always stable.
2.A realizable filter can always be obtained.
3.FIR filter has a linear phase response.
43. When cascade from realization is preferred in FIR filters?
The cascade from realization is preferred when complex zeros with absolute
magnitude less than one.
44. What are the disadvantage of Fourier series method ?
In designing FIR filter using Fourier series method the infinite duration impulse
response is truncated at n= (N-1/2).Direct truncation of the series will lead to fixed
percentage overshoots and undershoots before and after an approximated discontinuity in
the frequency response .

45. What is Gibbs phenomenon?


OR
What are Gibbs oscillations?
One possible way of finding an FIR filter that approximates H(ejw)would be to
truncate the infinite Fourier series at n= (N-1/2).Abrupt truncation of the series
will lead to oscillation both in pass band and is stop band .This phenomenon is
known as Gibbs phenomenon.
46. What are the desirable characteristics of the windows?
The desirable charaterstics of the window are
1.The central lobe of the frequency response of the window should contain
most of the energy and should be narrow.
2.The highest side lobe level of the frequency response should be small.
3.The sides lobes of the frequency response should decrease in energy
Rapidly.
47. Compare Hamming window with Kaiser window.
Hamming window Kaiser window
1.The main lobe width is equal to8/N
and the peak side lobe level is 41dB.
2.The low pass FIR filter designed will
have first side lobe peak of 53 dB
The main lobe width ,the peak side lobe
level can be varied by varying the
parameter and N.
The side lobe peak can be varied by
varying the parameter .
48. What is the necessary and sufficient condition for linear phase characteristics in
FIR filter?
The necessary and sufficient condition for linear phase characteristics in FIR filter
www.roeverengg.edu.in

www.roeverengg.edu.in

is the impulse response h(n) of the system should have the symmetry property,i.e,
H(n) = h(N-1-n)
Where N is the duration of the sequence .
49. What are the advantages of Kaiser widow?
1.It provides flexibility for the designer to select the side lobe level and N .
2. It has the attractive property that the side lobe level can be varied
continuously from the low value in the Blackman window to the high value in the
rectangle window .
50. What is the principle of designing FIR filter using frequency sampling method?
In frequency sampling method the desired magnitude response is sampled and a linear
phase response is specified .The samples of desired frequency response are defined as
DFT coefficients. The filter coefficients are then determined as the IDFT of this set of
samples.
51. For what type of filters frequency sampling method is suitable?
Frequency sampling method is attractive for narrow band frequency selective
filters where only a few of the samples of the frequency response are non-zero.

52. Distinguish IIR and FIR filters


FIR
IIR
1. Impulse response is finite
Impulse Response is infinite
2. They have perfect linear phase

They do not have perfect linear


phase

3. Non recursive

Recursive

4. Greater flexibility to control the


shape of magnitude response

Less flexibility

53. Distinguish analog and digital filters


Analog
*Constructed using active or
passive components and it is
described by a differential
equation

digital
Consists of elements like adder
subtractor and delay units and it is
described by a difference equation

*Frequency response can be


changed by changing the
components

Frequency response can be


changed by changing the filter
coefficients

*It processes and generates


analog output

Processes and generates digital


output
www.roeverengg.edu.in

www.roeverengg.edu.in

*Output varies due to external


Conditions

Not influenced by external


Conditions

54. What is DIT algorithm?


Decimation-In-Time algorithm is used to calculate the DFT of a N point sequence. The idea is to
break the N point sequence into two sequences, the DFTs of which can be combined to give the DFt of the
original N point sequence.This algorithm is called DIT because the sequence x(n) is often splitted into
smaller sub- sequences.
55. What DIF algorithm?
It is a popular form of the FFT algorithm. In this the output sequence X(k) is divided into
smaller and smaller sub-sequences , that is why the name Decimation In Frequency.
56. What are the applications of FFT algorithm? The
applications of FFT algorithm includes
1) Linear filtering
2) Correlation
3) Spectrum analysis

57. Why the computations in FFT algorithm is said to be in place?


Once the butterfly operation is performed on a pair of complex numbers (a,b) to produce (A,B),
there is no need to save the input pair. We can store the result (A,B) in the same locations as (a,b). Since
the same storage locations are used troughout the computation we say that the computations are done in
place.

PART-B
1.Draw the structures of IIR filters
Ans: Direct form I

www.roeverengg.edu.in

www.roeverengg.edu.in

Direct form II

www.roeverengg.edu.in

www.roeverengg.edu.in

Cascade structure

Lattice Ladder structure.

www.roeverengg.edu.in

www.roeverengg.edu.in

Parallel structure

2. Derive the equation for designing IIR filter using backward difference method.
Ans: maping from S plane to Z plane. d/dt y(t)=y(nT)y(nT-T)/T

3. Derive the equation for designing IIR filter using impulse invariant method.
Ans: Maping from S plane to Z plane.
www.roeverengg.edu.in

www.roeverengg.edu.in

Impulse invariant equation


4. Derive the equation for designing IIR filter using bilinear transformation.
Ans: maping from S plane to Z plane.
Bilinear equation S=2/T(1-Z-1/1+Z-1)
5. Explain the design procedure for Butterworth low pass filter
1. From the given specifications find the order of the filter N.
2. Round off it to the next higher integer
3. Find the transfer function H(s)
4. Calculate the value of cutoff frequency
5. Find the transfer function Ha(s).

6. Explain the design procedure for Chebyshev low pass filter


1. From the given specifications find the order of the filter N.
2. Round off it to the next higher integer
3. Find the value of a and b,which are minor and major axis of the ellipse.
4. Calculate the poles of chebyshev filter which lie on an ellipse .
5. Find the denominator of the transfer function using the above poles.
6. The numerator of the transfer fuction depends on the value of N.

7. Give the design procedure of impulse invariant method.


1.
2.
3.
4.

For the given specifications,find Ha(s),the transfer function of an analog filter.


Select the sampling rate of the digital filter,T seconds per sample
Express the analog filter transfer function as the sum of single pole filters.
Compute the z-transform of the digital filter.

8. Discuss briefly about the design procedure of bilinear transform method.


1.
2.
3.
4.

From the given specifications, find prewar ping analog frequencies


Using the analog frequencies find H(s) of the analog filter.
Select the sampling rate of the digital filter,call it T seconds per sample.
Substitute s in to the transfer function found in step 2.

9. Design an ideal low pass filter with a frequency response


Hd(e jw) =1 for /2<=w<= /2
0 otherwise
find the value of h(n) for N=11 find H(Z) plot magnitude response
a. Find h(n) by IDTFT
b. Convert h(n) in to a fine length by truncation
www.roeverengg.edu.in

www.roeverengg.edu.in

c. H(0)=1/2,
h(1)=h(-1)=0.3183
h(2)=h(-2)=0
h(3)=h(-3)= -0.106
h(4)=h(-4)=0
h(5)=h(-5)=0.06366
d. Find the transfer function H(Z) which is not realizable conver in to realizable by
multiplying by z-(N-1/2)
e. H(Z) obtained is 0.06366-0.106z-2+.3183Z-4+.5Z-5+.3183Z-6-.106Z-8+0.06366Z10
f. Find H (e jw) and plot amplitude response curve.

10. Design an ideal low pass filter with a frequency response


Hd(e jw) =1 for /4<=|w|<=3/4
0 otherwise
find the value of h(n) for N=11 find H(Z) plot magnitude response
g. Find h(n) by IDTFT
h. Convert h(n) in to a fine length by truncation
i. H(0)=0.75
h(1)=h(-1)=-.22
h(2)=h(-2)=-.159
h(3)=h(-3)= -0.075
h(4)=h(-4)=0
h(5)=h(-5)=0.045
j. Find the transfer function H(Z) which is not realizable conver in to realizable by
multiplying by z-(N-1/2)
k. H(Z) obtained is 0.045-0.075z-2 -.159 Z-3-0.22Z-4+0.75Z-5-.22Z-6 -0.159Z-7 -.
075Z-8+0.045Z-10
l. Find H (e jw) and plot amplitude response curve.

11. Derive the condition of FIR filter to be linear in phase.


Conditions are Group delay and Phase delay should be constant
And show the condition is satisfied
12. Explain the steps involved in the design of FIR filter using Kaiser window .
Ans: Determine hd(n)
Choose
Calculate
Determine the parameter s
Choose the parameter
Find N
Calculate yhe window function wk(n)
Find h(n)=wk(n) hd(n)
www.roeverengg.edu.in

www.roeverengg.edu.in

13.Draw the direct form realization of FIR system. 62. Draw the direct form realization of a
linear Phase FIR system for N even. Draw the direct form realization of a linear Phase FIR
system for N odd

14. Draw the structures of FIR filters.


Ans: Direct form
www.roeverengg.edu.in

www.roeverengg.edu.in

Cascade structure

www.roeverengg.edu.in

www.roeverengg.edu.in

Parallel structure

Lattice structure.

www.roeverengg.edu.in

www.roeverengg.edu.in

15. Design band pass filter with a frequency response


Hd(e jw) =1 for /3<=|w|<=2/3
0 otherwise
find the value of h(n) for N=11 find H(Z) plot magnitude response
m. Find h(n) by IDTFT
n. Convert h(n) in to a fine length by truncation
o. Find the transfer function H(Z) which is not realizable conver in to realizable by
multiplying by z-(N-1/2)
p. H(Z) obtained Find H (e jw) and plot amplitude response curve.
9. Design band reject filter with a frequency response Hd(e jw) =1 for /4<=|w|<=3/4
0 otherwise
find the value of h(n) for N=11 find H(Z) plot magnitude response
q. Find h(n) by IDTFT
r. Convert h(n) in to a fine length by truncation
s. Find the transfer function H(Z) which is not realizable conver in to realizable by
multiplying by z-(N-1/2)
t. H(Z) obtained Find H (e jw) and plot amplitude response curve.

www.roeverengg.edu.in

www.roeverengg.edu.in

UNIT III
UNIT III FINITE IMPULSE RESPONSE DIGITAL FILTERS
9
Symmetric and Antisymmetric FIR filters - Linear phase FIR filters - Design using Hamming,
Hanning and Blackmann Windows - Frequency sampling method - Realization of FIR filters Transversal,
Linear
phase
and
Polyphase
structures.

PART-A
1. Define white noise?
A stationary random process is said to be white noise if its power density
spectrum is constant. Hence the white noise has flat frequency response spectrum.
2. What do you understand by a fixed-point number?
In fixed point arithmetic the position of the binary point is fixed. The bit to the right
represent the fractional part of the number & those to the left represent the integer part.
For example, the binary number 01.1100 has the value 1.75 in decimal.
3. What is meant by block floating point representation? What are its advantages?
In block point arithmetic the set of signals to be handled is divided into blocks. Each
block have the same value for the exponent. The arithmetic operations with in the block
uses fixed point arithmetic & only one exponent per block is stored thus saving memory.
This representation of numbers is more suitable in certain FFT flow graph & in digital
audio applications.
4. What are the advantages of floating point arithmetic?
1. Large dynamic range
2. Over flow in floating point representation is unlike.

5. How the multiplication & addition are carried out in floating point arithmetic?
In floating point arithmetic, multiplication are carried out as follows,
Let f1 = M1*2c1 and f2 = M2*2c2. Then f3 = f1*f2 = (M1*M2) 2(c1+c2)
That is, mantissa is multiplied using fixed-point arithmetic and the exponents are
added.
The sum of two floating-point number is carried out by shifting the bits of the mantissa
of the smaller number to the right until the exponents of the two numbers are equal and
then adding the mantissas.

6. What do you understand by input quantization error?


In digital signal processing, the continuous time input signals are converted into digital
using a b-bit ACD. The representation of continuous signal amplitude by a fixed digit
produce an error, which is known as input quantization error.
www.roeverengg.edu.in

www.roeverengg.edu.in

7. What is the relationship between truncation error e and the bits b for representing a
decimal into binary?
For a 2's complement representation, the error due to truncation for both positive and
negative values of x is 0>=xt-x>-2-b
Where b is the number of bits and xt is the truncated value of x.
The equation holds good for both sign magnitude, 1's complement if x>0
If x<0, then for sign magnitude and for 1's complement the truncation error satisfies.
8. What is meant rounding? Discuss its effect on all types of number representation?
Rounding a number to b bits is accomplished by choosing the rounded result as the b
bit number closest to the original number unrounded.
For fixed point arithmetic, the error made by rounding a number to b bits satisfy the
inequality
for all three types of number systems, i.e., 2's complement, 1's complement & sign
magnitude.
For floating point number the error made by rounding a number to b bits satisfy the
inequality

9. What is meant by A/D conversion noise?


A DSP contains a device, A/D converter that operates on the analog input x(t) to
produce xq(t) which is binary sequence of 0s and 1s.
At first the signal x(t) is sampled at regular intervals to produce a sequence x(n) is of
infinite precision. Each sample x(n) is expressed in terms of a finite number of bits given
the sequence xq(n). The difference signal e(n)=xq(n)-x(n) is called A/D conversion noise.
10. What is the effect of quantization on pole location?
Quantization of coefficients in digital filters lead to slight changes in their value. This
change in value of filter coefficients modify the pole-zero locations. Some times the pole
locations will be changed in such a way that the system may drive into instability.
11. Which realization is less sensitive to the process of quantization?
Cascade form.
12. What is meant by quantization step size?
Let us assume a sinusoidal signal varying between +1 and -1 having a dynamic range
2. If the ADC used to convert the sinusoidal signal employs b+1 bits including sign bit,
the number of levels available for quantizing x(n) is 2b+1. Thus the interval between
successive levelsq= 2 =2-b--------2b+1
Where q is known as quantization step size.
14. How would you relate the steady-state noise power due to quantization and the b bits
representing the binary sequence?
Steady state noise power
Where b is the number of bits excluding sign bit.
15. what is overflow oscillation?
www.roeverengg.edu.in

www.roeverengg.edu.in

The addition of two fixed-point arithmetic numbers cause over flow the sum exceeds
the word size available to store the sum. This overflow caused by adder make the filter
output to oscillate between maximum amplitude limits. Such limit cycles have been
referred to as over flow oscillations.
16. What are the methods used to prevent overflow?
There are two methods used to prevent overflow
1. Saturation arithmetic 2. Scaling
17. what are the two kinds of limit cycle behavior in DSP?
1.zero input limit cycle oscillations
2.Overflow limit cycle oscillations
18. What is meant by "dead band" of the filter
The limit cycle occur as a result of quantization effect in multiplication. The
amplitudes of the output during a limit cycle are confined to a range of values called the
dead band of the filter.
19. Explain briefly the need for scaling in the digital filter implementation.
To prevent overflow, the signal level at certain points in the digital filter must be
scaled so that no overflow occurs in the adder.
20. What is meant by 1s complement form?
In 1,s complement form the positive number is represented as in the sign magnitude form. To obtain
the negative of the positive number, complement all the bits of the positive number.
21. What is meant by 2s complement form?
In 2s complement form the positive number is represented as in the sign magnitude form. To obtain
the negative of the positive number, complement all the bits of the positive number and add 1 to the
LSB.
22. What are the two types of limit cycle behavior of DSP?.
1.Zero limit cycle behavior 2.Over flow limit cycle behavior

23. What are the methods to prevent overflow?


1.Saturation arithmetic and2.Scaling
24. State some applications of DSP?
Speech processing ,Image processing, Radar signal processing.
25. What is meant by floating pint representation?
In floating point form the positive number is represented as F =2CM,where is mantissa, is a fraction such
that1/2<M<1and C the exponent can be either positive or negative.
www.roeverengg.edu.in

www.roeverengg.edu.in

26. What are the three-quantization errors to finite word length registers in digital filters?
1. Input quantization error 2. Coefficient quantization error 3. Product quantizationError

PART-B

1. Consider the transfer function H(Z)=H1(Z)H2(Z) where H1(Z) =1/1-a1Z-1


H2(z) =1/ 1-a2Z-1
Find the o/p Round of noise power Assume a1=0.5 and a2= 0.6 and find o.p
round off noise power.
Draw the round of Noise Model.
By using residue method find _01
By using residue method find _ 02
Ans:__________________
12
2. What is meant by A/D conversion noise? Explain in detail?
A DSP contains a device, A/D converter that operates on the analog input x(t) to
produce xq(t) which is binary sequence of 0s and 1s.
At first the signal x(t) is sampled at regular intervals to produce a sequence x(n) is of
infinite precision. Each sample x(n) is expressed in terms of a finite number of bits given
the sequence xq(n). The difference signal e(n)=xq(n)-x(n) is called A/D conversion noise.
+ derivation.
3.consider the transfer function H(Z)=H1(Z)H2(Z) where H1(Z) =1/1-a1Z-1
H2(z) =1/ 1-a2Z-1
Find the o/p Round of noise power Assume a1=0.7 and a2= 0.8and find o.p round
off noise power.
Draw the round of Noise Model.
By using residue method find _01
By using residue method find _ 02

4. Derive the expression for steady state I/P Noise Variance and Steady state O/P
Noise Variance
Write the derivation.

5. Explain the different types of representations in digital systems. Ans:Fixed


Point
Floating point
Block floating point
www.roeverengg.edu.in

www.roeverengg.edu.in

Fixed Point:
In fixed point arithmetic the position of the binary point is fixed. The bit to the right
represent the fractional part of the numbers and those to the left represent the integer part .for example,the
binary number 01.1100 the value 1.75 in decimal.depending on the way negative numbers are
represented,there are three different forms of fixed-point arithmetic. They are 1. Sign magnitude 2. 1s
complement 3. 2s complement.
*It is the fast operation.
*relatively economical
*small dynamic range
*roundoff error occurs only for addition
*overflow occurs in addition
*used in small computer.

Floating point:
In floating point representation a positive number is represented as F=2c.M where m called
mantissa and c is the exponent can be either positive or negative.negative floating point numbers are
generally represented by considering the mantissa as a fixed point number.
*slow operation
*more expensive because of costlier hardware
*increased dynamic range
*roundoff errors can occur with the both addition and multiplication.
*overflow does not arise
*used in larger general purpose computer.
Block floating point
In block floating point arithmetic the set of signals to be handled is divided in to blocks.
Each block has the same value for the exponent. The arithmetic operations within the block use fixed
point arithmetic and only one exponent per block is stored thus saving memory. This representation of
numbers is most suitable in certain FFT flow graphs and in digital audio applications.

6. Disscuss the different types of quantization errors. Ans:Input


www.roeverengg.edu.in

www.roeverengg.edu.in

quatization error
Product quantization error.
Coeffient quantization error
The common methods of quantization are truncation and rounding. Truncation is a process of discarding
all bits less significant than least significant bit that is retained.rounding of a number of b bits is
accomplished by choosing the rounded result as the b bit number closest to the original number un
rounded. The error due to truncation and rounding is known as quantization error.
The input quantization error is given by
E(n)=Xq(n)-X(n)
Xq(n)= sampled quantized value
X(n)= sampled unquantized value
The output quantization error is the variance of the sum of independent random variable is the sum of
their variances. If the quantization error are assumed to be independent at different sampling instances.
In fixed point arithmetic the product of two b bit numbers result in numbers 2b bits long. In digital
signal processing applications,it is necessary to round this product to a b bit number,which produce an
error known as product quantization error.
In the design of a digital filter the coefficients are evaluated with infinite precision. But when they are
quantized, the frequency response of the actual filter deviates from that which would have been obtained
with an infinite word length representation and the filter may actually fail to meet the desired
specifications. If the poles of the desired filter are close to the unit circle, then those of the filter with
quantized coefficients may lie just outside the unit circle.
.

7. Explain the different types of limit cycle oscillations and also the solutions
Ans: Zero input limit cycle oscillations
Overflow input limit cycle oscillations
Zero input limit cycle oscillations:
*This limit cycle oscillation has low amplitude.
*This limit cycle occurs when the input applied to the system is very low.
Overflow input limit cycle oscillations
*This limit cycle occurs because of the overflow taking place in the implementation of
digital filters.
www.roeverengg.edu.in

www.roeverengg.edu.in

* In addition to limit cycle oscillation causing by rounding the result of multiplication,


there are several types of limit cycle oscillation caused by addition, which make the filter output oscillate
between maximum and minimum amplitudes.
Dead band:
It is the range of output amplitudes over which limit cycle oscillation take place.
Scaling:
To prevent overflow .the signal level at certain points in the digital filters must be scaled so
that no overflow occurs in the adder.

8. Explain the construction and operation of channel vocoder with block diagram
Ans The channel vocoder is used in the analysis synthesis system. We use filter bank to separate the
frequency bands. There are about eight to ten filters. The amplitude of the filters outputs are encoded by
level detectors and coders. In addition to this, pitch and voice information are also sent along with them.
Also, a wide band excitation signal is generated at the receiving end using the transmitted pitch and
voicing information. For a voiced signal, the excitation consists of periodic signal with an appropriate
frequency.
However for unvoiced signal the excitation is a white noise. At the receiver end a matched filter
bank is available, due to which the output level matches the encode value.further,individual outputs are
combined to produce the speech signal.

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

UNIT IV
UNIT IV FINITE WORD LENGTH EFFECTS
9
Fixed point and floating point number representations - Comparison - Truncation and Rounding
errors - Quantization noise - derivation for quantization noise power - coefficient quantization error Product quantization error - Overflow error - Roundoff noise power - limit cycle oscillations due to
product
roundoff
and
overflow
errors
signal
scaling

1. Define unbiased estimate and consistent estimate. ( Apr 2007,Nov 2008))


Consistency is an asymptotic property: defining consistency requires considering arbitrarily large samples.
In real life, sample size will be limited by time or budget constraints. So it is natural to consider what
quality should be expected from an estimator based on samples of a fixed size. Then you would certainly
hope the central region of the distribution of the estimator to be close to the true value0 of the parameter.
One way of expressing this idea is to consider estimators whose distribution earn is equal to0 for any
value of n, the true value of the parameter . Such an estimator is said to be un biased., and unbiased ness
translates into :
E[]n =0 for any n
2. What are the disadvantages of non-parametric methods of power spectral estimation?
(May 2007,Nov 2008)
It requires long data sequences to obtain the necessary frequency resolution.
Spectral leakage effects because of windowing
The assumption of the autocorrelation estimate rxx(m) to be zero for mN. this assumption limits
the frequency resolution and quality of the power spectrum estimate.
. Assumption that the data are periodic with period N. these assumption may not be realistic.
3. What is periodogram? ( Apr/May 2008)
Periodogram is used to detect and measure hidden periodicity in the data let us take average value of
periodogram estimate from equation
4. Define the terms autocorrelation sequence and power spectral density(Apr 2007)
If x(t) is the stationary random process, then its autocorrelation function is
given as,
xx() = E[ x*(t) x(t+ )]
Here E [] denotes the statistical average

www.roeverengg.edu.in

www.roeverengg.edu.in

5. Define power spectral density and cross spectral density. (May2007)


power spectral density (PSD), which describes how thepower of a signal or time
series is distributed with frequency.
The PSD is the Fourier transform of the autocorrelation function,R(), of the
signal if the signal can be treated as a wide-sense stationary random process.
The power of the signal in a given frequency band can be calculated by integrating
over positive and negative frequencies,
The power spectral density of a signal exists if and only if the signal is a widesense stationary process.

Cross-spectral density
"Just as the Power Spectral Density (PSD) is the Fourier transform of the autocovariance function we may define the Cross Spectral Density (CSD) as the
Fourier transform of the cross-covariance function.
6. Explain deterministic and nondeterministic signals with examples. (Nov2006)
Deterministic signals are functions that are completely specified in time.
The nature and amplitude of such a signal at any time can be predicted.
The pattern of the signal is regular and can be characterized
mathematically.
Example X(t) = t this is a ramp whose amplitude increases linearly with time and
slope is . A non-deterministic signal is one whose occurrence is random in
nature and its pattern is quite irregular. A typical example of non deterministic signal is thermal noise in a
An electric circuit
7. Explain the use of DFT in power spectrum estimate?
We know that the periodogram f the signal is given as,
Pxx(f) = 1/N |X(f)|2

= 1/N | x(n) e-j2fn|


n=-
The fourier transform on right hand side of above equation can also be
evaluated using DFT. The DFT contains N-points. It is given as,
N-1
www.roeverengg.edu.in

www.roeverengg.edu.in

Pxx(k/n) = 1/N | x(n) e-j2kn/N|2


n=0
Thus the periodogram will now be evaluated at discrete frequencies
fk = k/N. the resolution of the spectrum can be increased bu increasing the length of
the DFT.

8. Define autocorrelation.
If x(t) is the stationary random process, then its autocorrelation function is given
as, xx() = E[ x*(t) x(t+ )]
Here E [] denotes the statistical average.
9. List the non-parametric methods for power spectral estimation.
Barlett method
Welch method
Blackman and Turkey method

10. What are the steps involved in Bartlett method?


The N-point sequence is subdivided into K number of non overlapping
segments. Each segment has the length M. i.e.,
Xi(n) = x(n+I M), i=0, 1, .k-1
Compuite the periodogram of each segment independently i.e.,
M-1
Pixx(f) = 1/N | x(n) e-j2fn|2
n=0
Take average of periodograms of all the K segments to get barlett power spectrum estimate
K-1
PBxx(f) = 1/K Pixx(f)
i=0
This equation gives the estimate of power spectrum using Bartlett method
11.
What
are
the
steps
involved
in
This method makes few modifications to Bartlett method.
as follows to calculate the periodogram.
The N-point sequence is subdivided into L number of segments. These
segments overlap over each other.
The data segment is passed through the window and then periodogram is
calculated.
The power density spectrum is then obtained by averaging the modified
www.roeverengg.edu.in

Welch
The three

method?
steps are

www.roeverengg.edu.in

periodogram.

12. Define Blackman and turkey method?


The Blackman and tukey suggested a new method in which less weight is
given to end points of rxx(m), variance is very high.
As per this method, the autocorrelation sequence is first passed through a window w(m). this window
shapes rxx(m) in such a way that weights of end points are reduced.
13.What is the objective of spectrum estimation?
The main objective of spectrum estimation is the determination of the power
spectral density of a random process. The estimated PSD provides information about the
structure of the random process which can be used for modeling, prediction or filtering of
the deserved process.
14.Define white noise?
A stationary random process is said to be white noise if its power density
Spectrum is constant. Hence the white noise has flat frequency response spectrum.
15.Describe briefly the different methods of power spectral estimation?
1. Bartlett method
2. Welch method
3. Blackman-Tukey method
and its derivation

PART-B

1. (i) With suitable relations, describe briefly the periodogram method of power
spectral estimation. Examine the consistency and bias of periodogram.
(Nov 2008,May 2007)

Periodogram
The periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur
Schuster in 1898[1] as in the following quote:[2]
THE PERIODOGRAM. It is convenient to have a word for some representation of a
www.roeverengg.edu.in

www.roeverengg.edu.in
variable quantity which shall correspond to the 'spectrum' of a luminous radiation. I
propose the word periodogram, and define it more particularly in the following
way:

where T may for convenience be chosen to be equal to some


integer multiple of

and plot a curve with 2 / k as absciss


as ordinates; this curve, or, better, the space between this curve and the axis
of absciss, represents the periodogram of f(t).
Note that the term periodogram may also be used to describe the quantity r2,[3] which is its common
meaning in astronomy (as in "the modulus-squared of the discrete Fourier transform of the time series
(with the appropriate normalisation)"[4]). See Scargle (1982) for a detailed discussion in this context.[5]
A spectral plot refers to a smoothed version of the periodogram.[6][7] Smoothing is performed to reduce
the effect of measurement noise.
In practice, the periodogram is often computed from a finite-length digital sequence using the fast Fourier
transform (FFT). The raw periodogram is not a good spectral estimate because of spectral bias and the fact
that the variance at a given frequency does not decrease as the number of samples used in the computation
increases.
The spectral bias problem arises from a sharp truncation of the sequence, and can be reduced by first
multiplying the finite sequence by a window function which truncates the sequence gradually rather than
abruptly.
The variance problem can be reduced by smoothing the periodogram. Various techniques to reduce
spectral bias and variance are the subject of spectral estimation.
One such technique to solve the variance problems is also known as the method of averaged
periodograms[8] or as Bartlett's method. The idea behind it is, to divide the set of N samples into L sets of

www.roeverengg.edu.in

www.roeverengg.edu.in

M samples, compute the DFT of each set, square it to get the power spectral density and compute the
average of all of them. This leads to a decrease in the standard deviation as

2.Explain the Bartlett method fo averaging periodograms. (Nov 2007,Apr 2008)


Bartletts method

Still have not a consistent estimate of the power spectrum!


Nevertheless, the periodogram is asymptotically unbiased
Hence if we can find a consistent estimate of the mean, then this estimate would also be a consistent estimate of
the power spectrum

Averaging (sampe mean) a set of uncorrelated measurements of a random variable results in a consistent
estimate of its mean

In other words: Variance of the sample mean is inversely proportional to the number of measurements

Hence this should also work here, by averaging Periodograms


Averaging these Periodograms
This results in an asymptotically unbiased estimate of the power spectrum
Since we assume that the realizations are uncorrelated, it follows, that the variance is inversely proportional to
the number of measurements K
Hence this is a consistent estimate of the power spectrum, if L and K go to infinity
There is still a problem: we usually do not have uncorrelated data records!
Typically there is only one data record of length N available
Hence Bartlett proposes to partition the data record into K nonoverlapping sequences of the length L, where N=K

www.roeverengg.edu.in

www.roeverengg.edu.in

Each expected value of the periodogram of the subsequences are identical hence the process of averaging
subsequences Periodograms results in the same average value => asymptotically unbiased
Note that the data length used for the Periodograms are now L and not N anymore, the spectral resolution
becomes worse (this is the price we are paying)

Now we reap the reward: the variance is going to zero as the number of subsequences goes to infinity
If both, K and L go to infinity, this will be a consistent estimate of the power spectrum
In addition, for a given N=K*L, we can trade off between good spectral resolution (large L) and reduction in
variance (Large K

www.roeverengg.edu.in

www.roeverengg.edu.in

a) Periodogram with N=512


b) Ensemble average
c) Overlay of 50 Bartlett estimates with K=4 and L=128
d) Ensemble average
e) Overlay of 50 Bartlett estimates with K=8 and L=64
f) Ensemble average

Refer book: Digital signal processing Proakis(pgno:910)


3. Explain about applications of the autocorrelation function of random signals. (Apr 2008)
Applications

One application of autocorrelation is the measurement of optical spectra and the measurement of veryshort-duration light pulses produced by lasers, both using optical autocorrelators.

For measuring particle size distributions of very fine particles or micelles suspended in a fluid. A laser
shining into the mixture produces flicker, which correlates with the motion of the particles.
Autocorrelation of the signal gives a picture of the diffusion speeds of the particles. From this, knowing the
viscosity of the fluid, the sizes of the particles can be calculated.

In optics, normalized autocorrelations and cross-correlations give the degree of coherence of an


electromagnetic field.

In signal processing, autocorrelation can give information about repeating events like musical beats (for
example, to determine tempo) or pulsar frequencies, though it cannot tell the position in time of the beat.
It can also be used to estimate the pitch of a musical tone.

In music recording, autocorrelation is used as a pitch detection algorithm prior to vocal processing, as a
distortion effect or to eliminate undesired mistakes and inaccuracies.[6]

Autocorrelation in space rather than time, via the Patterson function, is used by X-ray diffractionists to
help recover the "Fourier phase information" on atom positions not available through diffraction alone.

In statistics, spatial autocorrelation between sample locations also helps one estimate mean value
uncertainties when sampling a heterogeneous population.

The SEQUEST algorithm for analyzing mass spectra makes use of autocorrelation in conjunction with crosscorrelation to score the similarity of an observed spectrum to an idealized spectrum representing a
peptide.
www.roeverengg.edu.in

www.roeverengg.edu.in

In Astrophysics, auto-correlation is used to study and characterize the spatial distribution of galaxies in the
Universe and in multi-wavelength observations of Low Mass X-ray Binaries.

In panel data, spatial autocorrelation refers to correlation of a variable with itself through space.

4. Determine the performance characteristics of non-parametric power spectrum


estimators Blackman and Tukey. (Nov 2006 & May 2007)
Blackman-Tukey Correlogram and Cross-Spectrum
The correlogram constructs an estimate of the power spectrum using a windowed fast Fourier transforms
(FFT) of the autocorrelation function of the time series. It was developed by Blackman and Tukey (1958)
and is based on the Wiener-Khinchin theorem, which states that if the Fourier transform of a series x(t)
is X(f), and if the autocorrelation function of the series is R, then the Fourier transform of R yields
PX(f)=|X(f)|2 or the power spectrum of x(t). The resulting power-spectrum estimate is called a correlogram.
An alternative that is not included in the Toolkit is direct or windowed FFT of the time series itself, called
a periodogram.
Both periodograms and correlograms are usually performed on weighted versions of the time series or
autocorrelation functions in order to reduce power leakage (artificially high power estimates at frequencies
away from the true peak frequencies). Press et al. (1989, pp. 423-424) note that "when we select a run of
N sampled points for periodogram spectral estimation, we are in effect multiplying an infinite run of ...
data ... by a window function in time, one which is zero except during the total sampling time [NDt], and
is unity during that time." The sharp edges of this window function contain much power at highest
frequencies, which is imparted to the windowed signal and leads to power leakage. A similar argument
can be made for correlograms. Weighting the data or correlation function by various tapered shapes (high
in center and falling off to sides) is an accepted traditional approach to reducing power leakage.
In the Blackman-Tukey approach PX(f) is estimated by

where rk is the autocorrelation estimate at lag k, M is the maximum lag considered and window length, and
wk is the windowing function. Several window shapes are available in the Toolkit: Bartlett (triangular),
Hamming (cosinusoidal), Hanning (slightly different cosinusoidal), and none.
You may find that the various windows of the same widths give similar results. The more important
choice is how wide the windows should be. The averaging associated with windowing a series reduces the
resolution of the methods, from the frequency intervals of 1/N, to a windowed frequency intervals of
about 1/M (e.g., Kay 1988, p. 81). Thus, wider windows yield higher spectral resolution, and vice versa.
However, there is a trade-off between higher resolution and increasing variance of the spectral estimate.
At the extreme, a single (M=N) direct application of FFT to an unwindowed time series results in a
periodogram with a theoretical standard deviation of the estimates equal to the estimates at each
www.roeverengg.edu.in

www.roeverengg.edu.in

frequency, regardless of the number of observations in the time series (Press et al. 1989, p. 423).
Averaging the results from many short data windows throughout the series (or autocorrelation) effectively
increases the number of independent samples used in estimation and thereby reduces the estimation
variance. Kay (1988, section 4.5) shows that the variance of a power spectrum obtained by a windowed
correlogram is 2M/3N of the estimated power at each frequency. Thus a narrower window should be used
to smooth the spectrum and reduce the sampling errors on the estimate. In practice, Kay (1988)
recommends that windows should be no more than one-fifth to one-tenth the total number of data
points (to obtain desired estimate-variance reductions) and not too much smaller (in order to retain the
ability to distinguish between powers at neighboring frequencies and to obtain the desired leakage
reductions).
Theoretical estimates of variance for Blackman-Tukey power spectra are available (e.g., Kay, 1988) and
the Toolkit provides error bars constructed from them. These can either be plotted about the estimates
themselves, or as a red-noise uncertainty interval. In the latter case, an AR(1) process is fitted to the data,
and the the error bars are centered on the theoretical AR(1) spectrum.
As a "traditional" method, the correlogram is intended to provide a familiar benchmark against which the
other more modern methods provided in the Toolkit can be judged.
Cross-Spectra
Blackman-Tukey correlogram provides a straightforward way to compute the cross-power spectrum PXY
of the two input signals x(t) and y(t):

where X(f) and Y(f) are the correlogram estimates of the individual time series x(t) and y(t).
Cross-power spectrum can be used to estimate coherence between the two signals. However it requires
averaging of spectral estimates of independent realizations of x(t) and y(t). Multi-taper method provides a
practical way to compute Coherence by averaging the individual spectra given by each tapered version of
the data.
Multi-channel SSA is an advanced, data-adaptive method to analyze oscillatory spatio-temporal modes in
multivariate time series. In addition to identifying oscillatory peaks in the cross-spectrum, MSSA allows
reconstruction of the multivariate oscillatory modes.

5. Obtain the Estimation of autocorrelation

www.roeverengg.edu.in

www.roeverengg.edu.in

Autocorrelation
Autocorrelation is the cross-correlation of a signal with itself. Informally, it is the similarity between observations
as a function of the time separation between them. It is a mathematical tool for finding repeating patterns, such
as the presence of a periodic signal which has been buried under noise, or identifying the missing fundamental
frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing
functions or series of values, such as time domain signals

Definitions

different fields of study define autocorrelation differently, and not all of these definitions are equivalent.
In some fields, the term is used interchangeably with autocovariance.

[edit] Statistics
In statistics, the autocorrelation of a random process describes the correlation between values of the
process at different points in time, as a function of the two times or of the time difference. Let X be some
repeatable process, and i be some point in time after the start of that process. (i may be an integer for a
discrete-time process or a real number for a continuous-time process.) Then Xi is the value (or realization)
produced by a given run of the process at time i. Suppose that the process is further known to have defined
values for mean i and variance i2 for all times i. Then the definition of the autocorrelation between any
two time s and t is

where "E" is the expected value operator. Note that this expression is not well-defined for all time series
or processes, because the variance may be zero (for a constant process) or infinite. If the function R is
www.roeverengg.edu.in

www.roeverengg.edu.in

well-defined, its value must lie in the range [1, 1], with 1 indicating perfect correlation and 1 indicating
perfect anti-correlation.
If Xt is a second-order stationary process then the mean and the variance 2 are time-independent, and
further the autocorrelation depends only on the difference between t and s: the correlation depends only on
the time-distance between the pair of values but not on their position in time. This further implies that the
autocorrelation can be expressed as a function of the time-lag, and that this would be an even function of
the lag = s t.
It is common practice in some disciplines, other than statistics and time series analysis, to drop the
normalization by 2 and use the term "autocorrelation" interchangeably with "autocovariance". However,
the normalization is important both because the interpretation of the autocorrelation as a correlation
provides a scale-free measure of the strength of statistical dependence, and because the normalization has
an effect on the statistical properties of the estimated autocorrelations.

[edit] Signal processing


In signal processing, the above definition is often used without the normalization, that is, without
subtracting the mean and dividing by the variance. When the autocorrelation function is normalized by
mean and variance, it is sometimes referred to as the autocorrelation coefficient.[1]
Given a signal f(t), the continuous autocorrelation Rff() is most often defined as the continuous crosscorrelation integral of f(t) with itself, at lag .where epresents the complex conjugate and * represents
convolution. For a real function,

The above definitions work for signals that are square integrable, or square summable, that is, of finite
energy. Signals that "last forever" are treated instead as random processes, in which case different
definitions are needed, based on expected values. For wide-sense-stationary random processes, the
autocorrelations are defined asFor processes that are not stationary, these will also be functions of t, or n.
For processes that are also ergodic, the expectation can be replaced by the limit of a time average. The
autocorrelation of an ergodic process is sometimes defined as or equated to[1]

These definitions have the advantage that they give sensible well-defined single-parameter results
for periodic functions, even when those functions are not the output of stationary ergodic processes.

Alternatively, signals that last forever can be treated by a short-time autocorrelation function analysis,
using finite time integrals. (See short-time Fourier transform for a related process.)
Multi-dimensional autocorrelation is defined similarly. For example, in three dimensions the
autocorrelation of a square-summable discrete signal would be

www.roeverengg.edu.in

www.roeverengg.edu.in

When mean values are subtracted from signals before computing an autocorrelation function, the resulting
function is usually called an auto-covariance function.
6.Explain the properties and estimation of auticorrelation function
The properties of one-dimensional autocorrelations only, since most properties are easily transferred from
the one-dimensional case to the multi-dimensional cases.

A fundamental property of the autocorrelation is symmetry, R(i) = R( i), which is easy to prove from the
definition. In the continuous case,
the autocorrelation is an even function

when f is a real function,


and the autocorrelation is a Hermitian function

when f is a complex function.

The continuous autocorrelation function reaches its peak at the origin, where it takes a real value, i.e. for
any delay ,
discrete case.

. This is a consequence of the CauchySchwarz inequality. The same result holds in the

The autocorrelation of a periodic function is, itself, periodic with the same period.

The autocorrelation of the sum of two completely uncorrelated functions (the cross-correlation is zero for
all ) is the sum of the autocorrelations of each function separately.

Since autocorrelation is a specific type of cross-correlation, it maintains all the properties of crosscorrelation.

The autocorrelation of a continuous-time white noise signal will have a strong peak (represented by a
Dirac delta function) at = 0 and will be absolutely 0 for all other .

The WienerKhinchin theorem relates the autocorrelation function to the power spectral density via the
Fourier transform:

For real-valued functions, the symmetric autocorrelation function has a real symmetric transform, so the
WienerKhinchin theorem can be re-expressed in terms of real cosines only:

[edit] Efficient computation


For data expressed as a discrete sequence, it is frequently necessary to compute the autocorrelation with
high computational efficiency. While the brute force algorithm is order n2, several efficient algorithms
www.roeverengg.edu.in

www.roeverengg.edu.in

exist which can compute the autocorrelation in order


. For example, the WienerKhinchin theorem
allows computing the autocorrelation from the raw data X(t) with two Fast Fourier transforms (FFT)[2]:

FR(f) = FFT(X(t))

R() = IFFT(S(f))

where IFFT denotes the inverse Fast Fourier transform. The asterisk denotes complex conjugate.
Alternatively, a multiple correlation can be performed by using brute force calculation for low values,
and then progressively binning the X(t) data with a logarithmic density to compute higher values, resulting
in the same

efficiency, but with lower memory requirements.[citation needed]

[edit] Estimation
for any positive integer k < n. When the true mean and variance are known, this estimate is unbiased.
If the true mean and variance of the process are not known there are a several possibilities:

If and 2 are replaced by the standard formulae for sample mean and sample variance, then this is a
biased estimate.
A periodogram-based estimate replaces n k in the above formula with n. This estimate is always biased;
however, it usually has a smaller mean square error.[3][4]
Other possibilities derive from treating the two portions of data
and
separately and calculating
separate sample means and/or sample variances for use in defining the estimate.

The advantage of estimates of the last type is that the set of estimated autocorrelations, as a function of k,
then form a function which is a valid autocorrelation in the sense that it is possible to define a theoretical
process having exactly that autocorrelation. Other estimates can suffer from the problem that, if they are
used to calculate the variance of a linear combination of the X's, the variance calculated may turn out to be
negative.
[edit] Regression analysis
In regression analysis using time series data, autocorrelation of the errors is a problem. Autocorrelation of
the errors, which themselves are unobserved, can generally be detected because it produces
autocorrelation in the observable residuals. (Errors are also known as "error terms", in econometrics.)
Autocorrelation violates the ordinary least squares (OLS) assumption that the error terms are uncorrelated.
While it does not bias the OLS coefficient estimates, the standard errors tend to be underestimated (and
the t-scores overestimated) when the autocorrelations of the errors at low lags are positive.
The traditional test for the presence of first-order autocorrelation is the DurbinWatson statistic or, if the
explanatory variables include a lagged dependent variable, Durbin's h statistic. A more flexible test,
covering autocorrelation of higher orders and applicable whether or not the regressors include lags of the
dependent variable, is the BreuschGodfrey test. This involves an auxiliary regression, wherein the
residuals obtained from estimating the model of interest are regressed on (a) the original regressors and (b)
www.roeverengg.edu.in

www.roeverengg.edu.in

k lags of the residuals, where k is the order of the test. The simplest version of the test statistic from this
auxiliary regression is TR2, where T is the sample size and R2 is the coefficient of determination. Under
the null hypothesis of no autocorrelation, this statistic is asymptotically distributed as 2 with k degrees of
freedom.
Responses to nonzero autocorrelation include generalized least squares and the NeweyWest HAC
estimator (Heteroskedasticity and Autocorrelation Consistent).[5]

7. Obtain the variance of of welch power spectrum estimation.


Welchs method
Two modifications to Bartletts method
1) the subsequences are allowed to overlap
2) instead of Periodograms, modified Periodograms are averaged
Assuming that successive sequences are offset by D points and that each sequence is L points long, then the ith
sequence is
Thus the overlap is L-D points and if K sequences cover the entire N data points then
For example, with no overlap (D=L) there are K=N/L subsequences of length L
For a 50% overlap (D=L/2) there is a tradeoff between increasing L or increasing K
If L stays the same then there are more subsequences to average, hence the variance of the estimate is reduced

If subsequences are doubled in length and hence the spectral resolution is then doubled
Welchs method can be written in terms of the data record as follows
Or in terms of modified Periodograms

Hence the expected value of Welchs estimate is


Where W(ej ) is the Fourier transform of the L-point data window w(n)
For a fixed number of data N, with 50% overlap, twice as many subsequences can be averaged, hence expressing
the variance in terms of L and N we have
www.roeverengg.edu.in

www.roeverengg.edu.in
Since N/L is the number of subsequences K used in Bartletts method it follows
In other words, and not surprising, with 50% overlap (and Bartlett window), the variance of Welchs method is
about half that of Bartletts method

Refer book : Digital signal processing by John G.Proakis .(Pg no 902-906)

8. Explain detail about Wiener-Khintchine Theorem


Wiener-Khintchine Theorem
Let x(n) be a WSS random process with autocorrelation sequence
rxx(m) = E[x(n+m)x(n)]
The power spectral density is defined as the Discrete Time Fourier
Transform of the autocorrelation sequence
Pxx(f) = rxx(m)ei2fmT
where T is the sampling interval.
The signal is assumed to be bandlimited in frequency to 1/2T and
is periodic in frequency with period 1/T .
The inverse DTFT is
rxx(m) = Pxx(f)ei2fmT df
And, rxx(0) is the average power
rxx(0) = /2T Pxx(f)df
Due to the property rxx(m) = rxx(m), the PSD must be a strictly
real, nonnegative function.
This estimator can be improved by
Smoothing the sample sequence before computing the transform
Segmenting the sample sequence and computing more estimators
that can be averaged
Using overlapping sample sequences
These techniques can be combined to construct the Welch Periodogram
estimator.

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

www.roeverengg.edu.in

UNIT V
UNIT V MULTIRATE SIGNAL PROCESSING
9
Introduction to Multirate signal processing-Decimation-Interpolation-Polyphase implementation of
FIR filters for interpolator and decimator -Multistage implementation of sampling rate conversionDesign of narrow band filters - Applications of Multirate signal processing.
L: 45, T: 15, TOTAL= 60 PERIODS

1. What is meant by pipelining? (Apr2008 & Nov 2007)


A pipeline is the continuous and somewhat overlapped movement ofinstruc tion to theprocessor or in the
arithmetic steps taken by the processor to perform an instruction. With pipelining, the computer
architecture allows the next instructions to be fetched while the processor is performing arithmetic
operations, holding them in abuffer close to the processor until each instruction operation can be
performed. The staging of instruction fetching is continuous. The result is an increase in the number of
instructions that can be performed during a given time period.
2.What is the principal features of the harvard Architecture? ( Apr 2008)
The Harvard architecture has two separate memories for their instructions and data. It is capable of
simultaneous reading an instruction code and reading or writing a memory or peripheral.
3. Differentiate between von Neumann and Harvard architecture? (May 2007)
Sl.no Harvard Architecture
Von-Neumann Architecture
1

Separate memories for


program and data

It shares same memory for


Program and data.

The speed of execution in


Harvard architecture is high
In this architecture having a
common interval address and
data bus.
It is not suitable for DSP
processors

The speed of execution is


increased by pipelining
It is having a separate
interval address and data bus.

It is normally used for


Harvard architecture.

.
4. Give the digital signal processing application with the TMS 320 family. (Nov 2006)
DSP processors should have circular buffers to support circular shift
operations.
The DSP processor should be able to perform multiply and accumulate
operations very fast.
DSP processors should have multiple pointers to support multiple
www.roeverengg.edu.in

www.roeverengg.edu.in

operands jumps and shifts.

5. What is the advantage of Harvard architecture of TMS 320 series? (Nov 2006)
It
shares
same
memory
for
program
and
data
The
speed
of
execution
is
increased
by
pipelining
It
is
having
a
separate
interval
address
and
data
bus.
It is normally used for Harvard architecture
6.What are the desirable features of DSP Processors? (Nov 2006)
o DSP processors should have multiple registers so that data exchange from
register to register is fast.
o DSP operations require multiple operands simultaneously. Hence DSP
processor should have multiple operand fetch capacity.
o DSP processors should have circular buffers to support circular shift
operations.
o The DSP processor should be able to perform multiply and accumulate
operations very fast.
o DSP processors should have multiple pointers to support multiple
operands jumps and shifts. o
Multi processing ability.
7. What are the different types of DSP Architecture?
Von-NeumannArchitecture
HarvardArchitecture
Modified Harvard Architecture
8. Define MAC unit?
The dedicated hardware unit is called MAC. It is called multiplier- accumulator. It is one of the
computational unit in processor. The complete MAC operation is executed in one clock cycle. The DSP
processors have a special instruction called MACD. This means multiply accumulate with data shift.
9. Mention the Addressing modes in DSP processors.
Shortimmediateaddressing
ShortDirectAddressing
Memory-mappedAddressing
IndirectAddressing
6.5bitreversedaddressingmode
Circular addressing
10. State the features f TMS3205C5x series of DSP processors.
Powerful 16 bit CPU
TDM port
16X16 bit multiplies / Add operations can be performed in single
cycle.
224KX16 bit maximum addressable external memory space.
Full duplex synchronous serial port for coder / decoder interface.
On-chip scan based emulation logic.
www.roeverengg.edu.in

www.roeverengg.edu.in

Boundary scan
Low power dissipation
IEEE standard text access ports

11. Define Parallel logic unit?


It executes logic operations on the data without affecting the contents of ACC. PLU provides bit
manipulation which can be used to set, clear, test or toggle bits in data memory control or status registers.
12. Define scaling shifter
The scaling shifter has a 16 bit input connected to the data bus and 32 bit output connected to the ALU.
The scaling shifter produces a left shift of 0 to 16 bits on the input data. The other shifters perform
numerical scaling, bit extraction, extended precision arithmetic and overflow prevention.
13. Define ARAU in TMS320C5X processor?
ARAU meant Auxiliary register and auxiliary register arithmetic unit. These register are used for
temporary data storage. The auxiliary register file is connected to the auxiliary register arithmetic unit.
The contents of the auxiliary register can be ARAU helps to speed up the operations of CALU.
14. What are the Interrupts available in TMS320C5X processors?
It has four general purpose interrupts.
INT4
INT1
RS(Reset)
NMI (Non Maskable interrupt)
15. What are the addressing modes available in TMS320C5X processors?
Direct
Indirect
Immediate
Register
Memorymapped
Circular Addressing
16. Write the syntax of assembly language syntax.
The source statement can contain following four ordered fields. i.e.,
[Label][:] mnemonic [operand list] [; comment]
The source statement follows following guidelines
All the statements must begin with a label, a blank, an asterisk or a
semicolon.
Labels may be placed before the instruction mnemonic on the same
line or on the proceeding line in the first column.
Each field must be separated with blanks.
If comment begins in column 1 it must have semicolon or asterisk at
its beginning. In other columns, comments can begin with semicolon
www.roeverengg.edu.in

www.roeverengg.edu.in

17.List out the addressing modes supported by C5X processors?


1. Direct addressing
2. Indirect addressing
3. Immediate addressing
4. Dedicated-register addressing
5. Memory-mapped register addressing
6. Circular addressing

18.List the on-chip peripherals in 5X.


The C5X DSP on-chip peripherals available are as follows:
1. Clock Generator
2. Hardware Timer
3. Software-Programmable Wait-State Generators
4. Parallel I/O Ports
5. Host Port Interface (HPI)
6. Serial Port
7. Buffered Serial Port (BSP)
8. Time-Division Multiplexed (TDM) Serial Port
9. User-Maskable Interrupts
19.What are the different buses of TMS320C5X and their functions?
The C5X architecture has four buses and their functions are as follows:
Program bus (PB):
It carries the instruction code and immediate operands from program memory
space to the CPU.
Program address bus (PAB):
It provides addresses to program memory space for both reads and writes.
Data read bus (DB):
It interconnects various elements of the CPU to data memory space.
Data read address bus (DAB):
It provides the address to access the data memory space.
20 Explain various addressing modes of TMS processor.
Immediate.
Register
Register indirect
Indexed

www.roeverengg.edu.in

www.roeverengg.edu.in

Part-B

1. Explain about On-Chip memory and On-chip peripherals

The on-chip peripheral bus has the following features:


Up to a 64-bit address bus
32-bit or 64-bit data bus implementations
Fully synchronous
Provides support for 8-bit, 16-bit, 32-bit, and 64-bit slaves
Provides support for 32-bit and 64-bit masters
Dynamic bus sizing; byte, halfword, fullword, and doubleword transfers
Optional Byte Enable support
Uses a distributed multiplexer method of attachment instead of threestate drivers, to ease
manufacturing test. Address and data buses may be implemented in distributed AND-OR gates or
as a dotted bus
Byte and halfword duplication for byte and halfword transfers
Single cycle transfer of data between OPB bus master and OPB slaves
Sequential address protocol support
Devices on the OPB may be memory mapped, act as DMA peripherals, or support both transfer
methods
A 16-cycle fixed bus timeout provided by the OPB arbiter
OPB slave is capable of disabling the fixed timeout counter to suspend bus timeout error
Support for multiple OPB bus masters
Bus parking for reduced latency
OPB masters may lock the OPB bus arbitration
OPB slaves capable of requesting retry to break possible arbitration deadlock

www.roeverengg.edu.in

www.roeverengg.edu.in

2. Explain in detail about the von Neumann, Harvard


Signal Processor

and SHARC Architecture

of the Digital

One of the biggest bottlenecks in executing DSP algorithms is transferring information to and
from memory. This includes data, such as samples from the input signal and the filter coefficients, as well
as program instructions, thebinary codes that go into the program sequencer. For example, suppose we
need to multiply two numbers that reside somewhere in memory. To do this,we must fetch three binary
values from memory, the numbers to be multiplied,plus the program instruction describing what to do.
Figure 28-4a shows how this seemingly simple task is done in a traditional microprocessor. This is often
called a Von Neumann architecture, after the brilliant American mathematician John Von Neumann
(1903-1957). VonNeumann guided the mathematics of many important discoveries of the early twentieth
century. His many achievements include: developing the concept of a stored program computer,
formalizing the mathematics of quantum mechanics,and work on the atomic bomb. If it was new and
exciting, Von Neumann was there!
A Von Neumann architecture contains a single memory and a single bus for transferring data into
and out of the central processing unit (CPU). Multiplying two numbers requires at least three clock cycles,
one to transfer each of the three numbers over the bus from the memory to the CPU. We don't count the
time to transfer the result back to memory, because we assume that it remains in the CPU for additional
manipulation (such as the sum of products in an FIR filter). The Von Neumann design is quite satisfactory
when you are content to execute all of the required tasks in serial. In fact, most computers today are of the
Von Neumann design. We only need other architectures when very fast processing is required, and we are
willing to pay the price of increased complexity. This leads us to the Harvard architecture, shown in (b).
This is named for the work done at Harvard University in the 1940s under the leadership of
Howard Aiken (1900-1973). As shown in this illustration, Aiken insisted on separate memories for data
and program instructions, with separate buses for each. Since the buses operate independently, program
instructions and data can be fetched at the same time, improving the speed over the single bus design.
Most present day DSPs use this dual bus architecture.
The next level of sophistication, the Super Harvard Architecture. This term was coined by Analog
Devices to describe the 510 The Scientist and Engineer's Guide to Digital Signal Processing internal
operation of their ADSP-2106x and new ADSP-211xx families of Digital Signal Processors. These are
called SHARC DSPs, a contraction of the longer term, Super Harvard ARChitecture. The idea is to build
upon the Harvard architecture by adding features to improve the throughput. While theSHARC DSPs are
optimized in dozens of ways, two areas are important: an instruction cache, and an I/O controller.
First, let's look at how the instruction cache improves the performance of the Harvard architecture.
A handicap of the basic Harvard design is that the data memory bus is busier than the program memory
bus. When two numbers are multiplied, two binary values (the numbers) must be passed over the data
memory bus, while only one binary value (the program instruction) is passed over the program memory
bus. To improve upon this situation, we start by relocating part of the "data" to program memory. For
instance, we might place the filter coefficients in program memory, while keeping the input signal in data
memory. (This relocated data is called "secondary data" in the illustration). At first glance, this doesn't
seem to help the situation; now we must transfer one value over the data memory bus (the input signal
sample), but two values over the program memory bus (the program instruction and the coefficient). In
www.roeverengg.edu.in

www.roeverengg.edu.in

fact, if we were executing random instructions, this situation would be no better at all.However, DSP
algorithms generally spend most of their execution time in loops, such as instructions 6-12 of Table 28-1.
This means that the same set of program instructions will continually pass from program memory to the
CPU.
The Super Harvard architecture takes advantage of this situation by including an instruction cache
in the CPU. This is a small memory that contains about 32 of the most recent program instructions. The
first time through a loop, the program instructions must be passed over the program
memory bus. This results in slower operation because of the conflict with the coefficients that must also
be fetched along this path. However, on additional executions of the loop, the program instructions can be
pulled from the instruction cache. This means that all of the memory to CPU information
transfers can be accomplished in a single cycle: the sample from the input signal comes over the data
memory bus, the coefficient comes over the program memory bus, and the program instruction comes
from the instruction cache. In the jargon of the field, this efficient transfer of data is called a high
memoryaccess bandwidth. s a more detailed view of the SHARC architecture, showing the I/O controller
connected to data memory. This is how thesignals enter and exit the system. For instance, the SHARC
DSPs provides both serial and parallel communications ports. These are extremely high
speed connections. For example, at a 40 MHz clock speed, there are two serial ports that operate at 40
Mbits/second each, while six parallel ports each provide a 40 Mbytes/second data transfer. When all six
parallel ports are used together, the data transfer rate is an incredible 240 Mbytes/second.

3. Briefly explain about the fixed vs floating point.


Fixed versus Floating Point
Digital Signal Processing can be divided into two categories, fixed point and floating point. These refer
to the format used to store and manipulate numbers within the devices. Fixed point DSPs usually represent
each number with a minimum of 16 bits, although a different length can be used. For instance, Motorola
manufactures a family of fixed point DSPs that use 24 bits. There are four common ways that these 216
65,536 possible bit patterns can represent a number. In unsigned integer, the stored number can take on
any integer value from 0 to 65,535. Similarly, signed integer uses two's complement to make the range
include negative numbers, from -32,768 to 32,767. With unsigned fraction notation, the 65,536 levels are
spread uniformly between 0 and 1. Lastly, the signed fraction format allows negative numbers, equally
spaced between -1 and 1. In comparison, floating point DSPs typically use a minimum of 32 bits to
store each value. This results in many more bit patterns than for fixed point, 232 4,294,967,296 to be
exact. A key feature of floating point notation is that the represented numbers are not uniformly spaced. In
the most common format (ANSI/IEEE Std. 754-1985), the largest and smallest numbers are 3.41038
and 1.210 , respectively. The represented values are unequally &38 spaced between these two
extremes, such that the gap between any two numbers is about ten-million times smaller than the value of
the numbers. This is important because it places large gaps between large numbers, but small
gaps between small numbers.
All floating point DSPs can also handle fixed point numbers, a necessity to implement
counters, loops, and signals coming from the ADC and going to the DAC. However, this doesn't mean that
fixed point math will be carried out as quickly as the floating point operations; it depends on the internal
architecture. For instance, the SHARC DSPs are optimized for both floating point and fixed point
operations, and executes them with equal efficiency. For this reason, theSHARC devices are often referred
to as "32-bit DSPs," rather than just "Floating Point." In Fixed versus floating point. Fixed point DSPs are
generally cheaper, while floating point devices have better precision, higher dynamic range, and a shorter
www.roeverengg.edu.in

www.roeverengg.edu.in

development cycle. faster than floating point in general purpose computers. However, with DSPs the
speed is about the same, a result of the hardware being highly optimized for math operations. The internal
architecture of a floating point DSP is more complicated than for a fixed point device.

All the registers and data buses must be 32 bits wide instead of only 16; the multiplier and ALU
must be able to quickly perform floating point arithmetic, the instruction set must be larger (so that they
can handle both floating and fixed point numbers), and so on. Floating point (32 bit) has better precision
and a higher dynamic range than fixed point (16 bit) . In addition, floating point programs often have a
shorter development cycle, since the programmer doesn't generally need to worry about issues such as
overflow, underflow, and round-off error. On the other hand, fixed point DSPs have traditionally been
cheaper than floating point devices. Nothing changes more rapidly than the price of
electronics; anything you find in a book will be out-of-date before it is printed. Nevertheless, cost is a key
factor in understanding how DSPs are evolving, and we need to give you a general idea. When this book
was completed in 1999, fixed point DSPs sold for between $5 and $100, while floating point devices were
in the range of $10 to $300. This difference in cost can be viewed as a measure of the relative complexity
between the devices. If you want to find out what the prices are today, you need to look today.
Now let's turn our attention to performance; what can a 32-bit floating point system do that a
16-bit fixed point can't? The answer to this question is signal-to-noise ratio. Suppose we store a number
in a 32 bit floating point format. As previously mentioned, the gap between this number and its adjacent
neighbor is about one ten-millionth of the value of the number. To store the number, it must be round up
or down by a maximum of one-half the gap size. In other words, each time we store a number in floating
point notation, we add noise to the signal.The same thing happens when a number is stored as a 16-bit
fixed point value, except that the added noise is much worse. This is because the gaps between adjacent
numbers are much larger. For instance, suppose we store the number 10,000 as a signed integer (running
from -32,768 to 32,767). The gap between numbers is one ten-thousandth of the value of the number we
are storing. If we 516 The Scientist and Engineer's Guide to Digital Signal Processing want to store the
number 1000, the gap between numbers is only one onethousandth of the value. Noise in signals is usually
represented by its standard deviation. . For here, the important fact is that the standard deviation of this
quantization noise is about one-third of the gap size. This means that the signal-to-noise ratio for storing
a floating point number is about 30 million to one, while for a fixed point number it is only about tenthousand to one. In other words, floating point has roughly 3,000 times less quantization noise than fixed
point. This brings up an important way that DSPs are different from traditional microprocessors.
Suppose we implement an FIR filter in fixed point. To do this, we loop through each coefficient,
multiply it by the appropriate sample from the input signal, and add the product to an accumulator. Here's
the problem. In traditional microprocessors, this accumulator is just another 16 bit fixed point variable.
To avoid overflow, we need to scale the values being added, and will correspondingly add quantization
noise on each step. In the worst case, this quantization noise will simply add, greatly lowering the
signalto- noise ratio of the system. For instance, in a 500 coefficient FIR filter, the noise on each output
sample may be 500 times the noise on each input sample. The signal-to-noise ratio of ten-thousand to one
has dropped to a ghastly twenty to one. Although this is an extreme case, it illustrates the main point: when
many operations are carried out on each sample, it's bad, really bad. DSPs handle this problem by using
www.roeverengg.edu.in

www.roeverengg.edu.in

an extended precision accumulator. This is a special register that has 2-3 times as many bits as the other
memory locations. For example, in a 16 bit DSP it may have 32 to 40 bits, while in the SHARC DSPs it
contains 80 bits for fixed point use. This extended range virtually eliminates round-off noise while the
accumulation is in progress. The only round-off error suffered is when the accumulator is scaled and
stored in the 16 bit memory. This strategy works very well, although it does limit how some algorithms
must be carried out. In comparison, floating point has such low quantization noise that these techniques
are usually not necessary. In addition to having lower quantization noise, floating point systems are also
easier to develop algorithms for. Most DSP techniques are based on repeated multiplications and
additions.

In fixed point, the possibility of an overflow or underflow needs to be considered after each
operation. The programmer needs to continually understand the amplitude of the numbers, how the
quantization errors are accumulating, and what scaling needs to take place. In comparison,
these issues do not arise in floating point; the numbers take care of themselves (except in rare cases).
Fixed Point Floating Point Fixed versus floating point instructions. These are the multiplication
instructions used in the SHARC DSPs. While only a single command is needed for floating point, many
options are needed for fixed point. See the text for an explanation of these options. is, Fn = Fx * Fy, where
Fn, Fx, and Fy are any of the 16 data registers. It could not be any simpler. In comparison, look at all the
possible commands for fixed point multiplication. These are the many options needed to efficiently
handle the problems of round-off, scaling, and format. In Fig. 28-7, Rn, Rx, and Ry refer to any of the 16
data registers, and MRF and MRB are 80 bit accumulators. The vertical lines indicate options. For
instance, the top-left entry in this table means that all the following are valid commands: Rn = Rx * Ry,
MRF = Rx * Ry, and MRB = Rx * Ry. In other words, the value of any two registers can be multiplied
and placed into another register, or into one of the extended precision accumulators. This table also
shows that the numbers may be either signed or unsigned (S or U), and may be fractional or integer
(F or I). The RND and SAT options are ways of controlling rounding and register overflow.
There are other details and options in the table, but they are not important for our present discussion. The
important idea is that the fixed point programmer must understand dozens of ways to carry out the very
basic task of multiplication. In contrast, the floating point programmer can spend his time
concentrating on the algorithm. Given these tradeoffs between fixed and floating point, how do you
choose which to use? Here are some things to consider. First, look at how many bits are used in the ADC
and DAC. In many applications, 12-14 bits per sample is the crossover for using fixed versus floating
point. For instance, television and other video signals typically use 8 bit ADC and DAC, and the precision
of fixed point is acceptable. In comparison, professional audio applications can sample with as high as 20
or 24 bits, and almost certainly need floating point to capture the large dynamic range.The next thing to
look at is the complexity of the algorithm that will be run. If it is relatively simple, think fixed point; if it
is more complicated, think floating point. For example, FIR filtering and other operations in the time
domain only require a few dozen lines of code, making them suitable for fixed point.
In contrast, frequency domain algorithms, such as spectral analysis and FFT convolution, are
very detailed and can be much more difficult to program. While they can be written in fixed point, the
development time will be greatly reduced if floating point is used. Lastly, think about the money: how
important is the cost of the product, and how important is the cost of the development? When fixed point
is chosen, the cost of the product will be reduced, but the development cost will probably be higher due to
the more difficult algorithms. In the reverse manner, floating point will generally result in a quicker and
cheaper development cycle, but a more expensive final product. Figure 28-8 shows some of the major
www.roeverengg.edu.in

www.roeverengg.edu.in

trends in DSPs. Figure (a) illustrates the impact that Digital Signal Processors have had on the embedded
market. These are applications that use a microprocessor to directly operate and control some larger
system, such as a cellular telephone, microwave oven, or automotive instrument display panel. The name
"microcontroller" is often used in referring to these devices, to distinguish them from the microprocessors
used in personal computers. As shown in (a), about 38% of embedded designers have already started using
DSPs, and another 49% are considering the switch..
The high throughput and computational power of DSPs often makes them an ideal choice for
embedded designs. As illustrated in (b), about twice as many engineers currently use fixed point as use
floating point DSPs. However, this depends greatly on the application. Fixed point is more popular in
competitive consumer products where the cost of the electronics must be kept very low. A good example
of this is cellular telephones. When you are in competition to sell millions of your product, a cost
difference of only a few dollars can be the difference between success and failure. In comparison, floating
point is more common when greater performance is needed and cost is not important.

For Major trends in DSPs. As illustrated in (a), about 38% of embedded designers have
already switched from conventional microprocessors to DSPs, and another 49% are considering the
change. In (b), about twice as many engineers use fixed point as use floating point DSPs. This is mainly
driven by consumer products thatmust have low cost electronics, such as cellular telephones. However, as
shown in (c), floating point is thefastest growing segment; over one-half of engineers currently using 16
bit devices plan to migrate to floatingpoint DSPsinstance, suppose you are designing a medical imaging
system, such a computed tomography scanner. Only a few hundred of the model will ever be sold, at a
price of several hundred-thousand dollars each. For this application, the cost of the DSP is insignificant,
but the performance is critical. In spite of the larger number of fixed point DSPs being used, the floating
point market is the fastest growing segment. As shown in (c), over one-half of engineers using 16-bits
devices plan to migrate to floating point at some time in the near future. Before leaving this topic, we
should reemphasize that floating point and fixed point usually use 32 bits and 16 bits, respectively, but not
always. For instance, the SHARC family can represent numbers in 32-bit fixed point, a mode that is
common in digital audio applications. This makes the 232 quantization levels spaced uniformly over a
relatively small range, say, between -1 and 1. In comparison, floating point notation places the 232
quantization levels logarithmically over a huge range, typically 3.41038. This gives 32-bit fixed point
better precision, that is, the quantization error on any one sample will be lower. However, 32-bit floating
point has a higher dynamic range, meaning there is a greater difference between the largest number and
the smallest number that can be represented.

4. Describe about the features of TMS320C5x, TMS320LC5x PROCESSORS


Features
Processors Supported by the C54x Simulator
Features
The TMS320C54x simulators support the following features:
Included in Code Composer Studio IDE for TMS320C5000
TMS320C54x CPU full instruction set architecture execution
Support of all instructions for devices with extended program pages
Parallel instruction execution
Provision for warning in case of possible inconsistency due to pipeline latency of some
combinations of instructions
www.roeverengg.edu.in

www.roeverengg.edu.in

Configurable memory simulation


Accurate cycle simulation
On-chip memory blocks
External memory blocks
Support for the enhanced external parallel interface (XIO2) on C5410, C5403, C5404, C5406 and
C5407
Port Connect
Supports external peripherals in I/O page
Attachment of files to MMRs, On-chip and external memory locations to read and write when those
locations are accessed
Pin Connect
Supports external event simulation
Supports simulator analysis events
RTDX support
Host-target and target-host communication
DSP/BIOS real-time analysis suppo
CPU
Considerations for Choosing a Simulator

PROCESSOR CODE COMPOSER STUDIO IDE IMPORT CONFIGURATION


TMS320C5407 C5407 Device Simulator
TMS320C5409 C5409 Device Simulator
TMS320C5410 C5410 Device Simulator
TMS320C5416 C5416 Device Simulator
TMS320C5420 C5420 Device Simulator
Although simulators for other DSP platforms and families (e.g. C6000, C55x) have different types of
configurations (functional, cycle-accurate, etc.) that provide tradeoffs between the functionality modeled,
cycle accuracy of the simulation, and performance, simulators for the C54x family are of only one type.
All C54x simulator configurations are cycle-accurate. The differences among the configurations are due to
the differences in the actual hardware of the devices (DSPs) in TMS320C54x family. For example, the
C5402Device Simulator configuration is used to simulate the TMS320C5402 processor. The device
simulator configurations model most of the peripherals of the devices. The peripherals modeled are cycleaccurate,
as in the silicon. These simulators can be used to get an indication of the cycle behavior of the application.
.
Although there are many DSP processors in the C54x family, only those listed in Table 1-1 are currently
supported in the simulator. But, because of similarities among the processors in the C54x family, users
developing code for an unsupported processor can select the configuration most similar in terms of the
CPU and on-chip peripherals. Table 1-2 shows the configuration that should be selected for use with
some of the unsupported devices.
Table 1-2. Configurations for Use With Unsupported C54x Devices
UNSUPPORTED HARDWARE NEAREST SIMULATOR CONFIGURATION
DEVICE (PROCESSOR)
TMS320C5409A C5409 Device Simulator
TMS320C5410A C5410 Device Simulator
TMS320C5421 C5420 Device Simulator
www.roeverengg.edu.in

www.roeverengg.edu.in

TMS320C5440 C5420 Device Simulator


TMS320C5441 C5420 Device Simulator
The following sections provide a concise overview of the supported hardware resources for each of the
simulator configurationsThe C54x CPU core simulated is the same for all the supported devices. It
includes simulation of the full
instruction set architecture, including FAR and IDLE instructions. Pipeline latency warning messages are
displayed if the pipeline latency feature is enabled through the Board Properties section of Code
Composer Studio Setup
1.4.2 Memory
1.4.3 Peripherals
Supported Hardware Resources
The simulator provides configurable memory simulation. For each device, the simulator provides all
on-chip memory blocks of that device. It also, by default, provides external memory blocks in data space,
program space for devices without extended program pages, and up to the first program page that does
not have any on-chip program memory for devices with extended program pages. By default, the
simulator
does not provide external memory in I/O space, however, the rest of the external memory blocks in
program and I/O space can be simulated by adding memory blocks using the Memory Map Add feature or
by changing the appropriate configuration (.cfg) file.
The simulator provides cycle-accurate simulation for both on-chip and external memory blocks. For
example, if there are two simultaneous accesses (PB access and DB access) to the same on-chip
memory block, the simulator will take no extra cycle if the memory block is dual-access memory, but will
take one extra cycle if the memory is single-access memory. For external memory, the simulator provides
wait-state support (required for the slow external memory interface) according to the number of waitstates
being set by the SWWSR, SWCR, and BSCR registers. The simulator also supports XIO2 as applicable to
some of the devices.
3.1 Detailed Capabilities of Individual Configurations
3.1.1 Peripherals
Detailed Capabilities of Individual Configurations
The capabilites and known limitations of the simulator configuration are described below.
Modeled peripherals vary by device. The differences are detailed in the following sections.
3.1.1.1 Timer
(Timer0, Timer1)
Timer0 is supported on all devices
Timer1 is supported only on the C5402. It is not supported on the other devices (C5403, C5404,
C5406, and C5407), which have Timer1 on real hardware.
3.1.1.2 Serial Ports
(SP0, SP1, BSP0, BSP1, TDM, McBSP0, McBSP1, McBSP2)
The C54x simulator supports simple serial port transmission and reception by reading data from, and
writing data to, the files associated (port-connected) with the Data Transmit Register and Data Receive
Register, respectively.
The simulator provides limited support for the simulation of the serial port control pins (frame
synchronization pins) with the help of external event simulation capability. Frame synchronization
signals for receive and transmit operations at various instants of time are fed through the files
associated with the pins.
There are differences between the C5401 and C5402 McBSPs in real hardware, but in the C54x
simulator, the McBSPs of both devices have the functionality of the C5402 McBSP real hardware.
www.roeverengg.edu.in

www.roeverengg.edu.in

The McBSPs in the C5403, C5404, C5406, C5407 and C5410 simulators are equivalent to the C5410
McBSPs in real hardware.
McBSPs in other devices are equivalent to their respective hardware.
3.1.1.3 DMA
The DMA in the C5401 and C5402 simulators are equivalent to the C5410 DMA in real hardware.
The DMA in the C5403, C5404, C5406, C5407 and C5410 simulators are equivalent to the C5410
DMA in real hardware.
In the C5416 simulator the DMA supports extended registers but does not support extended I/O and
data pages in DMA memory map.
DMAs in other devices are equivalent to respective hardware.
3.1.1.4 FIFO
(in C5420)
In the C5420 simulator, only one CPU subsystem is supported. One end of the FIFO is connected to
DMA and the other end is connected to host files Rfifo.dat and Wfifo.dat in the current working
directory. Any attempt to write from one DMA to another DMA through FIFO will write in to Wfifo.dat.
Conversely, any read from another DMA through FIFO will read from "Rfifo.dat". If the user wants to
read through FIFO, he should create the file Rfifo.dat just before setting up the DMA. The names of
these two files cannot be changed.
24 Configuration Specifics SPRU598BJuly 2002Revised October 2003
www.ti.com

3.1.2 IDLE Behavior in the Simulator


3.2 Configuring the Simulator
Configuring the Simulator
3.1.1.5 Host Port Interface
(HPI, HPI8, HPI16)
This simulation is performed using two files associated with HPI. The first file specifies the value of the
control signals and the corresponding address and data values. This file is associated (pin-connected)
with the HPI pin. (See Section 2.1.1, Pin Connect, for file format details.)
The other file is for storing the output values. The outputs generated by the HPI simulation are stored in
the output file named "hpi.out" in the current working directory. The name of this file cannot be changed.
Under all configurations, the C54x simulator provides only one kind of idle functionality for all IDLE
instructions. When the simulator is in IDLE, the following occur:
The CPU in the simulator does not do any activity.
All peripherals inside the simulator are given clocks and, hence, can continue their activities.
The following occurrence will bring the simulator out of IDLE:
As a part of the activities under the IDLE mode, any one of the peripherals generates an interrupt.
Using the Pin Connect feature, the user sends an interrupt.
The simulator uses configuration files to provide the user the ability to configure the simulator. There is a
default configuration file in the drivers subdirectory in the Code Composer Studio installation directory
corresponding to each configuration listed in Table 1-1. The exact path of the configuration file can be
obtained from the Board Properties window as shown in Figure 2-1.
www.roeverengg.edu.in

www.roeverengg.edu.in

A small part of the default configuration file used for C549 configuration is shown below:
MODULE C54X;
CHIP C549; //Processor Number
MODULE C549;
// Template for defining blocks of memory
// MEMORY BLOCK_NAME;
// START < STARTING ADDRESS >;
// LENGTH < LENGTH OF BLOCK >;
// PAGE < IO = 2, DATA = 1, PROG = 0>;
// TYPE < DARAM/SARAM/ROM/WOM/RAM/EXRAM >;
// END BLOCK_NAME;
MEMORY MEM0;
START 0x0000;
LENGTH 0x0800;
PAGE 1;
TYPE DARAM;
END MEM0;
...
...
END C549;
END C54X;
3.2.1 Creating a Memory Map
3.3 Performance Numbers
Performance Numbers
The default configuration files have memory maps for all internal memories and some of the external
memories. Changes to internal memories can cause undefined behavior of the simulator. The user can
only configure the external memory of a particular configuration. For example, the current default config
file for the C549 configuration has the external memory maps up to extended program page three. The
user can add another external program page as follows:

MEMORY MEM27;
START 0x40000;
LENGTH 0x8000;
PAGE 0;
TYPE EXRAM;
END MEM27;
MEMORY MEM28;
START 0x48000;
LENGTH 0x8000;
PAGE 0;
TYPE EXRAM;
END MEM28;
Table 3-1 shows the performance numbers of the simulator for different device configurations. These
numbers were gathered on a 1.7GHz IntelPentium 4 PC with 256MB of RAM. The application used
for measurement in all three cases is the Reed-Solomon encoding and decoding application from a
standard benchmarking suite.
www.roeverengg.edu.in

www.roeverengg.edu.in

Table 3-1. Performance Numbers of the


C54x Simulator
SIMULATOR SIMULATOR SPEED
CONFIGURATION (IN KILOCYCLES/SECOND)
C541-C546 1216
C548-C549 1012
C5410-C5420 751
SPRU598BJuly 2002Revised October 2003
1.4.2 Memory
1.4.3 Peripherals
Supported Hardware Resources
The simulator provides configurable memory simulation. For each device, the simulator provides all
on-chip memory blocks of that device. It also, by default, provides external memory blocks in data space,
program space for devices without extended program pages, and up to the first program page that does
not have any on-chip program memory for devices with extended program pages. By default, the
simulator
does not provide external memory in I/O space, however, the rest of the external memory blocks in
program and I/O space can be simulated by adding memory blocks using the Memory Map Add feature or
by changing the appropriate configuration (.cfg) file.
The simulator provides cycle-accurate simulation for both on-chip and external memory blocks. For
example, if there are two simultaneous accesses (PB access and DB access) to the same on-chip
memory block, the simulator will take no extra cycle if the memory block is dual-access memory, but will
take one extra cycle if the memory is single-access memory. For external memory, the simulator provides
wait-state support (required for the slow external memory interface) according to the number of waitstates
being set by the SWWSR, SWCR, and BSCR registers. The simulator also supports XIO2 as applicable to
some of the devices.
Table 1-3 shows the on-chip peripherals supported under each simulator configuration. For more
information on these peripherals, see the TMS320C54x DSP CPU and Peripherals Reference Guide,
Volume 1 (literature number SPRU131) and the TMS320C54x DSP Enhanced Peripherals Reference
Guide, Volume 5 (literature number SPRU302).

Table 1-3. On-Chip Peripherals Supported by the C54x Simulator


SIMULATOR PERIPHERALS SUPPORTED
CONFIGURATION TIMERS SERIAL PORTS DMA CONTROLLER HPI FIFO
C541 Timer0 SP0 SP1
C542 Timer0 BSP0 TDM HPI
C543 Timer0 BSP0 TDM
C545 Timer0 BSP0 SP1 HPI
C546 Timer0 BSP0 SP1
C548 Timer0 BSP0 TDM BSP1 HPI
C549 Timer0 BSP0 TDM BSP1
C5401 Timer0 McBSP0 McBSP1 XDMA HPI8
C5402 Timer0 Timer1 McBSP0 McBSP1 XDMA HPI8
www.roeverengg.edu.in

www.roeverengg.edu.in

C5403 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI8


C5404 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI8
C5406 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI8
C5407 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI8
C5409 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI8
C5410 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI8
C5416 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI16
C5420 Timer0 McBSP0 McBSP1 McBSP2 XDMA HPI16 FIFO
10 Introduction to the TMS320C54x Simulator SPRU598BJuly 2002Revised October 2003
2.1 External Event & Data Simulation
2.1.1 Pin Connect
External Event & Data Simulation
The simulators simulate the hardware inside a particular DSP device, whereas in real hardware the DSP
interacts with many other external entities. The interactions between the simulator and these external
entities fall into the following two categories:
Control Signals - These signals trigger activities to the simulator (e.g. interrupts, serial port clocks,
serial port synchronization events, etc.)
Data Values - These are part of an interaction between the simulator and an external entity (e.g. read
and write to peripheral registers as a part of I/O memory, serial port data, etc.)
For example, in an audio device the serial port of the DSP is connected to A/D and D/A converters or to a
codec. The interaction between the DSP and the audio device happens through transfer of a
synchronization signal to start a sample, as well as the sample data itself. Here the synchronization signal
falls into the control signals category and the sample data falls in data values category.
The simulator provides two features - namely Pin Connect and Port Connect - for the simulation of these
two types of interactions, respectively.
The Pin Connect tool allows the user to simulate events from the external world.
Generally, control signals from the external entities to the simulator are of most interest to the user. Pin
connect provides a generic way to simulate the control signals from the external entities to the simulator.
In these cases, only the control signals and the time at which the signal must be triggered are important.
The simulator provides the user with a list of pins corresponding to different control signals. The user
must
specify all the clock values at which events are to be triggered on this pin using a special format in a file.
The list of the supported pins and the file format is discussed in the following sections. The user must then
connect this file to the pin using the command window, GEL commands, or the Pin Connect plug-in.
The C54x simulator provides the Pin Connect feature for all processor configurations. The pins supported
for different processor configurations are shown in Table 2-1.

Table 2-1. Pins Supported by the Pin Connect Feature of the C54x Simulator
CONFIG PINS SUPPORTED
INTERRUPTS BIO SERIAL PORT RELATED PINS HPI
(PULSE TYPE) (1) (PULSE W/ CLOCK RATIO SPECIFIED)(2) PIN
C541 INT INT INT INT BIO FSR0 FSX0 FSR1 FSX1 0123
C542 BFSR BFSX TFSR TFSX HPI
C543 BFSR BFSX TFSR TFSX C545 BFSR BFSX FSR FSX HPI
www.roeverengg.edu.in

www.roeverengg.edu.in

C546 BFSR BFSX FSR FSX C548 BFSR0 BFSX0 - TFSR TFSX - BFSR1 BFSX1 - HPI
C549 BFSR0 BFSX0 - TFSR TFSX - BFSR1 BFSX1 - C5401 McBFSR0 McBFSX0 - McBFSR1 McBFSX1 - McBFSR2 McBFSX2 - HPI
C5402 McBFSR0 McBFSX0 - McBFSR1 McBFSX1 - McBFSR2 McBFSX2 - HPI
C5403 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
C5404 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
C5406 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
C5407 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
C5409 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
C5410 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
C5416 McBFSR0 McBFSX0 - McBFSR1 McBFSX1 - McBFSR2 McBFSX2 - HPI
C5420 McBFSR0 McBFSX0 CLKS0 McBFSR1 McBFSX1 CLKS1 McBFSR2 McBFSX2 CLKS2 HPI
(1) BIO (waveform type)
(2) In the case of serial ports, the CLKX pin functionality is combined with the functionality of the
corresponding FSX pin, and only
one logical pin (FSX) is provided. The same is true of the CLKR pin. In the pin-connect file of these FSX
and FSR pins,
information regarding CLKX (or CLKR), such as the clock ratio with respect to CLOCKOUT, is provided
in addition to the
information regarding external frame synchronization events.
The Pin Connect file can have one or more statements of the following formats for the different types of
pins supported:
Pulse type
Waveform type
Pulse with clock ratio specified type
HPI type
2.1.1.1 Pulse Type
clock-cycle [ rpt { n | EOS } ]
The clock-cycle parameter represents the CPU clock cycle in which the interrupt should occur. The clock
value can be specified in decimal or hexadecimal format (using the 0x or 0X prefix format or the h or H
suffix format).
There can be two types of CPU clock cycles:
Absolute. The clock-cycle value must represent the actual CPU clock cycle in which the interrupt
should occur.
For example:
12 34 56
Interrupts are simulated at the 12th, 34th, and 56th CPU clock Cycles.
Relative. The clock-cycle value is relative to the time at which the last event occurred.
For example:
12 +34 55

Three interrupts are simulated: at the 12th, 46th (12 + 34), and 55th CPU clock cycles. A plus sign (+)
before a clock cycle adds that value to the total clock cycles preceding it. Both relative and absolute
values can be mixed in input file as shown.
SPRU598BJuly 2002Revised October 2003 Supported Simulation Features 13
www.roeverengg.edu.in

www.roeverengg.edu.in

www.ti.com
External Event & Data Simulation
The rpt { n | EOS } parameter is optional and represents a repetition value.
Two forms of repetition in simulating interrupts can be used:
Repeat a fixed number of times. Repeat a particular pattern a fixed number (n) of times.
For example:
5 (+10 +20) rpt 2
The values inside the parentheses represent the portion that is repeated. Therefore, an interrupt is
simulated at the 5th, 15th (5 + 10), 35th (15 + 20), 45th (35 + 10), and 65th (45 + 20) CPU clock
cycles. The parameter n is a positive integer value.
Repeat to the end of simulation. To repeat the same pattern throughout the simulation, the string
EOS should be added to the line.
10 (+5 +20) rpt EOS
Interrupts are simulated at the 10th, 15th (10+5), 35th (15 + 20), 40th (35 + 5), 60th (40 + 20), 65th (60
+ 5), and 85th (65 + 20) CPU cycles, continuing in that pattern until the end of simulation.
2.1.1.2 Waveform Type
[clock-cycle, logic-value] [ rpt { n | EOS } ]
The square brackets ([ ]) are required to encapsulate clock-cycle and its corresponding logic-value.
The logic-value is valid only for the BIO pin and this is the only difference between pulse type and
waveform type. The signal can be forced to go high or low at specified clock cycles. A value of 1 forces
the signal to go high, and a value of 0 forces the signal to go low. For example:
[12,1] [23,0] [45,1]
2.1.1.3 Pulse with Clock Ratio Specified Type
DIVIDE r
clock-cycle [ rpt { n | EOS } ]
The difference in this format and the pulse type format (see Section 2.1.1.1) is the addition of the DIVIDE
command to specify the divide-down ratio for the device clock. The parameter r is a real number or
integer
specifying the ratio of the CPU clock rate to the serial port clock rate. The divide ratio is used when the
serial port is configured to use the external clock.
When the DIVIDE command is used, it must be the first command in the file. The following example
specifies the clock ratio of the transmit clock and the clock cycles for the occurrence of TFSX pulses (if
this file is connected to the TFSX pin):
DIVIDE 5
100 +200 +100
The DIVIDE command specifies the divide-down ratio of the clock against the CPU clock. That is, the
CLKX frequency is 1/5 of the CPU clock. The second line indicates that the TFSX should go high at the
100th, 300th (100 + 200) and 400th (300 + 100) CPU cycles. The TFSX pin goes high in the 500th,
1500th, and 2000th cycles of the serial port clock.

www.roeverengg.edu.in

www.roeverengg.edu.in

2.1.1.4 HPI Type


clock-cycle hpi-function-specification [ rpt { n | EOS } ]:
A colon is required as a delimiter after every statement, as shown in the syntax.
The hpi-function-specification parameter specifies the type of Host Port Interface (HPI) operation to be
performed. Table 2-2 describes the different formats of the hpi-function-specification depending upon the
type of HPI.
Table 2-2. hpi-function-specification Parameter Values
FORMAT HPI TYPE DESCRIPTION
CTL_READ LSB HPI, HPI8 Control register read, byte is determined by MSB or LSB as HPI/HPI8
CTL_READ MSB can handle only one byte at a time
CTL_READ HPI16 Control register read, HPI16 can handle one word at a time
CTL_WRITEdataLSB HPI, HPI8 Control register write; data specifies the write value
CTL_WRITE data MSB
CTL_WRITE data HPI16 Control register write; data specifies the write value
DATA_READ LSB ++ HPI, HPI8 Data read with HPIA increment
DATA_READ MSB ++
DATA_READ LSB HPI, HPI8 Data read without HPIA increment
DATA_READ MSB
DATA_READ ++ HPI16 Data read with HPIA increment
DATA_READ HPI16 Data read without HPIA increment
DATA_WRITE data ++ LSB HPI, HPI8 Data write with HPIA increment; data specifies the write value
DATA_WRITE data ++ MSB
DATA_WRITE data LSB HPI, HPI8 Data write without HPIA increment; data specifies the write value
DATA_WRITE data MSB
DATA_WRITE data ++ HPI16 Data write with HPIA increment; data specifies the write value
DATA_WRITE data HPI16 Data write without HPIA increment; data specifies the write value
LOAD file-name HPI, HPI8, Initialize HPI RAM with data from file-name
HPI16
HPIA_WRITE address-as-data LSB HPI, HPI8 HPIA register write, byte is determined by MSB or LSB
as HPI/HPI8 can
HPIA_WRITE address-as-data MSB handle only one byte at a time; address-as-data specifies the write
value
HPIA_WRITE address-as-data EXB HPI8 HPIA register write, EXB specifies that this is the extended
seven
(16..22) bits of address
HPIA_WRITE address-as-data HPI16 HPIA register write, HPI16 can handle one word at a time,
XHPIA bit in
HPIC determines whether it is lower or extended bits
See Section 2.1.1.1 for details on the clock-cycle and rpt parameters.
For more information on how to use the Pin Connect feature from the Command window, GEL files, or
the
Port Connect plug-in, please see the Code Composer Studio IDE online help file.
2.1.2 Port Connect
External Event & Data Simulation
The Port Connect feature allows the user to simulate a data transfer between the DSP and an external
entity which is present in the real hardware.
The transfer of data between the DSP and an external entity (which sits at some particular address in the
memory space of the DSP) can happen in the following two ways:
www.roeverengg.edu.in

www.roeverengg.edu.in

Data from an external entity to the DSP


Data from the DSP to an external entity
To simulate a data transfer from an external entity to the DSP, first all data values which will be
transferred from the external entity to the DSP are put into a file (the details of the file format are in
Section 2.1.2.1). Then an association is made in the simulator between this file and the address at which
the external entity sits. This association is called read-mode port-connect. Whenever the simulator must
read from the external entity through the associated address, it reads from the file one word at a time. To
simulate the data transfer from the DSP to an external entity, a file is port-connected in write mode against
the address of the external entity. Whenever the simulator has to write data to that address, it writes in the
file one word at a time.
Although only one bit at a time of a sample transfers through the serial line when HPI or HPI8 is specified
(see Section 2.1.1.4), samples can be encapsulated in words as serial port reads from the Serial Port
Receive Register (DRR) or writes in to the Serial Port Transmit Register (DXR). All the samples that are
to
be sent to the simulator are written in a file and this file is port-connected in read mode against the
memory address corresponding to the DRR. Whenever the serial port of the simulator has to read one
word from the DRR as part of the receive operation, it reads the word from the file. Similarly, to get the
data serial port transmits through the serial line, a file is port-connected in write mode against the memory
address corresponding to the DXR. Whenever the serial port of the simulator writes one word to the DXR
as a part of the transmit operation, it also writes the word in to the file.
The C54x simulator provides the Port Connect feature for all processor configurations. Although Port
Connect is primarily meant for the I/O memory and serial port registers, the same concept can be applied
to program and data memory as well. The C54x simulators support Port Connect to the following memory
locations:
All I/O memory locations
All data and program memories, except the reserved and memory mapped register (MMR) locations
Only the serial-port receive and transmit register locations among the MMR locations. (See the data
sheet corresponding to the particular configuration to get the address of the serial port receive and
transport registers.)
Before using this feature, make sure the address being connected to is already mapped. This is important
in case the address used is in I/O or higher extended external pages, as the simulator by default does not
include those addresses in its memory map.

5. Discuss in detail about pipeline latency warning.


2.2 Pipeline Latency Warning
Pipeline Latency Warning
2.1.2.1 Port Connect File Format
The Port Connect file contains one or more lines. Each line contains less than 80 characters to represent
one data value. The data is specified in hex format without any 0x prefix or h suffix.
The following example is a sample Port Connect file:
6666
9999
aaaa
cccc
7f7f
Port Connect File Format Notes:
www.roeverengg.edu.in

www.roeverengg.edu.in

The first value is taken as (6666)16 not (6666)10.


If a single file is used in read mode for a range of addresses, one line (hence, one datum) is read from
the file and the file pointer is advanced to the next line for any address within the range. For example,
if the example file is used for a range 0x2000-0x2004 and the read access is made to the addresses
0x2000, 0x2001, 0x2000, 0x2000, 0x2003 (in that order) during a simulation session, the values will go
to the addresses in the order - 0x6666 to 0x2000, 0x9999 to 0x2001, 0xaaaa to 0x2000, 0xcccc to
0x2000 and 0x7f7f to 0x2003. Similarly, in write mode, when there is a write to any address in the
range, one line containing that data is written in the file.
If a Port Connect file is used in read mode with the no-rewind attribute, for any access made to the
address after end-of-file is reached, the value 0xFFFF is read and the file pointer is kept unchanged.
Otherwise, the file pointer is rewound to the beginning and then one datum is read. For example, if the
example file is used for address 0x2000 in read mode, at the time of the sixth read access to that
address, the value 0xFFFF is read if no-rewind attribute is set. Otherwise, 0x6666 is read.
For information on how to use Port Connect features from the Command window, GEL files, or the Port
Connect plug-in, refer to the Code Composer Studio IDE online help.
Since the C54x pipeline is unprotected an assembly programmer must be very careful in using multiple
instructions that access CPU resources at the same time (C compiler handles the issue for C
programmers). In particular, the following sentences in the CPU and peripheral users guide point to
potential problems: Some of these pipeline conflicts are resolved automatically by the C54x by delaying
accesses. Other conflicts are unprotected and must be resolved by the programmer.
The simulator provides assistance for the detection of pipeline conflicts on the C54x. It gives warning
messages whenever pipeline conflicts occur during the execution of an applications assembly code.
Although the C54x assembler provides similar features for the detection of pipeline conflicts, the
differences between this feature in the simulator and that provided in the assembler are as follows:
Since the simulator does run-time analysis, it can tell when pipeline conflicts occur - as opposed to the
assembler, which must report potential pipeline conflicts at the time of assembling the code.
The simulator can only detect pipeline conflicts in code that is actually executed in the simulator. The
assembler can report potential pipeline conflicts in all the code it processes, with the exception of code
surrounding branches. The assembler detects pipeline conflicts in straight-line code only.
The potential pipeline conflicts detected by the assembler remains the same as long as the assembly
code does not change, but the pipeline conflicts detected by the simulator may change from execution
to execution depending on the input data, interrupts, and the dynamics of the real-time execution.
The following pipeline latencies are handled in the simulator:
DAGEN registers latency
ARx latency
BK latency
Stack pointer latency
As an offset in direct addressing (CPL = 1)
Stack operation (CPL = 0)
TREG latency
SPRU598BJuly 2002Revised October 2003 Supported Simulation Features 17
www.ti.com
Pipeline Latency Warning
PMST latency
OVLY/MPMC/DROM/IPTR latency
Status registers (ST0/ST1) latency
ARP/CMPT/CPL/DP/SXM/ASM/BRAF latency
BRC register latency
www.roeverengg.edu.in

www.roeverengg.edu.in

MMR access of accumulators latency


To use the pipeline latency warning feature in the simulator, the pipeline latency warning option must be
enabled during setup using the following procedure:
1. Open Code Composer Studio Setup. Import the configuration to be used.
2. Select C54x Simulator under My System in the System Configuration pane.
3. Right-click and open the properties window.
4. Go to the Board Properties tab.
5. Select Enabled from the list box corresponding to Display Pipeline Warnings. (See Figure 2-1)
6. Save this setup and start Code Composer Studio IDE.
2.3 Simulation Pipeline Modes
2.3.1 Switching Between the Modes
Simulation Pipeline Modes
Figure 2-2. Pipeline Latency Message
All C54x simulators support two modes of execution called Pipeline Flush mode and Pipelined mode.
In Pipeline Flush mode, when the simulation is halted, the execution pipeline is flushed. The watch
window shows the correct values of local variables in this mode. Profile data may be skewed because
of flushing the pipeline whenever the simulation halts.
The PC (Program Counter) corresponds to the address of the instruction that is about to enter the
Decode phase of the pipeline.
In Pipelined mode, when the simulation is halted, the execution pipeline is not flushed. Pipeline
behavior is accurate in this mode. This mode provides accurate profile data. The watch window display
may not show the correct values of local variables.
The PC corresponds to the address of the instruction that is about to enter the Execute phase of the
pipeline.
Pipeline Flush mode is the default simulation mode.
The mode can be changed using either of the following mechanisms:
From the Code Composer Studio IDE Debug Menu, by enabling/disabling Flush Pipeline on Halt.
Using the GEL command GEL_SimSetMode(Arg)
Arg value 0 sets the mode to Pipeline Flush mode
Arg value 1 sets the mode to Pipelined mode.
SPRU598BJuly 2002Revised October 2003 Supported Simulation Features 19
www.ti.com
2.3.2 Effects of Mode Switching
2.3.3 Recommendations on Simulation Mode
2.3.4 Related Warnings
Simulation Pipeline Modes
From Pipeline Flush mode to Pipelined mode:
Simulation advances the pipeline to bring the PC to the execution phase.
After mode switching, the PC points to the instruction in the execute phase of the pipeline.
From Pipelined mode to Pipeline Flush mode:
Simulation flushes the pipeline.
After mode switching, the PC points to the instruction in the decode phase of the pipeline.
Breakpoints that might have been set at instructions which are being flushed out of the pipeline are
ignored.
The table below summarizes the Simulation mode recommendations while using various features:
Pipelined Mode Use this mode for following features:
Profiler
Simulator analysis
www.roeverengg.edu.in

www.roeverengg.edu.in

Pipeline Flush Mode Use this mode for following features:


Watch window
Register window
Memory window
The following warning message will be displayed from the watch window when the simulator is in
Pipelined Mode
Watch Window: (Warning) Pipeline flush mode is OFF. Values may be
inaccurate due to pipeline stage delays. To ensure accurate reporting, turn on
"Flush Pipeline on Halt" from Debug menu.
The following warning message will be displayed from Pipeline Stall Analyzer when the simulator is in
Pipeline Flush Mode.
Simulator is currently in Pipeline Flush Mode. Pipeline Flush Mode
does not support Pipeline Stall Analyzer Tool. To use Pipeline Stall Analyzer,
turn off "Flush Pipeline on Halt" from Debug menu.
20 Supported Simulation Features SPRU598BJuly 2002Revised October 2003
www.ti.com
2.4 Simulator Analysis
2.5 RTDX
2.6 DSP/BIOS
Simulator Analysis
The C54x analysis module gives a detailed look into events occurring in the hardware, expanding
debugging capabilities beyond software breakpoints. The analysis module examines C54x bus cycle
information in real time and reacts to this information through actions, such as data breakpoints. The
simulator provides monitoring of the following types of accesses during a debugging session on all
processor configurations:
Read access on program or data memory
Write access on program or data memory
Read and write access on program or data memory
Read and/or write access for a particular data value
Read and/or write access for a particular data pattern
Instruction fetch on the program bus
Data breakpoint between two instruction addresses
For more information, see the Code Composer Studio C54x Simulator Analysis plug-in online help.
Real-Time Data Exchange (RTDX) is supported when running inside the simulator. To run RTDX inside
the simulator, the user must link applications with the RTDX Simulator Target library. All the simulator
device configurations support RTDX. RTDX support includes both host-target and target-host
communication.
All applications using the DSP/BIOS can be run on all the C54x simulators. In order to enable Real Time
Analysis (RTA) for these applications, the user needs to ensure that the RTDX Mode in the configuration
is set to simulator. Please refer to the Code Composer Studio online documentation on DSP/BIOS for
more details on RTA and how to configure the simulator target.
3.1 Detailed Capabilities of Individual Configurations
3.1.1 Peripherals
Detailed Capabilities of Individual Configurations
The capabilites and known limitations of the simulator configuration are described below.
Modeled peripherals vary by device. The differences are detailed in the following sections.
3.1.1.1 Timer
(Timer0, Timer1)
www.roeverengg.edu.in

www.roeverengg.edu.in

Timer0 is supported on all devices


Timer1 is supported only on the C5402. It is not supported on the other devices (C5403, C5404,
C5406, and C5407), which have Timer1 on real hardware.
3.1.1.2 Serial Ports
(SP0, SP1, BSP0, BSP1, TDM, McBSP0, McBSP1, McBSP2)
The C54x simulator supports simple serial port transmission and reception by reading data from, and
writing data to, the files associated (port-connected) with the Data Transmit Register and Data Receive
Register, respectively.
The simulator provides limited support for the simulation of the serial port control pins (frame
synchronization pins) with the help of external event simulation capability. Frame synchronization
signals for receive and transmit operations at various instants of time are fed through the files
associated with the pins. (See Section 2.1.1, Pin Connect, for file format details.)
There are differences between the C5401 and C5402 McBSPs in real hardware, but in the C54x
simulator, the McBSPs of both devices have the functionality of the C5402 McBSP real hardware.
The McBSPs in the C5403, C5404, C5406, C5407 and C5410 simulators are equivalent to the C5410
McBSPs in real hardware.
McBSPs in other devices are equivalent to their respective hardware.
3.1.1.3 DMA
The DMA in the C5401 and C5402 simulators are equivalent to the C5410 DMA in real hardware.
The DMA in the C5403, C5404, C5406, C5407 and C5410 simulators are equivalent to the C5410
DMA in real hardware.
In the C5416 simulator the DMA supports extended registers but does not support extended I/O and
data pages in DMA memory map.
DMAs in other devices are equivalent to respective hardware.
3.1.1.4 FIFO
(in C5420)
In the C5420 simulator, only one CPU subsystem is supported. One end of the FIFO is connected to
DMA and the other end is connected to host files Rfifo.dat and Wfifo.dat in the current working
directory. Any attempt to write from one DMA to another DMA through FIFO will write in to Wfifo.dat.
Conversely, any read from another DMA through FIFO will read from "Rfifo.dat". If the user wants to
read through FIFO, he should create the file Rfifo.dat just before setting up the DMA. The names of
these two files cannot be changed.
24 Configuration Specifics SPRU598BJuly 2002Revised October 2003
www.ti.com
3.1.2 IDLE Behavior in the Simulator
3.2 Configuring the Simulator
Configuring the Simulator
3.1.1.5 Host Port Interface
(HPI, HPI8, HPI16)
This simulation is performed using two files associated with HPI. The first file specifies the value of the
control signals and the corresponding address and data values. This file is associated (pin-connected)
with the HPI pin. (See Section 2.1.1, Pin Connect, for file format details.)
The other file is for storing the output values. The outputs generated by the HPI simulation are stored in
the output file named "hpi.out" in the current working directory. The name of this file cannot be changed.
Under all configurations, the C54x simulator provides only one kind of idle functionality for all IDLE
instructions. When the simulator is in IDLE, the following occur:
The CPU in the simulator does not do any activity.
All peripherals inside the simulator are given clocks and, hence, can continue their activities.
www.roeverengg.edu.in

www.roeverengg.edu.in

The following occurrence will bring the simulator out of IDLE:


As a part of the activities under the IDLE mode, any one of the peripherals generates an interrupt.
Using the Pin Connect feature, the user sends an interrupt.
The simulator uses configuration files to provide the user the ability to configure the simulator. There is a
default configuration file in the drivers subdirectory in the Code Composer Studio installation directory
corresponding to each configuration listed in Table 1-1. The exact path of the configuration file can be
obtained from the Board Properties window as shown in Figure 2-1.
A small part of the default configuration file used for C549 configuration is shown below:
MODULE C54X;
CHIP C549; //Processor Number
MODULE C549;
// Template for defining blocks of memory
// MEMORY BLOCK_NAME;
// START < STARTING ADDRESS >;
// LENGTH < LENGTH OF BLOCK >;
// PAGE < IO = 2, DATA = 1, PROG = 0>;
// TYPE < DARAM/SARAM/ROM/WOM/RAM/EXRAM >;
// END BLOCK_NAME;
MEMORY MEM0;
START 0x0000;
LENGTH 0x0800;
PAGE 1;
TYPE DARAM;
END MEM0;
...
...
END C549;
END C54X;
SPRU598BJuly 2002Revised October 2003 Configuration Specifics 25
www.ti.com
3.2.1 Creating a Memory Map
3.3 Performance Numbers
Performance Numbers
The default configuration files have memory maps for all internal memories and some of the external
memories. Changes to internal memories can cause undefined behavior of the simulator. The user can
only configure the external memory of a particular configuration. For example, the current default config
file for the C549 configuration has the external memory maps up to extended program page three. The
user can add another external program page as follows:
MEMORY MEM27;
START 0x40000;
LENGTH 0x8000;
PAGE 0;
TYPE EXRAM;
END MEM27;
MEMORY MEM28;
START 0x48000;
LENGTH 0x8000;
PAGE 0;
www.roeverengg.edu.in

www.roeverengg.edu.in

TYPE EXRAM;
END MEM28;
Table 3-1 shows the performance numbers of the simulator for different device configurations. These
numbers were gathered on a 1.7GHz IntelPentium 4 PC with 256MB of RAM. The application used
for measurement in all three cases is the Reed-Solomon encoding and decoding application from a
standard benchmarking suite.
6.10 Universal Asynchronous Receiver/Transmitter (UART)
C6424 has 2 UART peripherals. Each UART has the following features:
16-byte storage space for both the transmitter and receiver FIFOs
1, 4, 8, or 14 byte selectable receiver FIFO trigger level for autoflow control and DMA
DMA signaling capability for both received and transmitted data
Programmable auto-rts and auto-cts for autoflow control
Frequency pre-scale values from 1 to 65,535 to generate appropriate baud rates
Prioritized interrupts
Programmable serial data formats
5, 6, 7, or 8-bit characters
Even, odd, or no parity bit generation and detection
1, 1.5, or 2 stop bit generation
False start bit detection
Line break generation and detection
Internal diagnostic capabilities
Loopback controls for communications link fault isolation
Break, parity, overrun, and framing error simulation
Modem control functions (CTS, RTS) on UART0 only.

6.Explain about the features of TMS320C54x processor


Features
Table 11 provides an overview the TMS320C54x, TMS320LC54x, and
TMS320VC54x fixed-point, digital signal processor (DSP) families (hereafter
referred to as the 54x unless otherwise specified). The table shows significant
features of each device, including the capacity of on-chip RAM and ROM, the
available peripherals, the CPU speed, and the type of package with its total
pin count.
Features provided by the 54x DSPs include:
_ High-performance, low-power C54x CPU
_ Advanced multibus architecture with three separate 16-bit data
memory buses and one program memory bus
_ 40-bit arithmetic logic unit (ALU), including a 40-bit barrel shifter and
two independent 40-bit accumulators
_ 17- 17-bit parallel multiplier coupled to a 40-bit dedicated adder for
nonpipelined single-cycle multiply/accumulate (MAC) operation
_ Compare, select, and store unit (CSSU) for the add/compare
selection of the Viterbi operator
_ Exponent encoder to compute an exponent value of a 40-bit
www.roeverengg.edu.in

www.roeverengg.edu.in

accumulator value in a single cycle


_ Two address generators with eight auxiliary registers and two
auxiliary register arithmetic units (ARAUs)
_ Data buses with a bus holder feature
_ Extended addressing mode for up to 8M 16-bit maximum
addressable external program space
_ Single-instruction repeat and block-repeat operations for program
code
_ Block-memory-move instructions for better program and data
management
_ Instructions with a 32-bit-long word operand
_ Instructions with two- or three-operand reads
_ Arithmetic instructions with parallel store and parallel load
_ Conditional store instructions
_ Fast return from interrupt
_ On-chip peripherals
_ Software-programmable wait-state generator and programmable
bank-switching
_ Phase-locked loop (PLL) clock generator with internal crystal
oscillator or external clock source
_ Full-duplex standard serial port
Features
TMS320C54x DSP Functional Overview 3
_ Time-division multiplexed (TDM) serial port
_ Buffered serial port (BSP)
_ Multichannel buffered serial port (McBSP)
_ Direct memory access (DMA) controller
_ 8-bit parallel host-port interface (HPI)
_ Enhanced 8-bit parallel host-port interface (HPI8)

_ 16-bit parallel host-port interface (HPI16)


_ 16-bit timer with 4-bit prescaler
_ Interprocessor first-in first-out (FIFO) unit (on multiple CPU devices)
_ Power conservation features
_ Software power consumption control with IDLE1, IDLE2, and IDLE3
power-down modes
_ Ability to disable external address bus, data bus, and control bus
signals under software control
_ Ability to disable CLKOUT under software control
_ Low-voltage device options to reduce power consumption without
compromising performance
_ On-chip scan-based emulation capability
_ IEEE 1149.1 (JTAG) boundary scan test capability
_ 5.0-V power supply devices with speeds up to 40 million instructions per
second (MIPS) (25-ns instruction cycle time)
_ 3.3-V power supply devices with speeds up to 80 MIPS (12.5-ns
www.roeverengg.edu.in

www.roeverengg.edu.in

instruction cycle time)


_ 2.5-V power supply devices with speeds up to 100 MIPS (10-ns instruction
cycle time)
_ 1.8-V power supply devices with speeds up to 200 MIPS (10-ns instruction
cycle time per CPU core)
_ 1.5-V power supply devices with speeds up to 532 MIPS (7.5-ns
instruction cycle time per CPU core)
7.Discuss in detail about theTMS320C54x architecture
1.2 Architecture
The 54x DSPs use an advanced, modified Harvard architecture that maximizes processing power by
maintaining one program memory bus and three data memory buses. These processors also provide an
arithmetic logic unit (ALU) that has a high degree of parallelism, application-specific hardware logic,
on-chip memory, and additional on-chip peripherals. These DSP families also provide a highly
specialized instruction set, which is the basis of the operational flexibility and speed of these DSPs.
Separate program and data spaces allow simultaneous access to program instructions and data, providing
the high degree of parallelism. Two reads and one write operation can be performed in a single cycle.
Instructions with parallel store and application-specific instructions can fully utilize this architecture. In
addition, data can be transferred between data and program spaces. Such parallelism supports a powerful
set of arithmetic, logic, and bit-manipulation operations that can all be performed in a single machine
cycle. Also included are the control mechanisms to manage interrupts, repeated operations, and function
calls. Figure 11 is a functional block diagram that shows the principal blocks and bus structure in the
54x devices.
1.2.1 Central Processing Unit (CPU)
The CPU of the 54x devices contains:
_ A 40-bit arithmetic logic unit (ALU)
_ Two 40-bit accumulators
_ A barrel shifter
_ A 17 17-bit multiplier/adder
_ A compare, select, and store unit (CSSU)

1.2.2 Arithmetic Logic Unit (ALU)


The 54x devices perform 2s-complement arithmetic using a 40-bit ALU and two 40-bit accumulators
(ACCA and ACCB). The ALU also can perform
Boolean operations.
The ALU can function as two 16-bit ALUs and perform two 16-bit operations simultaneously when the
C16 bit in status register 1 (ST1) is set.
1.2.3 Accumulators
The accumulators, ACCA and ACCB, store the output from the ALU or the multiplier / adder block; the
accumulators can also provide a second input to the ALU or the multiplier / adder. The bits in each
accumulator is grouped as follows:
_ Guard bits (bits 3239)
www.roeverengg.edu.in

www.roeverengg.edu.in

_ A high-order word (bits 1631)


_ A low-order word (bits 015)
Instructions are provided for storing the guard bits, the high-order and the low-order accumulator words in
data memory, and for manipulating 32-bit accumulator words in or out of data memory. Also, any of the
accumulators can be used as temporary storage for the other.
1.2.4 Barrel Shifter
The 54xs barrel shifter has a 40-bit input connected to the accumulator or data memory (CB, DB) and a
40-bit output connected to the ALU or data memory (EB). The barrel shifter produces a left shift of 0 to
31 bits and a right shift of 0 to 16 bits on the input data. The shift requirements are defined in the shiftcount field (ASM) of ST1 or defined in the temporary register (TREG), which is designated as a shiftcount register. This shifter and the exponent detector normalize the values in an accumulator in a single
cycle. The least significant bits (LSBs) of the output are filled with 0s and the most significant
bits (MSBs) can be either zero-filled or sign-extended, depending on the state of the sign-extended mode
bit (SXM) of ST1. Additional shift capabilities enable the processor to perform numerical scaling, bit
extraction, extended arithmetic, and overflow prevention operations. Architecture 10 TMS320C54x DSP
Functional Overview
1.2.5 Multiplier/Adder
The multiplier / adder performs 17 17-bit 2s-complement multiplication with a 40-bit accumulation in
a single instruction cycle. The multiplier / adder block consists of several elements: a multiplier, adder,
signed/unsigned input control, fractional control, a zero detector, a rounder (2s-complement),
overflow/saturation logic, and TREG. The multiplier has two inputs: one input is selected from the TREG,
a data-memory operand, or an accumulator; the other is selected from the program memory, the data
memory, an accumulator, or an immediate value. The fast on-chip multiplier allows the 54x to perform
operations such as convolution, correlation, and filtering efficiently. In addition, the multiplier and ALU
together execute multiply/accumulate (MAC) computations and ALU operations in parallel in a single
instruction cycle. This function is used in determining the Euclid distance, and in implementing
symmetrical and least mean square (LMS) filters, which are required for complex DSP algorithms.
1.2.6 Compare, Select, and Store Unit (CSSU)
The compare, select, and store unit (CSSU) performs maximum comparisons between the accumulators
high and low words, allows the test/control (TC) flag bit of status register 0 (ST0) and the transition
(TRN) register to keep their transition histories, and selects the larger word in the accumulator to be stored
in data memory. The CSSU also accelerates Viterbi-type butterfly computation with optimized on-chip
hardware.

1.2.7 Program Control


Program control is provided by several hardware and software mechanisms:
_ The program controller decodes instructions, manages the pipeline, stores the status of operations, and
decodes conditional operations. Some of the hardware elements included in the program controller are the
program counter, the status and control register, the stack, and theaddress-generation logic.
_ Some of the software mechanisms used for program control include branches, calls, conditional
instructions, a repeat instruction, reset, and interrupts.
_ The 54x supports both the use of hardware and software interrupts for program control. Interrupt
service routines are vectored through a relocatable interrupt vector table. Interrupts can be globally
www.roeverengg.edu.in

www.roeverengg.edu.in

enabled/disabled and can be individually masked through the interrupt mask register (IMR). Pending
interrupts are indicated in the interrupt flag register (IFR). For detailed information on the structure of the
interrupt vector table, the IMR and the IFR, see the device-specific data sheets. Architecture
TMS320C54x DSP Functional Overview 11
1.2.8 Status Registers (ST0, ST1)
The status registers, ST0 and ST1, contain the status of the various conditions and modes for the 54x
devices. ST0 contains the flags (OV, C, and TC) produced by arithmetic operations and bit manipulations
in addition to the data page pointer (DP) and the auxiliary register pointer (ARP) fields. ST1 contains
the various modes and instructions that the processor operates on and executes.
1.2.9 Auxiliary Registers (AR0AR7)
The eight 16-bit auxiliary registers (AR0AR7) can be accessed by the central airthmetic logic unit
(CALU) and modified by the auxiliary register arithmetic units (ARAUs). The primary function of the
auxiliary registers is generating 16-bit addresses for data space. However, these registers also can act as
general-purpose registers or counters.
1.2.10 Temporary Register (TREG)
The TREG is used to hold one of the multiplicands for multiply and multiply/accumulate instructions. It
can hold a dynamic (execution-time programmable) shift count for instructions with a shift operation such
as ADD, LD, and SUB. It also can hold a dynamic bit address for the BITT instruction. The EXP
instruction stores the exponent value computed into the TREG, while the NORM instruction uses the
TREG value to normalize the number. For ACS operation of Viterbi decoding, TREG holds branch
metrics used by the DADST and DSADT instructions.
1.2.11 Transition Register (TRN)
The TRN is a 16-bit register that is used to hold the transition decision for the path to new metrics to
perform the Viterbi algorithm. The CMPS (compare, select, max, and store) instruction updates the
contents of the TRN based on the comparison between the accumulator high word and the accumulator
low word.
1.2.12 Stack-Pointer Register (SP)
The SP is a 16-bit register that contains the address at the top of the system stack. The SP always points to
the last element pushed onto the stack. The stack is manipulated by interrupts, traps, calls, returns, and the
PUSHD, PSHM, POPD, and POPM instructions. Pushes and pops of the stack predecrement and
postincrement, respectively, all 16 bits of the SP.
1.2.13 Circular-Buffer-Size Register (BK)
The 16-bit BK is used by the ARAUs in circular addressing to specify the data block size.
Architecture
12 TMS320C54x DSP Functional Overview

1.2.14 Block-Repeat Registers (BRC, RSA, REA)


The block-repeat counter (BRC) is a 16-bit register used to specify the number of times a block of code is
to be repeated when performing a block repeat. The block-repeat start address (RSA) is a 16-bit register
containing the starting address of the block of program memory to be repeated when operating in the
www.roeverengg.edu.in

www.roeverengg.edu.in

repeat mode. The 16-bit block-repeat end address (REA) contains the ending address if the block of
program memory is to be repeated when operating in the repeat mode.
1.2.15 Interrupt Registers (IMR, IFR)
The interrupt-mask register (IMR) is used to mask off specific interrupts individually at required times.
The interrupt-flag register (IFR) indicates the current status of the interrupts.
1.2.16 Processor-Mode Status Register (PMST)
The processor-mode status register (PMST) controls memory configurations of the 54x devices.
1.2.17 Power-Down Modes
There are three power-down modes, activated by the IDLE1, IDLE2, and IDLE3 instructions. In these
modes, the 54x devices enter a dormant state and dissipate considerably less power than in normal
operation. The IDLE1 instruction is used to shut down the CPU. The IDLE2 instruction is used to shut
down the CPU and on-chip peripherals. The IDLE3 instruction is used to shut down the 54x processor
completely. This instruction stops the PLL circuitry as well as the CPU and peripherals. Bus Structure
TMS32
1.4 Memory
The minimum memory address range for the 54x devices is 192K words composed of 64K words in
program space, 64K words in data space, and 64K words in I/O space. Selected devices also provide
extended program memory space of up to 8M words. The program memory space contains the
instructions to be executed as well as tables used in execution. The data memory space stores data used by
the instructions. The I/O memory space interfaces to external memory-mapped peripherals and can also
serve as extra data storage space. The 54x DSPs provide both on-chip RAM and ROM to improve system
performance and integration.
1.4.1 On-Chip ROM
The 54x devices include on-chip maskable ROM that can be mapped into program memory or data
memory depending on the device. On-chip ROM is mapped into program space by the
microprocessor/microcontroller (MP/MC) mode control pin. On-chip ROM that can be mapped into data
space is controlled by the DROM bit in the processor mode status register (PMST). This allows an
instruction to use data stored in the ROM as an operand. Customers can arrange to have the ROM of the
54x programmed with contents unique to any particular application.
1.4.2 Bootloader
A bootloader is available in the standard 54x on-chip ROM. This bootloader can be used to transfer user
code from an external source to anywhere in the program memory at power up automatically. If the
MP/MC pin of the device is sampled low during a hardware reset, execution begins at location FF80h
of the on-chip ROM. This location contains a branch instruction to the start of the bootloader program.
The standard 54x devices provide different ways to download the code to accommodate various system
requirements:
_ Parallel from 8-bit or 16-bit-wide EPROM
_ Parallel from I/O space in 8-bit or 16-bit mode
_ Serial port boot in 8-bit or 16-bit mode through the standard serial port, The TDM serial port, the
buffered serial port (BSP), or the multichannel buffered serial port (McBSP)

www.roeverengg.edu.in

www.roeverengg.edu.in

_ Host port interface boot (standard HPI, HPI8, and HPI16)


_ Warm boot (restart of an application without reloading the code)
The bootloader options on each 54x device are determined by the peripheral mix available on that device.
See Table 11 for information on the peripherals available on each device. On select devices, in addition
to the bootloader, the standard on-chip ROM also contains complex FFT algorithms, -law/A-law
expansion tables, and a sine look-up table.
1.4.3 On-Chip Dual-Access RAM (DARAM)
Dual-access RAM blocks can be accessed twice per machine cycle. This memory is intended primarily to
store data values; however, it can be used to store program as well. At reset, the DARAM is mapped into
data memory space. DARAM can be mapped into program/data memory space by setting
the OVLY bit in the PMST register. Memory TMS320C54x DSP Functional Overview 17
1.4.4 On-Chip Single-Access RAM (SARAM)
Each of the SARAM blocks is a single-access memory. This memory is intended primarily to store data
values; however, it can be used to store program as well. SARAM can be mapped into program/data
memory space by setting the OVLY bit in the PMST register.
1.4.5 On-Chip Two-Way Shared RAM
Select 54x devices with multiple CPU cores include two-way shared RAM blocks that allow simultaneous
program space access from two CPU cores. Each CPU can perform a single access with zero-states to any
location in the two-way shared RAM during each clock cycle. This shared RAM is most efficiently used
when the two CPUs are executing identical programs. In this case, the amount of program memory
required for the application is effectively reduced by 50% since both CPUs can execute from the same
RAM.
1.4.6 On-Chip Memory Security
A security feature is included on 54x devices to prevent the on-chip memory contents from being
extracted by a user. This feature is enabled during the manufacturing process and is ONLY available to
customers that order custom ROM programming. Consequently, memory security cannot be
enabled/disabled by the user. When the memory security feature is enabled, access to on-chip memory is
protected in the following ways:
_ Emulation access: The security feature completely disables the scan-based emulation capability of the
54x to prevent the use of a debugger utility. Note that this only affects emulation, and does not prevent
the use of the JTAG boundary scan test capability.
_ HPI access: On select devices, HPI accesses are restricted when the security feature is enabled. These
restrictions are described in Table 14.
_ CPU access: The security feature prohibits the DSP CPU from accessingthe on-chip memory. There are
two levels of security associated with CPU accesses.
_ ROM security option. This option is the least secure, because it only protects the on-chip ROM and
does not protect the on-chip RAM. When the ROM security option is enabled, any instruction fetched
from external memory or on-chip RAM is prohibited from accessing on-chip ROM and reads invalid data
(0FFFFh). Only instructions fetched from the on-chip ROM can be used to access the contents of
the ROM.Memory 18 TMS320C54x DSP Functional Overview
_ ROM/RAM security option. This option is the most secure, because it protects both the on-chip ROM
and the on-chip RAM. When the ROM/RAM security option is enabled, any instruction fetched from
external memory is prohibited from accessing on-chip ROM or RAM and reads invalid data (0FFFFh).
Only instructions fetched from on-chip ROM or on-chip RAM can access the on-chip memory. The
www.roeverengg.edu.in

www.roeverengg.edu.in

ROM/RAM security option also internally forces the device into microcomputer mode (MP/MC bit forced
to zero), preventing the ROM from being disabled.

1.4.7 Program Memory


The standard external program memory space on the 54x devices addresses up to 64K 16-bit words.
Software can configure their memory cells to reside inside or outside of the program address map. When
the cells are mapped into program space, the device automatically accesses them when their
addresses are within bounds. When the program-address generation (PAGEN) logic generates an address
outside its bounds, the device automatically generates an external access. The advantages of operating
from on-chip memory are as follows:
_ Higher performance because no wait states are required
_ Lower cost than external memory
_ Lower power than external memory
The advantage of operating from off-chip memory is the ability to access a larger address space.
1.4.7.1 Relocatable Interrupt Vector Table
The reset, interrupt, and trap vectors are addressed in program space. These vectors are soft meaning
that the processor, when taking the trap, loads the program counter (PC) with the trap address and
executes the code at the vector location. Four words are reserved at each vector location to accommodate a
delayed branch instruction: either two 1-word instructions or one 2-word instruction, which allows
branching to the appropriate interrupt service routine with minimal overhead. At device reset, the reset,
interrupt, and trap vectors are mapped to address FF80h in program space. However, these vectors can be
remapped to the Memory TMS320C54x DSP Functional Overview 19 beginning of any 128-word page in
program space after device reset. This is done by loading the interrupt vector pointer (IPTR) bits in the
PMST register with the appropriate 128-word page boundary address. After loading IPTR, any user
interrupt or trap vector is mapped to the new 128-word page. NOTE: The hardware reset (RS) vector
cannot be remapped, because the hardware reset loads the IPTR with 1s. Therefore, the reset vector is
always fetched at location FF80h in program space. In addition, for the 54x, 128 words are reserved in
the on-chip ROM for device-testing purposes. Application code written to be implemented in on-chip
ROM must reserve these 128 words at addresses FF00hFF7Fh in program space.
1.4.7.2 Extended Program Memory
Selected 54x devices use a page-extended memory scheme in program space to allow access of up to 8M
of program memory. This extended program memory is organized into a maximum of 128 pages (0127),
each 64K in length. Devices which implement the extended program memory scheme
include the following additional features:
_ Maximum of seven additional address lines (for a total of 23)
_ An extra memory-mapped register [program counter extension register
(XPC)]
_ Six new instructions for addressing extended program memory space:
_ FB[D] Far branch
_ FBACC[D] Far branch to the location specified by the value in
accumulator A or accumulator B
_ FCALA[D] Far call to the location specified by the value in
accumulator A or accumulator B
_ FCALL[D] Far call
_ FRET[D] Far return
www.roeverengg.edu.in

www.roeverengg.edu.in

_ FRETE[D] Far return with interrupts enabled


_ Two 54x instructions are extended:
_ READA Read program memory addressed by accumulator A and store in data memory
_ WRITA Write data to program memory addressed by accumulator A For more information on these
six new instructions and the two extended instructions, refer to the instruction set summary (Table 111)
and to the TMS320C54x DSP Reference Set, Volume 2, Mnemonic Instruction Set, literature number
SPRU172. And for more information on extended program memory, refer to the TMS320C54x DSP
Reference Set, Volume 1, CPU and Peripherals, literature number SPRU131.
1.4.8 Data Memory
The data memory space on the 54x device addresses 64K of 16-bit words. The device automatically
accesses the on-chip RAM when addressing within its bounds. When an address is generated outside the
RAM bounds, the device automatically generates an external access. The advantages of operating from
on-chip memory are as follows:
_ Higher performance because no wait states are required
_ Higher performance because of better flow within the pipeline of the CALU
_ Lower cost than external memory
_ Lower power than external memory
In addition to general-purpose data memory, the CPU maintains a set of memory-mapped registers in data
memory for processor configuration and configuration/communication with the device peripherals. For
detailed information on the implementation of the memory-mapped CPU and peripheral control registers,
see the device-specific data sheets
1.5 On-Chip Peripherals
All the 54x devices have the same CPU structure; however, they have different on-chip peripherals
connected to their CPUs. The on-chip peripheral options provided are:
_ Software-programmable wait-state generator
_ Programmable bank-switching
_ Parallel I /O ports
_ DMA controller
_ Host-port interface (standard 8-bit, enhanced 8-bit, and 16-bit)
_ Serial ports (standard, TDM, BSP, and McBSP)
_ General-purpose I/O pins
_ 16-bit timer with 4-bit prescaler
_ Phase-locked loop (PLL) clock generator
1.5.1 Software-Programmable Wait-State Generators
The software-programmable wait-state generator can be used to extend external bus cycles to interface
with slower off-chip memory and I/O devices. The software wait-state generator is incorporated without
any external hardware. For off-chip memory access, a number of wait states can be specified for every
32K-word block of program and data memory space, and for one 64K-word block of I/O space within the
software wait-state register (SWWSR). The software wait-state generator is programmable up to 7 or
14 wait states depending on the device. For more specific information on the software wait-state
generation capability, see the device-specific data sheet.
1.5.2 Programmable Bank-Switching
Programmable bank-switching can be used to insert one cycle automatically when crossing memory-bank
boundaries inside program memory or data memory space. One cycle can also be inserted when crossing
from program-memory space to data-memory space (54x) or from one program memory page to another
program memory page on selected devices. This extra cycle allows memory devices to release the bus
before other devices start driving the bus; thereby avoiding bus contention. The size of memory bank for
www.roeverengg.edu.in

www.roeverengg.edu.in

the bank-switching is defined by the bank-switching control register (BSCR). For specific information on
the bank-switching capabilities of a specific device, see the device-specific data sheet.
1.5.3 Parallel I/O Ports
Each 54x device has a total of 64K I/O ports. These ports can be addressed by the PORTR instruction or
the PORTW instruction. The IS signal indicates a read/write operation through an I/O port. The devices
can interface easily with external devices through the I/O ports while requiring minimal off-chip
address-decoding circuits. On-Chip Peripherals

8. Briefly explain about the Direct memory access


1.5.4 Direct Memory Access (DMA) Controller
The 54x direct memory access (DMA) controller transfers data between points in the memory map
without intervention by the CPU. The DMA allows movements of data to and from internal program/data
memory, internal peripherals (such as the McBSPs), or external memory devices to occur in the
background of CPU operation. The DMA has six independent programmable channels, allowing six
different contexts for DMA operation.
The DMA has the following features:
_ The DMA operates independently of the CPU.
_ The DMA has six channels. The DMA can keep track of the contexts of six
independent block transfers.
_ The DMA has higher priority than the CPU for both internal and external
accesses.
_ Each channel has independently programmable priorities.
_ Each channels source and destination address registers can have configurable indexes through memory
on each read and write transfer, respectively. The address may remain constant, postincrement,
postdecrement, or be adjusted by a programmable value.
_ Each read or write transfer may be initialized by selected events.
_ On completion of a half-block or full-block transfer, each DMA channel
may send an interrupt to the CPU.
_ On-chip-RAM-to-off-chip-memory DMA transfer requires 5 cycles while
off-chip-memory-to-on-chip-RAM DMA transfer requires 5 cycles.
_ The DMA can perform double-word transfers (a 32-bit transfer of two
16-bit words).
1.5.4.1 DMA Memory Map
The DMA memory map includes access to on-chip memory on all devices and access to external memory
on selected devices. The DMA memory map for on-chip memory is unaffected by the state of the memory
control bits: MP/MC, DROM, and OVLY. For specific information on DMA implementations and
memory maps, see the device-specific data sheets.
1.5.4.2 DMA Priority Level
Each DMA channel can be independently assigned high or low priority relative to each other. Multiple
DMA channels that are assigned to the same priority level are handled in a round-robin manner.
On-Chip Peripherals 24 TMS320C54x DSP Functional Overview
1.5.4.3 DMA Source/Destination Address Modification
www.roeverengg.edu.in

www.roeverengg.edu.in

The DMA provides flexible address-indexing modes for easy implementation of data management
schemes such as autobuffers and circular buffers. Source and destination addresses can be indexed
separately and can be postincremented, postdecremented, or postincremented with a specified index offset.
1.5.4.4 DMA in Autoinitialization Mode
The DMA can automatically reinitialize itself after completion of a block transfer. Some of the DMA
registers can be preloaded for the next block transfer through DMA global reload registers (DMGSA,
DMGDA, and DMGCR). Autoinitialization allows:
_ Continuous operation. Normally, the CPU would have to reinitialize the DMA immediately after the
completion of the current block transfer; with the global reload registers, it can reinitialize these values for
the next block transfer any time after the current block transfer begins.
_ Repetitive operation. The CPU does not preload the global reload register with new values for each
block transfer but only loads them on the first block transfer.
The DMA global reload register sets are sharred by all channels. However, select DMAs have been
enhanced to expand the DMA global reload register set to provide each DMA channel its own DMA
global reload register set. For example, the DMA global reload register set for channel 0 includes
DMGSA0, DMGDA0, DMGCR0, and DMGFR0 while DMA channed 1 registers include
DMGSA1, DMGDA1, DMGCR1, and DMGFR1
1.5.4.5 DMA Transfer Counting
The DMA channel element count register (DMCTRx) and the frame count register (DMFRCx) contain bit
fields that represent the number of frames and number of elements per frame to be transferred.
_ Frame count. This 8-bit value defines the total number of frames in the block transfer. The maximum
number of frames per block transfer is 128 (FRAME COUNT= 0ffh). The counter is decremented upon
the last read transfer in a frame transfer. Once the last frame is transferred, the selected
8-bit counter is reloaded with the DMA global frame reload register (DMGFR) if the AUTOINIT is set to
1. A frame count of 0 (default value) means the block transfer contains a single frame. Element count.
This 16-bit value defines the number of elements per frame. This counter is decremented after the read
transfer of each On-Chip Peripherals TMS320C54x DSP Functional Overview 25 element. The maximum
number of elements per frame is 65536 (DMCTRn = 0ffffh). In autoinitialization mode, once the last
frame is transferred, the counter is reloaded with the DMA global count reload register (DMGCR)
1.5.4.6 DMA Transfer in Double-Word Mode
Double-word mode allows the DMA to transfer 32-bit words in any index mode. In double-word mode,
two consecutive 16-bit transfers are initiated and the source and destination addresses are automatically
updated following each transfer. In this mode, each 32-bit word is considered to be one element.
1.5.4.7 DMA Channel Index Registers
The particular DMA channel index register is selected by way of the SIND and DIND field in the DMA
mode control register (DMMCRx). Unlike basic address adjustment, in conjunction with the frame index
DMFRI0 and DMFRI1, the DMA allows different adjustment amount depending on whether or not the
element transfer is the last in the current frame. The normal adjustment value (element index) is contained
in the element index registers, DMIDX0 and DMIDX1. The adjustment value (frame index) for the end of
the frame, is determined by the selected DMA frame index register, either DMFRI0 or
DMFRI1. The element index and the frame index affect address adjustment as follows:
_ Element index. For all except the last transfer in the frame, element index determines the amount to be
added to the DMA channel for the source/destination address register (DMSRCx/DMDSTx) as selected by
the SIND/DIND bits.
_ Frame index. If the transfer is the last in a frame, frame index is used for address adjustment as selected
by the SIND/DIND bits. This occurs in both single-frame and multiframe transfer.
www.roeverengg.edu.in

www.roeverengg.edu.in

1.5.4.8 DMA Interrupts


The ability of the DMA to interrupt the CPU based on the status of the data transfer is configurable and is
determined by the IMOD and DINM bits in the DMA channel mode control register (DMMCRn)
9. Write short notes on HPI .
Host-Port Interface (HPI)
Standard 8-Bit HPI
The host-port interface (HPI) is an 8-bit parallel port used to interface a host processor to the DSP device.
Information is exchanged between the DSP device and the host processor through on-chip memory that is
accessible by both the host and the DSP device. The DSP devices have access to the HPI control (HPIC)
register and the host can address the HPI memory through the HPI address register (HPIA). Standard 8-bit
HPI memory is a 2K-word DARAM block that resides at 1000h to 17FFh in data memory and can also be
used as general-purpose on-chip data or program DARAM. Data transfers of 16-bit words occur as two
consecutive bytes with a dedicated pin (HBIL) indicating whether the high or low byte is being
transmitted. Two control pins, HCNTL1 and HCNTL0, control host access to the HPIA, HPI data
(with an optional automatic address increment), or the HPIC. The host can interrupt the DSP device by
writing to HPIC. The DSP device can interrupt the host with a dedicated HINT pin that the host can
acknowledge and clear. The standard 8-bit HPI has two modes of operation, shared-access mode
(SAM) and host-only mode (HOM). In SAM, the normal mode of operation, both the DSP device and the
host can access HPI memory. In this mode, asynchronous host accesses are resynchronized internally and,
in case of conflict, the host has access priority and the DSP device waits one cycle. The HOM capability
allows the host to access HPI memory while the DSP device is in IDLE2 (all internal clocks stopped) or in
reset mode. The host can therefore access the HPI RAM while the DSP device is in its optimal
configuration in terms of power consumption. The HPI control register has two data strobes, (HDS1 and
HDS2), a read /write strobe (HR/W), and an address strobe (HAS) to enable a glueless interface to a
variety of industry-standard host devices. The HPI is interfaced easily to hosts with multiplexed
address/data bus, separate address and data buses, one data strobe and a read/write strobe, or two separate
strobes for read and write. The HPI supports high-speed back-to-back accesses.
_ In the SAM, the HPI can transfer one byte every five DSP device periodsthat is, 8 MBps with a 40MIPS DSP, or 20 MBps with a 100-MIPS DSP. The HPI is designed so that the host can take advantage
of this high bandwidth and run at frequencies up to (f _ n) 5, where n is the number of host cycles for
an external access and f is the DSP device frequency.
_ In HOM, the HPI supports high-speed back-to-back host accesses at one byte every 50 nsthat is, 20
MBps with a 40 MIPS or faster DSP. On-Chip Peripherals 28 TMS320C54x DSP Functional Overview
1.5.5.2 Enhanced 8-Bit HPI (HPI8)
The enhanced 8-bit HPI (HPI8) is similar to the standard 8-bit HPI with the additional capability that data
can be exchanged between the host processor and the DSP throughout the entire on-chip memory via the
DMA. The HPI8 can address all on-chip memory including extended memory pages. Extended memory
addresses are defined by a 23-bit address. The HPI sets the upper six bits of the extended memory address
by writing a one to the XHPIA bit in HPIC, then writing address bits A[22:16] into HPIA. The lower
16 bits of the extended memory address are set by writing a zero to XHPIA, followed by writing bits
A[15:0] to HPIA. Similar to previous implementations of the HPI, after a write is performed to XHPIA or
HPIA, a memory prefetch is initiated. The XHPIA bit is accessible only by the host. The address and data
strobe scheme implemented on the standard 8-bit HPI is also used on the enhanced 8-bit HPI. All memory
access on the HPI8 is in shared-access mode, meaning both the DSP and the host can access memory.
Asynchronous host accesses are resynchronized internally and in the event that the CPU and the host both

www.roeverengg.edu.in

www.roeverengg.edu.in

request access to the same memory block, the host has access priority. The HRDY pin provides
handshaking to the host during memory access. On the 5410, the HPI8 also provides the capability to
access memory during reset and power-down states. During reset, data or application code can be
loaded via the HPI8 and the application can be initiated through the HPI option of the bootloader. During
IDLE2/3 states, the HPI and the other six DMA channels continue to operate and all pending DMA events
will complete before the DSP stops the clocks. The HPI has higher priority than the other six DMA
channels. The HPI will continue to have access to memory in IDLE2/3 even after the DSP has stopped the
internal clocks as long as X2/CLKIN is maintained. The HPI8 also remains active during emulation stop.
1.5.5.3 16-Bit HPI (HPI16)
The HPI16 is an enhanced 16-bit version of the C54x 8-bit HPI. The HPI16 is designed to allow a 16-bit
host to access the DSP on-chip memory, with the host acting as the master of the interface. It should be
noted that neither the CPU nor the DMA I/O spaces can be accessed using the HPI16.
On-Chip Peripherals TMS320C54x DSP Functional Overview 29 Some features of the HPI16 include:
_ 16-bit bidirectional data bus
_ Multiple data strobes and control signals to allow glueless interfacing to a variety of hosts
_ Multiplexed and nonmultiplexed address/data modes
_ 18-bit address bus used in nonmultiplexed mode to allow access to all internal memory
(including internal extended address pages)
_ 18-bit address register used in multiplexed mode. Includes address autoincrement feature for faster
accesses to sequential addresses.
_ Interface to on-chip DMA module to allow access to entire internal memory space
_ HRDY signal to hold off host accesses due to DMA latency
_ Control register available in multiplexed mode only. Accessible by either host or DSP to provide
host/DSP interrupts, extended addressing, and data prefetch capability.
_ Maximum data rate: 28 MB/s at 100-MHz DSP clock rate (assuming
multiplexed address/data with no DMA latency).
The HPI16 acts as a slave to a 16-bit host processor and allows access to the on-chip memory of the DSP.
There are two modes of operation as determined by the HMODE signal: multiplexed mode and
nonmultiplexed mode. In multiplexed mode, the HPI16 address and data buses are multiplexed onto a
single bus. This provides a simple and glueless connection to multiplexed-bus processors. In
nonmultiplexed mode, the HPI16 address and data buses are separate dedicated buses. In this mode, there
is no access to the HPI16 control register (HPIC).
1.5.5.4 Combination Enhanced 8-Bit HPI (HPI8) and 16-Bit HPI (HPI16)
Some 54x devices include both the HPI8 and the HPI16 (see Table 18). On these devices, only one HPI
version can be used at a time and the selection is made by the HPI16 input pin. When the HPI16 pin is
driven low, the HPI8 module is enabled; when the pin is driven high, the HPI16 module is enabled.
These devices do not include an HMODE signal, so when the HPI16 module is enabled, only the
nonmultiplexed mode of operation is supported.
10. Give an detail explanation about the serial port in C54x processor
1.5.6 Serial Ports
The 54x devices provide high-speed, full-duplex serial ports that allow direct interface to other 54x
devices, codecs, and other devices in a system. There is a standard serial port, a time-division-multiplexed
(TDM) serial port, a buffered serial port (BSP), and a multichannel buffered serial port (McBSP).
Table 11 shows the availability of each of the serial port types in the 54x family.
1.5.6.1 Standard Serial Port
www.roeverengg.edu.in

www.roeverengg.edu.in

The general-purpose serial port utilizes two memory-mapped registers for data transfer: the data-transmit
register (DXR) and the data-receive register (DRR). Both of these registers can be accessed in the same
manner as any other memory location. The transmit and receive sections of the serial port
each have associated clocks, frame-synchronization pulses, and serial-shift registers; and serial data can be
transferred either in bytes or in 16-bit words. Serial port receive and transmit operations can generate their
own maskable transmit and receive interrupts (XINT and RINT), allowing serial-port transfers to be
managed through software. The 54x serial ports are double-buffered and fully static.
1.5.6.2 TDM Serial Port
The TDM port allows the device to communicate through time-division multiplexing with up to seven
other 54x devices with TDM ports. Time-division multiplexing is the division of time intervals into a
number of subintervals with each subinterval representing a prespecified communications channel. The
TDM port serially transmits 16-bit words on a single data line (TDAT) and destination addresses on a
single address line (TADD). Each device can transmit data on a single channel and receive data from one
or more of the eight channels, providing a simple and efficient interface for multiprocessing applications.
A frame synchronization pulse occurs once every 128 clock cycles, corresponding to the transmission of
one 16-bit word on each of the eight channels. Like the general-purpose serial port, the TDM port is
double-buffered on both input and output data.
1.5.6.3 Buffered Serial Port (BSP)
The buffered serial port (BSP) consists of a full-duplex, double-buffered serial-port interface and an
autobuffering unit (ABU). The serial port block of the BSP is an enhanced version of the standard serial
port. The ABU allows the serial port to read/write directly to the 54x internal memory using a dedicated
bus independent of the CPU. This results in minimal overhead for serial port transactions and faster data
rates. When autobuffering capability is disabled (standard mode), serial port transfers are performed under
software control through interrupts. In this mode, the ABU is transparent and the word-based interrupts
(WXINT and WRINT) provided by the serial port are sent to the CPU as transmit interrupt (XINT) and
receive interrupt (RINT). When autobuffering is enabled, word transfers are done directly between the
serial port and the 54x internal memory using ABU-embedded address generators.
On-Chip Peripherals TMS320C54x DSP Functional Overview 31
The ABU has its own set of circular-addressing registers with corresponding address-generation units.
Memory for the buffers resides in 2K words of the 54x internal memory. The length and starting
addresses of the buffers are user-programmable. A buffer-empty/buffer-full interrupt can be posted to the
CPU. Buffering is easily halted by an autodisabling capability. Autobuffering capability can be enabled
separately for transmit and receive sections. When autobuffering is disabled, operation is similar to that of
the general-purpose
serial port.
The BSP allows transfer of 8-, 10-, 12-, or 16-bit data packets. In burst mode, data packets are directed by
a frame synchronization pulse for every packet. In continuous mode, the frame synchronization pulse
occurs when the data transmission is initiated and no further pulses occur. The frame and clock
strobes are frequency- and polarity-programmable. The BSP is fully static and operates at arbitrarily low
clock frequencies. The BSP maximum operating frequency for 54x devices up to 50 MIPS is CLKOUT.
For higher-speed 54x devices, the BSP maximum operating frequency is 50 Mbps at 20 ns.
Buffer Misalignment (BMINT) Interrupt (549 only)
The BMINT interrupt is generated when a frame sync occurs and the ABU transmit or receive buffer
pointer is not at the top of the buffer address. This is useful for detecting several potential error conditions
on the serial interface, including extraneous and missed clocks, and frame sync pulses. A BMINT
interrupt, therefore, indicates that one or more words may have been lost on the serial interface.
www.roeverengg.edu.in

www.roeverengg.edu.in

BMINT is useful for detecting buffer misalignment only when the buffer pointer(s) are initially loaded
with the top-of-buffer address, and a frame of data contains the same number of words as the buffer
length. These are the only conditions under which a frame sync occurring at a buffer address, other
than the top of buffer, constitute an error condition. In cases where these conditions are met, a frame sync
always occurs when the buffer pointer is at the top of buffer address, if the interface is functioning
properly. If BMINT is enabled under conditions other than those stated above, interrupts may be generated
under circumstances other than actual buffer misalignment. In these cases, BMINT should generally be
masked in the IMR register so that the processor will ignore this interrupt. BMINT is available when
operating autobuffering mode with continuous transfers, the FIG bit cleared to 0, and external serial clocks
or frames. The BSP0 and BSP1 BMINT bits in the IMR and IFR registers are bits 12 and 13, respectively
(bit 15 is the MSB). The interrupt vector locations of IMR and IFR are 070h and 074h, respectively.
1.5.6.4 Multichannel Buffered Serial Port (McBSP)
The 54x devices provide high-speed, full-duplex, multichannel buffered serial ports that allow direct
interface to other 54x devices, codecs, and other devices in a system. The multichannel buffered serial
ports (McBSPs) are On-Chip Peripherals 32 TMS320C54x DSP Functional Overview based on the
standard serial port interface found on other 54x devices. Like its predecessors, the McBSP provides:
_ Full-duplex communication
_ Double-buffer data registers which allow a continuous data stream
_ Independent framing and clocking for receive and transmit
In addition, the McBSP has the following capabilities:
_ Direct interface to:
_ T1/E1 framers
_ MVIP switching-compatible and ST-BUS compliant devices
_ IOM-2 compliant devices
_ AC97-compliant devices
_ IIS-compliant devices
_ Serial peripheral interface
_ Multichannel transmit and receive of up to 128 channels
_ A wide selection of data sizes including 8, 12, 16, 20, 24, or 32 bits
_ -law and A-law companding
_ Programmable polarity for both frame synchronization and data clocks
_ Programmable internal clock and frame generation
The McBSPs consist of separate transmit and receive channels that operatecompletely independently. The
external interface of each McBSP consists of the following pins:

_ BCLKX Transmit reference clock


_ BDX Transmit data
_ BFSX Transmit frame sync
_ BCLKR Receive reference clock
_ BDR Receive data
_ BFSR Receive frame sync
_ BCLKS External clock reference for the programmable clock
generator
The first six pins listed are identical to the previous serial port interfaces on the C5000 family of DSPs.
The BCLKS pin is an additional signal to provide a clock reference to the McBSP programmable clock
generator. As a compatibility option, BCLKS is not implemented on some packages. On the transmitter,
transmit frame synchronization and clocking are indicated by the BFSX and BCLKX pins, respectively.
www.roeverengg.edu.in

www.roeverengg.edu.in

The CPU or DMA can initiate transmission of data by writing to the data transmit register (DXR). Data
written to DXR is shifted out on the BDX pin through a transmit shift register (XSR). This structure
allows DXR to be loaded with the next word to be sent while the transmission of the current word is in
progress. On the receiver, receive frame synchronization and clocking are indicated by the BFSR and
BCLKR pins, respectively. The CPU or DMA can read received data from the data receive register
(DRR). Data received on the BDR pin is shifted into a receive shift register (RSR) and then buffered in the
receive buffer register (RBR). If DRR is empty, the RBR contents are copied into DRR.
On-Chip Peripherals TMS320C54x DSP Functional Overview 33 If not, RBR holds the data until DRR is
available. This structure allows storage of the two previous words while the reception of the current word
is in progress.To maintain pin compatibility with previous devices, not all 54x devices with
McBSPs implement the BCLKS pin. For this reason, select 54x devices allow either the receive clock pin
(BCLKR) or the transmit clock pin (BCLKX) to be configured as the input clock to the sample rate
generator. This enhancement is enabled through two register bits: pin control register (PCR) bit 7
enhanced sample clock mode (SCLKME), and sample rate generator register 2 (SRGR2) bit 13 McBSP
sample rate generator clock mode (CLKSM). SCLKME is an addition to the PCR contained in the
McBSPs on previous C5000 devices. The selection of the sample rate generator (SRG) clock input
source is made by the combination of the CLKSM and SCLKME bit values. When either of the
bidirectional pins, BCLKR or BCLKX, is configured as the clock input, its output buffer is automatically
disabled. For example, with SCLKME = 1 and CLKSM = 0, the BCLKR pin is configured as the SRG
input. In this case, both the transmitter and receiver circuits can be synchronized to the SRG output by
setting the PCR bits (9:8) for CLKXM = 1 and CLKRM = 1. However, the SRG output is only driven onto
the BCLKX pin because the BCLKR output is automatically disabled. The CPU and DMA can move data
to and from the McBSPs and can synchronize transfers based on McBSP interrupts, event signals, and
status flags. The DMA is capable of handling data movement between the McBSPs and memory with no
intervention from the CPU. In addition to the standard serial port functions, the McBSP provides
programmable clock and frame sync generation. Among the programmable functions are:
_ Frame sync pulse width
_ Frame period
_ Frame sync delay
_ Clock reference (internal vs. external)
_ Clock division
_ Clock and frame sync polarity
The on-chip companding hardware allows compression and expansion of data in either -law or A-law
format. When companding is used, transmit data is encoded according to the specified companding law
and received data is decoded to 2s complement format. The McBSP allows multiple channels to be
independently selected for the transmitter and receiver. When the multiple channels are selected, each
frame represents a time-division multiplexed (TDM) data stream. In using TDM data
On-Chip Peripherals 34 TMS320C54x DSP Functional Overview streams, the DSP CPU can be
programmed to process as many data streams as necessary for the specific application. Thus, to save
memory and bus bandwidth, multichannel selection allows independent enabling of particular
channels for transmission and reception. Up to a maximum of 32 channels in a 128-channel bit stream can
be enabled or disabled. Select devices have been enhanced to allow the enabling or disabling of up to 128
channels in a 128-channel bit stream

Serial Port Interface

www.roeverengg.edu.in

www.roeverengg.edu.in

Four different types of serial port interfaces are available on C54xdevices. The basic standard serial
port interface is implemented on C541, C545, and C546 devices. The TDM serial port interface is
implemented on the C542, C543, C548, and C549 devices. The C542, C543, C545, C546, C548, and
C549 devices include a buffered serial port (BSP) that implements an automatic buffering feature, which
greatly reduces CPU overhead required in handling serial data transfers. The C5402, C5410, and C5420
devices include multichannel buffered serial ports (McBSPs). See Table 91 for information
about the features included in various C54x devices. The BSP operates in either autobuffering or
nonbuffered mode. When operated in nonbuffered (or standard) mode, the BSP functions the same as the
basic standard serial port (except where specifically indicated) and is described in this section. The TDM
serial port operates in either TDM or non- TDM mode. When operated in non-TDM (or standard) mode,
the TDM serial port also functions the same as the basic standard serial port and is described
in this section. The BSP also implements several enhanced features in standard mode. These features,
together with operation of the BSP in autobuffering mode, are described in section 9.3, Buffered Serial
Port (BSP) Interface, on page 9-33. Therefore, when using the C542, C543, C545, C546, C548, and C549
devices, you should consult section 9.3. Operation of the TDM serial port in TDM mode is described in
section 9.4, Time-Division Multiplexed (TDM) Serial Port Interface, on page 9-56. Note that the BSP and
TDM serial ports initialize to a standard serial port compatible mode upon reset. In all C54x DSP serial
ports, both receive and transmit operations are doublebuffered, thus allowing a continuous
communications stream with either 8-bit or 16-bit data packets. The continuous mode provides operation
that, once initiated, requires no further frame synchronization pulses (FSR and FSX) when transmitting at
maximum packet frequency. The serial ports are fully static and thus will function at arbitrarily low
clocking frequencies. The maximum operating frequency for the standard serial port of one-fourth of
CLKOUT (10 Mbit/s at 25 ns, 12.5 Mbit/s at 20 ns) is achieved when using internal serial port clocks. The
maximum operating frequency for the BSP is CLKOUT. When the serial ports are in reset, the device may
be configured to turn off the internal serial port clocks, allowing the device to run in a lower
power mode of operation.

www.roeverengg.edu.in