Sei sulla pagina 1di 21

Hardware/Software Co-Design

of
Kalman Filter for Radar
Applications
M.Sc. Computer Engineering
Project Presentation

Fakhar Ahsan
2007-
2007-06-
06-0016

Outline
 Introduction
 Purpose & Overview
 Adaptive filtering
 Kalman Filter
 Hardware/Software Codesign
 Co design of Kalman filter
 Accelerator Design & Implementation
– Flowchart
– FSM
 Synthesis results
 Performance evaluation
 Summary

1
Introduction
 Signal Processing
– Digital Signal Processing
– Digital Filtering

 Purpose

– To develop efficient implementation of


real-time adaptive filtering, using:
 Embedded Computing
 Hardware Accelerator

Digital Filters
 Fixed-coefficient Filters

 Programmable Filters

 Adaptive Filters

2
Adaptive Filtering
 Why do we need Adaptive filtering?

– Spectral overlap between the signal and noise

– Interfering signal’
signal’s statistics change with time

– Statistics of the noise are not known

– Statistics of the noise change over time

Adaptive Filtering

Variable Filter
x(n) (ωn)
y(n)

∆ωn
Update
Algorithm

3
Adaptive Filtering Algorithms
 Least Mean Squares (LMS) Algorithm

 Recursive Least Squares (RLS)

 Kalman Filtering

 Wiener Filtering

Kalman Filter
 Advantages

– Gain coefficients are computed dynamically.


– Provides an accurate measure of the covariance
matrix.
– Kalman filter makes it possible to partially
compensate for the effects of miss-
miss-correlation.
– Error settles down more quickly.

4
Kalman Filter
 Applications

– Used in radar tracking systems


 Smoothing Filter

– Autonomous or assisted navigation


 Unmanned Aerial Vehicles (UAV)

Typical Kalman Filter Application

Observed
Measurement
Error Source Measurement
Optimal
Estimate of
Controls Measuring Kalman System State
System
Device Filter

System Error
Sources

System State
(Desired but not Known)
10

5
Kalman Filter
 MATLAB Simulations

Measurement
Error Source
External Noise

System + Measuring
Device Observed
Measurement

System State
11
(Desired but not Known)

Kalman Filter
 MATLAB Simulations

Kalman Filter

12

6
Kalman Filter
 MATLAB Simulations

13

Kalman Filter
 Radar Tracking
– Position, Velocity, and Acceleration
– Azimuth Angle, Azimuth Angle Rate, and
Azimuth Angle Acceleration

– Elevation Angle, etc.


 Kalman Filter with state vector containing
three variables is needed in all above cases.
14

7
Kalman Filter
 Operation of Kalman Filter

Measurement Update
Predict / Estimate

15

Block Diagram of Kalman Filter

Predict

Xo (n) Estimate
+ + Kn +

Delay
X(n)
Hn Φ
X (n-1)

Update
16

8
Kalman Tracker
 State Equation
– prediction

Position

Velocity

Acceleration

17

Kalman Tracker

Xo (n) Estimate
+
+ Kn +

Delay

Hn Φ

18

9
Kalman Tracker
 Computational Complexity
Kalman Gain Update

Xo (n) Estimate
+ + Kn +

Delay

Hn Φ

 Xn 1   Φ 11 Φ 12 Φ 13
 X 
  
1
* 
 Xn 2  =  Φ 21 Φ 22 Φ 23  X 2 19
 Xn 3   Φ 31 Φ 32 Φ   X
33  
3

Kalman Tracker
 Kalman gain computation
– Error covariance extrapolation
 S = PHI * S * PHIT + Q

 S11 S S   Φ11 Φ Φ   S11 S S  Φ11 Φ Φ   Q11 Q Q 



 =   + 
12 13
 * *
12 13 12 13 12 13 21 31

S 21 S S 22 23
 Φ21 Φ Φ
22 23 S 21 S S
22 23 Φ12 Φ Φ 22 32  Q21 Q Q 
   
22 23
S 31 S S 32

33  Φ31 Φ Φ
32  S 31
33 S S
32
 Φ13
33  Φ Φ 23 
33  Q
 31 Q Q 32 33

– Gain calculation
 k1 = S11 / (S11 + R)
 k2 = S12 / (S11 + R)
 k3 = S13 / (S11 + R)
20

10
Computation complexity of Kalman
Tracker

3M + 2A
 9(3M + 2A) for 3x3 matrix multiplication

 PHI * S * PHIT => 54M + 36A

 Compute Kalman Gain


– 54 Multipliers + 45 Adders
– + 3 Div + 3 Adders
 For gain update equation
21

Hardware/Software Codesign
 A mixture of
– Off-
Off-The-
The-Shelf Processors / software
– Specialized hardware

 Execute portions of the applications as


software programs and the other portions as
customized hardware implementations

22

11
Hw/Sw Codesign
 Motivation for Codesign
– Hardware provides with greater performance
– Software incorporates flexibility in the system

– The availability of processors cores that can be


easily embedded into an ASIC/FPGA design
– The increased efficiency of higher level language
compilers that make writing efficient code of
embedded processors much easier and less time
consuming

23

Co-design using FPGAs


 Embedded Processors (hard or soft)
 On-Chip Busses
 Memory
 I/Os and Interface Standards
 Hardware Accelerators
– Custom IP cores

24

12
Co-design with Xilinx EDK

Host Processor

(e.g. Microblaze) UART GPIO

Peripheral Bus

Debug Ethernet Memory VGA


Module Controller Controller

Memory
25

Co-design with Xilinx EDK


Hardware Software
 Processors
– Power-
Power-PC o Customized system o Platform specific
o Processor + Peripherals Software Libraries
– MicroBlaze
o Custom IP Core + Co-
o Device drivers
processor

o Component
o Embedded Operating
configuration
System
o Bus Connections

o Simulation and
o Compiler + Debugger
Implementation
o Machine Code generation
o Bit Stream generation

o Boot Memory Initialization


o Download design to FPGA
26

13
Accelerator Design & Implementation
 Computations in FPGAs

– Floating Point Implementation is an Issue


 Floating-
Floating-point computation requires a lot more
resources as compared to fix-
fix-point computation

– Fixed-
Fixed-point implementation should be preferred

27

Fixed–Point Implementation Issues


 Finite Precision Effects
– Coefficient quantization error
 caused by representing the coefficients by a finite
number of bits
– Overflow error
 caused by the addition of two large numbers
– Round off error
 caused when the result of a multiplication is rounded (or
truncated) to the nearest discrete value

28

14
Fixed–Point Implementation Issues
 Fixed –Point Scale Factor Adjustment

 (s · x + s · y ) = s · (x + y)

 (s · x * s · y ) / s = s · (xy)

Scale Value
1024 (For three decimal places)

29

Matrix Calculation Optimization


  2  

 S11

S S 12 13

 = 
1 T T 2   S S S 11 12 13
  1
*
0 0   Q Q Q 11 12 13


S 21 S S 22 23

0 1 T  * S S S 21 22 23  T 1 0   + Q Q Q 
  0 
21 22 23
S 31 S S 0 1  S S S   2 
T 1   Q Q Q
33 33
 T 2 33

32 31 32
   31 32


 
S S  S + S *T + S *T 2  1 0 0  Q Q Q 
2 2 2
S11

12 13 11 21 31
2 S + S *T + S *T
12 22 32
2 S + S *T + S *T
13 23 33
  11 12 13

S21 S S  =  S + S *T
22 23 21 31 S + S *T 22 32 S + S *T
23 33  * T 1 0  + Q Q Q 
21 22 23
S31 S S   S S S   2
 T
 Q Q Q
T 1   
32 33 31 32 33
31 32 33
 2

Advantage:
(9M + 9A) instead of (27M +18A)
Total: (30M + 27A) instead of (54M +45A)
30

15
Kalman Accelerator Core Design
 Software Part

31

Kalman Accelerator Core Design


 FSM for Kalman Accelerator

32

16
Kalman Accelerator Core Design
 State Diagram of Kalman Accelerator

33

Kalman Accelerator Core Design


 PLB slave bus interface with Kalman
Accelerator

34

17
Kalman Accelerator Core Design
 Mapping of Software Accessible Registers

35

Synthesis Results
 Kalman Accelerator:
– 64x32-
64x32-bit dual-
dual-port RAM 3
– Multipliers 34
– Adders/Subtractors 140
– Device Utilization:
 Number of Slices 3783 out of 13696 27%
 Maximum Freq. 43 MHz

 Overall System Device Utilization


– Number of occupied Slices: 5,939 out of 13,696 (43%)
– Number of MULT18X18s: 88 out of 136 (64%)
– Maximum Freq.: 43 MHz

36

18
Performance Evaluation
 Final System Design

Microblaze
UART

Processor Local Bus (PLB v4.6)

Kalman Timer
Accelerator Module

37

Performance Evaluation
 Experimental Results
– For a block of data (block size 64)
 Software Only
– Filtering operation takes 28305 clock cycles at 33 MHz.
 Hw/Sw Codesign
– Filtering operation inclusive of data read and coefficient write
operations, takes 7366 clock cycles at 33 MHz.

38

19
Summary
 Tools Used
– MATLAB 7.3 (R2006b)
– Xilinx ISE v 10.1 —XST (for Logic Synthesis)
– Xilinx ISE v 10.1 — ISE Simulator (for
Simulation)
– Xilinx EDK v10.1 (for Software Compilation,
HW/SW merger & Implementation)

39

Summary
 Conclusion
– A Kalman filter accelerator for MicroBlaze
processor based system was designed and
implemented with standard PLB interface on
Xilinx Virtex-
Virtex-II pro FPGA development board

– Hardware/Software codesign approach is almost


four times faster than Software only approach.

40

20
Thank You !
41

21

Potrebbero piacerti anche