FPGA Accelerated Stereo Vision

CS 684 PROJECT REPORT
Depth Map Acquisition from Stereo Cameras
Team Blind Gear

Aswin (120070056)
Yeshwanth (120070050)
November 8, 2014
Contents
1. Introduction
1.1 Abbreviations and Definitions ....
2. Problem Statement
3. Possible Methods
3.1 Raspberry-Pi and Web-cams .
3.2 FPGA based Camera Controller ..
4. Requirements
4.1 Hardware Requirements .
4.2 Software Requirements
5. Limitations
5.1 Limitations of Raspberry-Pi
5.2 Limitations of FPGA .
6. Implementation
6.1 FPGA Modules
6.2 Additional Hardware ..
6.3 Final Design ..
7. Future Work
7.1 Original Target
7.2 Compact Model .
8. Conclusion
References
Gallery
Introduction
Stereo Vision is a way to obtain information on the depth and feature of an object
in a 3D environment. We sense the differences in images obtained from two
parallel placed camera sensors and try to gauge the depth map of the 3D
environment. Our eyes basically works on this principle where our brain creates a
3D map of the view in front of us by calculating the differences in the images from
the eyes. Our project aim was to achieve the same result using a pair of cheap
camera sensors available in market. We had the task of acquiring synchronous
images from the cameras which was a basic necessity if we had to process the
images later.
1.1 Abbreviations and Definitions
Raspberry-P : It is a credit card-sized single board computer which offers
computer like processing power in cheap hardware.
FPGA
: A Field-Programmable Gated Array is an integrated circuit which

can be configured after manufacturing. Allowing us to design
application specific integrated circuits at a fraction of cost of AISC
chips.
Ov7670
: A cheap camera sensor available in market which provides various

processing capabilities with an in build DSP.
Problem Statement
Our Projects aim is to create depth map of a surrounding using images from a pair
of cameras. Our setup will be consisting of two cameras placed in a parallel
position. We have to acquire synchronous images from the cameras, which is the
primary task at hand since open source libraries are available which can find depth
information from stereo images. We wish to achieve this with as simple hardware
2
as possible and a cheap one. Our final deliverable stereo camera setup would look
like figure 1 (just for graphic representation).
Figure 1.
Two cameras set up parallely on a planar surface.
Possible Methods
We can achieve the final objective through many ways all thanks to vast number of
development boards available in market and cheap cameras. We have arrived two
possible solutions with the resources available to us in short period of time and
which seems to be good enough to meet our requirements.
3.1 Raspberry Pi and Web-Cams
Raspberry Pi b+ has a moderate processor and 4 USB ports. This credit card-sized
mini-computer may be a possible portable solution for our problem. We can use
two web-cams and one Raspberry-Pi and OpenCV image capture libraries to
capture continuous images from the cameras possibly with a minimum time shift in
the frames. We have to calculate the minimum delay possible in between the
frames from the two cameras at every instance. This delay should as minimum as
possible.
3
3.2 FPGA Based Camera Controller

Now to explain how FPGA can be the perfect solution for our problem we have
explain you what exactly is a FPGA? Well lets take a small detour into this topic.
What is an FPGA?
Its called Field-Programmable Gated Array. Well that may not mean anything to
you but what it can do is just amazing. Imagine if you what to have a special
hardware design made of hundreds of logic gates but with the performance
equivalent to specially designed chips. Then you are looking for an FPGA. This piece
of hardware allows us to design real logical circuits through a code and so we can
prototype real circuits in matter of minutes with the performance almost
equivalent to an integrated circuit.
Now how is this possible? How can we change circuits made of hundreds of logical
gates in matter of minutes? For this we use a Hardware Description Language. We
create objects for the circuit elements in codes and we describe how the
connections are to be made between these elements. This tool allows us to apply
basic understanding of working principle of a circuit element and create one
equivalent to that with high performance. Now we can just write codes to describe
the circuit and its getting made right in your hands.
What does this allow us to do? Well when we what to achieve high speed data
acquisition for a very unique device like a camera sensor, Then we create a digital
circuit to do that. This allows us make hardware drivers for specific devices at a
fraction of cost and time.
Well, since we can design perfect clock driven drivers for the devices we can make
two drivers synchronous by feeding them with the same clock.
This seems to be the perfect solution but writing drivers for a device can be really
time consuming and offers no warranty of success. Keeping this in mind we planned
to try with Raspberry-Pi first rather FPGA.
4
Requirements
We required different hardware for both the solutions because they were very
different by nature.
4.1 Hardware Requirements
For the first method we needed:
Raspberry-Pi, Two web-cams, Monitors, Keyboard, Mouse, HDMI to VGA Converter.
For the second method we needed:
FPGA Board, Fast DAC, VGA Port, 40 GPIO Pins, Two OV7670 Camera Sensors,
Monitor.
4.2 Software Requirements
For the first method we needed:
OpenCV libraries, Python, Terminal, Windows 7/8.
For the second method we needed:
Quartus Web Edition, Windows 7/8.
Limitations
No problem comes without challenges. We faced many challenges along this

project as limitations with the hardware available to us.
5.1 Limitations of Raspberry-Pi
Raspberry-Pi was designed to provide complete solution to cheap computing power
in a card-sized board. Basically it was designed to not consume too much power.
This power saving feature was a big hurdle for since Web-Cams take a lot of power.
Even though the Board supports 4 USB ports there were power constrains to the
5
ports. Not able to handle the current requirements the Boards shuts down one of
the ports when we try to access images from both the cameras.
So we tried to access images one after another this way we wont be keeping both
the cameras on at the same time. But this produced a time shift in between the
frames which was minimum of 0.05 seconds. We consulted our professors and it
was decided not to go ahead with this method. We should at least have
synchronous images if not a depth map. The step we take should be in the right
direction at the least if we aim to succeed.
Since we were not able to use both the cameras with a Raspberry-Pi we had to
leave this approach.
5.2 Limitations of FPGA
Now we had no choice but to use a FPGA to solve our problem. FPGA has some
limitations. The number of logical elements on the FPGA is limited so we can have
an infinite memory element in the boards. While our camera can support 160*120
format we needed a DAC fast enough to convert the data to support VGA port
which is an analog port. We had a FPGA board DE0-Nano but it lacked the necessary
hardware.
Our choice of the FPGA board was heavy dependent on the availability of on board
DAC and VGA port. Luckily we had DE2i-150 in our Labs which had all the required
hardware on board.
Implementation
We havent explained how we acquired images from the Raspberry-Pi setup since
we didnt use it later.
6.1 FPGA Modules
OV7670 is a cheap camera sensor but has many DSP features inbuilt. We can
control the DSP in the chip through various register values present in the camera.
The basic state diagram for the camera sensor is given in figure 2.
Figure 2.
Basically we need:
1. I2C Controller
2. Registers Bank
3. Capture logic
4. Buffer memory
Controller
OV7670 uses SCCB (Serial Camera Control Bus) to communicate for the registers.
SCCB is similar to I2C so we have used I2C controller in place of SCCB.
The I2C controller comes with the Register Banks. Register Bank holds all the
register addresses and values to be set at the time of initialization.
The final Module contains one I2C controller and Register Banks connected as one.
The controller design in given below in figure 3.
7
Figure 3.
OV7670 needs a clock to send data back we feed the Clock through xclk. This is very
important as OV7670 need xclk without which it wont respond. This pin is also part
of the controller.
Some settings needed to operate the Camera are:
Register Name
Register Address
Register Value
Action
COM 7
12
80
Reset
CLKRC
11
Pre-scalar of XCLK
COM 7
12
PIXEL Format
COM 15
40
E0
RGB FORMAT
EDGE
3F
92
EDGE Enhancement
MIRROR
1E
20
ORIENTATION
DE-NOISE
4C
Level of DE-NOISE
Capture Logic
Capture logic is to get the data which is being sent by the Camera and send it to the
buffer memory with the address calculated.
The camera sends data in accordance with PCLK and HREF, which is given in the
datasheet of OV7670 given below as figure 4.
Figure 4.
Every Frame beginning is given by VSYNC signal going from high to low and every
row data beginning is given by HREF going from low to high. Each byte of data is
sent for every clock cycle of PCLK.
The Capture Logic would calculate the location of the pixel and send the data and
the address to be stored in the memory buffer.
Buffer Memory
Buffer Memory is used for storing the image pixel values in the FPGA. The size of
the buffer depends on the format of the PIXEL. It is set to 160*120. The minimum
possible so as to decrease the memory requirements.
The Buffer memory would be connected to the Capture Logic. The Block is
implemented through Altera Support Memory IP which creates a Memory Block
within the SDRAM present on the board so as to save use of Logical Elements in the
FPGA.
The Buffer Memorys write clock is fed with PCLK from the camera through the
Capture Logic.
9
The design of Capture Logic and Buffer Memory is given below as figure 5.
Figure 5.
Finally now we can start the OV7670 and receive the data and store it in our
memory. We need to view the results in order to check the images so we need
additional hardware designs.
6.2 Additional Hardware

We used VGA port to view the image stream on a monitor.
For sending data we need:
1. DAC Chip
2. VGA Controller
The FPGA Development Board had a DAC Chip in it. So we had to design the VGA
Controller which would read the Buffer Memory and send the Data to the DAC chip
which will convert the digital data into analog signal and send it to the VGA port.
The DAC chip would need some extra signals than regular VGA signals.
10
VGA Controller
The VGA Controller reads pixels data at the rate of 50MPixels/s.
The VGA Controller will be connected to the Buffer Memory and the design is given
below as figure 6.
Figure 6.
The signal HSYNC and VSYNC are used to denote the Row End and Frame End.
The signals vga_sync_n should be kept at 0 always and vga_blank_n should be kept
at 1 always. This is specific to the DAC chip on this board.
The vga_clk is same as system clock i.e. 50MHz clock used throughout the system.
The data transmission follows as shown below in figure 7.
11
Figure 7.
6.3 Final Design

Till now we explained how to extract images from one camera sensor. For Stereo
Vision we would need two streams from two cameras. So we duplicate same
controller and the components needed while both cameras are given the same
clock. This would provide us with maximum possible sync with the cameras.
The Video output in given in Anaglyph format. Which is obtained through applying
red filter on left side camera and cyan filter on the right side camera.
This will produce a video stream which if view through the 3D glasses with red and
cyan filters then we can perceive depth from the video stream.
The VGA module is given input from both the Buffer Memory and the controller
takes only red value from one image and cyan value from the other image.
The final circuit design consisting of components for both cameras is given in the
page as figure 8.
12
Figure 8.
13
7 Future Work
We started with a target of creating depth map of our surrounding but lack of
certain knowledge about data transfers have limited us to store data within FPGA
itself we were not able to transfer the data to a PC or a Computer to process the
data. But we believe we have crossed a big hurdle in our final path.
7.1 Target Module
Our target module would include a stereo camera which is connected to a
processors which can calculate the disparity map. Our current module can be
extended to include disparity map calculator and we can display the disparity map
on a screen.
Even though a processor might be easy to use due to available c libraries for
computer vision FPGA implementation of Disparity Map calculator might be
efficient and faster in performance. So its necessary to look into the possibility of a
FPGA implementation of the algorithm.
7.2 Compact Model
Our aim was also to make the module compact enough to make it as a wearable.
But the current size our implementation is relatively heavy and have huge power
requirements but this is due the versatile nature of the board we have used which
was designed to meet high end technical needs of prototyping. We can port the
codes to a smaller FPGA which can handle large memory one such FPGA Board is
DE0-Nano which can be powered small 5V battery and can hold the code currently
written but it doesnt have enough memory so if we can expand the memory then
DE0-Nano offers an excellent alternative to DE2i-150 in terms of power and size.
8 Conclusion
Our project have demonstrated certain embedded system problem can be easily
solved by change in hardware. The advantages of a FPGA have been clearly
demonstrated in the implementation of our project.
14
References
1.
2.
3.
4.
5.
6.
OV7670 Datasheet : www.voti.nl/docs/OV7670.pdf
http://developer.mbed.org/users/ms523/notebook/ov7670-camera/
https://github.com/desaster/ov7670test/tree/master/src
http://wiki.jmoon.co/sensors/camera/ov7670/
http://www.siliconimaging.com/RGB%20Bayer.htm
http://www.rpg.fi/desaster/blog/2012/05/07/ov7670-camera-sensor-success/
Gallery
Our Final Model
15

FPGA Accelerated Stereo Vision

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

FPGA Accelerated Stereo Vision

Caricato da

Copyright:

Formati disponibili

CS 684 PROJECT REPORT

Depth Map Acquisition from Stereo Cameras

Team Blind Gear

: A Field-Programmable Gated Array is an integrated circuit which

: A cheap camera sensor available in market which provides various

3.2 FPGA Based Camera Controller

No problem comes without challenges. We faced many challenges along this

6.2 Additional Hardware

6.3 Final Design

OV7670 Datasheet : www.voti.nl/docs/OV7670.pdf

Potrebbero piacerti anche