Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Submitted By-
Efficient and accurate object detection has been an important topic in the
advancement of computer vision systems. With the advent of deep
learning techniques, the accuracy for object detection has increased
drastically. The project aims to incorporate well tested techniques of
object detection with the goal of achieving high accuracy with seamless
performance to. And the goal is to properly identify the species and
variety of fruit that is presented. A major challenge in many of the object
detection systems is the dependency on other computer vision techniques
for helping the deep learning based approach, which leads to slow and
non-optimal performance. In this project, we use a completely machine
depended deep learning approach to solve the problem of object
detection in an end-to-end fashion. The network is trained on a large and
variable dataset with images that show good representation in real life.
The resulting system is fast and accurate, which allows our integrated
system to perform well enough in real-time scenarios.
2. Introduction
Machine learning and deep learning techniques have come a long way in the
recent past thanks to the AI boom. And a lot of progress is being made in the
field of image recognition. Image Recognition has become the hottest topic
today driving most of the progress being made in the field of AI and Computer
Vision. And now focus is being given to hardware based systems that can
perform real-time recognition. For such integrated systems, the biggest
challenge is having hardware with weak performance perform complex
calculations required by image recognition programs. Our project demonstrates
how one can do object detection using a Raspberry Pi against the drawback of
having weak performance with the help of deep neural networks. Our aim
through this project is to create a system which can identify fruits present in our
environment and showcase the capacity of deep neural network to identify
things without human input.
In our system by just clicking a button, the fruit placed before the camera is
analyzed and a description of the fruit is generated on the LED screen.
2.1 Objective
The objective of our project is to showcase a working model that can perform
real-time image recognition with the power of integrated development board
Raspberry Pi. Our project aims to recognize the presence of a fruit which is
placed in front of the model’s camera unit and show the description of the fruit
onto the display of the model.
Our model consists of hardware and software in a unit, with the hardware
responsible for both input and output. No auxiliary device is required and our
project model works independently on its own.
The input is an image of a fruit that we are to recognize. It is not necessary for
the fruit to be distinct from the environment. Our project aims to understand the
presence of the fruit in its natural environment.
The output is the name and a short description of the fruit on the display of our
model.
We have trained our own Neural network model using the optimal AI and deep
learning techniques and the entire model is built from scratch. This allows us to
best fit our use case and predict with a relatively high accuracy.
2.2 Challenges for the project
The biggest challenge for our project is the difficulty in properly recognizing a
fruit from the background due to environmental factors and camera noise.
Image not properly taken will also correspond to an error in recognition. This
factor is user determined and hence very hard to correct. So proper care must be
taken by the user in snapping the image.
Environment that a fruit is in may also induce error in recognition. This will be
major focus area for our team to work on. Prepossessing and filtering of data in
the image will help in increasing the output accuracy. Darker environments will
be bane for the project’s working as the camera input of our project is of not the
best quality.
Another important obstacle for our model is the requirement of real-time output
while being accurate in detection. The more complex the model is, the more
time it requires for inference; and the less complex the model is, the less is the
accuracy. As our hardware is not the most powerful out there, compromises
have to be made to fulfill our objectives. Realistic estimates must be taken
through in depth testing and performance anaylsis.
3. Methodology of work
For our project, we will be working on Raspberry Pi as our base with python
program code on top.
For the software part, we will be building upon the Raspbian OS, the official OS
of Raspberry Pi. The device will boot into out custom GUI which will in
cooperate the camera interface. Image of the fruit can be directly taken from the
interface and no element of the underlying Raspbian OS will hinder with our
system.
The underlying code present of the Raspberry Pi consists of python based
interfacing layers that will connect the GUI to the backend. This will be
implemented directly on the Raspberry Pi using the inbuilt IDE of Raspbian OS:
Thonny.
The backend consists of the pre-trained model of our image recognizer placed in
the loaded into the memory and the local storage that holds the temporary data
in a simple folder structure.
The image recognizer, which is the heart of our project, is trained on a separate
system with Python code as the base. We use a hybrid approach for our image
recognizer combining the best of modern neural network techniques. The base
is built upon Google’s Tenserflow deep learning library which has all the basic
operations we need to generate a neural network. Some aspects are simplified
with Keras Library with is built on Tenserflow, and successful networks like
Resnet are drawn upon as inspiration.
4. Components and Design:
Hardware Components:
Software Base:
Libraries:
Raspberry Pi:
Raspberry Pi 3 Model B was released in February 2016 with a 1.2 GHz 64-bit
quad core processor, on-board 802.11n Wi-Fi, Bluetooth and USB boot
capabilities. On Pi Day 2018 the Raspberry Pi 3 Model B+ was launched with
a faster 1.4 GHz processor and a three-times faster gigabit Ethernet (throughput
limited to ca. 300 Mbit/s by the internal USB 2.0 connection) or 2.4/5GHz dual-
band 802.11ac Wi-Fi (100 Mbit/s).Other features are Power over Ethernet
(PoE), USB boot and network boot.
3.5” LED Touchscreen Display:
The camera board is shipped with a flexible flat cable that plugs into the CSI
connector which is located between the Ethernet and HDMI ports. In Raspbian,
the user must enable the use of the camera board by running Raspi-config and
selecting the camera option.
MicroSD Card: It is a type of removable flash memory card used for storing
information. SD is an abbreviation of Secure Digital, and microSD cards are
sometimes referred to as µSD or uSD. These cards are used in mobile phones
and other mobile devices.
Power Source (USB): A USB Power Source can be any direct current source
that can provide 5V at low amperage. These can be USB wall adapers,
Powerbanks or PC USB ports.
4.2 SOFTWARE BASE:
Raspbian OS:
An operating system is the set of basic programs and utilities that make your
Raspberry Pi run. However, Raspbian provides more than a pure OS: it comes
with over 35,000 packages, pre-compiled software bundled in a nice format for
easy installation on your Raspberry Pi. Raspbian is still under active
development with an emphasis on improving the stability and performance of as
many Debian packages as possible.
Sypder:
Spyder is an open source cross-platform integrated development environment
(IDE) for scientific programming in the Python language. Spyder integrates
with a number of prominent packages in the scientific Python stack, including
NumPy, SciPy, Matplotlib, pandas, IPython, SymPy and Cython, as well as
other open source software. It is released under the MIT license.
Spyder uses Qt for its GUI, and is designed to use either of the PyQt or PySide
Python bindings.
Features include:
An editor with syntax highlighting, introspection, code completion
Support for multiple IPython consoles
The ability to explore and edit variables from a GUI
A Help pane able to retrieve and render rich text documentation on functions, classes and
methods automatically or on-demand
A debugger linked to IPdb, for step-by-step execution
Static code analysis, powered by Pylint
A run-time Profiler, to benchmark code
Project support, allowing work on multiple development efforts simultaneously
A built-in file explorer, for interacting with the filesystem and managing projects
A "Find in Files" feature, allowing full regular expression search over a specified scope
An online help browser, allowing users to search and view Python and package
documentation inside the IDE
A history log, recording every user command entered in each console
An internal console, allowing for introspection and control over Spyder's own operation
IDLE:
IDLE is an integrated development environment for Python, which has been
bundled with the default implementation of the language since 1.5.2b1. It is
packaged as an optional part of the Python packaging with many Linux
distributions. It is completely written in Python and the Tkinter GUI toolkit
IDLE is intended to be a simple IDE and suitable for beginners, especially in an
educational environment. To that end, it is cross-platform, and avoids feature
clutter.
Its main features are:
Multi-window text editor with syntax highlighting, auto-completion,
smart indent and other.
Python shell with syntax highlighting.
Integrated debugger with stepping, persistent breakpoints, and call stack
visibility.
Author Guido van Rossum says IDLE stands for "Integrated DeveLopment
Environment", and since Van Rossum named the language Python partly to
honor British comedy group Monty Python, the name IDLE was probably also
chosen partly to honor Eric Idle, one of Monty Python's founding members
Thonny:
Thonny is a Python IDE for beginners. It supports different ways of stepping
through the code, step-by-step expression evaluation, detailed visualization of
the call stack and a mode for explaining the concepts of references and heap.
It comes built in with NOOBS package of Raspberry Pi with Python 3.7.
Main development of Thonny took place in Institute of Computer Science of
University of Tartu, Estonia.
4.3 LIBRARIES USED:
OpenCV:
OpenCV is a library of programming functions mainly aimed at real-time
computer vision. Originally developed by Intel, it was later supported by
Willow Garage then Itseez. The library is cross-platform and free for use
under the open-source BSD license.
Pillow:
Python Imaging Library (abbreviated as PIL) (in newer versions knownas
Pillow) is a free library for the Python programming language that adds
support for opening, manipulating, and saving many different image file
formats. Pillow offers several standard procedures for image
manipulation. These include: per-pixel manipulations, masking and
transparency handling, image filtering, such as blurring, contouring,
smoothing, or edge finding, image enhancing, such as sharpening,
adjusting brightness, contrast or color, adding text to images and much
more.
Numpy:
NumPy is a library for the Python programming language, adding support
for large, multi-dimensional arrays and matrices, along with a large
collection of high-level mathematical functions to operate on these arrays.
The ancestor of NumPy, Numeric, was originally created by Jim Hugunin
with contributions from several other developers.
Keras:
Keras is an open source neural network library written in Python. It is
capable of running on top of TensorFlow, Microsoft Cognitive Toolkit,
Theano, or MXNet. Designed to enable fast experimentation with deep
neural networks, it focuses on being user-friendly, modular, and
extensible. It was developed as part of the research effort of project
ONEIROS, and its primary author and maintainer is François Chollet, a
Google engineer. It allows for easy and fast prototyping (through user
friendliness, modularity, and extensibility).
Tenserflow:
TensorFlow is an open-source software library for dataflow programming
across a range of tasks. It is a symbolic math library, and is also used for
machine learning applications such as neural networks. It is used for both
research and production at Google often replacing its closed-source
predecessor, DistBelief. TensorFlow is an end-to-end open source
platform for machine learning. It has a comprehensive, flexible
ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build
and deploy ML powered applications.
Scikit-Learn:
Our project aims to be education model that shows how to easily implement a
deep learning algorithm to identity a real world object. The biggest advantage of
this project is it will give a vast knowledge about fruits. It will help farmers in
there fruit farming. It helps in stopping the medical issues for which the rotten
fruits. It can be used for other object recognition also.
Apart from that, this system could also be used not just for fruits but also for
any other object which is unknown to us. These are the real-world applications
of our proposed project. These systems can be integrated with other tasks such
as pose estimation where the first stage in the pipeline is to detect the object,
and then the second stage will be to estimate pose in the detected region. A well-
known application of object detection is face detection, that is used in almost all
the mobile cameras. A more generalized (multi-class) application can be used in
autonomous
The fact is that for many applications, one doesn’t need real-time detection and
the models are small enough to run anywhere. It may take us 30 seconds to
evaluate the image, but that’s fine for most purposes.
This functionality is of very low cost, secure and respectful of privacy. The fact
that it’s cheap to build, relatively simple, fully customization and able to run
offline puts it at an advantage over many more expensive cloud services.
Conclusion
An accurate and efficient fruit recognition system has been developed which
achieves comparable metrics with the existing state-of-the-art system. This
project uses well tested techniques in the field of computer vision and deep
learning and highlights the usage and integration with available hardware
ecosystem. This can be used in real-time applications which require object
detection for pre-processing in their pipeline. It can also help to generate similar
models that detect other kinds off objects for more specific purposes. We can
also build on this project, to have the ability to identify any object and provide
us information about it. For this we’ll have to connect it to the internet and use a
database based off a cloud service. But for now, we plan to execute just those
objectives which have been stated above.
The Team
Ahmed Gamal Al-Hagri – 16/ICS/001