Sei sulla pagina 1di 27

FRUITS RECOGNITION WITH RASPBERRY PI

A documentation for the evaluation and partial fulfilment of the


requirement for the award of the degree

INTEGRATED DUAL DEGREE B.TECH + M.TECH (CSE)

Submitted By-

Ahmed Gamal Al-Hagri (16/ICS/001)


Allen Joe Mathew (16/ICS/006)
Prashant Kumar (16/ICS/039)
Paritosh Bisht (16/ICS/037)

UNDER THE SUPERVISION OF

Mr. Ankur Kumar


SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY

GAUTAM BUDDHA UNIVERSITY


CONTENTS
1. Abstract
2. Introduction
2.1 Objective
2.2 Challenges in the project
3. Methodology of Work: Design and Implementation
4. Hardware and Software Components
5. Working Model details and Images
6. Real-World Application and Use
7. Conclusion
8. Students Involved
9. Bibliography/References
1. Abstract

Efficient and accurate object detection has been an important topic in the
advancement of computer vision systems. With the advent of deep
learning techniques, the accuracy for object detection has increased
drastically. The project aims to incorporate well tested techniques of
object detection with the goal of achieving high accuracy with seamless
performance to. And the goal is to properly identify the species and
variety of fruit that is presented. A major challenge in many of the object
detection systems is the dependency on other computer vision techniques
for helping the deep learning based approach, which leads to slow and
non-optimal performance. In this project, we use a completely machine
depended deep learning approach to solve the problem of object
detection in an end-to-end fashion. The network is trained on a large and
variable dataset with images that show good representation in real life.
The resulting system is fast and accurate, which allows our integrated
system to perform well enough in real-time scenarios.
2. Introduction

Machine learning and deep learning techniques have come a long way in the
recent past thanks to the AI boom. And a lot of progress is being made in the
field of image recognition. Image Recognition has become the hottest topic
today driving most of the progress being made in the field of AI and Computer
Vision. And now focus is being given to hardware based systems that can
perform real-time recognition. For such integrated systems, the biggest
challenge is having hardware with weak performance perform complex
calculations required by image recognition programs. Our project demonstrates
how one can do object detection using a Raspberry Pi against the drawback of
having weak performance with the help of deep neural networks. Our aim
through this project is to create a system which can identify fruits present in our
environment and showcase the capacity of deep neural network to identify
things without human input.

In our system by just clicking a button, the fruit placed before the camera is
analyzed and a description of the fruit is generated on the LED screen.
2.1 Objective

The objective of our project is to showcase a working model that can perform
real-time image recognition with the power of integrated development board
Raspberry Pi. Our project aims to recognize the presence of a fruit which is
placed in front of the model’s camera unit and show the description of the fruit
onto the display of the model.
Our model consists of hardware and software in a unit, with the hardware
responsible for both input and output. No auxiliary device is required and our
project model works independently on its own.
The input is an image of a fruit that we are to recognize. It is not necessary for
the fruit to be distinct from the environment. Our project aims to understand the
presence of the fruit in its natural environment.
The output is the name and a short description of the fruit on the display of our
model.
We have trained our own Neural network model using the optimal AI and deep
learning techniques and the entire model is built from scratch. This allows us to
best fit our use case and predict with a relatively high accuracy.
2.2 Challenges for the project

The biggest challenge for our project is the difficulty in properly recognizing a
fruit from the background due to environmental factors and camera noise.
Image not properly taken will also correspond to an error in recognition. This
factor is user determined and hence very hard to correct. So proper care must be
taken by the user in snapping the image.

Environment that a fruit is in may also induce error in recognition. This will be
major focus area for our team to work on. Prepossessing and filtering of data in
the image will help in increasing the output accuracy. Darker environments will
be bane for the project’s working as the camera input of our project is of not the
best quality.

Another important obstacle for our model is the requirement of real-time output
while being accurate in detection. The more complex the model is, the more
time it requires for inference; and the less complex the model is, the less is the
accuracy. As our hardware is not the most powerful out there, compromises
have to be made to fulfill our objectives. Realistic estimates must be taken
through in depth testing and performance anaylsis.
3. Methodology of work

For our project, we will be working on Raspberry Pi as our base with python
program code on top.

On the hardware side of things, we have a 3.5” Touchscreen Display mounted


onto the PCB of Raspberry pi. This display will be responsible for interacting
with the user by directing the GUI input and showcasing the result of our
recognizer. We will also be having the official Raspberry Pi compatible camera
mounted onto the back of the PCB to take the input image of the fruit. The
connectors used for interfacing are the standard display and camera pins
provided on the Raspberry Pi motherboard.

For the software part, we will be building upon the Raspbian OS, the official OS
of Raspberry Pi. The device will boot into out custom GUI which will in
cooperate the camera interface. Image of the fruit can be directly taken from the
interface and no element of the underlying Raspbian OS will hinder with our
system.
The underlying code present of the Raspberry Pi consists of python based
interfacing layers that will connect the GUI to the backend. This will be
implemented directly on the Raspberry Pi using the inbuilt IDE of Raspbian OS:
Thonny.
The backend consists of the pre-trained model of our image recognizer placed in
the loaded into the memory and the local storage that holds the temporary data
in a simple folder structure.

The image recognizer, which is the heart of our project, is trained on a separate
system with Python code as the base. We use a hybrid approach for our image
recognizer combining the best of modern neural network techniques. The base
is built upon Google’s Tenserflow deep learning library which has all the basic
operations we need to generate a neural network. Some aspects are simplified
with Keras Library with is built on Tenserflow, and successful networks like
Resnet are drawn upon as inspiration.
4. Components and Design:

Hardware Components:

Raspberry Pi Model 3B+; 3.5” Touchscreen Display; Raspberry Pi v2 Camera


Module; Thumb button; USB Power Supply; USB cable.

Software Base:

Raspbian OS; Python 3; IDLE; Anaconda: Sypder, Jupyter; Thonny.

Libraries:

Image Processing: OpenCV, Numpy, Pillow

Deep Learning: Keras, Tenserflow, Scikit-learn


4.1 HARDWARE COMPONENTS:

Raspberry Pi:

The Raspberry Pi is a series of small single-board computers developed in the


United Kingdom by the Raspberry Pi Foundation to promote teaching of basic
computer science in schools and in developing countries. The original model
became far more popular than anticipated, selling outside its target market for
uses such as robotics.

Raspberry Pi 3 Model B was released in February 2016 with a 1.2 GHz 64-bit
quad core processor, on-board 802.11n Wi-Fi, Bluetooth and USB boot
capabilities. On Pi Day 2018 the Raspberry Pi 3 Model B+ was launched with
a faster 1.4 GHz processor and a three-times faster gigabit Ethernet (throughput
limited to ca. 300 Mbit/s by the internal USB 2.0 connection) or 2.4/5GHz dual-
band 802.11ac Wi-Fi (100 Mbit/s).Other features are Power over Ethernet
(PoE), USB boot and network boot.
3.5” LED Touchscreen Display:

The input and output is handled by a touchscreen LED display connected


directly to the raspberry pi through display header. Up to 50fps refresh rate,
125MHz signal input, stable and flicker free. It features a 3.5" display with
480x320 16-bit color pixels and a resistive touch overlay. Works best with the
help of resistive stylus.
Raspberry Pi Official Camera v2:

The camera board is shipped with a flexible flat cable that plugs into the CSI
connector which is located between the Ethernet and HDMI ports. In Raspbian,
the user must enable the use of the camera board by running Raspi-config and
selecting the camera option.

The Raspberry Pi Camera Module v2 replaced the original Camera Module in


April 2016. The v2 Camera Module has a Sony IMX219 8-megapixel sensor
(compared to the 5-megapixel OmniVision OV5647 sensor of the original
camera). It supports 1080p30, 720p60 and VGA90 video modes, as well as still
capture.
Other Components:

Button Switch: Pushbutton switches come in a variety of sizes and


configurations. They can be push-on/ push-off or momentary-on, momentary
off. Our selection includes miniature to large buttons, lighted and non-lighted.
They can be surface mounted, panel mounted and have solder, screw, pc
terminals.

MicroSD Card: It is a type of removable flash memory card used for storing
information. SD is an abbreviation of Secure Digital, and microSD cards are
sometimes referred to as µSD or uSD. These cards are used in mobile phones
and other mobile devices.

Power Source (USB): A USB Power Source can be any direct current source
that can provide 5V at low amperage. These can be USB wall adapers,
Powerbanks or PC USB ports.
4.2 SOFTWARE BASE:

Raspbian OS:

Raspbian is a free operating system based on Debian Wheezy armhf with


compilation settings adjusted to produce optimized "hard float" code that will
run on the Raspberry Pi. This provides significantly faster performance for
applications that make heavy use of floating point arithmetic operations. All
other applications will also gain some performance through the use of advanced
instructions of the ARMv6 CPU in Raspberry Pi. Since 2015 it has been
officially provided by the Raspberry Pi Foundation as the primary operating
system for the family of Raspberry Pi single-board computers.

An operating system is the set of basic programs and utilities that make your
Raspberry Pi run. However, Raspbian provides more than a pure OS: it comes
with over 35,000 packages, pre-compiled software bundled in a nice format for
easy installation on your Raspberry Pi. Raspbian is still under active
development with an emphasis on improving the stability and performance of as
many Debian packages as possible.

Although Raspbian is primarily the efforts of Mike Thompson (mpthompson)


and Peter Green (plugwash), it has also benefited greatly from the enthusiastic
support of Raspberry Pi community members who wish to get the maximum
performance from their device.
Python 3:
Python is a general-purpose interpreted, interactive, object-oriented, and high-
level programming language. It was created by Guido van Rossum during 1985-
1990. Like Perl, Python source code is also available under the GNU General
Public License (GPL). Python is named after a TV Show called ëMonty
Pythonís Flying Circusí and not after Python-the snake.
Python 3.0 was released in 2008. Although this version is supposed to be
backward incompatibles, later on many of its important features have been
backported to be compatible with version 2.7
Python is a high-level, interpreted, interactive and object-oriented scripting
language. Python is designed to be highly readable. It uses English keywords
frequently where as other languages use punctuation, and it has fewer
syntactical constructions than other languages.
Following are important characteristics of python −
 It supports functional and structured programming methods as well as
OOP.
 It can be used as a scripting language or can be compiled to byte-code for
building large applications.
 It provides very high-level dynamic data types and supports dynamic type
checking.
 It supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and
Java.
Anaconda Distribution:
Anaconda is a free and open-source distribution of the Python and R
programming languages for scientific computing (data science, machine
learning applications, large-scale data processing, predictive analytics, etc.), that
aims to simplify package management and deployment. Package versions are
managed by the package management system conda. The Anaconda distribution
is used by over 15 million users and includes more than 1500 popular data-
science packages suitable for Windows, Linux, and MacOS.
Anaconda distribution comes with more than 1,500 packages as well as the
conda package and virtual environment manager. It also includes a GUI,
Anaconda Navigator, as a graphical alternative to the command line interface
(CLI).
conda package manager analyses the current environment including everything
currently installed, and, together with any version limitations specified works
out how to install a compatible set of dependencies, warning if this cannot be
done.
Anaconda Navigator is a desktop graphical user interface (GUI) included in
Anaconda distribution that allows users to launch applications and manage
conda packages, environments and channels without using command-line
commands. Navigator can search for packages on Anaconda Cloud or in a local
Anaconda Repository, install them in an environment, run the packages and
update them.
Anaconda Cloud is a package management service by Anaconda where you can
find, access, store and share public and private notebooks, environments, and
conda and PyPI packages. Cloud hosts useful Python packages, notebooks and
environments for a wide variety of applications. You do not need to log in or to
have a Cloud account, to search for public packages, download and install them.
Jupyter:
Jupyter Notebook (formerly IPython Notebooks) is a web-based interactive
computational environment for creating Jupyter notebook documents. The
"notebook" term can colloquially make reference to many different entities,
mainly the Jupyter web application, Jupyter Python web server, or Jupyter
document format depending on context. A Jupyter Notebook document is a
JSON document, following a versioned schema, and containing an ordered list
of input/output cells which can contain code, text (using Markdown),
mathematics, plots and rich media, usually ending with the ".ipynb" extension.
Jupyter Notebook can connect to many kernels to allow programming in many
languages. By default Jupyter Notebook ships with the IPython kernel.

Sypder:
Spyder is an open source cross-platform integrated development environment
(IDE) for scientific programming in the Python language. Spyder integrates
with a number of prominent packages in the scientific Python stack, including
NumPy, SciPy, Matplotlib, pandas, IPython, SymPy and Cython, as well as
other open source software. It is released under the MIT license.
Spyder uses Qt for its GUI, and is designed to use either of the PyQt or PySide
Python bindings.
Features include:
 An editor with syntax highlighting, introspection, code completion
 Support for multiple IPython consoles
 The ability to explore and edit variables from a GUI
 A Help pane able to retrieve and render rich text documentation on functions, classes and
methods automatically or on-demand
 A debugger linked to IPdb, for step-by-step execution
 Static code analysis, powered by Pylint
 A run-time Profiler, to benchmark code
 Project support, allowing work on multiple development efforts simultaneously
 A built-in file explorer, for interacting with the filesystem and managing projects
 A "Find in Files" feature, allowing full regular expression search over a specified scope
 An online help browser, allowing users to search and view Python and package
documentation inside the IDE
 A history log, recording every user command entered in each console
 An internal console, allowing for introspection and control over Spyder's own operation
IDLE:
IDLE is an integrated development environment for Python, which has been
bundled with the default implementation of the language since 1.5.2b1. It is
packaged as an optional part of the Python packaging with many Linux
distributions. It is completely written in Python and the Tkinter GUI toolkit
IDLE is intended to be a simple IDE and suitable for beginners, especially in an
educational environment. To that end, it is cross-platform, and avoids feature
clutter.
Its main features are:
 Multi-window text editor with syntax highlighting, auto-completion,
smart indent and other.
 Python shell with syntax highlighting.
 Integrated debugger with stepping, persistent breakpoints, and call stack
visibility.
Author Guido van Rossum says IDLE stands for "Integrated DeveLopment
Environment", and since Van Rossum named the language Python partly to
honor British comedy group Monty Python, the name IDLE was probably also
chosen partly to honor Eric Idle, one of Monty Python's founding members

Thonny:
Thonny is a Python IDE for beginners. It supports different ways of stepping
through the code, step-by-step expression evaluation, detailed visualization of
the call stack and a mode for explaining the concepts of references and heap.
It comes built in with NOOBS package of Raspberry Pi with Python 3.7.
Main development of Thonny took place in Institute of Computer Science of
University of Tartu, Estonia.
4.3 LIBRARIES USED:

OpenCV:
OpenCV is a library of programming functions mainly aimed at real-time
computer vision. Originally developed by Intel, it was later supported by
Willow Garage then Itseez. The library is cross-platform and free for use
under the open-source BSD license.

Pillow:
Python Imaging Library (abbreviated as PIL) (in newer versions knownas
Pillow) is a free library for the Python programming language that adds
support for opening, manipulating, and saving many different image file
formats. Pillow offers several standard procedures for image
manipulation. These include: per-pixel manipulations, masking and
transparency handling, image filtering, such as blurring, contouring,
smoothing, or edge finding, image enhancing, such as sharpening,
adjusting brightness, contrast or color, adding text to images and much
more.

Numpy:
NumPy is a library for the Python programming language, adding support
for large, multi-dimensional arrays and matrices, along with a large
collection of high-level mathematical functions to operate on these arrays.
The ancestor of NumPy, Numeric, was originally created by Jim Hugunin
with contributions from several other developers.
Keras:
Keras is an open source neural network library written in Python. It is
capable of running on top of TensorFlow, Microsoft Cognitive Toolkit,
Theano, or MXNet. Designed to enable fast experimentation with deep
neural networks, it focuses on being user-friendly, modular, and
extensible. It was developed as part of the research effort of project
ONEIROS, and its primary author and maintainer is François Chollet, a
Google engineer. It allows for easy and fast prototyping (through user
friendliness, modularity, and extensibility).

Tenserflow:
TensorFlow is an open-source software library for dataflow programming
across a range of tasks. It is a symbolic math library, and is also used for
machine learning applications such as neural networks. It is used for both
research and production at Google often replacing its closed-source
predecessor, DistBelief. TensorFlow is an end-to-end open source
platform for machine learning. It has a comprehensive, flexible
ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build
and deploy ML powered applications.

Scikit-Learn:

Scikit-learn (formerly scikits.learn) is a free software machine learning library


for the Python programming language. It features various classification,
regression and clustering algorithms including support vector machines, random
forests, gradient boosting, k-means and DBSCAN, and is designed to
interoperate with the Python numerical and scientific libraries NumPy and
SciPy

5. Working Model Details and Images


6. Real World Applications and Use

Our project aims to be education model that shows how to easily implement a
deep learning algorithm to identity a real world object. The biggest advantage of
this project is it will give a vast knowledge about fruits. It will help farmers in
there fruit farming. It helps in stopping the medical issues for which the rotten
fruits. It can be used for other object recognition also.

Apart from that, this system could also be used not just for fruits but also for
any other object which is unknown to us. These are the real-world applications
of our proposed project. These systems can be integrated with other tasks such
as pose estimation where the first stage in the pipeline is to detect the object,
and then the second stage will be to estimate pose in the detected region. A well-
known application of object detection is face detection, that is used in almost all
the mobile cameras. A more generalized (multi-class) application can be used in
autonomous

The fact is that for many applications, one doesn’t need real-time detection and
the models are small enough to run anywhere. It may take us 30 seconds to
evaluate the image, but that’s fine for most purposes.

This functionality is of very low cost, secure and respectful of privacy. The fact
that it’s cheap to build, relatively simple, fully customization and able to run
offline puts it at an advantage over many more expensive cloud services.
Conclusion
An accurate and efficient fruit recognition system has been developed which
achieves comparable metrics with the existing state-of-the-art system. This
project uses well tested techniques in the field of computer vision and deep
learning and highlights the usage and integration with available hardware
ecosystem. This can be used in real-time applications which require object
detection for pre-processing in their pipeline. It can also help to generate similar
models that detect other kinds off objects for more specific purposes. We can
also build on this project, to have the ability to identify any object and provide
us information about it. For this we’ll have to connect it to the internet and use a
database based off a cloud service. But for now, we plan to execute just those
objectives which have been stated above.
The Team
Ahmed Gamal Al-Hagri – 16/ICS/001

Allen Joe Mathew – 16/ICS/006

Paritosh Bisht – 16/ICS/037

Prashant Kumar – 16/ICS/039

Under the Guidance of Ankur Kumar.


References
1) Ross Girshick, Je Donahue, Trevor Darrell, and Jitendra Malik. Rich
feature hierarchies for accurate object detection and semantic
segmentation. In The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2014.
2) Ross Girshick. Fast R-CNN. In International Conference on Computer
Vision (ICCV), 2015.
3) Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN:
Towards real- time object detection with region proposal networks. In
Advances in Neural Information Processing Systems (NIPS), 2015.
4) Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You
only look once: Unied, real-time object detection. In The IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
5) Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott
Reed, Cheng- Yang Fu, and Alexander C. Berg. SSD: Single shot
multibox detector. In ECCV, 2016.
6) Karen Simonyan and Andrew Zisserman. Very deep convolutional
networks for large-scale image recognition. arXiv preprint
arXiv:1409.1556, 2014.

Potrebbero piacerti anche