Sei sulla pagina 1di 29

PROJECT TITLE

PLANT SPECIES DETECTION


USING DEEP LEARNING

BY
DEBOJYOTI NASKAR
SPURTHI FELISHA BATTU
NAFISUR RAHMAN
CONTENTS
# INTRODUCTION TO PROJECT
# INTRODUCTION TO DATASET
# SAMPLES OF DATASET
# DATASET DESCRIPTION
# WORKFLOW
# RESULTS OF THE WORKDONE
INTRODUCTION
TO PROJECT

The challenge is to develop the algorithm to more


accurately identify whether images of forest and
foliage contain invasive hydrangea(class=0) or not
(class=1).It is a classification problem.
INTRODUCTION TO DATASET

The dataset contains pictures taken in a


Brazilian national forest.
In some of the pictures there is Hydrangea, a
beautiful invasive species
original of Asia.
Based on training pictures and the labels
provided, the model should predict the
presence of the invasive species in the
testing set of pictures.
SAMPLES OF DATASET
DATASET DESCRIPTION
• File descriptions:
• train.7z – the training set(contains 2295 images)
• train_labels.csv – the correct labels for the training set
• test.7z – the testing set(contains 1531 images), ready to be
labelled by your algorithm
• Data fields
• name – name of the sample picture file(numbers)
• invasive – probability of the picture containing an invasive
species. A probability of 1 means the species is present.
INVASIVE SAMPLE
This picture has hydrangea flower in it.
So it has a probability of 1.
NON INVASIVE SAMPLE
This picture does not have hydrangea flower in it.
So this picture sample has a probability of Zero
WORK FLOW
PREPROCESSING DATASET
• ARRANGING THE DATASET-the training
data set was divided into 4 parts-
class_0_train,class_1_train,val_0_train,val_1_t
rain.
• CONTRAST STRETCHING
• RESIZING
• PADDING
CNN ARCHITECTURE
WORK DONE
• Preprocessing the dataset
• Trained our CNN model using a sequential
classifier
• Fitted the model to the validation data set and
achieved an accuracy of 96.61%
PLOT SHOWING ACCURACY OF
THE MODEL
TUNING PARAMETERS OF CNN
We added some batch Normalisation Layers and
an extra Dense Layer.
We changed the Image Augmentation parameters
to make them slightly more conservative.
This model was made to run for 25 epochs
FLOW OF FINE TUNED CNN
CONCLUSION OF THE TUNED
CNN MODEL

• The model was not that effective


• The loss function flattened at about 0.50
• The accuracy achieved is 0.77
• The validation accuracy stopped improving at
about 0.80.
VGG16 pre-trained model
• In this model, we used Stochastic Gradient Descent to
update the weights. A small learning rate of 0.0001 and a
momentum of 0.9 are used.
• The loss function decreased all the way down to less than
.05.
• The accuracy stabilized at about 98% and the validation
accuracy finished at about 95%.
• This model took approximately 30 hours to train on CPU,
which is very time consuming. The loss continued to
decrease with each subsequent epoch, so instead if we used
a powerful GPU resource, the training would have
extended for 30 more epochs. Below is a plot of the
training.
FLOW OF VGG
PLOT SHOWING PERFORMANCE OF
THE MODEL
TRANSFER LEARNING
• Deep Learning supports an immensely useful
feature called 'Transfer Learning'. Basically,
you are able to take a pre-trained deep learning
model - which is trained on a large-scale
dataset such as ImageNet - and re-purpose it to
handle an entirely different problem.
TRANSFER LEARNING-RESNET50
MODEL
• This model seemed to suffer from a
considerable amount of overfitting.
• It performed less effectively than the first two
models that we created.
• The training accuracy reached about 95% but
the model didn't perform well on the validation
data.
PLOT SHOWING PERFORMANCE OF
THE MODEL
TEST PREDICTIONS
CONCLUSION
• Pre-trained models are incredibly powerful, and it would be foolish
to not harness their power.

• Flow from directory is a very useful function provided by Keras, but


the data organization process required to use it is rather time
consuming and tedious.

• It's expensive to run complex and effective models on a CPU.

• Image augmentation is a very valuable tool to prevent overfitting,


especially when working with limited image data.

• It's integral to save weights and models after training.


FUTURE SCOPE
• Try different learning rates for each model.

• Run pretrained VGG16 model for more epochs on a GPU


service. The loss was still decreasing when the model was
done training, but I was too impatient to wait 30 more hours
to train for 20 more epochs.

• Experiment with other pre-trained models and more


advanced transfer learning methods.

• Experiment with various data augmentation


methods/parameters.
REFERENCES
Helpful Sources
[1]https://www.kaggle.com/ievgenvp/keras-flow-from-directory-on-
python
Helped with setting up photos for flow from directory, and helped with
re-organizing photos after predictions were made.

[2]https://www.kaggle.com/dansbecker/transfer-learning/notebook
Helped to set up the transfer learning model.

[3]https://www.kaggle.com/fujisan/use-keras-pre-trained-vgg16-acc-98
Helped to set up VGG16 model.

[4]https://blog.keras.io/building-powerful-image-classification-models-
using-very-little-data.html
• Helped to learn more about image augmentation.

[5]https://www.kaggle.com/dansbecker/transfer-learning/data
Downloaded the bottleneck features from ResNet50 model from this
site.

[6]https://keras.io
Constantly referenced the keras documentation throughout the
project.

[7]https://stackoverflow.com
Constantly received help from stack overflow when trouble shooting
problems.

[8]https://machinelearningmastery.com/adam-optimization-algorithm-
for-deep-learning/
Helped to develop a basic understanding of the adam optimizer.
THANK YOU

Potrebbero piacerti anche