Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Animal
Constantino Geovany O.L Johan Aristo Wibowo Lewi Junardi T.
Department Of Informatics Department Of Informatics Department Of Informatics
Universtas Atma Jaya Yogyakarta Universtas Atma Jaya Yogyakarta Universtas Atma Jaya Yogyakarta
Yogyakarta, Indonesia Yogyakarta, Indonesia Yogyakarta, Indonesia
orlandolana09@gmail.com aristo.yohan@yahoo.com lewijunardi46@gmail.com
Abstract – This paper aim to do a study of image therefore we emphasize more on introducing
classification for rare animal. This research will explore technology that can classify these animals based on
many of machine learning algorithm, like logistic their status. The one that was sculpted to almost
regression, kneighbors, random forest, etc. after three
extinction. Our goal is none other than to jointly invite
experiments on several algorithms used, it shows that
the public to be able to protect and share the care of
random forest is suitable for solving this problem, with an
accuracy reaching 93% the endangered animal population.
Keywords – image classification, rare animal, animal It is often debated whether killing a whale is an
affect, machine learning classification for animal, random illegal step, cultivating snakes and crocodiles is
forest . something that is prohibited, or can the birds that we
maintain be animals with an endangered status?
I. INTRODUCTION
II. RELATED WORK
By keeping up with the times, of course
technology is also growing. More and more new Animal data circulating in cyberspace is fairly
discoveries and developments from technology that large and one form of data is in the form of images.
have already existed before have had a positive impact The research conducted by Slavomir Matuska, Robert
on human life. Hudec, Patrik Kamencay, Miroslav Benco, Martina
Zachariasova, 2014 was aimed at introducing objects
One of the technologies that is developing at this based on local hybrid descriptors. The dataset used
time is the classification of images. One area of comes from dataset classes such as wolves, foxes,
artificial intelligence that is developing at this time. brown bears, deer and wild boar using images of large
animals originating from the country of Slovakia. The
From a picture there is very much information that method used is SVM with the aim to see the speed of
can be obtained. A collection of images is unprocessed testing and the level of accuracy obtained [1].
raw data. After the data is processed we can get
information so that we can explore the knowledge A study conducted by Tibor TRNOVSZKY,
contained in it. Patrik KAMENCAY, Richard ORJESEK, Miroslav
We chose to classify endangered species because BENCO, Peter SYKORA, 2017 aims to compare the
we wanted to better educate the public, especially the methods used to recognize all types of animals,
next generation, to get to know the types of animals on namely Convolutional Neural Network (CNN) with
this earth. There are so many types of animals, several other methods such as PCA (Principal
Component Analysis) , Linear Discriminant Analysis several species such as amur leopard, Yangtze
(LDA), Local Binary Patterns Histographic (LBPH) finless porpoise, black rhinos, hawksbill turtle, and
and Support Vector Machine (SVM). The results Sumatran orangutan. Endangered is a species that
obtained show that CNN is the most appropriate has been categorized as very likely to become
method to use because the method gives positive extinct in the near future. Endangered is used to
results and outperforms other methods [2]. label several species such as African wild dogs,
chimpanzee, sea lions, whale sharks, and red
The research using a drawing dataset from the pandas. Extinct is a grouped species that is already
Wildlife Spotter project, of course, was obtained using extinct because it considers the last species that has
a trap camera from Australian Scientists. The research died.
was carried out by Hung Nguyen, Sarah J. Maclagan,
Tu Dinh Nguyen, Thin Nguyen, Paul Flemons, Kylie EX is used to label several species such as
Andrews, Euan G. Ritchie and Dinh Phung in 2017. Canadian cougar, Galapagos tortoise, Iberian ibex,
The study aimed to classify image data in the form of macaw spix, and quagga. The distribution for each
animals or non-animals. If the image is detected by an label is shown in the following table.
animal image it will be classified again into animal
data [3]. In the dataset that we have collected, there are
several labels including Near Threatened (NT),
III. DATASET PREPARATION Least Concern (LC), Critically Endangered (CE),
Endangered (E), Extinct (EX).
A. Data Collection
In data collection, we collect images through
google's search engine. By using google's search
engine, we select and save images manually one by
one. We collect images into 5 classes and each Table 1. Dataset Distribution For Each Label.
class has 5 species of animals. The data we collect Label Number of Data Data Test
is image data with medium quality and size with Data Train
file formats .jpg, .jpeg, and .png. The collected NT 498 488 10
images are inserted into several folders according
LC 496 486 10
to the label, by giving the file name that we have
CE 474 464 10
specified.
E 550 540 10
EX 500 490 10
B. Data Characteristics
Near Threatened is a species that is
considered endangered in the near future, although C. Dataset Prepocessing
it does not currently qualify for the threatened 1. Dataset Reading
status. NT is used to label several species such as To read all datasets, we use the cv2
Jaguar, Beluga, Greater Sage Grouse, Albacore library to open every jpg and png type image
Tuna, and Bison Plains. Least Concern is
considered not to be the focus of species 2. Normalisation
conservation because it does not qualify as This process is useful for cleaning
threatened or almost threatened and the risk of raw data before processing. The method of
extinction based on its population. LC is used to normalization includes several techniques
label several species such as brown bear, tree including:
kangaroo, swift fox, macaw, and pronghorn.
3. KNeighbors Classifier
Figure 1. The result of the the model evaluation
use pyplot for 7 algorithm.
C. Parameter Tuning
The third experiment, we try to replace the
value of the bins parameter with 10, and num tree
with 100. The accuracy of some algorithms adds
around 0.05%, so this third experiment can affect
the value of accuracy.
B. Augmented
In the second experiment, the images in the
dataset are augmented, thus having a large
number of datasets.
V. CONCLUSION