Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ScienceDirect
Research Paper
Special Issue: Robotic Agriculture
article info
The spectral-spatial classification of high spatial resolution RGB images obtained from
Article history: unmanned aerial vehicles (UAVs) for detection of tomatoes in the image is presented.
Published online xxx Bayesian information criterion (BIC) was used to determine the optimal number of clusters
for the image. Spectral clustering was carried out using K-means, expectation max-
imisation (EM) and self-organising map (SOM) algorithms to categorise the pixels into two
Keywords: groups i.e. tomatoes and non-tomatoes. Due to resemblance in spectral intensities, some of
Unmanned aerial vehicle the non-tomato pixels were grouped into the tomato group and in order to remove them,
Spectral clustering spatial segmentation was performed on the image. Spatial segmentation was carried out
Spatial segmentation using morphological operations and by setting thresholds for geometrical properties. The
number of pixels grouped in the tomato cluster is different for each clustering method. EM
doesn't pick up the land patches as tomato pixels. As a result, the size of the tomatoes
picked up is different than K-means and SOM. Since threshold values chosen for carrying
out spatial segmentation are shape and size dependent, different threshold values are
applied to different methods of clustering. A synthetic image of 12 12 pixels with
different labels is created to illustrate the effect of each method used for spatial segmen-
tation on the clustered image. Two representative UAV images captured at different
heights from the ground were used to demonstrate the performance of the proposed
method. Results and comparison of performance parameters of different spectral-spatial
classification methods were presented. It is observed that EM performed better than K-
means and SOM.
© 2015 IAgrE. Published by Elsevier Ltd. All rights reserved.
* Corresponding author. Tel.: þ91 080 229 32873; fax: þ91 080 236 00134.
E-mail address: omkar@aero.iisc.ernet.in (S.N. Omkar).
http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
1537-5110/© 2015 IAgrE. Published by Elsevier Ltd. All rights reserved.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
2 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 3
maximisation (EM) and self-organising map (SOM), applied on process of building the UAV for data acquisition has been
the image to generate the clusters and 2) Spatial segmentation: detailed.
Morphological operations with threshold for size and shape The UAV used in this study is built using off-the-shelf
based properties such as area and shape intensity are applied components. It is a quad-copter which belongs to a class of
to separate the tomatoes from other objects which were multi-rotor aircraft. The UAV was an electrically powered,
falsely picked up in the spectral clustering process. VTOL, with low endurance and low cost. The main advantage
Section 2 describes the UAV system used in this study and of VTOL UAVs are their hovering capabilities which makes
the data acquisition procedure. Section 3 describes the them stable platforms during image acquisition. It is piloted
spectral-spatial methods applied on UAV images for tomato remotely from the ground station by a trained pilot. The take-
detection. In Section 4, a synthetic image is used to illustrate off weight was 2.1 kg including a payload of 0.1 kg. The UAV
the proposed methods. Sections 5 and 6 details the perfor- and camera specifications are listed in Tables 1 and 2
mance measures and results and a discussion on the results respectively.
respectively. The conclusions are discussed in Section 7. Figures 1 and 2 show an image of the system and a block
diagram, respectively. The URS comprised the following sub-
systems, namely, a UAV, an imaging sensor and ground
2. Unmanned aerial vehicle for remote control.
sensing
2.1.1. UAV sub-system
In this section the type of UAVs suitable for remote sensing This sub-system consisted of all the components and devices
and more specifically for agriculture applications are dis- required for the UAV to be airborne. The UAV sub-system has
cussed. UAVs typically fly at low altitudes to acquire remote four brushless DC motors mounted on a quad-copter frame
sensing data also known as low altitude remote sensing operated using four electronic speed controllers (ESC) that
(LARS) (Saberioon et al., 2014). For LARS, most UAVs are fixed- were powered by Lithium-Polymer batteries. The quad-copter
wing or rotary-wing aircraft with low payload and short was stabilised using an on-board flight stabilisation system.
endurance capabilities. Payload size and weight are critical The flight stabilisation system was an open-source multi-
factors for UAVs in agricultural applications. The most rotor platform called ardupilot (APM2.5, 3d Robotics Inc, Ber-
important payload component in URS is the imaging sensor. keley, California, USA). It consisted of MEMS based gyroscope,
The imaging sensor for URS therefore needs to be of small size accelerometer, magnetometer, barometer, GPS module and
and low mass. AT-mega 2560 AVR microcontroller. As the pilot navigated
Co rcoles, Ortega, Herna ndez, and Moreno (2013) used a using the remote control, the control commands were trans-
vertical take-off and landing (VTOL) micro-drone quad-rotor mitted over the air interface to the on-board flight stabilisa-
aircraft to determine LAI and canopy cover mapping of onion tion system. The flight stabilisation system varied the speed of
crop. Berni, Zarco-Tejada, Sua rez, and Fereres (2009) used the motors to achieve the desired altitude. The pitch-roll
helicopter based UAV with hyper-spectral and thermal im- camera stabilisation maintained the nadir view which was
aging for vegetation monitoring. Xiang and Tian (2011) used a independent of the orientation of UAV.
14 kg helicopter with multispectral cameras and autonomous
capabilities to collect field image data with GPS navigation. 2.1.2. Imaging sensor sub-system
Sugiura, Noguchi, and Ishii (2005) used remote sensing for The imaging system is realised using a Raspberry Pi® (Model
vegetation monitoring using helicopter UAV. From the litera- 1 B, Raspberry Pi Foundation, Caldecote, Cambridgeshire,
ture it can be observed that LARS for agriculture is carried out United Kingdom). The Raspberry Pi® is a credit card sized
using cameras mounted on small low payload UAVs. These single board computer (SBC) that has a camera board as a
requirements drive the system characteristics of the UAV standard add on. The SBC runs a Linux operation system.
used in this work. Aerial video was acquired using the camera module of the
Raspberry Pi® computer. The Raspberry Pi® camera module
2.1. System description specification is shown in Table 2. The camera has a fixed focal
length of 3.6 mm and a fixed horizontal and vertical field of
In this section, the characteristics and preferred configuration view. The UAV was flown up to an altitude of 50 m. The images
of URS for agriculture application is discussed. Further, the chosen for analysis were captured at altitudes below 20 m.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
4 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 5
X K
N X 2
J¼ xn mk (3)
n¼1 k¼1
where xn is the nth pixel and m k is the kth cluster centre. The
BIC-K-means steps are
pk Nðxn jmk ; Sk Þ
3.1.1. K-means gnk ¼ P (5)
K
K-means aims at minimisation of the Euclidean distance j¼1 pj N xn mj ; Sj
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
6 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Step 6: Update the parameters of each distribution using basis for competition. The discriminant function for a
Eqs. (6)e(8) neuron is given by
PN
gnk xn X
D
2
mnew
k ¼ n¼1
(6) dj ¼ xi wij (11)
Nk
i¼1
PN T
Xnew gnk xn mnew xn mnew where x is the input vector of D dimension and wij is the
¼ n¼1 k k
(7) weight vector connecting the ith component of the input
k Nk
vector to the jth neuron. Hence, the winning neuron is the
XN neuron that most closely matches the input i.e. the neuron for
Nk
pnew
k ¼ ; where; Nk ¼ gnk (8) which the discriminant function is minimum.
N n¼1
When one neuron is activated, its closest neighbours tend
where mnew
k ; Sk ; pk
new new
are the updated parameters after each to get excited more than those farther away. There is a topo-
iteration, Nk is the total sum of the responsibilities of a logical neighbourhood that decays with distance (Kohonen,
Gaussian distribution and N is the total number of pixels in the 1990). This neighbourhood set Nc around the winning
image. neuron where c is defined.
Step 7: Calculate the log likelihood function given by Step 4: Update the winning neuron and the neurons within
the topological neighbourhood using
X
n
LðQ; XÞ ¼ logðpxi jQÞ (9)
i¼1 Dwij ¼ hðtÞ xi wij (12)
where the observation X ¼ {xiji ¼ 1,…,n} are independently where Dwij indicates the change in the weight and h(t) is the
drawn from the distribution p(x) parameterised by Q. time dependent learning rate which decreases with time. In
The optimised likelihood is given by this study the basic Kohonen network was implemented. The
above steps were repeated until there was no further change
e
LðQ; XÞ ¼ LðQ; XÞ þ gPðX; yjQÞ (10) in the topography.
where the regulariser P is a functional of the distribution of the
complete data given by Q and the positive value g is the reg- Step 5: Select cluster which is dominated with tomatoes for
ularisation parameter that controls the compromise between spatial segmentation, further apply spatial methods to
the degree of regularisation of the solution and the likelihood remove misclassified tomato regions as discussed in Sec-
function. tion 3.2.
Repeat the steps 4e6 till the log likelihood value converges.
3.2. Spatial segmentation
Step 8: Select the cluster which is dominated with to-
matoes for spatial segmentation, further apply spatial As discussed in the last step of K-means, EM and SOM algo-
methods to remove misclassified tomato regions as dis- rithm, here the spatial features of the clustered image were
cussed in Section 3.2. used to remove misclassified tomato regions.
The steps used in spatial segmentation and the rationale
for the chosen sequences of steps were as follows:
3.1.3. Self-organising maps (SOM)
SOM is an unsupervised algorithm which belongs to the class 3.2.1. Closing operation
of competitive learning (Kohonen, 1990). In competitive Some of the single tomatoes appear bisected in the binary
learning, output neurons compete with each other to get segmented image because of the overhead stalks. While
activated. counting, these may be counted as two instead of one. In order
SOM transforms an incoming signal pattern into one or two to join these, close operation is applied on the image with an
dimensional discrete map in a topological fashion. The neu- appropriate structuring element.
rons become selectively tuned to various input patterns or The closing of an image f by a structuring element s is
classes of input patterns during the course of competitive dilation followed by erosion (Haralick, Sternberg, & Zhuang,
learning. The BIC-SOM algorithm steps are: 1987). The structuring element is a small matrix of pixels,
each with a value of zero or one. The dimensions of the matrix
Step 1: Find out the number of neurons K in the mixture state the size of the structuring element and the pattern of
using BIC. ones and zeros specifies its shape.
Step 2: Randomly initialise the connection weights of the
neurons. f $s ¼ ðf 4srot ÞQsrot (13)
Step 3: For each input pattern, compute the value of the where srot means that the dilation and erosion should be
discriminant function for each neuron. This provides the performed with a rotated structuring element. In case of
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 7
symmetrical structuring elements, the rotation does not make then opened using B and then it is subtracted from the eroded
a difference. 4 represents the dilation of f with srot and Q image to obtain the skeleton points as shown in Eq. (14). The
represents erosion. A disk structuring element of radius 2 is operations were done iteratively for different sizes of B. The
used in this study. union of all such skeleton points gave the skeleton represen-
tation of the object. In this study, instead of taking the union
3.2.2. Area thresholding and skeletonisation as shown in Eq. (15), only Eq. (14) was implemented for a
The noise was predominantly due to patches of soil, stalks suitable k and B to find out the number of skeleton points for
and pebbles. It is not possible to eliminate all noise in one step each object. Due to overlapping, the connected tomatoes had
due to different geometric characteristics of the noises. The a complex shape and they need a greater number of pixels to
soil patches, being the largest, were remove first by setting a represent their skeleton. Therefore, along with an area
threshold for area values. The threshold of area values for a threshold, a threshold for skeleton points can be put to retain
given image depends on the resolution of the image. The the tomatoes and eliminate the land patches.
resolution of the image is a function of the altitude of the
image from the ground. The objects in the image were first 3.2.3. Removal of stalks
labelled and then the area for each object was calculated. Area Tomatoes are nearly circular in shape while the stalks are
of an object was calculated by finding the total number of relatively long, thin and have branches. This distinction in the
pixels depicting that object. However, problems arose in geometrical features of the stalks can be used to categorise
selecting a threshold when the tomatoes in the image them into non-tomato group. The geometric parameter shape
overlapped. index (SI) (Senthilnath et al., 2013) given by
In some cases, the area values of tomatoes that overlapped
P
and soil patches were similar. We overcome this difficulty by SI ¼ pffiffiffiffi (16)
4* A
taking into account the skeleton representation of each object
calculated using morphological means. The skeleton of an where P represents the perimeter of the object (i.e. the number
object provides a simple and compact representation of a of pixels on the boundary of the object) and A represents the
shape that preserves the topological and the original size of area of the object. For a given area, the perimeter value of a
the object. The skeletonisation S(A) (Sagar, 2013) of an object A stalk is greater than that of a tomato, which makes the ratio of
is carried out using the following equations: perimeter to area, a higher value for the stalks. Thus, a
threshold value was set for SI to remove stalks. The morpho-
Sk ðAÞ ¼ ðAQkBÞ ðAQkBÞoB (14) logical operation open was applied on the image to detach the
stalks attached with tomatoes.
K
SðAÞ ¼ ∪ Sk ðAÞ (15)
k¼0
3.2.4. Removal of elliptical noise
where B is a structuring element of appropriate size and The unwanted objects remaining in the image were mostly
shape. At first, k successive erosions of object A were carried pebbles which are commonly elliptical in shape. The best way
out with the structuring element B. The eroded image was to distinguish an ellipse from a circle is by finding the ratio of
Fig. 3 e Illustrative example (a) Clustered image (b) Structuring element disk of radius 1 (c) Image after applying closing
operation (d) Effect of area thresholding (e) Effect of shape intensity thresholding (f) Removing the elliptical noise by
calculating the ratio of major to minor axis.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
8 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
the length of the major axis to the minor axis. Circular objects shown in Fig. 3(e). The major to minor axis ratio calculated for
have the ratio almost one while elliptical objects have ratio each object is
greater than one. The labels having ratio greater than the
threshold are categorised into non-tomato group while the
others are retained in the tomato group.
The result obtained, after implementing all the noise Label 1: 1.9861 Label 5: 2.2638
Label 6: 1.4639 Label 7: 1.0000
removal methods, was an image consisting tomatoes present
in the image.
Label 5 represented a pebble in the image. It had the
highest major to minor axis ratio and was eliminated by
setting a threshold of 2 for the ratio. The final result obtained
4. Illustrative example
after eliminating all the misclassified tomato regions is shown
in Fig. 3(f).
A synthetic image of size 12 12 pixels was assumed to
illustrate the effect of each step in the proposed algorithm.
Figure 3(a) shows the clustered image with tomatoes and
misclassified tomato regions. The challenge here is to retain 5. Performance measures
tomatoes by removing misclassified tomato regions, i.e. to
obtain labels 1, 6 and 7. The performance measures using ROC parameters were ana-
Firstly, close operation is applied on the image with a disk lysed in this study by comparing the spectral-spatial classi-
structuring element of radius 1 as shown in Fig. 3(b). Closing fiers. We have the reference data which shows the location of
fills the gaps in the regions while retaining the initial region tomato pixels in the image. It is collected using field survey.
size, as shown in Fig. 3(c). The processed binary image is overlaid on the reference data
The number of pixels (area) of each label was specified as: and the tomatoes and non-tomatoes pixels are identified
respectively. The number of true tomatoes and false tomatoes
were counted separately to evaluate the performance
measures.
Label 1: 13 Label 2: 02
ROC parameters calculated for UAV images to extract to-
Label 3: 11 Label 4: 08
Label 5: 08 Label 6: 03
matoes are defined in terms of true positive (TP), false positive
Label 7: 04 Label 8: 01 (FP) and false negative (FN) (Fawcett, 2006; Senthilnath,
Shivesh, Omkar, Diwakar, & Mani, 2012). A TP means that
the extracted object is a tomato and the database also in-
From Fig. 3(c), it can be observed that number of skeleton dicates the object to be a tomato. If the extracted object is not a
points for each label was: tomato but the database indicates it to be a tomato, then it is
counted as a FN. For an FP the extracted object is a tomato but
the database indicates it not to be a tomato.
Based on these ROC parameters, performance measures:
Label 1: 4 Label 2: 0
Recall, Precision and F-Measure are being evaluated.
Label 3: 2 Label 4: 2
Label 5: 2 Label 6: 1
Label 7: 0 Label 8: 0 i. Recall:
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 9
For a good spectral-spatial classifier, all the above mea- maximum threshold point for the ratio of major to minor axis
sures will be close to one. along with a threshold value of 26 for the skeleton points.
Likewise, for EM clustered image an area threshold value of
16 and SI threshold value of 1.424 were used along with a
threshold of 8 for the number of skeleton points. For the ratio
6. Results and discussion
of the major axis length to minor axis, a threshold value of
2.0655 along with a threshold value of 25 for skeleton points
In this section, the experimental results obtained using UAV
was used. For SOM clustered image, an area threshold value of
images for tomato detection are presented. In the images and
18 and SI threshold value between 2.5077 and 1.4402 was used.
the spatial resolution of images acquired in this study are
So, all those labels having an SI value in the threshold range
dynamic since the altitude of UAV varied. The nearer the UAV
were categorised as non-tomato pixels. For eliminating the
was to the tomato field, better features it picked up in the
elliptical shaped noise, a value of 2.4079 was used as the
image. The two images examined were acquired at different
threshold value for major to minor axis ratio along with a
altitudes and hence different spatial resolution were studied.
threshold value of 18 for skeleton points.
The detection of tomatoes was carried out using spectral-
The spatial classification leads to improve the tomato
spatial classification.
detection as shown in the Figs. 6(b), 7(b) and 8(b) for K-means,
EM and SOM respectively. Figures 6(c), 7(c) and 8(c) show the
6.1. UAV image 1 spectral-spatial classified tomato region overlaid on the orig-
inal image.
The size of the first UAV image was 293 415 pixels as shown The results of the Figs. 6(c), 7(c) and 8(c) are represented in
in Fig. 4. The optimal number of clusters to be chosen for Table 3 which compares the ROC parameters of all the
clustering depends on the data set. The BIC curve reached a methods, namely, K-means, EM and SOM. The performance
maximum at 3 as shown in Fig. 5. Hence all the spectral measures based on ROC parameters were calculated for each
methods generated 3 clusters. method, as shown in Table 4. For this was lower for SOM. The
The spectral methods, namely, K-means, EM and SOM, advantage of SOM is that it had higher precision value than the
were applied to group tomato and non-tomato regions in the other two methods as it selects less false positives. To analyse
image. The resultant images are as shown in Figs. 6(a), 7(a) and the results, the graph between the precision and recall was
8(a) respectively. The spectrally classified image contained plotted as shown in Fig. 9. The desirable outcome was the point
soil patches, stalks and pebbles among other objects that were where both the measures i.e., recall and precision were one.
misclassified as belonging to the tomato region. The point closest to the coordinate (1, 1) was selected as the
To overcome this problem, spatial classification was best result. The results of SOM and EM were equally good for
applied on spectrally classified image as discussed in Section both precision and recall compared to the K-means process.
3.2. For K-means clustered image, an area threshold value of In all these algorithms used for tomato detection, the
25 was used where all the labels having area <25 were grouped number of clusters chosen for clustering played an important
into non-tomato pixels. To remove the stalks an SI threshold role. Since, there are two classes of objects, i.e. tomato and
value of 1.4125 was used along with a threshold value of 26 for non-tomato region, the performance measures were analysed
the number of skeleton points. Therefore, all the labels having by taking exact partitioning into 2 clusters. This was
SI value >1.4125 were classified into non-tomato pixels. compared with the clusters generated using BIC, which is 3
However, due to the threshold value for number of skeleton clusters, for this image. The ROC parameters and the perfor-
points, certain overlapped tomatoes, despite having SI values mance measures using two clusters are shown in Tables 5 and
>1.4125 were retained in tomato group. For eliminating the
elliptical shaped noise, a value of 2.3033 was used as the
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
10 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Fig. 6 e (a) Spectral clustering using K-means on image 1 (b) Spatial classification on the clustered image (c) Spectral and
spatial classified tomato region overlaid on the original image.
6 respectively. While comparing the performance measures, it was observed by varying the learning rate and then
is noticed that choosing BIC generated 3 cluster centres gave a comparing the performance measures obtained for each
better recall as well as precision values than 2 clusters for K- case. The complete analysis is shown in Table 8, from this
means and SOM. For EM, the precision value was better for 2 table it can be observed that the value 0.99 gives the better
clusters than for 3 clusters generated by BIC. However, the result.
number of false positives picked was high for the case of 2 The algorithms were tested on a computer with Core-i3
clusters as compared to 3 clusters generated by BIC. Hence, processor, 4 GB RAM with Matlab (R2013a, The MathWorks,
clustering using 3 clusters was a better solution than 2 clus- Inc., Natick, Massachusetts, USA), and the execution time for
ters. Figure 9 shows the precision verses recall values using UAV Image 1 was recorded. The time of execution for the UAV
BIC and without BIC for K-means, EM and SOM. It was Image 1 for K-means spatial method was 874.41 s, the EM-
concluded that using the number of clusters generated by BIC spatial method took 3453.818 s and SOM-spatial method
gave a better result compared with 2 clusters. took 2886.445 s.
In EM method regularisation factor plays a key role in
classification of the pixels as tomatoes. In order to decide the 6.2. UAV image 2
regularisation factor which gives the best result, the change in
the ROC parameters and performance measures with respect The second UAV image is of size 1080 1920 pixels, as shown
to different regularisation factors was examined. The com- in Fig. 10. The number of clusters at which the BIC curve
plete analysis is shown in Table 7. Since, F-Measure takes into reaches its maxima was analysed. From Fig. 11, it can be
account both recall and precision, it can be taken as the prime observed that 5 is the optimal number of clusters to be applied
factor for deciding the best regularisation value. From Table 7, on this image for all the clustering methods.
it can be seen that F-Measure reached maximum value of The clustering of the image using K-means, EM and SOM is
0.8911 for 0.0005. Hence, the obtained result using 0.0005 as as shown in Figs. 12(a), 13(a) and 14(a) respectively. In the
the regularisation value was used for comparing the perfor- clustered image, misclassification can be observed where
mance of EM method with the other two methods. non-tomato pixels are misclassified as tomato pixels. In order
SOM considers learning rate as an important parameter to remove the misclassified pixels, spatial classification is
for deciding the classification of pixels. The optimal value performed on these images.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 11
Fig. 7 e (a) Spectral clustering using EM on image 1 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.
In case of EM, since no soil patches were classified into the The number and the type of pixels grouped in the tomato
tomato group, and “salt and pepper” noise was removed first. cluster were different in each method. For example, for EM,
To achieve the same, a threshold value of 62 was used for the tomato cluster did not pick up the soil patches and did not
areas where all the labels having area <62 were grouped into capture the tomato pixels fully. The size of the tomatoes
non-tomato pixels. All those labels having SI value in the picked up in EM tomato cluster was smaller compared to other
range of 1.9486 to 0.9219 were retained as tomato pixels. To two methods. In the case of SOM, the soil patches, stalks and
remove the pebbles, a threshold value of 2.6084 was set for the pebbles were captured in larger proportion than other two
ratio of major to minor axis. For K-means and SOM, soil methods. Since the threshold values were shape and size
patches were also misclassified into the tomato group. In case dependent, different threshold values were applied to
of K-means, soil patches have areas >1475 and the “salt and different methods of clustering.
pepper noise” have areas <140. Therefore, an area threshold Threshold values were data dependent due to the
value in the range of 1475 and 140 along with 50 as the different resolutions used in this study. Threshold values of
threshold value for the number of skeleton points was used to these above mentioned parameters were chosen empirically.
retain the tomatoes. An SI threshold value was set at 1.6137 A sensitivity analysis for the spatial method applied on K-
and for the ratio of major axis to minor axis, 1.8030 was cho- means spectrally clustered image was carried out. Table 9
sen as the threshold value and 5 as the threshold for number shows how the results varied with the change in threshold
of skeleton points. Similarly for SOM, the area threshold value values for spatial method. As the threshold values were
chosen was in the range of 2070 and 130 along with 25 as the increased for all these three geometrical properties, indi-
threshold for the number of skeleton points. The threshold vidually or together, (i.e. the labels having values greater
values for SI and ratio of major to minor axis were set as 1.7768 than these thresholds are removed) the number of true
and 1.9396 respectively. The resultant images were as shown positives as well as false positives increased. Hence, the
in Figs 12(b), 13(b) and 14(b) for K-means, EM and SOM threshold values for spatial method were chosen for the
respectively. Figures 12(c), 13(c) and 14(c) are the spectral- empirically set value so as to give a balance between the
spatial classified tomato region overlaid on the original image. precision and recall. This was done by calculating F-Measure
Due to the similar spectral intensities, few tomato pixels in our study. The thresholds, which give the better F-Mea-
and non-tomato pixels were misclassified with each other. sure value, were set.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
12 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Fig. 8 e (a) Spectral clustering using SOM on image 1 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.
The ROC parameters and the performance measures for was greater than the difference in the recall values, which
the second UAV image are shown in Tables 10 and 11 makes EM a better method than SOM. The same inference can
respectively. From Table 10, it can be seen that along with be observed from Fig. 15 where the EM (precision and recall)
the largest number of true positives, SOM gave the largest
number of false positives. Because of this the precision value
for SOM was the least among the three methods. For EM, the
number of true positives was less than SOM, but the number
of false positives was also less when compared to SOM or K-
means. The difference in the precision values of SOM and EM
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 13
Table 7 e ROC parameters and performance measures for different regularisation factor used in EM for UAV Image 1.
Regularisation factor True positives False positives False negatives Recall Precision F-Measure
0 47 8 04 0.9216 0.8545 0.8868
0.0005 45 5 06 0.8824 0.9000 0.8911
0.0008 18 0 33 0.3529 1.0000 0.5217
0.0010 17 0 34 0.3333 1.0000 0.4000
0.0030 39 1 12 0.7647 0.9750 0.8571
0.0050 41 4 10 0.8039 0.9111 0.8541
0.0080 42 2 09 0.8235 0.9545 0.8842
0.0100 35 2 16 0.6863 0.9459 0.7955
0.0200 09 3 42 0.1765 0.7500 0.2858
Table 8 e ROC parameters and performance measures for different learning parameter used in SOM for UAV Image 1.
Learning rate True positives False positives False negatives Recall Precision F-Measure
0.10 39 4 12 0.7647 0.9070 0.8298
0.30 39 7 12 0.7647 0.8478 0.8041
0.50 42 5 09 0.8235 0.8936 0.8571
0.70 42 5 09 0.8235 0.8936 0.8571
0.90 42 4 09 0.8235 0.9130 0.8659
0.99 44 3 07 0.8627 0.9362 0.8979
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
14 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Fig. 12 e (a) Spectral clustering using K-means on image 2 (b) Spatial classification on the clustered image (c) Spectral and
spatial classified tomato region overlaid on the original image.
Fig. 13 e (a) Spectral clustering using EM on image 2 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 15
Fig. 14 e (a) Spectral clustering using SOM on image 2 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
16 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 17
Hunt, E. R., Hively, W. D., Fujikawa, S. J., Linden, D. S., Selim, S. Z., & Ismail, M. A. (1984). K-means-type algorithms: a
Daughtry, C. S., & McCarty, G. W. (2010). Acquisition of NIR- generalized convergence theorem and characterization of
green-blue digital photographs from unmanned aircraft for local optimality. Pattern Analysis and Machine Intelligence, IEEE
crop monitoring. Remote Sensing, 2(1), 290e305. Transactions on, 1, 81e87.
Johnson, D. M. (2014). An assessment of pre-and within-season Seng, W. C., & Mirisaee, S. H. (2009). (2009, August). A new method
remotely sensed variables for forecasting corn and soybean for fruits recognition system. In , ICEEI'09. International
yields in the United States. Remote Sensing of Environment, 141, Conference on: Vol. 1. Electrical engineering and informatics (pp.
116e128. 130e134). IEEE. August.
Kohonen, T. (1990). The self-organizing map. Proceedings of the Senthilnath, J., Omkar, S. N., Diwakar, P. G., Mani, V., Nitin, K., &
IEEE, 78(9), 1464e1480. Shreyas, P. B. (2013). Crop stage classification of hyperspectral
Kooistra, L., Suomalainen, J., Franke, J., Bartholomeus, H., data using unsupervised techniques. IEEE Journal of Selected Topics
Mücher, S., & Becker, R. (2014, May). Monitoring agricultural in Applied Earth Observations and Remote Sensing, 6(2), 861e866.
crops using a light-weight hyperspectral mapping system for Senthilnath, J., Shivesh, B., Omkar, S. N., Diwakar, P. G., &
unmanned aerial vehicles. In EGU General Assembly Conference Mani, V. (2012). An approach to multi-temporal MODIS image
Abstracts (Vol. 16, p. 2790). analysis using image classification and segmentation.
Launay, M., & Guerif, M. (2005). Assimilating remote sensing data Advances in Space Research, 50(9), 1274e1287.
into a crop model to improve predictive performance for Shi, H., & Xingguo, M. (2011). Interpreting spatial heterogeneity of
spatial applications. Agriculture, Ecosystems & Environment, crop yield with a process model and remote sensing. Ecological
111(1), 321e339. Modelling, 222(14), 2530e2541.
Li, H., Zhang, K., & Jiang, T. (2005). The regularized EM algorithm.
Stajnko, D., & Cmelik, Z. (2005). Modelling of apple fruit growth by
In Proceedings of the 20th National Conference on Artificial application of image analysis. Agriculturae Conspectus
Intelligence (pp. 807e812). Scientificus (ACS), 70(2), 59e64.
Lobell, D. B. (2013). The use of satellite data for crop yield gap Stajnko, D., Lakota, M., & Hoc evar, M. (2004). Estimation of
analysis. Field Crops Research, 143, 56e64. number and diameter of apple fruits in an orchard during the
Mo, X., Liu, S., Lin, Z., Xu, Y., Xiang, Y., & McVicar, T. R. (2005). growing season by thermal imaging. Computers and Electronics
Prediction of crop yield, water consumption and water use in Agriculture, 42(1), 31e42.
efficiency with a SVAT-crop growth model using remotely Sugiura, R., Noguchi, N., & Ishii, K. (2005). Remote-sensing
sensed data on the North China Plain. Ecological Modelling, technology for vegetation monitoring using an unmanned
183(2), 301e322. helicopter. Biosystems Engineering, 90(4), 369e379.
Patel, H. N., Jain, R. K., & Joshi, M. V. (2011). Fruit detection using Tarabalka, Y., Benediktsson, J. A., & Chanussot, J. (2009).
improved multiple features based algorithm. International Spectralespatial classification of hyperspectral imagery based
Journal of Computer Applications, 13(2), 1e5. on partitional clustering techniques. Geoscience and Remote
Regunathan, M., & Lee, W. S. (2005, July). Citrus fruit identification Sensing, IEEE Transactions on, 47(8), 2973e2987.
and size determination using machine vision and ultrasonic Xiang, H., & Tian, L. (2011). Development of a low-cost
sensors. In ASAE Annual International Meeting. agricultural remote sensing system based on an autonomous
Saberioon, M. M., Amin, M. S. M., Anuar, A. R., Gholizadeh, A., unmanned aerial vehicle (UAV). Biosystems Engineering, 108(2),
Wayayok, A., & Khairunniza-Bejo, S. (2014). Assessment of rice 174e190.
leaf chlorophyll content using visible bands at different Yuping, M., Shili, W., Li, Z., Yingyu, H., Liwei, Z., Yanbo, H., et al.
growth stages at both the leaf and canopy scale. International (2008). Monitoring winter wheat growth in North China by
Journal of Applied Earth Observation and Geoinformation, 32, combining a crop model and remote sensing data. International
35e45. Journal of Applied Earth Observation and Geoinformation, 10(4),
Sagar, B. S. D. (2013). Mathematical morphology in geomorphology and 426e437.
GISci. CRC Press. Zhou, R., Damerow, L., Sun, Y., & Blanke, M. M. (2012). Using
Schwarz, G. (1978). Estimating the dimension of a model. The colour features of cv. ‘Gala’ apple fruits in an orchard in image
Annals of Statistics, 6(2), 461e464. processing to predict yield. Precision Agriculture, 13(5), 568e580.
Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003