Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
https://doi.org/10.1007/s10278-019-00245-9
Abstract
In this paper, a simplified yet efficient architecture of a deep convolutional neural network is presented for lung image classi-
fication. The images used for classification are computed tomography (CT) scan images obtained from two scientifically used
databases available publicly. Six external shape-based features, viz. solidity, circularity, discrete Fourier transform of radial length
(RL) function, histogram of oriented gradient (HOG), moment, and histogram of active contour image, have also been identified
and embedded into the proposed convolutional neural network. The performance is measured in terms of average recall and
average precision values and compared with six similar methods for biomedical image classification. The average precision
obtained for the proposed system is found to be 95.26% and the average recall value is found to be 69.56% in average for the two
databases.
Keywords Biomedical indexing . Image retrieval . Convolutional neural network (CNN) . Content-based image retrieval (CBIR)
max-pooling layers, two normalization layers, one fully con- is applied to the feature set obtained and reconstruction of the
nected and one softmax layer for classification of tumor in images is achieved by minimizing the loss between pre-
brain MR images. Authors in [15] presented a Caffe CNN dicted output image and obtained high-resolution image.
framework for high-resolution CT (HRCT) images of lungs. Authors in [24] used a stochastic Hopfield network for
They classified biomedical images using CNN architecture detecting lung nodules in CT scan images. The features
with two convolutional layers followed by a max-pooling lay- extracted using RBM are fed into a CNN with two
er. The output of this max-pooling layer is fed into a convolutional layers, two sub-sampling layers, and one
feedforward neural network architecture for final classifica- fully connected layer for final classification. The ap-
tion. Chung et al. [16] proposed a CNN by combining two proach was classified between the benign and malignant
Res-Net-50 CNNs. In each Res-Net-50 CNN, pre-trained nature of lung nodules.
weights of another CNN named as Image-Net over the same It has also been observed in literature [21–24] that when a
database are used. This CNN thereby formed is referred to as suitable CNN architecture is supported with external features,
Siamese deep network and is used for classification of diabetic its performance is increased. Therefore, in this paper, a CNN
retinopathy images. Authors in [17] proposed a CNN with 25 architecture for medical image classification is proposed and
convolutional layers, two fully connected layers, and one out- its performance is analyzed with respect to other CNN-based
put layer for classification of datasets with medical images of medical image classification systems to prove its worthiness.
brain MRI, breast MRI, and cardiac computed tomography Moreover, six external feature sets (four existing and two sug-
angiogram (CTA). Here, a single CNN is proposed for three gested) have also been introduced and when the proposed
different types of medical images and it is shown that its per- CNN-based system is combined with these six feature sets,
formance is comparable to CNNs trained for a specific dataset. its precision and recall are further increased by nearly 10.07%
In [18], authors suggested a deep network with four and 6.21% respectively. The paper is divided into following
convolutional and three pooling layers. The input consisted sections. BBackground^ explains the background of the vari-
of 1553 axial, sagittal, and coronal views of pulmonary seg- ous CNNs. The proposed CNN architecture followed by de-
ment for classification of emphysema. The results have been tails of extracted features is given in BProposed Work.^
compared with other complex CNN architectures and the pro- Experimental results are given in BResults and Discussions^
posed one has been claimed to be more suitable. Another and the paper is concluded in BConclusion.^
CNN-based image retrieval system for lung images was pro-
posed by authors in [19] in which the extracted features
are encoded using fisher vector encoding and dimen- Background
sionality reduction is done using principal component
analysis (PCA). It is then fed into a multivariate regres- Basics of CNN
sion model for classification. Campo et al. [20] prof-
fered an approach based to detect the emphysema from Convolutional neural networks are a specialized form of arti-
X-ray images using a CNN with four convolutional ficial neural networks with more number of hidden layers and
layers, two max-pooling layers, two dense layers, one drop- advanced transfer functions (https://leonardoaraujosantos.
out, one flatter, and one output layer. An accuracy of 90.33% gitbooks.io/artificial-inteligence/content/residual\_net.html).
is achieved in this case. A simplest CNN architecture can have six layers as shown in
Moreover, there are a number of medical image classifica- Fig. 1.
tion methods which use CNNs along with externally extracted
features. For example, authors in [21] used a classification 1. Input layer: It holds the raw input image to be fed into
model where many texture-based features like local binary convolutional layer.
patterns and local ternary patterns are extracted from the im- 2. Convolutional layer: It extracts the features from a given
ages and then fed into three parallel CNNs at the same time. image by convoluting the kernel over the image. Here,
The output feature set is an input to SVM for final classifica- hyper-parameters like stride (number of pixels the kernel
tion of images. In [22], a dynamic threshold-based local mesh can skip while convolution) can be tuned.
ternary pattern (LMeTerP) is computed. The features extracted 3. ReLU (rectified linear unit) layer: A ReLU layer performs
using this dynamic LMeTerP are then combined with CNN a threshold operation to the input, and any value in the
architecture to retrieve similar biomedical images. In [23], a input less than zero is set to zero. Instead of making it as a
dataset of 1500 medical images from different body parts like separate layer, we can also embed it in the transfer func-
lung, brain, and heart is collected and downsampled. A patch- tion of the convolutional layer.
based feature set is then extracted using various filters and 4. Pooling layer: It reduces the dimensions of feature maps
subsequently a parametric rectified linear unit is applied as as extracted by convolutional layer and thereby prevents
an activation function. Finally, a deep convolutional network overfitting.
J Digit Imaging
Input Image Convolutional Layer ReLU layer Max-pooling layer Dense Layer Classification layer
Fig. 1 Architecture of a convolutional neural network
5. Dense layer: Feature maps are flattened to create feature Proposed Work
vector in this layer, which can then be fed into the final
classification layer. Proposed CNN Architecture
6. Classification layer: It predicts the class of the input im-
age. Here, the number of neurons is same as the number of Many CNN architectures have been proposed by authors in
classes in the final output. the past, like in [24, 27] for classification of CT scan images of
lungs. One such architecture is an ImageNet-VGG-f feature-
based CNN model which has been proposed by authors in
Parameters like number of convolutional layers, size of [21]. In this paper, a simplest modified version of ImageNet-
filters, number of max-pooling layers, size of stride, number VGG-f CNN with lesser number of layers has been proposed.
of hidden neurons in dense layer, and transfer function are Moreover, six external features (four existing and two pro-
considered as training parameters for a given CNN. posed) as described in BExtracted Features^ have also been
suggested and the combined performance of these features
ImageNet-VGG-f Architecture with proposed CNN is evaluated. As discussed in BBasics of
CNN,^ since the role of convolutional layer is to extract fea-
It is one of the popular CNN architectures explored in [21] for tures from the input image, this layer is dropped in the pro-
medical image classification with remarkable performance. Its posed CNN architecture due to the externally extracted fea-
original architecture is suggested by authors in [25, 26] in tures given as an input to it. Thereby, this architecture is com-
which there are five convolutional layers, three max-pooling prised of one ReLU layer, one max-pooling layer, one dense
layers, two dense layers, and one output layer. In [21], the layer, and finally one classification layer as shown in Fig. 2.
performance of ImageNet-VGG-f architecture is claimed to The stride is set to 1 and also the pool size is set to 1 × 1 for
be the one of the best over other ImageNet architectures for maximum precision and recall as discussed in (https://in.
a medical image dataset. mathworks.com/help/nnet/examples/create-simple-deep-
In this paper, the architecture of ImageNet-VGG-f CNN learning-network-for-classification.html).
has been modified as explained in the next section by scraping Further, the architecture is tested on different numbers of
the convolutional layer and incorporating externally extracted epochs and the maximum precision and recall values are ob-
feature vectors. tained when the CNN model is trained with 200 epochs. To
Classification result
External features
Group 1
Group 2
Group n-1
Group n
Input image ReLU layer Max-pooling layer Dense Layer Classification layer
Fig. 2 Proposed CNN architecture
J Digit Imaging
Images from
database for
training the
model
Query image is classified into appropriate
group of the database by the classification layer
of the proposed CNN architecture.
Parameters Best [29] Google-Arch MLC-CNN [19] ImageNet-VGG-f Base approach Proposed CNN
with CNN architecture
1 n−1
A¼ ∑ ðxiþ1 −xi Þ yiþ1 −yi ð3Þ (c) DFT coefficients of the RL function:
2 i¼1
(b) Circularity (Fcirc): It calculates the extent of circular nature The RL function is defined in Eq. (6). Radial length is the
of the extracted contour and can be calculated using Eq. (4). distance of points in the contour from the centroids of the
contour. If xc and yc are coordinates of contour centroid, then
F circ ¼ 1− 4 π A=P2 ð4Þ radial length RL(i) of the ith point (xi, yi) is given as:
Fig. 5 Images obtained from ELCAP lung database for 10 different groups
Results and Discussions with chronic obstructive pulmonary disease (COPD or em-
physema)) (http://image.diku.dk/emphysema_database/).
Two datasets having CT scan images of lungs (http://image.
diku.dk/emphysema_database/, http://www.via.cornell.edu/
lungdb.html) profoundly used by the research community
have been used. These datasets comprised of 124 and 100
samples from different patients as summarized in Table 1.
The results are obtained on a computational device with 8-
GB RAM and 2.0 GHz frequency. MATLAB is used for sim-
ulation purpose.
The first dataset is comprised of CT scan images of around
39 subjects (9 never-smokers, 10 smokers, and 20 smokers
Fig. 6 a Sample image (database 1). b Active contour extracted after Eq. Fig. 7 a Sample image (database 2). b Active contour extracted after Eq.
(1) applied once. c Contour extracted after Eq. (1) applied 10 times (1) applied once. c Contour extracted after Eq. (1) applied 10 times
J Digit Imaging
Best [29] Google-Net CNN [27] MLC-CNN [19] ImageNet-VGG-f CNN [8] Base approach [8] Prop. features
architecture [21] with proposed CNN with proposed architecture
There are 124 slices obtained from top, middle, and lower Thereafter, the recall and precision are calculated for each
view of lungs. All images are divided into four groups for group. The results of different algorithms are compared based
classification—normal tissue (NT), centrilobular emphysema on the average retrieval precision (ARP) and average retrieval
(CLE), paraseptal emphysema (PSE), and pan-lobular emphy- rate (ARR) respectively. It can be calculated using Eqs. (15)
sema (PLE). and (17).
The second dataset is obtained from ELCAP lung
∑w
i¼0 P ðiÞ
database. Out of the four repositories available in ARP ¼ ð15Þ
(http://www.via.cornell.edu/lungdb.html), we selected w
W0011-W0020 repository. It had CT scan images for for q ≤ 10
10 different groups and image classification will be
done for these 10 groups. Each slice is of 1.25 mm number of relevant images retrieved of a group
PðiÞ ¼ ð16Þ
thickness and is obtained in a single breath hold using total number of images retrieved for that group
CT scan. The dataset is also provided with the location
Further, w is the number of images in the database and q is
of nodules in the lungs of the patient. The details of these two
the number of similar images retrieved from the database.
datasets are given in Table 2. Images of patients from four
Similarly,
different groups of dataset 1 and 10 different groups of dataset
2 are given in Figs. 4 and 5 respectively. ∑wi¼0 RðiÞ
Figure 6a shows an original image from database 1 from ARR ¼ ð17Þ
w
which active contour region is extracted and accordingly
Fig. 6b, c shows the contour generation when Eq. (1) is ap- for q ≥ 10where R(i) denotes the recall value and is defined as
plied once and then 10 times respectively. Similarly, Fig. 7
number of relevant images retrieved of a group
shows the sample image and its contour generation from da- RðiÞ ¼ ð18Þ
total number of relevant images in the database for that group
tabase 2.
From these active contours, six feature vectors have been A detailed comparison of recall and precision values ob-
computed as described in BProposed Work.^ These feature tained for the various algorithms are summarized in Tables 3,
vectors are then fed into proposed CNNs for the classification 4, 5, and 6. Tables 3 and 5 compare various algorithms for
of images of different groups. The parameters for different recall values in the case of database 1 and database 2 respec-
CNNs used in the experimental results are summarized in tively whereas Tables 4 and 6 compare precision for various
Table 2. algorithms in the case of databases 1 and 2 respectively. Each
Best [29] Google-Net CNN [27] MLC-CNN [19] ImageNet-VGG-f CNN [8] Base approach [8] Prop. features
architecture [21] with proposed CNN with proposed architecture
Best [29] Google-Net CNN [27] MLC-CNN [19] ImageNet-VGG-f CNN [8] Base approach [8] Prop. features
architecture [21] with proposed CNN with proposed architecture
Best [29] Google-Net CNN [27] MLC-CNN [19] ImageNet-VGG-f CNN [8] Base approach [8] Prop. features with
architecture [21] with proposed CNN proposed architecture
References 17. Moeskops P, Wolterink JM, van der Velden BH, Gilhuijs KG,
Leiner T, Viergever MA, Išgum I: Deep learning for multi-task
medical image segmentation in multiple modalities. In
1. Siegel RL, Miller KD, Jemal A: Cancer statistics, 2016. CA: a
International Conference on Medical Image Computing and
cancer journal for clinicians 66(1):7–30, 2016
Computer-Assisted Intervention. Cham: Springer, 2016, pp. 478–
2. De Azevedo-Marques, Paulo Mazzoncini, Arianna Mencattini,
486
Marcello Salmeri, and Rangaraj M. Rangayyan, eds. "Medical
18. Bermejo-Peláez, David, Raúl San José Estepar, and María J.
Image Analysis and Informatics: Computer-Aided Diagnosis and
Ledesma-Carbayo. "Emphysema classification using a multi-view
Therapy.", CRC Press, Taylor and Francis, 2017.
convolutional network.", In 15th International Symposium on
3. Purwar RK, Srivastava V: Recent advancements in detection of
Biomedical Imaging (ISBI 2018), pp. 519-522. IEEE, 2018.
cancer using various soft computing techniques for MR images,
19. Gao, M., Xu, Z., Lu, L., Harrison, A. P., Summers, R. M., and
in Progress of Advanced Computing and Intelligent Engineering.
Mollura, D. J., "Holistic interstitial lung disease detection using
Singapore: Springer, 2018, pp. 99–108
deep convolutional neural networks: Multi-label learning and unor-
4. Sluimer I, Schilham A, Prokop M, Van Ginneken B: Computer
dered pooling", Computer Vision and Pattern Recognition, Cornell
analysis of computed tomography scans of the lung: A survey.
University, arXiv:1701.05616, 2017.
IEEE transactions on medical imaging 25(4):385–405, 2006
20. Campo, Mónica Iturrioz, Javier Pascau, and Raúl San José Estépar.
5. Coppini G, Miniati M, Paterni M, Monti S, Ferdeghini EM:
"Emphysema quantification on simulated X-rays through deep
Computer-aided diagnosis of emphysema in COPD patients:
learning techniques." In 2018 IEEE 15th International
Neural-network-based analysis of lung shape in digital chest radio-
Symposium on Biomedical Imaging (ISBI 2018), pp. 273-276.
graphs. Medical engineering and physics 29(1):76–86, 2007
IEEE, 2018.
6. Li, Xin, Leiting Chen, and Junyu Chen, "A visual saliency-based
21. Nanni L, Ghidoni S, Brahnam S: Handcrafted vs. non-handcrafted
method for automatic lung regions extraction in chest radio-
graphs.", In 14th International Computer Conference on Wavelet features for computer vision classification. Pattern Recognition 71:
158–172, 2017
Active Media Technology and Information Processing
(ICCWAMTIP), pp. 162-165. IEEE, 2017. 22. Srivastava Varun, Purwar RK, Jain Anchal, "A dynamic threshold-
7. Abbasi S, Mokhtarian F, Kittler J: Curvature scale space image in based local mesh ternary pattern technique for biomedical image
shape similarity retrieval. Multimedia systems 7(6):467–476, 1999 retrieval. International Journal of Imaging Systems and Technology,
8. Tsochatzidis L, Zagoris K, Arikidis N, Karahaliou A, Costaridou L, Vol. 29(2), pp: 168-179, 2018.
Pratikakis I: Computer-aided diagnosis of mammographic masses 23. Liu H, Xu J, Wu Y, Guo Q, Ibragimov B, Xing L: Learning
based on a supervised content-based image retrieval approach. deconvolutional deep neural network for high resolution medical
Pattern Recognition 71:106–117, 2017 image reconstruction. Information Sciences 468:142–154, 2018
9. Wang XH, Park SC, Zheng B: Assessment of performance and 24. Hua, Kai-Lung, Che-Hao Hsu, Shintami Chusnul Hidayati, Wen-
reliability of computer-aided detection scheme using content- Huang Cheng, and Yu-Jen Chen. "Computer-aided classification of
based image retrieval approach and limited reference database. lung nodules on computed tomography images via deep learning
Journal of digital imaging 24(2):352–359, 2011 technique.", OncoTargets and therapy 8, 2015.
10. Park, Yang Shin, Joon Beom Seo, Namkug Kim, Eun Jin Chae, 25. Simonyan, Karen, and Andrew Zisserman, "Very deep
Yeon Mok Oh, Sang Do Lee, Youngjoo Lee, and Suk-Ho Kang, convolutional networks for large-scale image recognition",
"Texture-based quantification of pulmonary emphysema on high- Computer Vision and Pattern Recognition, Cornell University,
resolution computed tomography: comparison with density-based arXiv: 1409.1556, 2014.
quantification and correlation with pulmonary function test.", 26. Wozniak P, Afrisal H, Esparza RG, Kwolek B: Scene recognition
Investigative radiology, 43(6), 395-402, 2008. for indoor localization of mobile robots using deep CNN. In:
11. Nanni L, Lumini A, Brahnam S: Local binary patterns variants as International Conference on Computer Vision and Graphics.
texture descriptors for medical image analysis. Artificial intelli- Cham: Springer, 2018, pp. 137–147
gence in medicine 49(2):117–125, 2010 27. Hoo-Chang S, Roth HR, Gao M et al.: Deep convolutional neural
12. Moura DC, Guevara López MA: An evaluation of image descrip- networks for computer-aided detection: CNN architectures, dataset
tors combined with clinical data for breast cancer diagnosis. characteristics and transfer learning. IEEE transactions on medical
International journal of computer assisted radiology and surgery imaging. 35(5):1285–1298, 2016. https://doi.org/10.1109/TMI.
8(4):561–574, 2013 2016.2528162
13. Srivastava V, Purwar R: An extension of local mesh peak valley 28. Chan TF, Sandberg BY, Vese LA: Active contours without edges
edge based feature descriptor for image retrieval in bio-medical for vector-valued images. Journal of Visual Communication and
images. ADCAIJ: Advances in Distributed Computing and Image Representation 11(2):130–141, 2000
Artificial Intelligence Journal 7(1):77–89, 2018 29. Nanni L, Paci M, Brahnam S, Ghidoni S: An ensemble of visual
14. Pang S, Yu Z, Orgun MA: A novel end-to-end classifier using features for Gaussians of local descriptors and non-binary coding
domain transferred deep convolutional neural networks for biomed- for texture descriptors. Expert Systems with Applications 82:27–
ical images. Computer methods and programs in biomedicine 140: 39, 2017
283–293, 2017
15. Karabulut, Esra Mahsereci, and Turgay Ibrikci. "Emphysema dis-
crimination from raw HRCT images by convolutional neural net-
works." In 9th International Conference on Electrical and
Electronics Engineering (ELECO), pp. 705-708, IEEE, 2015.
16. Chung, Y. A., and Weng, W. H., "Learning deep representations of
medical images using Siamese CNNs with application to content- Publisher’s Note Springer Nature remains neutral with regard to juris-
based image retrieval", Computer Vision and Pattern Recognition, dictional claims in published maps and institutional affiliations.
Cornell University, arXiv:1711.08490, 2017.