Sei sulla pagina 1di 11

Machine Vision and Applications (2020) 31:4

https://doi.org/10.1007/s00138-019-01055-3

ORIGINAL ARTICLE

Detection of difficult airway using deep learning


Kevin Aguilar1 · Germán H. Alférez1 · Christian Aguilar2

Received: 30 May 2019 / Revised: 24 September 2019 / Accepted: 6 December 2019


© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract
Whenever a patient needs to enter the operating room, in case the surgery requires general anesthesia, he/she must be intubated,
and an anesthesiologist has to make a previous check to the patient in order to evaluate his/her airway. This process should
be done to the patient to anticipate any problem, such as a difficult airway at the time of being anesthetized. In fact, the
inadequate detection of a difficult airway can cause serious complications, even death. This research work proposes a mobile
app that uses a convolutional neural network to detect a difficult airway. This model classifies two classes of the Mallampati
score, namely Mallampati 1–2 (with low risk of difficult airway) and Mallampati 3–4 (with higher risk of difficult airway).
The average accuracy of the predictive model is 88.5% for classifying pictures. A total of 240 pictures were used for training
the model. The results of sensitivity and specificity were 90% in average.

Keywords Difficult airway · Deep learning · Convolutional neural networks

1 Introduction patient airway before anesthesia to develop an appropriate


plan of anesthetic management [3–5].
One of the biggest fears that anesthesiologists have is to con- Since 1993, the American Society of Anesthesiology
front a patient with a difficult airway, whether previously (ASA) has published its management guidelines in diffi-
diagnosed or in the worst case, unexpectedly. A difficult air- cult airway. These guidelines give very specific guidance
way is defined as the need for three or more attempts to to the anesthesiologist for the management of these cases.
intubate the trachea or more than 10 minutes to achieve it [1]. Specifically, these guidelines focus on maintaining good ven-
Approximately, it occurs in 1.5–8% of procedures where gen- tilation and oxygenation since a failed intubation can cause
eral anesthesia is used. The incidence of the “non-intubable an increase in morbidity and mortality [1,6].
patient” or “non-ventilatory patient” situation is present in Based on this concern for patient safety, modified proto-
1/50,000 patients. Likewise, the failure of orotracheal intuba- cols have been created in which, in the context of the country
tion occurs in 1/2000 programmed cases, increasing to 1/200 or hospital, it is proposed to add new devices in the manage-
cases in emergency rooms. In pregnant women, the difficult ment of airway, checklists or technology that may be useful
intubation is 7.9%, and in cases of very difficult intubation, it at the moment of crisis. Although this whole verification sys-
is 2% [2]. This is why the anesthesiologist must evaluate the tem for patient safety is documented, it is a reality that the
undervalued difficult intubation either by the specialist’s ego,
B Germán H. Alférez by the lack of supporting devices or simply due to the lack
harveyalferez@um.edu.mx of experience and/or ability of the anesthesiologist [7,8], has
Kevin Aguilar made this issue one of the hot topics in any course or confer-
kevinaguilar@um.edu.mx ence of anesthesiology worldwide. Difficult airway was, is,
Christian Aguilar and will continue to be a topic to be addressed because of the
coordmedicina@um.edu.mx, anestesiaguilar@gmail.com subjective and objective evaluation of an anesthesiologist in
1
a difficult airway.
Facultad de Ingeniería y Tecnología, Universidad de
Montemorelos, Av. Libertad 1300 Poniente, Barrio In fact, “the ASA Closed Claims reveals that 34% of the
Matamoros, 67530 Montemorelos, N.L., Mexico demands to anesthetists are related to airway events, and that
2 Escuela de Medicina, Universidad de Montemorelos, Av. the difficulty of intubation has been the most common cause
Libertad 1300 Poniente, Barrio Matamoros, 67530 of damage since the 90s” [2]. This percentage is high, so
Montemorelos, N.L., Mexico

0123456789().: V,-vol 123


4 Page 2 of 11 K. Aguilar et al.

is the impact a failed intubation can generate, since the risk


of having a hypoxic injury or death of a patient is always
present. The inability to successfully manage a difficult air-
way is responsible of 600 annual deaths and 25–30% of
deaths attributable to anesthesia [9,10].
Our contribution is to present how a convolutional neu-
ral network can be used as a tool to detect a difficult airway
via a mobile app. Specifically, our predictive model is able
to classify pictures according to two classes of the Mallam-
pati score, Mallampati 1–2 (low risk) and Mallampati 3–4
(high risk). The evaluation of the classification model shows
an average accuracy of 88.5%. Also, the results of cross-
validation in terms of sensitivity and specificity were 90% in
Fig. 1 Underpinnings of our approach
average.
This paper is organized as follows. Section 2 presents the
panorama of mobile apps in the domain of difficult airway.
3.1 Difficult airway
Section 3 presents the underlying concepts of our approach.
Section 4 presents the methodology followed in this research
A difficult airway is defined as the need for three or more
work. Section 5 presents the evaluation results. Section 6
attempts to intubate the trachea or more than 10 min to
presents the conclusions and future work.
achieve it [1]. It is also defined as the clinical situation
in which an experienced anesthesiologist with conventional
training has difficulty in ventilating the upper airway with
2 Mobile apps in the context of difficult facial mask, endotracheal intubation, or both. Approxi-
airway mately, it occurs in 1.5–8% of procedures where general
anesthesia is used. Intubation is often a challenge for anes-
In recent years, there has been a scientific and industrial
thesiologists because of the multiparameters to assist the
awakening in the world focused on artificial intelligence
prediction of difficult intubation [4].
(AI), not only in topics such as finance, transport or robotics,
It is important to mention that the difficult airway could
but also in medicine. This can be evidenced by apps that
be even caused by the interaction among the patient, the
help diagnose or support the doctor in making decisions.
anesthesiologist, the available equipment, and other circum-
For example, at Universidad de Montemorelos, research
stances. For example, a previously diagnosed normal airway
projects have been carried out that have shown the potential
can become a difficult one if the anesthesiologist has no expe-
of machine learning to detect melanoma [11] and glaucoma
rience in intubation or there was an error in doses of drugs that
[12]. In addition, AI has been used to study the relationship
help the patient to be in a good anesthetic plan to perform
between dental caries and diabetes [13].
the procedure. Another example has to do with the “fixa-
However, as far as we know, there are only two apps to
tion” in the airway. In this case, the environment is not seen
support anesthesiologists in their work, which do not make
in terms of general vital signs when intubating the patient.
use of machine learning. The first is an app that provides
These situations generate stress in the anesthesiologist dur-
the patient with a file with his/her medical record so that, if
ing pre-, intra-, and post-procedure, especially if they do not
in the future there is any need for surgical intervention that
have the necessary support from the institution in which they
requires intubation, the doctors can use this information to
work in terms of technology, established protocols or simply
realize that this person has a difficult airway [14]. The second
because they do not have the necessary basics for a situation
app focuses on the experiences of users. In this way, either a
of difficult airway planned or unforeseen.
doctor or other staff who performed an intervention and had
There are several tests to detect a difficult airway, for
problems with a patient due to a difficult airway can share
instance the Mallampati score, thyromental, and sternomen-
that experience so other people can learn from previous case
tal distances. It is even possible to combine these tests to
studies [15].
make them more solid in diagnosis.

3 Underpinnings of our approach 3.2 Mallampati score

This section presents the fundamental concepts of our The Mallampati score was created as a tool for the anesthesi-
approach as shown in Fig. 1. ologist to detect a difficult airway. This score gives an early

123
Detection of difficult airway using deep learning Page 3 of 11 4

study that gives computers the ability to learn without being


explicitly programmed [22].
Learning is a multifaceted phenomenon. Learning pro-
cesses include the acquisition of new declarative knowledge,
the development of motor and cognitive skills through
instruction or practice, the organization of new knowledge
in general, effective representations, and the discovery of
new facts and theories through observation and experimen-
tation. Since the beginning of the computer age, researchers
Fig. 2 Mallampati score [17]
have endeavored to implement such capabilities in comput-
ers. However, solving this problem has been an increasingly
challenging and fascinating long-range objective in the field
indication of patient cooperation and gives a quick assess of AI [23].
of other parameters by observing patients open mouth [16].
The score is obtained by visualizing the anatomy of the oral
cavity. It is meant to identify a large tongue that obscures the 3.4 Deep learning
oropharyngeal structures and also evaluates how much the
mouth opens. Deep learning is a form of machine learning that allows com-
In the Mallampati score, the anesthesiologist notes whether puters to learn from experience and understand the world
the base of the uvula, faucial pillars, and soft palate are visi- in terms of a hierarchy of concepts. Because the computer
ble. If the tongue is relatively large, the patient is more likely gathers the knowledge of experience, it is not necessary for a
to be difficult to intubate using direct laryngoscopy before human computer operator to formally specify all the knowl-
general anesthesia [5,17,18]. Depending on this evaluation, edge the computer needs. The hierarchy of concepts allows
the person can be classified in one out of four classes as the computer to learn complicated concepts by building them
shown in Fig. 2. from simpler ones [24].
Although the Mallampati has four possible scores, it has Artificial neural networks (ANNs) are the core of deep
been used in clinical practice and in research to dichotomize it learning. ANNs are versatile, powerful, and scalable, which
into classes 1 and 2 (low risk) versus classes 3 and 4 (higher makes them ideal for addressing large and complex machine
risk). This binary risk assessment is then used to predict a learning tasks, such as sorting out billions of images (for
research outcome of either difficult laryngoscopy or of diffi- example, Google Images). They also serve to enhance voice
cult intubation [17]. Table 1 describes the definition of each recognition services (for example, Apple’s Siri), recom-
one of the four possible scores of Mallampati and their clin- mend videos to hundreds of millions of users every day (for
ical interpretation. example, YouTube), or learn to beat the world champion in
Nowadays, the Mallampati score is one of the most used the game of Go by examining millions of previous games
methods for the detection of difficult airway worldwide [17]. and then playing against himself (for example, DeepMind’s
The Mallampati score has a sensitivity of 80%, specificity of AlphaGo) [22].
50%, and a low predictive value < 50% [20,21]. However,
it is the starting point of the airway evaluation that many 3.5 Convolutional neural networks
anesthesiologists rely on [4,18].
A convolutional neuronal network (CNN) is a deep learning
algorithm that can take an input image, assign importance
3.3 Machine learning (weights and learning biases) to several aspects/objects in
the image, and be able to differentiate one from the other.
Machine learning is the science and art of computer programs The architecture of a CNN is analogous to the pattern of
to learn from data. It can also be described as the field of connectivity of neurons in the human brain and was inspired

Table 1 Definition of
Class Visible structures Predicted intubation
Mallampati score and clinical
interpretation [16,18,19] Class 1 Soft palate, fauces, uvula, and pillars Easy
Class 2 Soft palate, fauces, and base of uvula
Class 3 Only soft palate Difficult
Class 4 Soft palate cannot be visualized only hard palate

123
4 Page 4 of 11 K. Aguilar et al.

Fig. 3 Convolution operation

Fig. 5 ReLU activation function

TensorFlow was created by Google and is compatible with


many of its machine learning applications [22]. TensorFlow,
as the name implies, is a framework to define and execute
calculations involving tensors. A tensor is a generalization of
Fig. 4 Max pooling
vectors and matrices to potentially higher dimensions. Inter-
nally, TensorFlow represents tensors as n-dimensional arrays
by the organization of the visual cortex. Individual neurons of base datatypes. Each element in the tensor has the same
respond to stimuli only in a restricted region of the visual type of data, and the type of data is always known. The shape
field known as the receptive field. A collection of these fields (that is, the number of dimensions it has and the size of each
are superimposed to cover the entire visual area [25]. CNNs dimension) can only be partially known. The range of a tensor
are commonly used for the analysis of images. is its number of dimensions [27].
A convolutional neuronal network has the following topol-
ogy [26]: 3.7 MobileNetV2

– Convolution layer In the convolution layer, the output MobileNet is a family of computer vision models for mobile
of the dot product is calculated between an area of the technology with TensorFlow. MobileNet was designed to
input image(s) and a weight matrix called filter. The filter maximize the accuracy of the models effectively while being
slides throughout the image repeating the same dot prod- mindful of the restricted resources for an on-device or embed-
uct operation. For example, Fig. 3 shows a 3 × 3 image ded application. These models are small, low latency, low
and a 2 × 2 filter. The last matrix shows the result of the power, and parameterized to meet the resource constraints of
convolution. a variety of cases. They can be built for classification, detec-
– Grouping or sub-sampling layer This clustering layer is tion, embeddings, and segmentation in a similar way to how
used to reduce the spatial dimensions, but not the depth, other popular large-scale models, such as Inception, are used
in a CNN. In this layer, max pooling can be used so the [28].
highest number (the most sensitive area of the image) of MobileNet is based on an optimized architecture that uses
the input area is taken (an n × m matrix). Figure 4 shows separable convolutions in depth to build light and deep neural
an example with the max pooling operation. networks. MobileNetV2 improves the performance of mobile
– Nonlinearity layer The ReLU activation function is used models in multiple tasks and benchmarks, as well as in a spec-
in the nonlinearity layer. This function returns 0 for each trum of different model sizes. The topology of MobileNetV2
negative value in the input image and returns the same is based on an inverted residual structure in which the input
value for each positive value. Figure 5 shows an example and output of the residual block are thin layers of bottlenecks
of the ReLU operation. contrary to traditional residual models that use expanded rep-
– Fully connected layer In this layer, the output of the last resentations at the entrance. In addition, MobileNetV2 uses
convolution layer is flattened and each node of the current lightweight depthwise convolutions to filter features in the
layer is connected to another node of the next layer. intermediate expansion layer [29].
MobileNetV2 is based on the ideas of MobileNetV1 and
uses the separable convolution in depth as efficient build-
3.6 TensorFlow ing blocks [30]. However, MobileNetV2 introduces two new
features to the architecture: linear bottlenecks between layers
TensorFlow1 is an open source software library for dis- and direct access connections between bottlenecks. Bottle-
tributed numerical computation using data flow graphs. necks encode the intermediate inputs and outputs of the
model, while the inner layer encapsulates the model abil-
1 https://www.tensorflow.org. ity to transform from lower-level concepts, such as pixels, to

123
Detection of difficult airway using deep learning Page 5 of 11 4

Table 2 MobileNetV2 architecture Fig. 6 A picture used for


training
Input Operator t c n s

2242 × 3 conv2d – 32 1 2
1222 × 32 bottleneck 1 16 1 1
1122 × 16 bottleneck 6 24 2 2
562 × 24 bottleneck 6 32 3 2
282 × 32 bottleneck 6 64 4 2
142 × 64 bottleneck 6 96 3 1 illumination was required. Figure 6 shows an example of
142 × 96 bottleneck 6 160 3 2 a well-taken picture.
72 × 160 bottleneck 6 320 1 1 4. Finally, the pictures were grouped by an anesthesiologist
72 × 320 conv2d 1 × 1 – 1280 1 1 in folders corresponding to their classes, Mallampati 1–
72 × 1280 avgpool 7 × 7 – – 1 –
2 (with low risk of difficult intubation) and Mallampati
3–4 (with high risk of difficult intubation). In the low-
1 × 1 × 1280 conv2d 1 × 1 – k –
risk class folder, the anesthesiologist put 66 pictures of
Mallampati 1 and 59 pictures of Mallampati 2. In the
high-risk folder, he put 61 pictures of Mallampati 3 and
higher-level descriptors, such as image categories. Finally, as 54 pictures of Mallampati 4.
with traditional residual connections, shortcuts allow faster
training and greater precision [31].
Table 2 shows how bottleneck blocks are organized in 4.2 Train the predictive model with deep learning
MobileNetV2. t represents the expansion rate of the chan-
nels, c represents the number of input channels, and n the In this step, the MobileNetV2 convolutional neural network
frequency with which the block is repeated. Finally, s indi- was retrained to create a new classification model based on
cates whether the first repetition of a block used a step of 2 Mallampati’s pictures. Although it is possible to train the
in the sampling reduction process [32]. MobileNetV2 from scratch with the Mallampati pictures,
it is computationally expensive. Therefore, we decided to
retrain the original MobileNetV2 model, which is trained
4 Methodology with images from the ImageNet dataset.
The MobileNetV2 topology was chosen because it imple-
The steps carried out in each of the stages of this research ments a residual neural network and batch normalization for
work are described below. model training. These are two key aspects to achieve a good
optimization in terms of accuracy and performance. Also,
when comparing MobileNetV2 and InceptionV3, a lower
4.1 Take pictures of patients and classify the size is obtained at the megabyte level of the model [33]. This
pictures for training is an important aspect in this research so that health specialists
can use this app even in low-end devices. Table 3 shows the
A total of 260 pictures were taken of the oral cavity of fresh- comparison between the InceptionV3 and the MobileNetV2
man students of the School of Medicine of Universidad de models in terms of their size before and after retraining. The
Montemorelos, one picture per student (240 for training and model size generated with MobileNetV2 is around 10 times
20 for testing). These pictures were taken in a lapse of one smaller than the model generated with InceptionV3.
month with an iPhone 6s by a medical student. To take the In order to retrain the model, it was necessary to install
pictures, the following steps were carried out: the following tools:

1. Each freshman medical student was asked to participate 1. Anaconda version 5.2.0 was installed on Windows 10.2
in this research. 2. An environment with Python version 3.5.5 was cre-
2. The student who wished to participate signed a consent to ated. In this environment, TensorFlow version 1.10.0 was
accept the procedure and to use his/her picture. In order installed.
to take pictures of the Mallampati score, the student was 3. The retrain.py code was downloaded from the Tensor-
asked to open as wide as he/she could his/her mouth and Flow GitHub project.3 This code loads the pre-trained
to stick his tongue out as far as possible.
2
3. The student was photographed. To take the picture, a https://www.anaconda.com/distribution/#download-section.
distance close to the mouth of the person and a good 3 http://bit.ly/tensorflow-retrain.

123
4 Page 6 of 11 K. Aguilar et al.

Table 3 Size comparison


Model Original size (MB) Generated size (after retraining) (MB)
between the InceptionV3 and
the MobileNetV2 models Inception_V3 95.3 83.4
Mobilenet_V2_1.0_224 14.0 8.76

model with MobileNetV2 with images from the Ima- ImageClassifier classes of TFMobile were added to
geNet dataset and retrains the classification model with the Android Studio project. TFMobile is the API that
the Mallampati pictures taken at the School of Medicine makes it possible to use classification models in mobile
of Universidad de Montemorelos. The learning rate was devices. The Classifier class is a generic interface
set to 0.01 during the retraining of the model. The to interact with different recognition engines, while the
command with which the model was retrained is the fol- TensorflowImageClassifier class implements it
lowing: and creates the classifier to tag the pictures using Tensor-
Flow. These classes were taken from the official TensorFlow
python −m scripts . retrain −−output_graph=retrained_graph
.pb −−output_labels=retrained_labels . txt website on GitHub5 for Android devices.
−−image_dir=pictures −−tfhub_module https://tfhub . Figure 7 shows the workflow of the app. The first screen
dev/google/imagenet/ shows the instructions to take the picture (e.g., good lighting,
mobilenet_v2_100_224/feature_vector/2 etc.). The trained classification model is loaded as soon as
This command is composed of the following parts: the user clicks on the button to go to the second screen. In
the third screen, the camera is opened and the user can take
– The script to retrain the model: retrain.py the picture. Once the picture is taken, the camera returns the
– The model to be created: retrained_graph.pb Mallampati class with the accuracy score of the classification
– The list with the labels for classification: in the fourth screen.
retrained_labels.txt A fragment of the code to load the model and classify
– The folder where the pictures are located. In this case, the image is presented in Listing 1. In lines 7–11, the vari-
the folder named “pictures” contains two folders, one ables that are needed to load the model are declared. These
with the Mallampati 1-2 and another with the Mal- variables are described below:
lampati 3-4 classes
– Topology on which the model will be retrained. In – INPUT_SIZE is the input size of the model. In our case,
this case, the retrain was carried out on the topology the MobileNetV2 topology that was used has an input
of Mobilenet_V2_1.0_224 size of 224 × 224 pixels.
– IMAGE_MEAN and IMAGE_STD indicate the expected
Ten images of each class, apart from those that were used input range of the neural network that is being used. It is
during training, were separated for the realization of the tests. the [0, 255] range in our case.
This corresponds to five images of Mallampati 1, five images – INPUT_NAME and OUTPUT_NAME are the entry and
of Mallampati 2, five images of Mallampati 3, and five images exit names of the model that come from the retrain.py
of Mallampati 4. training script.
The retraining of the model took 6 min on a computer with – MODEL_FILE and LABEL_FILE are the files of the
the following features: Intel Core i7-7700HQ 2.8 GHz, 16 GB model and its labels that are inside the assets folder of
DDR4 1300 MHz RAM, NVIDIA GeForce GTX 1060 Max- the app.
Q 6 GB GDDR5 8008 MHz graphics card, and operating
system Windows 10 version 1809. In line 16, the classification interface is created. In lines
27–46, the loadModel function that loads the classi-
4.3 Construction of the mobile app for the detection fier model is specified. This function takes the previ-
of difficult airway ously defined variables to create the classifier. It uses
the TensorFlow-InferenceInterface, which is
In this step, a mobile app that uses the trained model responsible for loading the model using the assetManager.
for the detection of a difficult airway was created. The This is similar to a tf.Session with TensorFlow.
app was built on Android Studio.4 In order to use de In lines 48–52, the analyze function that classifies the
classification model, the Classifier andTensorflow picture is specified. Specifically, this function uses the model

4 https://developer.android.com/studio#downloads. 5 http://bit.ly/tensorflow-android-demo.

123
Detection of difficult airway using deep learning Page 7 of 11 4

Fig. 7 Application workflow

Table 4 Evaluation results of pictures classified by an anesthesiologist 1 import android . graphics . Bitmap ;
as Mallampati I 2 import java . util . List ;
3 import java . util . c o n c u r r e n t . E x e c u t o r ;
(%) (%) 4 import java . util . c o n c u r r e n t . E x e c u t o r s ;
5
6 public class MallampatiScore {
7
8 p u b l i c int I N P U T _ S I Z E = 224;
9 p u b l i c int I M A G E _ M E A N = 0;
10 public float IMAGE_STD = 255.0 f;
11 public String INPUT_NAME = "
Placeholder ";
12 public String OUTPUT_NAME = "
final_result ";
13 private String MODEL_FILE = " file
:/// a n d r o i d _ a s s e t / g r a p h . pb " ;
14 private String LABEL_FILE = " file
:/// a n d r o i d _ a s s e t / l a b e l s . txt " ;
15
16 public Classifier classifier ;
17 public Executor executor = Executors .
n e w S i n g l e T h r e a d E x e c u t o r () ;
18
19 @Override
20 p r o t e c t e d void o n C r e a t e ( B u n d l e
savedInstanceState ) {
21 super . onCreate ( savedInstanceState );
22 setContentView (R. layout . activity );
23
24 l o a d M o d e l () ;
25 }
26
Average accuracy of 90.75 9.25 27 p r i v a t e void l o a d M o d e l () {
28 e x e c u t o r . e x e c u t e ( new R u n n a b l e () {
the model
29 @Override
30 p u b l i c v o i d run () {
31 try {
32 classifier =
TensorFlowImageClassifier
. create (
to classify a picture and then it returns the results of the 33 g e t A s s e t s () ,
classification into a list that contains the class predicted and 34 MODEL_FILE ,
35 LABEL_FILE ,
its accuracy.

123
4 Page 8 of 11 K. Aguilar et al.

36 INPUT_SIZE , Table 5 Evaluation results of pictures classified by an anesthesiologist


37 IMAGE_MEAN , as Mallampati II
38 IMAGE_STD ,
39 INPUT_NAME , (%) (%)
40 OUTPUT_NAME );
41 } catch ( final Exception e) {
42 t h r o w new R u n t i m e E x c e p t i o n ( "
Error Loading Model !" , e
);
43 }
44 }
45 }) ;
46 }
47
48 p u b l i c List < C l a s s i f i e r . R e c o g n i t i o n > a n a l y z e (
Bitmap bitmap ) {
49 b i t m a p = B i t m a p . c r e a t e S c a l e d B i t m a p ( bitmap
, INPUT_SIZE , INPUT_SIZE , false );
50 f i n a l List < C l a s s i f i e r . R e c o g n i t i o n >
results = classifier . recognizeImage (
bitmap );
51 return results ;
52 }
53 }

Listing 1 Fragment of code to load the model and classify the picture.

5 Evaluation results
Average accuracy of 82.77 17.23
The evaluation of the classification model was made with 20 the model
pictures. Tables 4, 5, 6, and 7 show the results of the evalu-
ations with 10 pictures of Mallampati 1 and 2 (five pictures
of Mallampati 1 and five pictures of Mallampati 2) and 10
Table 6 Evaluation results of pictures classified by an anesthesiologist
pictures of Mallampati 3 and 4 (five pictures of Mallampati as Mallampati III
3 and five pictures of Mallampati 4). Each picture shows the
(%) (%)
percentage of accuracy of the model when Mallampati 1–2
or Mallampati 3–4 is detected. The average accuracy of the
model generated was 88.5%.
In the evaluations of Mallampati 1 and Mallampati 3 in
Tables 4 and 6, respectively, all images were correctly clas-
sified. In Table 5, within Mallampati 2 evaluation tests, four
out of every five images are classified correctly. It is the same
in the evaluation of Mallampati 4 in Table 7.
During model testing, there were some cases where the
model misclassified pictures. For instance, Fig. 8 shows two
pictures that were misclassified. This is attributed to the fact
that the pictures were not taken properly (e.g., without an
open mouth and tongue stuck out). Based on this, it is very
important to emphasize that the instructions for taking the
pictures must be followed so that optimal results are obtained.
Table 8 shows the sensitivity and specificity results for
the generated model. Sensitivity and specificity are defined
as follows [34]:
Average accuracy of 7.45 92.55
the model
– Sensitivity: How good is the model in detecting patients
with a difficult airway. The formula to obtain the sensi-
tivity is the following:

TP
Sensitivity =
TP + FN

123
Detection of difficult airway using deep learning Page 9 of 11 4

Table 7 Evaluation results of pictures classified by an anesthesiologist Table 8 Test results


as Mallampati IV
Sensitivity Specificity
(%) (%)
Mallampati 1–2 0.9 0.9
Mallampati 3–4 0.9 0.9
Average 0.9 0.9

– FP: a person who was told to have a difficult airway,


although he/she does not have it
– FN: a person who was told not to have a difficult airway
even though he/she really does

Table 8 shows that both sensitivity and specificity had an


average of 90%. These values are 10% and 40% higher,
respectively, than the results of medical sensitivity and speci-
ficity in the case of the Mallampati score [20]. There is
similarity between the classification obtained by an expert
anesthesiologist and by the classification model of the mobile
app.

Average accuracy of 12.06 87.94


the model
6 Conclusions and future work

This research work demonstrates that deep learning can be


used as a tool for detecting difficult airway in pre-surgical
patients by means of using a smartphone. An app to autom-
atize the detection of the Mallampati score can be used to
support the subjective and objective decisions of the anes-
thesiologist.
Through TensorFlow, the MobileNetV2 model was
retrained with 240 pictures of freshman students taken at
the School of Medicine of Universidad de Montemorelos. In
the evaluation made with 20 pictures, an average accuracy of
Fig. 8 Misclassified test images 88.5% was obtained in the classification between Mallam-
pati 1–2 (low risk) and Mallampati 3–4 (high risk) classes.
In addition, the results of the cross-validation in terms of
– Specificity: How good is the model in detecting patients sensitivity and specificity were 90% in average.
without a difficult airway. The formula to obtain the As future work, the app will be updated to TFLite,
specificity is the following: since this technology recently displaced TFMobile. Also,
the pictures used for training the classification model will
TN be extended with pictures taken in hospitals in Mexico and
Specificity =
TN + FP Peru. As a result, anesthesiologists will be able to use the app
in hospitals in these countries to get additional results.
The concepts of True Positive (TP), True Negative (TN), Last but not least, the combination of tests adds an incre-
False Positive (FP), and False Negative (FN) in the context mental diagnostic value in comparison with the value of
of this study are described below: each test alone [3]. Therefore, in future work we expect
to create additional models for classifying the thyromen-
– TP: a person who was told to have a difficult airway and tal and sternomental distances. As a result, we expect that
that he/she really has it anesthesiologists could have a supporting tool to help them
– TN: a person who was told not to have a difficult airway diagnose the three most popular methods to measure difficult
and does not really have it airway.

123
4 Page 10 of 11 K. Aguilar et al.

References 18. Adamus, M., Fritscherova, S., Hrabalek, L., Gabrhelik, T., Zaple-
talova, J., Janout, V.: Mallampati test as a predictor of laryngoscopic
1. Apfelbaum, J.L., Hagberg, C.A., Caplan, R.A., Blitt, C.D., Connis, view. Biomed. Pap. Med. Fac. Univ. Palacky Olomouc Czechoslov.
R.T., Nickinovich, D.G., Hagberg, C.A., Caplan, R.A., Benumof, 154, 339–343 (2010)
J.L., Berry, F.A., Blitt, C.D., Bode, R.H., Cheney, F.W., Connis, 19. Lee, A., Fan, L.T.Y., Gin, T., Karmakar, M.K., Ngan Kee, W.D.:
R.T., Guidry, O.F., Nickinovich, D.G., Ovassapian, A.: Practice A systematic review (meta-analysis) of the accuracy of the Mal-
guidelines for management of the difficult airway. Anesthesiology lampati tests to predict the difficult airway. Anesth. Analg. 102,
118(2), 251–270 (2013) 1867–1878 (2006)
2. García, B.: Valoración preoperatoria de la vía aérea difícil “hay 20. Rodríguez, A.M., Pascual, J.N., Ferrer, L.P., Domínguez, J.F.,
algo nuevo? https://anestesiar.org/2015/valoracion-preoperatoria- Chaves, J.B., González, E.M.: Validez de los predictores de vía
de-la-via-aerea-dificil-hay-algo-nuevo/ (2015). Accessed 11 May aérea difícil en medicina extrahospitalaria. An. Sist. Sanit. Navar.
2018 37(1), 91–98 (2014)
3. Baker, P.: Assessment before airway management. Anesthesiol. 21. García, E.R., Cedeño, J.R.: Valor predictivo de las evaluaciones de
Clin. 33(2), 257–278 (2015) la vía aérea difícil. Trauma 8(3), 63–70 (2005)
4. Khandekar, R., Diwan, R., Shah, A., Patel, B.: Validation of mod- 22. Géron, A.: Hands-On Machine Learning with Scikit-Learn and
ified Mallampati test with addition of thyromental distance and TensorFlow. O’Reilly UK Ltd., Farnham (2017)
sternomental distance to predict difficult endotracheal intubation 23. Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.): Machine
in adults. Indian J. Anaesth. 58(2), 171 (2014) Learning. Springer, Berlin (1983)
5. Bair, A .E., Caravelli, R., Tyler, K., Laurin, E .G.: Feasibility of the 24. Kim, K.G.: Book review: Deep learning. Healthc. Inform. Res.
preoperative mallampati airway assessment in emergency depart- 22(4), 351 (2016)
ment patients. J. Emerg. Med. 38(5), 677–680 (2010) 25. Saha, S.: A comprehensive guide to convolutional neural
6. Campos, J.: Guías, algoritmos y recomendaciones durante el networks—the ELI5 way. https://bit.ly/2JyfxpT (2018). Accessed
manejo de la vía aérea difícil en el paciente sometido a cirugía 14 Mar 2019
torácica: ¿están respaldados por la evidencia cientifica? Rev. Esp. 26. Zerium, A.: Demystifying convolutional neural networks. https://
Anestesiol. Reanim. 65(1), 1–4 (2018) medium.com/@eternalzer0dayx/demystifying-convolutional-
7. Kovacs, G.: Airway Management in Emergencies. McGraw-Hill neural-networks-ca17bdc75559 (2018). Accessed 16 Sept 2018
Education, New York (2011) 27. Google, Tensors. https://www.tensorflow.org/guide/tensors.
8. Rubio-Martínez, R., Espino-Núñez, S., Espinoza-Tadeo, A., Accessed 14 Mar 2019
Romero-Guillén, P., Medina-Pérez, M.E., Coronado-Ávila, S.: Ses- 28. Howard, A.G., Zhu, M.: Mobilenets: open-source models for
gos cognitivos en anestesia, una causa latente de error humano. Rev. efficient on-device vision. https://ai.googleblog.com/2017/06/
Mex. Anestesiol. 42(2), 118–121 (2019) mobilenets-open-source-models-for.html (2017). Accessed 16
9. Gómez-Ríos, M., Gaitini, L., Matter, I., Somri, M.: Guías y algo- Sept 2018
ritmos para el manejo de la vía aérea difícil. Rev. Esp. Anestesiol. 29. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.:
Reanim. 65(1), 41–48 (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: The
10. Cook, T., Woodall, N., Frerk, C.: Major complications of airway IEEE Conference on Computer Vision and Pattern Recognition
management in the UK: results of the Fourth National Audit Project (CVPR), June (2018)
of the Royal College of Anaesthetists and the Difficult Airway 30. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W.,
Society. Part 1: anaesthesia. Br. J. Anaesth. 106(5), 617–631 (2011) Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient con-
11. Marín, C., Alférez, G.H., Córdova, J., González, V.: Detection volutional neural networks for mobile vision applications (2017).
of melanoma through image recognition and artificial neural net- arXiv:1704.04861
works. In: IFMBE Proceedings. Springer International Publishing, 31. Sandler, M., Howard, A.: Mobilenetv2: the next generation of
Cham, pp. 832–835 (2015) on-device computer vision networks. https://ai.googleblog.com/
12. Espinoza, M., Alférez, G.H., Castillo, J.: Prediction of glaucoma 2018/04/mobilenetv2-next-generation-of-on.htmlnetv2-next-
through convolutional neural networks. In: Proceedings of the 2018 generation-of-on.html (2018). Accessed 14 Mar 2019
International Conference on Health Informatics and Medical Sys- 32. Pröve, P.-L.: Mobilenetv2: inverted residuals and linear bot-
tems, pp. 90–95 (2018) tlenecks. https://towardsdatascience.com/mobilenetv2-inverted-
13. Alférez, G.H., Jiménez, J., Hernández-Navarro, H., González, M., residuals-and-linear-bottlenecks-8a4362f4ffd5 (2018). Accessed
Domínguez, R., Briones, A., Hernández-Villalvazo, H.: Applica- 14 Mar 2019
tion of data science to discover the relationship between dental 33. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.:
caries and diabetes in dental records. In: Arabnia, H.R., Deligian- Rethinking the inception architecture for computer vision. In: The
nidis, L. (eds.) International Conference on Health Informatics and IEEE Conference on Computer Vision and Pattern Recognition
Medical Systems (HIMS 2016). CSREA Press, pp. 176–181 (2016) (CVPR), June (2016)
14. Shanahan, E., Huang, J.H.-C., Chen, A., Narsimhan, A., Tang, R.: 34. Greenfield, Y.: Precision, recall, sensitivity and specificity. http://
Difficultintubationapp.com—a difficult airway electronic record. yuvalg.com/blog/2012/01/01/precision-recall-sensitivity-and-
Can. J. Anesth. 63(11), 1299–1300 (2016) specificity/ (2012). Accessed 21 Mar 2019
15. Duggan, L.V., Lockhart, S.L., Cook, T.M., O’Sullivan, E.P., Dare,
T., Baker, P.A.: The airway app: exploring the role of smartphone
technology to capture emergency front-of-neck airway experiences
internationally. Anaesthesia 73(6), 703–710 (2018) Publisher’s Note Springer Nature remains neutral with regard to juris-
16. Law, J.A.: From the journal archives: Mallampati in two dictional claims in published maps and institutional affiliations.
millennia—its impact then and implications now. Can. J. Anesth.
61(5), 480–484 (2014)
17. Green, S.M., Roback, M.G.: Is the mallampati score useful for
emergency department airway management or procedural seda- Kevin Aguilar is a MSc in Computer Science student at Universidad de
tion? Ann. Emerg. Med. 74(2), 251–259 (2019) Montemorelos. He is currently working on Software Engineering and
Computer Vision.

123
Detection of difficult airway using deep learning Page 11 of 11 4

Germán H. Alférez is the director of the Institute of Data Science Christian Aguilar is the dean of the Medical School and professor
and professor at the School of Engineering and Technology, Univer- of Universidad de Montemorelos. His work focuses on thorax anes-
sidad de Montemorelos. His research interests include Data Science, thesia and difficult airway. He is a member of the Peruvian Society
Computer Vision, and Autonomic Computing. His research contribu- of Anesthesia, Analgesia and Resuscitation; Mexican Federation of
tions have been recognized by the National Council of Science and Anesthesiology; Peruvian Society of Pediatric Endosurgery; Mexican
Technology (CONACYT), Government of Mexico, by awarding him College of Anesthesiology; and Latin American Lifestyle Medicine
a distinction in the National System of Researchers (SNI). Also, he is Association. He is also certified by the Mexican Board of Anesthesi-
a research fellow of Peru’s National Council for Science, Technology, ology.
and Technological Innovation (CONCYTEC). www.harveyalferez.com.

123

Potrebbero piacerti anche