Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
∗
HTC Research & Healthcare
{jake.mh wu, felix chu}@htc.com
†
Dynamical Biomarkers Group
emilyjchang30@gmail.com
93
Table I: Database summary
Database Dataset #Channels #Records Purpose N SVEB VEB F Q
MIT-BIH-AR DS1 2 22 Train 45809 940 3769 414 8
MIT-BIH-AR DS2 2 22 Test 44207 1836 3218 388 7
DeepQ - 1 22 Test 6852 454 663 − −
DS1: 101, 106, 108, 109, 112, 114, 115, 116, 118, 119, 122, 124, 201, 203, 205, 207, 208, 209, 215, 220, 223, 230.
DS2: 100, 103, 105, 111, 113, 117, 121, 123, 200, 202, 210, 212, 213, 214, 219, 221, 222, 228, 231, 232, 233, 234.
DeepQ: T 3042, T 3043, T 3044, T 3045, T 3049, T 3050, T 3052, T 3054, T 3056, T 3059, T 3063, T 3066, T 3068,
T 4007, T 4021, T 4024, T 5001, T 5006, T 5029, T 5047, T 5049, T 5057.
94
pool: stride=2. Each densely connected block consists of Table III: Beat-by-beat performance
a fully-connected layer, a non-linear activation layer, and (a) One-lead
a dropout layer. The first dense block is structured as
follows: (1) Dense: 64, (2) ReLU, (3) Dropout: 0.5. The Predicted
second dense block is structured as follows: (1) Dense: 16, N S V F Total
Se
(%)
(2) ReLU, (3) Dropout: 0.5. A softmax regression is used
N 41297 2358 116 436 44207 93
as the output layer to perform the multi-class classification. S 46 1779 11 0 1836 97
True
We train the network with the Adam optimizer, an initial V 17 79 2908 214 3218 90
learning rate of 0.0001, and a batch size of 128. Class F 86 4 8 290 388 75
weights are adjusted inversely proportional to the class Total 41446 4220 3043 940 49649 89
+P (%) 100 42 96 31 67 93
frequencies to account for the class imbalance in the input
data. We save the best model based on the leave-one-out (b) Two-lead
validation scheme depicted in Figure 1.
Predicted
Se
G. Classification Performance Measures N S V F Total
(%)
N 41599 2129 72 407 44207 94
Since 90% of the heartbeats are from the N class, a S 55 1769 12 0 1836 96
True
V 17 75 2848 278 3218 89
dummy classifier could always predict this dominant class F 21 1 5 361 388 93
and still get 90% accuracy all the time. For this reason, it is Total 41692 3974 2937 1046 49649 89
important to utilize other performance measures to evaluate +P (%) 100 45 97 35 67 94
the classifier performance. In compliance with the AAMI
recommendations, two performance measures, precision top-performing heartbeat classification results reported pre-
(positive predictive value, +P) and recall (sensitivity, Se), viously to serve as baselines for comparison. Table IV
are used to evaluate the classifier performance. These two shows that our proposed model achieves the state-of-the-
measures focus on assessing the ability of an algorithm to art performance and outperforms all previous methods that
discriminate VEB beats from non-VEB beats and SVEB used handcrafted features on the MIT-BIH-AR dataset.
beats from non-SVEB beats. Works in [8]–[11], [15], [22]
also employ this performance assessment. III. Model Personalization
H. Results This section details our method to achieve personalized
ECG arrhythmic heartbeat detection using standard and
Our method tops all nine metrics in the one-lead wearable databases.
analysis group and eight out of the nine metrics in the
two-lead analysis group by a large margin. Our sensitivity A. Approach
performance for F beats is a marginal 2% lower than the
state-of-the-art metric in the two-lead approach. It is also The advantage of deep learning is the ability to auto-
worth noting that our approach performs almost equally matically learn good feature representation from the input
well for one-lead and two-lead inputs. This had not been data [23]. In this context, we utilize our deep learning
achievable with previous methods. model to tackle the challenging problem created by the
The beat-by-beat confusion matrix table provides in- inter-patient ECG signal differences and to achieve two
sight into how the classifier performs on each beat type main objectives: (1) a good feature representation learner,
classification. Both one-lead and two-lead performances and (2) an active patient-adaptation learner to help its
are assessed and deemed unbiased as no records from human counterpart in cardiac disease detection. Given a
MIT-BIH-AR DS2 were used to generate the classifica- new patient’s ECG data, we first transform the data into a
tion models. Table III summarizes the overall beat-by- set of estimated class outputs using our pre-trained generic
beat performance for one-lead inputs and two-lead inputs. model. Through an interactive phase, we select a set of the
According to the beat-by-beat results, the loss of positive most ambiguously classified heartbeats and ask a human
predictivity for SVEB and F classes is mainly attributed expert to annotate their true classes. We then apply this
to the imbalanced dataset. newly labeled dataset to fine-tune and tailor our pre-trained
This work showcases the extraordinary feature extrac- model’s classifier section (dense blocks) to increase its
tion and discrimination ability of deep neural networks and generalization ability for the new patient.
demonstrates their effectiveness over the heuristic-based The main consideration in designing a systematic beat
and shallow architecture approaches. We include the four selection method is to increase the model’s disease dis-
95
Table IV: Methods comparison
(a) One-lead performance
Method Acc N S V F
Se +P Se +P Se +P Se +P
(%)
(%) (%) (%) (%) (%) (%) (%) (%)
Proposed 93 93 100 97 42 90 96 75 31
de Chazal [11] 90 91 99 74 37 88 80 19 5
Method Acc N S V F
Se +P Se +P Se +P Se +P
(%)
(%) (%) (%) (%) (%) (%) (%) (%)
Proposed 94 94 100 96 45 89 97 93 35
Zhang et al. [10] 86 89 99 79 36 85 93 94 14
Llamedo et al. [9] 78 78 99 76 41 83 88 95 4
de Chazal et al. [8] 83 87 99 75 39 80 81 89 9
crimination capability while minimizing the number fine- When there are no new samples to query or the maximum
tuning samples and amount of human-labeling effort. To number of queries is reached, we terminate our query
wisely reach this objective, we apply a confidence score iteration and assume the classification task has reached a
to each predicted heartbeat and select the least confident conclusion. We use our two-lead generic classifiers to cre-
samples to be labeled. Concretely, we measure the entropy ate adapted models for patients in MIT-BIH-AR DS2. As
level of the estimated posterior probabilities associating a DeepQ is a single-channel dataset, we utilize our one-lead
given heartbeat to a given class. The entropy measure is generic classifiers to build the patient-adaptation model
defined as follows: with the DeepQ dataset. Two-fold cross-validation scheme
n is used to derive best models. Algorithm 1 illustrates the
H=− pi log (pi ) , (2) main steps of our active learning approach.
i=1
96
one beat from each class. All of the least confident samples and VEB beats from the expert. Results from cases 202 and
that we queried in the 202 and 219 cases were labeled as N 219 support our initial theory that fine-tuning the classifier
without the necessary beat classes (e.g., new SVEB and F
beats in this case) may result in no improvement on the
Algorithm 1 Framework of classifier personalization classification performance.
Input: 2) DeepQ Dataset: We specifically include the DeepQ
S = {(xi , yi )}ni=1 : training set; dataset in the model personalization phase for testing and
T = {(xj }mj=1 : test set;
evaluating our model performance with data collected from
D: pre-trained model; the wearable ECG monitoring device.
Q: number of queries; The record-by-record evaluation presented in Table VI
k: number of beats to be labeled in each query; shows our model’s effectiveness in feature extraction and
N : heartbeat classes generalization abilities when applied to new patients in
Output: the DeepQ dataset. Our model achieves overall scores of
Y = {(xj , yj )}m j=1 : classification result;
100% positive predictivity and 97% sensitivity in N class
detection, 99% positive predictivity and 73% sensitivity in
1: p ← estimated class probability; SVEB class detection, and 77% positive predictivity and
2: P ← set of estimated class probability; 98% sensitivity in VEB class detection.
3: h ← confidence score; The errors made in the VEB and SVEB detections from
4: H ← set of confidence score; the generic model are mainly contributed from the T3050
5: F ← fine-tune set; and T3068 patient records. These errors are later remedied
6: Fp ← previous fine-tune set; in our active learning process.
7: K ← query set; In the personalization process, our approach rarely
8: Db ← best model from the cross-validation; queries more than 10% of the data from each patient and
9: initialization: p = ∅; F = ∅; Fp = ∅; K = ∅; obtains a precise disease detection performance. Out of 22
10: for each t ∈ T do patients, 21 patients’ results are further improved from the
11: compute p using D; initial outputs. It is worth noting that our proposed con-
12: P ← P ∪ {p}; figuration values sensitivity over positive predictivity for
13: end for ectopic beats. This bias is reflected on patients T3045 and
14: compute Y with T and P ; T5057, whose sensitivity scores are higher but the positive
15: for q = 1; q ≤ Q; q ++ do predictivity scores are lower than the initial outputs after
16: for each t ∈ T do the query process. The only case of performance loss is in
17: compute h with Eq. 2; patient T3068, whose single VEB beat is misclassified as a
18: H ← H ∪ {h}; non-VEB diseased beat (an F beat). This is understandable
19: end for as F beats share similar morphological features to normal
20: sort T in descending order by H; and VEB beats.
21: for all n such that n ∈ N do In general, our model’s behavior complies with that of
22: Kn ← a set of top k/N beats; clinical practice where the essential objective is to identify
23: K ← K ∪ Kn ; all the cardiac abnormalities within a patient. Similar
24: end for results are obtained in the DeepQ dataset when compared
25: F ← Fp ∪ K; to the MIT-BIH-AR DS2 set. Our model demonstrates
26: If F = Fp , EXIT. its effectiveness in arrhythmic heartbeat detection on both
27: query an expert to label K; non-wearable and wearable ECG data and reaches near
28: divide F into V folds; 100% accuracy in the normal and VEB beat predictions in
29: for each of the V folds do both cases.
30: repeat
31: fine-tune model’s dense blocks with F ; IV. Conclusions and Future Works
32: until validation loss converges
33: end for In this work, we first presented an end-to-end generic
34: compute Db from V folds; ECG heartbeat classification model that addresses inter-
35: D ← Db ; Fp ← F ; F ← ∅; patient variability and achieves the state-of-the-art per-
36: update P using T and D; formance for arrhythmia detection on the MIT-BIH-AR
37: update Y by relabeling T with P ; database. The key to exceeding all previous heuristic-based
38: end for and shallow architecture baselines is utilizing the extraor-
97
Table V: Model personalization on the MIT-BIH-AR database
#Beats Generic Model Output After 5 Queries
Record N S V F N S V F
#Beats
N S V F
+P Se +P Se +P Se +P Se Queried +P Se +P Se +P Se +P Se
(%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%)
100 2237 33 1 0 100 100 100 100 100 100 - - 28 100 100 100 100 100 100 - -
103 2080 2 0 0 100 98 11 50 - - - - 37 100 100 100 100 - - - -
105 2514 0 41 0 100 93 - - 44 90 - - 81 100 97 - - 47 100 - -
111 2121 0 1 0 100 99 - - 0 0 - - 24 100 100 - - 100 100 - -
113 1787 6 0 0 100 99 19 100 - - - - 32 100 100 100 100 - - - -
117 1532 1 0 0 100 100 50 100 - - - - 34 100 100 100 100 - - - -
121 1859 1 1 0 100 100 25 100 100 100 - - 19 100 100 100 100 100 100 - -
123 1513 0 3 0 100 99 - - 100 100 - - 27 100 100 - - 100 100 - -
200 1742 30 825 2 99 98 23 73 99 90 2 50 101 100 98 42 80 99 98 17 100
202 2059 55 19 1 100 67 7 87 72 95 4 100 114 100 95 32 93 100 100 0 0
210 2421 22 195 10 99 97 21 100 100 71 3 10 105 100 100 60 100 100 90 40 70
212 2746 0 0 0 100 100 - - - - - - 36 100 100 - - - - - -
213 2639 28 220 362 100 99 78 89 93 29 66 97 117 100 99 82 100 96 79 82 96
214 2000 0 255 1 100 95 - - 100 98 0 0 49 100 100 - - 100 100 100 100
219 2080 7 64 1 100 79 1 57 98 91 0 0 69 100 100 0 0 99 100 0 0
221 2029 0 396 0 100 94 - - 100 99 - - 65 100 100 - - 100 100 - -
222 2272 209 0 0 99 75 26 94 - - - - 46 98 85 34 83 - - - -
228 1686 3 362 0 100 95 0 0 100 100 - - 56 100 100 60 100 100 100 - -
231 1566 1 2 0 100 92 0 0 100 100 - - 38 100 100 0 0 100 100 - -
232 397 1381 0 0 97 96 99 99 - - - - 55 100 100 100 100 - - - -
233 2229 7 830 11 199 95 83 71 100 94 4 64 84 100 100 88 100 100 100 64 64
234 2698 50 3 0 99 100 97 64 100 100 - - 46 100 100 94 98 100 100 - -
dinary feature learning and discrimination capability of the nificantly improved the precision of disease detection on
deep neural network, which maps the input ECG data to each new patient. Our proposed active learning algorithm
the corresponding arrhythmia class. We then showcased an ranks the prediction with an entropy measure and queries
efficient active learning approach to tailor our pre-trained the 10 least confident beats from each predicted class.
generic model to achieve model personalization, and sig- Within five iterations and 5% of the total beats, we are
98
able to improve the performance of the generic classifier on criteria,” Biomedical Engineering, IEEE Trans on, vol. 58,
each new patient in both MIT-BIH-AR DS2 and the DeepQ no. 3, pp. 616–625, 2011.
dataset, reaching nearly 100% accuracy in the normal [10] Z. Zhang, J. Dong, X. Luo, K.-S. Choi, and X. Wu, “Heart-
and VEB beat predictions. It is worth noting that with beat classification using disease-specific feature selection,”
Computers in biology and medicine, vol. 46, pp. 79–89,
our model, predicting using one-lead input data performs 2014.
almost as well as using two-lead input data. Our result [11] P. de Chazal, “Detection of supraventricular and ventricular
provides flexibility to improve a wearable device’s user ectopic beats using a single lead ecg,” in Engineering in
experience and reduce its cost. With this input channel Medicine and Biology Society (EMBC), 2013 35th Annual
expansion capability, our proposed method is expected to International Conference of the IEEE. IEEE, 2013, pp.
also work well with 12-lead ECG input data. 45–48.
[12] K. Park, B. Cho, D. Lee, S. Song, J. Lee, Y. Chee, I. Kim,
Our future work plans to include sitting and walking
and S. Kim, “Hierarchical support vector machine based
cases from the DeepQ Arrhythmia Database and apply heartbeat classification using higher order statistics and
active one-shot learning on the deep neural network model. hermite basis function,” in Computers in Cardiology, 2008.
The disease classifier performance can be further strength- IEEE, 2008, pp. 229–232.
ened with the help of increasing diversity and sizable [13] S. Osowski and T. H. Linh, “Ecg beat recognition using
labeled datasets for approaches like deep learning based fuzzy hybrid neural network,” Biomedical Engineering,
IEEE Trans on, vol. 48, no. 11, pp. 1265–1271, 2001.
machine learning models.
[14] I. Christov, G. Gómez-Herrero, V. Krasteva, I. Jekova,
A. Gotchev, and K. Egiazarian, “Comparative study of
References morphological and time-frequency ecg descriptors for heart-
beat classification,” Medical engineering & physics, vol. 28,
no. 9, pp. 876–887, 2006.
[1] O. T. Inan, L. Giovangrandi, and G. T. Kovacs, “Robust [15] X. Cui, E. Chang, W.-H. Yang, B. C. Jiang, A. C. Yang,
neural-network-based classification of premature ventricular and C.-K. Peng, “Automated detection of paroxysmal atrial
contractions using wavelet transform and timing interval fibrillation using an information-based similarity approach,”
features,” Biomedical Engineering, IEEE Transactions on, Entropy, vol. 19, no. 12, 2017.
vol. 53, no. 12, pp. 2507–2515, 2006. [16] E. Y. Chang, “Deepq: Advancing healthcare through artifi-
[2] G. K. Prasad and J. Sahambi, “Classification of ecg arrhyth- cial intelligence and virtual reality,” in Proceedings of the
mias using multi-resolution analysis and neural networks,” 2017 ACM on Multimedia Conference, ser. MM ’17. New
in TENCON 2003. Conference on Convergent Technologies York, NY, USA: ACM, 2017, pp. 1068–1068.
for the Asia-Pacific Region, vol. 1. IEEE, 2003, pp. 227– [17] G. B. Moody and R. G. Mark, “The impact of the mit-bih
231. arrhythmia database,” Engineering in Medicine and Biology
[3] S. Osowski, L. T. Hoai, and T. Markiewicz, “Support Magazine, IEEE, vol. 20, no. 3, pp. 45–50, 2001.
vector machine-based expert system for reliable heartbeat [18] S. Tong and E. Chang, “Support vector machine active
recognition,” Biomedical Engineering, IEEE Transactions learning for image retrieval,” in Proceedings of the Ninth
on, vol. 51, no. 4, pp. 582–589, 2004. ACM International Conference on Multimedia, ser. MUL-
[4] J. Rodriguez, A. Goni, and A. Illarramendi, “Real-time TIMEDIA ’01. ACM, 2001, pp. 107–118.
classification of ecgs on a pda,” Information Technology [19] M.-H. Wu and E. Y. Chang, “Deepq arrhythmia database:
in Biomedicine, IEEE Transactions on, vol. 9, no. 1, pp. A large-scale dataset for arrhythmia detector evaluation,” in
23–34, 2005. Proceedings of the 2Nd International Workshop on Multi-
[5] X. Jiang, L. Zhang, Q. Zhao, and S. Albayrak, “Ecg arrhyth- media for Personal Health and Health Care, ser. MMHealth
mias recognition system based on independent component ’17. New York, NY, USA: ACM, 2017, pp. 77–80.
analysis feature extraction,” in TENCON 2006. 2006 IEEE [20] A.-A. EC57, “Testing and reporting performance results of
Region 10 Conference. IEEE, 2006, pp. 1–4. cardiac rhythm and st segment measurement algorithms,”
[6] L. S. De Oliveira, R. V. Andreão, and M. Sarcinelli-Filho, Association for the Advancement of Medical Instrumenta-
“Premature ventricular beat classification using a dynamic tion, Arlington, VA, 1998.
bayesian network,” in Engineering in Medicine and Biology [21] A. ECAR, “Recommended practice for testing and report-
Society, EMBC, 2011 Annual International Conference of ing performance results of ventricular arrhythmia detection
the IEEE. IEEE, 2011, pp. 4984–4987. algorithms,” Association for the Advancement of Medical
[7] M. Lagerholm, C. Peterson, G. Braccini, L. Edenbrandt, and Instrumentation, 1987.
L. Sörnmo, “Clustering ecg complexes using hermite func- [22] G. De Lannoy, D. François, J. Delbeke, and M. Verleysen,
tions and self-organizing maps,” Biomedical Engineering, “Weighted conditional random fields for supervised inter-
IEEE Trans on, vol. 47, no. 7, pp. 838–848, 2000. patient heartbeat classification,” Biomedical Engineering,
[8] P. De Chazal, M. O. Dwyer, and R. B. Reilly, “Automatic IEEE Trans on, vol. 59, no. 1, pp. 241–247, 2012.
classification of heartbeats using ecg morphology and heart- [23] Y. Bengio, A. Courville, and P. Vincent, “Representation
beat interval features,” Biomedical Engineering, IEEE Trans learning: A review and new perspectives,” IEEE transac-
on, vol. 51, no. 7, pp. 1196–1206, 2004. tions on pattern analysis and machine intelligence, vol. 35,
[9] M. Llamedo and J. P. Martı́nez, “Heartbeat classification no. 8, pp. 1798–1828, 2013.
using feature selection driven by database generalization
99