Sei sulla pagina 1di 9

Eur Radiol (2010) 20: 15451553

DOI 10.1007/s00330-009-1699-5 BREAST


Gisella Gennaro
Alicia Toledano
Cosimo di Maggio
Enrica Baldan
Elisabetta Bezzon
Manuela La Grassa
Luigi Pescarini
Ilaria Polico
Alessandro Proietti
Aida Toffoli
Pier Carlo Muzzio
Received: 4 August 2009
Revised: 14 October 2009
Accepted: 14 November 2009
Published online: 22 December 2009
# European Society of Radiology 2009
Digital breast tomosynthesis versus digital
mammography: a clinical performance study
Abstract Objective: To compare the
clinical performance of digital breast
tomosynthesis (DBT) with that of full-
field digital mammography (FFDM) in
a diagnostic population. Methods:
The study enrolled 200 consenting
women who had at least one breast
lesion discovered by mammography
and/or ultrasound classified as doubtful
or suspicious or probably malignant.
They underwent tomosynthesis in one
view [mediolateral oblique (MLO)] of
both breasts at a dose comparable to that
of standard screen-film mammography
in two views [craniocaudal (CC) and
MLO]. Images were rated by six breast
radiologists using the BIRADS score.
Ratings were compared with the truth
established according to the standard of
care and a multiple-reader multiple-
case (MRMC) receiver-operating
characteristic (ROC) analysis was
performed. Clinical performance of
DBTcompared with that of FFDM was
evaluated in terms of the difference
between areas under ROC curves
(AUCs) for BIRADS scores. Results:
Overall clinical performance with DBT
and FFDM for malignant versus all
other cases was not significantly
different (AUCs 0.851 vs 0.836,
p=0.645). The lower limit of the 95%
CI or the difference between DBT and
FFDM AUCs was 4.9%.
Conclusion: Clinical performance of
tomosynthesis in one view at the same
total dose as standard screen-film
mammography is not inferior to digital
mammography in two views.
Keywords Digital breast
tomosynthesis
.
Digital
mammography
.
ROC analysis
.
Clinical performance
.
Non-inferiority
Introduction
Digital breast tomosynthesis (DBT) is expected to
overcome some inherent limitations of mammography
clinical performance caused by overlapping of normal and
pathological tissues during the standard 2D projections
[14]. The tomosynthesis principle has been known since the
1930s [2, 5], but has been really applied only in the last
decade, gaining advantage from the new digital detector
technologies employed in X-ray medical imaging [68]. In a
breast tomosynthesis system the X-ray tube moves along
an arc during the examination and a finite number of
two-dimensional (2D) projections are acquired within a
limited angle. The 3D volume of the compressed breast is
reconstructed from the 2D projections using algorithms
generically described as shift and add. Afterwards, the
reconstructed breast volume can be explored by scrolling
through the slices, allowing the enhancement of the
information contained in each plane while blurring the
off-focus information. For this reason, tomosynthesis is
G. Gennaro (*)
.
E. Baldan
.
E. Bezzon
.
L. Pescarini
.
I. Polico
.
A. Proietti
.
A. Toffoli
.
P. C. Muzzio
Department of Radiology, Venetian
Oncological Institute (IOV), IRCCS,
via Gattamelata 64,
35128 Padua, Italy
e-mail: gisella.gennaro@ioveneto.it
e-mail: gisella.gennaro@pd.infn.it
Tel.: +39-49-8215735
Fax: +39-49-8215740
A. Toledano
Statistics Collaborative Inc.,
Washington, DC, USA
C. di Maggio
.
L. Pescarini
Department of Oncological
and Surgical Sciences,
Padua University,
Padua, Italy
M. La Grassa
Department of Radiology,
Aviano Oncological Reference Center
(CRO), IRCCS,
Aviano (Pordenone), Italy
P. C. Muzzio
Department of Medical Diagnostic
Sciences, Padua University,
Padua, Italy
often reported to be a technique that is able to partially
remove the so-called structural or anatomical noise.
Early experience of tomosynthesis application to breast
imaging has shown the potential of DBT, which might
improve the specificity of mammography with improved
lesion margin visibility and might improve early breast
cancer detection, especially in women with dense
breasts [1]. Tomosynthesis was described as using one
or two breast views under the constraint that the dose
level remains comparable with doses delivered for
standard mammography in two views [911].
A significant number of technical papers can be found in
the literature concerning simulations that estimate the
potential benefits of tomosynthesis [12, 13], the optimisation
of geometrical and technique factors in DBT [1416], the
evaluation of the scatter contribution [17], reconstruction
algorithms [1820], computer-aided detection (CAD) appli-
cations for DBTimages [2124], etc., but only a few clinical
studies, very limited in size, have been published recently.
The first clinical study was published by Poplack et al. [25],
based on a population of 98 patients recalled fromscreening.
They performed a paired comparison between DBT and
screen-filmmammography images in terms of image quality,
concluding that subjectively, DBT has comparable or
superior image quality versus full-field digital mammogra-
phy (FFDM), and has the potential to reduce screening recall
rates when used in adjunction with digital mammography.
Good et al. [26] compared lesion detectability and probabil-
ity of malignancy for 30 patients based on FFDM, DBT
projection images, and DBTreconstructed slices, concluding
that there was no significant difference on average between
the three techniques, and justifying such results by the small
sample size and the inter-reader variability. Andersson et al.
[27] conducted a paired study based on lesion visibility
within a population of 40 cancers, concluding that cancer
visibility on DBT inone view is superior to FFDM in two
views and that this would indicate the potential of DBT to
increase sensitivity. Finally, Smith et al. [28] investigated
whether or not DBT might improve the performance of less
experienced radiologists when used in adjunction with
FFDMand concluded that all the radiologists involved in the
study, whatever their experience, gained a benefit from
tomosynthesis.
This work reports the final results of a clinical study
involving a diagnostic population of 200 women. The
study purpose was to compare the clinical performance of
DBT in one view [mediolateral oblique (MLO)] with
FFDM in two views [craniocaudal (CC) and MLO]. A
blinded multiple-reader multiple-case (MRMC) receiver-
operating characteristic (ROC) experiment was performed
involving six breast radiologists with experience in breast
imaging ranging from 5 to 30 years. Diagnostic accuracy of
tomosynthesis versus digital mammography using areas
under ROC curves (AUCs), reader-by-reader and over-
all, and lesion conspicuity were evaluated and are
discussed.
Materials and methods
Study population
The clinical investigation plan was approved by the
institutional Ethics Committee and by the national Ministry
of Health, as required by law for prototype medical
devices. Two hundred consenting women, showing at least
one breast lesion discovered by mammography and/or
ultrasound (US) and classified as doubtful or suspicious or
probably malignant, were enrolled in the study. They
underwent standard digital mammography in two views
(CC and MLO), and tomosynthesis in one view (MLO) of
both breasts. Inclusion and exclusion criteria applied for
patient accrual are listed in Table 1.
Patient accrual took 15 months, starting in April 2007
and concluding in July 2008. Tomosynthesis images were
not included in the standard of care, and the truth was
established on the basis of standard examinations: FFDM,
US, and related work-up.
Image acquisition protocol
All patients enrolled in the study underwent digital
mammography in two views (CC and MLO) of both breasts
by a GE Senographe 2000D with the AOP/STD exposure
mode. Moreover, they underwent bilateral tomosynthesis in
one view (MLO) by an investigational device developed by
GE Healthcare. The prototype equipment was based on a
standard FFDM platform (Senographe DS), modified to
acquire multiple projections over a 2040 arc. The
acquisition protocol was set to 15 projections over
40 (20 around the MLO position). Anode/filter combina-
tion, kV
p
and total mAs values were defined as a function of
breast thickness to match the condition that radiation dose for
DBTin one viewwould not be higher than the dose delivered
for standard screen-film mammography in two views [9].
Technique factors were manually selected by radiographers,
depending on the breast thickness interval (10 mm steps),
Table 1 Inclusion and exclusion criteria for patient accrual in the
clinical study to compare clinical performance of DBT versus
FFDM
Inclusion criteria Exclusion criteria
Breast lesion(s) classified BIRADS
3 or 4 or 5 at mammography or US
Previous mastectomy
40 years <40 years
Breast size to fit detector size Breast size exceeding
detector size
Breast implant
High genetic risk
1546
according to a specific table. The total tube load (mAs) was
equally divided by the DBT equipment per each of the 15
projections. Figure 1 shows the calculated average glandular
dose (AGD) versus equivalent breast thickness for DBT in
one view with the technique factors according to Wu et al
2006 [9]. It was compared with the AGD acceptance limits
proposed by the European Guidelines for Quality Assurance
in Breast Cancer Screening and Diagnosis, which were
obtained from wide statistics on a large amount of screen-
film mammography data [29]. Conversion factors from the
entrance dose to the AGD were derived from the tables
published by Dance et al. [30].
Raw projection images were sent to a reconstruction
computer, which applied an iterative SART (simultaneous
algebraic reconstruction technique) algorithm [18, 20],
providing slices sampled at 1-mm intervals. Thick slabs
(10 mm) were also reconstructed to provide the readers
with synthetic information before going through the slices.
Image interpretation protocol
Readings were performed subsequent to clinical manage-
ment and involved six breast radiologists with experience
in breast imaging of between 5 and 30 years. The left and
right breasts of each patient were interpreted separately, to
allow investigation of DBT and FFDM performance for
normal breasts isolated from performance for breasts with
lesions. FFDM and DBT images were displayed on the
same review workstation (GE SenoAdvantage) by two
different viewers: the standard one used for FFDM in the
clinical workflow, and another one specific for DBT
images, showing a series of slabs or slices of a given breast,
which can be scrolled by using mouse, keyboard, or cine-
loop tools. Standard additional features, like magnification
glass, zooming, etc. were available with both viewers.
The cases (single breasts) were randomised in several
reading sessions, each including 50% of FFDM and 50% of
DBT images of different breasts, both with and without
lesions. A same breast could not be included twice (one as
FFDM, one as DBT) in the same reading session and a time
interval ranging from 1 to 4 weeks was secured between
two reading sessions with FFDM and DBT images of the
same breast, in order to reduce potential bias due to
readers short-term memory. The six radiologists had a
training period to become confident with DBT images
including 25 cases read independently and discussed at the
end in a consensus meeting, in order to agree on the general
bases of DBT image evaluation. After the training period,
the six radiologists evaluated independently mammogra-
phy and tomosynthesis images previously anonymised and
blinded for any clinical information. FFDM CC and MLO
processed images were displayed full size one per each of
the two high-resolution monitors (5 MP); DBT MLO
reconstructed images were loaded as slabs on the left
monitor and as slices on the right monitor. The readers were
asked to evaluate slabs first, and afterwards slices, in an
attempt to figure out the potential role of slabs in
tomosynthesis reviewing.
Every single reader had to localise possible finding(s),
specifying depth with DBT images, define the finding type,
and assess conspicuity with a five-step scale: 1 = no visible
finding, 2 = low conspicuity, 3 = medium conspicuity,
4 = high conspicuity, 5 = very high conspicuity. Conspicuity
was defined as the combination of the confidence in the
presence of a given lesion with the confidence in decision
making based on lesion detectability. Figure 2 illustrates the
conspicuity concept: the finding was considered to have
higher conspicuity on tomosynthesis than on mammography.
The architectural distorsion is much more recognisable as
such from the DBT slice than from the FFDM MLO view.
After conspicuity assessment, each finding was classified
by using the BIRADS (Breast Imaging Reporting and Data
System) score in seven steps increasing with the probability
of malignancy, according to the American College of
Radiologists BIRADS score [31]: 1 = BIRADS 1 (negative),
2 = BIRADS 2 (benign finding), 3 = BIRADS 3 (probably
benign), 4 = BIRADS 4A (low suspicious abnormality),
5 = BIRADS 4B (medium suspicious abnormality),
6 =BIRADS4C(high suspicious abnormality), 7 =BIRADS
5 (highly suggestive of malignancy). Finally, readers had to
choose their favourite view, between CC and MLO for
FFDM images, and between slabs and slices for DBT
images, i.e. the view(s) they deemed more useful in assessing
the BIRADS finding. A maximum of three findings was
considered per breast and per technique (DBT or FFDM).
Breast density was also rated by each reader as one of the
four BIRADS classes [31] on the basis of FFDM images.
10 20 30 40 50 60 70 80 90 100
0
2
4
6
8
10
12
14
16
Mammography 2-views
Tomosynthesis 1-view
A
G
D

(
m
G
y
)
breast thickness (mm)
Fig. 1 AGD versus breast thickness for tomosynthesis in one view
and standard screen-film mammography in two views (screen-film
limits provided by the European Guidelines for Quality Assurance in
Breast Cancer Screening and Diagnosis [29])
1547
Data analysis
The truth was established using histology for lesions that
had undergone breast biopsy (all of which were classified
as malignant lesions, plus a small proportion of those
considered benign) and fine-needle aspiration cytology
(FNAC), whenever available, and 1-year follow-up for
benign lesions.
ROC curves for DBT and FFDM were determined
reader-by-reader. As two different values were available on
DBT for both conspicuity and BIRADS score, one for
slabs, another for slices, the BIRADS score associated with
the highest conspicuity value was used for ROC analysis,
assuming that higher conspicuity would mean higher
confidence in assessing the presence of lesions and the
probability of malignancy. In case a same reader assigned
the same conspicuity value to slabs and slices but different
BIRADS scores, the preferred volume scrolling mode
(slabs or slices) was used to decide which BIRADS score
was to be selected for the analysis. Overall assessment of
DBTand FFDM clinical performance was performed using
MRMC methodology.
Smoothed ROC curves were obtained from observed data
points by fitting statistical models. Three model choices were
available in the current software for analysing data from
MRMC studies (DBM MRMC version 2.2, http://xray.bsd.
uchicago.edu.krl/) [3234]. We chose the contaminated
binormal model [34], based on visual evaluation of how
close the smooth ROCcurves are to the observed data points.
The overall comparison of clinical performance was
derived from the difference between the mean areas under
the ROC curves (AUCs) by means of analysis of variance
(ANOVA), taking into account all variability factors:
techniques, readers, cases, and interactions [35].
Non-inferiority analysis was applied to AUCs, sensitivity
and specificity. The non-inferiority margin was delta=0.05.
A p value<0.05 was considered statistically significant.
Finally, only for breasts with lesions (malignant or
benign), conspicuity of FFDM and DBT were compared,
under the null hypothesis that lesions with DBT would be
at least as conspicuous as on FFDM (non-inferiority). A
choice between conspicuity associated with slabs and slices
was necessary for DBT images: the highest value between
slabs and slices was taken as DBT conspicuity, while the
BIRADS score associated with the highest conspicuity was
taken as DBT clinical assessment for ROC analysis, as
previously explained. Non-inferiority analysis of conspicuity
across readers was done using an Obuchowski-type model
[36] applied to malignant and benign lesions separately, and
to all lesions combined, after adjusting the degrees of
freedom according to Hilliss formula, to increase the
accuracy of the lowest 95%confidence limit estimation [37].
Results
The study population was 200 patients; three of them were
excluded because of technical issues during image acqui-
sition. Eighteen other single breasts were excluded a
posteriori because they had scars from previous surgical
interventions that were over-rated in terms of BIRADS
scores, the readings having been blinded. The effective
dataset included 376 breasts, 63 of them with cancers, 177
with benign lesions, and 136 with no lesions. The analysis
was performed per breast, which means that only one
finding per breast was counted, even if during readings up
to three findings per breast were permitted. The finding
with the highest BIRADS score was taken for the analysis
FFDM MLO
DBT MLO
Fig. 2 Clinical example of
finding with higher conspicuity
with DBT versus FFDM
1548
in all cases; in two breasts with bi-focal lesions, only one of
these lesions was included in the analysis. Multiple
findings will be used in a per-lesion analysis, which will
be the subject of a future paper.
Figure 3 represents the mean ROC curves for DBT and
FFDM calculated by averaging the curves obtained from
the six readers with the two techniques for malignant
versus all other cases.
AUCs were 0.851 for DBT and 0.836 for FFDM,
respectively, resulting in 0.014 difference, with p value
0.645. The 95% confidence interval (CI) for technique
difference was between 0.049 and +0.078, leading to
conclude the non-inferiority of DBT in one view compared
with FFDM in two views within a 5% non-inferiority
margin.
Table 2 lists, for each reader and overall, the values of
areas under FFDM and DBT ROC curves, the difference
between the two areas, and the p values calculated for the
difference.
The same type of ROC analysis was repeated considering
breasts with any lesions (malignant and benign) versus
normal breasts. Overall AUCs became 0.841 for DBT and
0.832 for FFDM, with a difference lower than 0.01 (p value=
0.704). Once again the 95% confidence interval for the AUC
difference (0.03764, +0.05541) confirms the non-inferiority
of DBT to FFDM using a 5% non-inferiority margin.
Figure 4 shows the plot of the mean difference between
DBT and FFDM areas under ROC curves with the
corresponding 95% confidence intervals for the two overall
analyses performed taking as positive cases only breasts
with malignant lesions (difference=0.0143; 95% CI=
0.0492; +95% CI=0.0778; circular symbol) versus
breasts with any type of lesions (difference=0.0089;
95% CI=0.0376; +95% CI=0.0554; square symbol).
The inferior limit of the confidence interval (95%) is
above the non-inferiority margin (0.05), providing a
graphical demonstration of DBT non-inferiority.
Table 3 reports the sensitivity and specificity values for
each reader, as synthetic diagnostic accuracy indices. The
threshold to separate between true/false-positive findings
and true/false-negative findings was set between BIRADS
3 and 4A. This means that a cancer that was classified by a
reader as a BIRADS grade equal to or greater than 4 was
counted as a true positive (TP) or as a false negative (FN) if
it was rated as BIRADS 3 or lower; conversely, a benign
lesion rated BIRADS 3 or lower was counted as true
negative (TN), or as false positive (FP) if rated higher than
BIRADS 3.
Analysis of variance (ANOVA), according to the
Obuchowski-Rockette model [36], was applied to sensi-
tivity to estimate whether the difference between the two
techniques was significant or not. Results showed that there
was a 4.5% decrease in sensitivity for DBT compared with
FFDM, which was not statistically significant (p=0.32;
95% CI: 15% decrease to 6% increase). The same ANOVA
model was used for specificity analysis, finding symme-
trically that the 4.0% increase for DBT was not significant
(p=0.10; 95% CI: 1% decrease to 9% increase).
Conspicuity analysis was performed for all breasts with
proven lesions, i.e. 240 breasts. Table 4 summarises the
average proportions of conspicuity ratings that were higher
with DBT than FFDM, equal with DBT and FFDM, and
lower with DBT than FFDM, within malignant lesions,
benign lesions, and a combination of the two, showing that
for both malignant and benign lesions DBT conspicuity is
at least as good as that for FFDM in most cases.
On average, lesion conspicuity with DBT was at least as
good as lesion conspicuity with FFDM in 79.4% of cases
(95% lower confidence bound from ANOVA=74.8%).
Similar results were obtained for conspicuity of malignant
lesions and for conspicuity of benign lesions separately.
Discussion
Image interpretation was performed per breast, not per
patient. This choice was related to the inclusion criteria,
requiring that patients enrolled in the study would have at
least one breast lesion previously classified as BIRADS
3. As the readers were aware of these criteria, bilateral
interpretation of FFDM and DBT images could have
produced distorted ROC results, because the readers
would have forced a sort of lesion searching process.
Interpreting left and right breasts of the same patient
separately, approximately 50% of the cases were normal
breasts, which made the clinical evaluation from the
readers more realistic, as performed on breasts with and
without lesions. This makes it harder to recognise
asymmetries, but the increased difficulty applies to both
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
s
e
n
s
i
t
i
v
i
t
y
1-specificity
DBT
FFDM
Fig. 3 Overall ROC curves (BIRADS scores) averaged over six
readers for DBT (solid line) and FFDM (dashed line). Breasts with
malignant lesions versus all other breasts (with benign lesions and
without lesions)
1549
FFDM and DBT, such that the comparison between them is
still valid.
Whenever a new technology is compared with one that is
already used clinically, experience gives an advantage to
the latter. Training in the new technology helps to ensure
that the readers learning curves reach their plateau before
interpreting study cases. Image evaluation by the six
radiologists took several months and data analysis was
conducted in subsequent steps, and no trend in the
individual clinical performance was noticed which could
be related to a potential dependence on their DBT learning
curve. However, some possible disadvantages of DBT,
like the limited experience of breast radiologists with
tomosynthesis images compared with their long-term
experience with mammography, or the use of prototype
DBT equipment which is not fully optimised in both the
acquisition setting and the display tools, may have
limited the results for DBT in this study.
ROC analysis has shown that clinical performance of
DBT and FFDM were similar in the study population.
AUCs were slightly higher for DBT versus FFDM for four
out of six readers, and slightly lower for two of them, but
the difference between the two AUCs in favour of DBT
was significant only for Reader A, as reported in Table 2.
The overall analysis confirmed that the mean diagnostic
accuracy of DBT in one view (MLO) and that of FFDM in
two views (CC and MLO) were not significantly different.
ROC calculation using both malignant and benign
lesions as positive cases and only normal breasts (with
no lesions) as negative cases, allows the impact of false
positives on the two techniques to be enhanced. The
difference between DBTand FFDM AUCs recalculated for
all lesions versus normal breasts was still non-significant.
Nevertheless, the reduction in FFDM mean area compared
with that obtained from the analysis performed considering
malignant lesions versus all other breasts (benign and
normal) is definitely smaller than the reduction in DBT
mean area. This would suggest that DBTcould better allow
radiologists to discriminate between malignant and benign
findings.
Every time radiologists face the issue of assessing
clinical performance provided with a new imaging tech-
nique, non-inferiority studies are designed in order to begin
answering the question of whether the new technique is
equivalent to the existing technique, used as a reference.
However, the application of a statistical test to state
whether the difference between the new and existing
technique is statistically significant or not, is insufficient to
conclude that the two technologies are diagnostically
equivalent [38]. Our results illustrated by Fig. 4 show DBT
non-inferiority compared with FFDM, which means that if
other similar ROC experiments were repeated with new
cases and readers, the average AUC for DBTwill generally
be higher than the average AUC for FFDM, and we are
95% confident that the mean AUC for DBT will not be
inferior by more than 0.05 to the mean AUC for FFDM.
Despite non-inferiority being insufficient to propose the
replacement of mammography with DBT, the results
obtained from this clinical study are encouraging if it is
considered that the diagnostic comparison was conducted
using tomosynthesis in one view versus digital mammog-
raphy in two views. The eventual reduction in the number
of views allowed by DBT (one versus two for FFDM)
could be positive from the patients point of view, reducing
the number of compressions per breast, and potentially
Malignant Malignant + Benign
-
0
.
1
0
-
0
.
0
5
0
.
0
0
0
.
0
5
0
.
1
0
m
e
a
n

A
U
C

d
i
f
f
e
r
e
n
c
e
(
D
B
T
-
F
F
D
M
)
+95% CL
+95% CL
- 95% CL - 95% CL
DBT inferior
DBT non-inferior
positive cases
Fig. 4 Non-inferiority plot showing the mean AUC difference
between overall DBT and FFDM with 95% confidence interval.
The left part represents the AUC difference and confidence interval
for ROC curves calculated assuming that positive cases were only
breasts with cancers; the right part is the difference recalculated with
the same dataset but taking as positive cases all the breasts with
lesions
Table 2 Comparison of AUCs for FFDM and DBT reader-by-reader and overall: per-breast analysis, malignant versus all other breasts
Reader ID AUCD
BT
AUC
FFDM
diff
AUC
(DBT-FFDM) p value (95% CI)
A 0.872 0.800 0.076 0.033
B 0.803 0.865 0.062 0.115
C 0.830 0.802 0.027 0.645
D 0.900 0.876 0.024 0.407
E 0.871 0.819 0.052 0.149
F 0.829 0.860 0.031 0.369
Overall 0.851 0.836 0.014 0.645
1550
reducing the total examination dose, both probably relevant
in screening programs. In this study, the technique factors
(anode/filter combination, kV
p
, mAs) to be used for DBT
sequence acquisition were selected such as the total DBT
dose was equal or less than the AGD delivered for a
standard screen-film mammography examination in two
views. It has been shown that some FFDM systems may
reduce AGD versus screen-film [3941]. In the future,
nothing should prevent DBT examinations at dose levels
comparable with FFDM.
In any case, non-inferiority demonstration is the first step
in accepting a new technique, but superior clinical
performance or a reduced dose would be necessary to
ground the expected benefits of breast tomosynthesis
compared with the reference, digital mammography. As
recently summarised very well by JT Dobbins III [42], the
application of tomosynthesis in breast imaging presupposes
to answer several questions: the most important is to
determine whether DBT would be more useful in the
screening or diagnostic environment, or both. Our experience
with DBT, limited to this clinical trial, indicates that, at
present, the main obstacle for using tomosynthesis in
screening is the review time, whilst in a diagnostic environ-
ment it would complete clinical information together with
other less expensive imaging techniques. It is clear that,
besides the optimistic expectations from physics studies and
computer simulations, some type of real diagnostic
advantage should be demonstrated by unbiased clinical trials
testing DBT superiority before replacing mammography.
Currently, tomosynthesis can not yet be considered a mature
technology, and the fast evolution of acquisition protocols,
reconstruction algorithms, CAD applications and possibly
contrast-enhancement applications can only be imagined.
Because of the basic principle of anatomical noise
removal by tomosynthesis, an increase in both sensitivity
and specificity is expected [3, 12]. However, in this study
mean sensitivity and specificity were found to be
comparable for DBT and FFDM. This could be explained
by inter-reader variability which affects clinical decisions
with both tomosynthesis and mammography images and
produces extremely variable information. Moreover, the
study population contained 63 breasts with cancers against
240 breasts with benign lesions or with no lesion at all;
studies with larger sample sizes could perhaps better
address the sensitivity expectation.
Conspicuity analysis showed that proportions of
lesions classified by radiologists with DBT conspicuity
higher than or equal to that of FFDM were more prevalent
than proportions of lesions with DBT conspicuity rated
lower than that of FFDM. This was confirmed for both
malignant and benign lesions, and for each of the six
readers, as reported in Table 4. Despite this promising
result, it should be noticed that the categorisation of lesion
conspicuity is not defined precisely, and a certain level of
subjectivity in the interpretation of the conspicuity concept
should be accepted.
The trend towards increasing lesion conspicuity from
DBTis consistent with the work fromGong et al. [12], which
demonstrates by computer simulation that tomosynthesis
increases significantly confidence in lesion presence, and the
study from Poplack et al. [25], which concludes that
tomosynthesis has superior image quality compared with
screen/film mammography. However, despite that the poten-
tial of DBT to improve image quality/lesion conspicuity
compared with 2D imaging was confirmed from this study,
such gain was not translated into better diagnostic
Table 3 Sensitivity and specificity for each reader with DBT and FFDM
Sensitivity Specificity
Reader ID DBT FFDM DBT FFDM
A 68.3% 61.9% 94.2% 93.6%
B 63.5% 84.1% 86.9% 75.7%
C 65.1% 68.3% 86.5% 83.4%
D 81.0% 82.5% 92.3% 83.4%
E 66.7% 65.1% 92.3% 91.4%
F 74.6% 84.1% 80.4% 81.5%
Mean 69.8% 74.3% 88.9% 84.8%
Table 4 Average proportions of cases with DBT lesion conspicuity classified higher than, equal to or lower than FFDM (rows) calculated
for malignant lesions, benign lesions, and a combination of the two
Mean conspicuity Malignant Benign Malignant + benign
DBT>FFDM 35.4% 38.7% 37.8%
DBT=FFDM 46.0% 39.9% 41.5%
DBT<FFDM 18.5% 21.4% 20.6%
1551
performance. Within the diagnostic process, conspicuity is
probably more directly related to the lesion detection phase
than to the decision making. Even if a better lesion depiction
should in principle increase the radiologists confidence in
decision making, this might not ensure that the decision taken
is correct.
Conclusions
Lesion conspicuity increases with DBT compared with
FFDM, which represents a positive achievement for the
radiologist, as higher conspicuity may provide radiologists
with more confidence in making clinical decisions.
Nevertheless, the increase in conspicuity did not allow a
measurable improvement of diagnostic performance.
The study has demonstrated that clinical performance of
tomosynthesis in one view at the same dose as standard
screen-film mammography is non-inferior to digital mam-
mography in two views.
Acknowledgements The authors would like to thank Luc Katz,
Francesca Braga, Henri Souchay, Razvan Iordache, and Sylvain
Bernard from GE Healthcare for helpful discussion and scientific
debate, and Lorenzo Pesce from University of Chicago for his
support on ROC fitting models.
A. Toledano (statistician) is consultant for GE Healthcare.
References
1. Niklason LT, Christian BT, Niklason
LE, Kopans DB, Castleberry DE,
Ophsal-Ong BH, Landberg CE, Slanetz
PJ, Giardino AA, Moore R, Albagli D,
DeJoule MC, Fitzgerald PF, Fobare DF,
Giambattista BW, Kwasnick RF, Liu J,
Lubowski SJ, Possin GE, Richotte JF,
Wei C-Y, Wirth RF (1997) Digital
tomosynthesis in breast imaging.
Radiology 205:399406
2. Dobbins JT III, Godfrey DJ (2003)
Digital x-ray tomosynthesis: current
state of the art and clinical potential.
Phys Med Biol 48:R65R106
3. Park JM, Franken EA Jr, Garg M,
Fajardo LL, Niklason LT (2007) Breast
tomosynthesis: present considerations
and future applications. Radiographics
(Suppl 1):S231S240
4. Rafferty E (2007) Digital mammogra-
phy: novel applications. Radiol Clin N
Am 45:831843
5. van Tiggelen R (2002) In search for the
third dimension: from radiostereoscopy
to three-dimensional imaging.
JBR-BTR 85:266270
6. Mahesh M (2004) Digital mammogra-
phy: an overview. Radiographics
24:17471760
7. Spahn M (2005) Flat detectors and their
clinical applications. Eur Radiol
15:19341947
8. Pisano ED, Yaffe MJ (2005) Digital
mammography. Radiology 234:353
362
9. Wu T, Liu B, Moore R, Kopans D
(2006) Optimal acquisition techniques
for digital breast tomosynthesis
screening. In: Flynn MJ, Hsieh J (eds)
Medical imaging 2006: physics of
medical imaging. Proceedings of SPIE
2006 6142:61425-E
10. Sechopoulos I, Suryanarayanan S,
Vedhantam S, DOrsi C, Karellas A
(2007) Computation of the glandular
radiation dose in digital tomosynthesis
of the breast. Med Phys 34:331232
11. Ma AKW, Darambera DG, Stewart A,
Gunn S, Bullard E (2008) Mean glan-
dular dose estimation using MNCPX for
a digital breast tomosynthesis system
with tungsten/aluminum and tungsten/
aluminum + silver x-ray anode/filter
combination. Med Phys 35:52785289
12. Gong X, Glick SJ, Liu B, Vedula AA,
Thacker S (2006) A computer
simulation study comparing lesion
detection accuracy with digital
mammography, breast tomosynthesis
and cone-beam CT breast-imaging.
Med Phys 33:10411052
13. Zhao B, Zhao W (2008) Three-
dimensional linear system analysis for
breast tomosynthesis. Med Phys
35:52195232
14. Zhou J, Zhao B, Zhao W (2007) A
computer simulation platform for the
optimization of a breast tomosynthesis
system. Med Phys 34:10981109
15. Chawla AS, Samei E, Saunders RS, Lo
JY, Baker JA (2008) A mathematical
model platform for optimizing a
multiprojection breast imaging system.
Med Phys 35:13371345
16. Wang X, Mainprize JG, Kempston MP,
Mawdsley GE, Yaffe MJ (2007) Digital
breast tomosynthesis geometry calibra-
tion. In: Flynn MJ, Hsieh J (ed) Medical
imaging 2007: physics of medical
imaging. Proceedings of SPIE 2007
6510:65103B
17. Sechopoulos I, Suryanarayanan S,
Vedhantam S, DOrsi C, Karellas A
(2007) Scatter radiation in digital
tomosynthesis of the breast. Med Phys
34:564576
18. Wu T, Moore RH, Rafferty EA, Kopans
DB (2004) A comparison of
reconstruction algorithms for breast
tomosynthesis. Med Phys 31:2636
2647
19. Wu T, Moore RH, Kopans DB (2006)
Voting strategy for artifact reduction in
digital breast tomosynthesis. Med Phys
33:14611471
20. Zhang Y, Chan H-P, Sahiner B, Wei J,
Goodsitt MM, Hadjiiski LM, Ge J,
Zhou C (2006) A comparative study of
limited angle cone-beam reconstruction
methods for breast tomosynthesis. Med
Phys 33:37813795
21. Chan H-P, Sahiner B, Rafferty EA, Wu
T, Roubidoux MA, Moore RH, Kopans
DB, Hadjiiski LM, Helvie MA (2005)
Computer-aided detection system for
breast masses on digital tomosynthesis
mammograms: preliminary experience.
Radiology 237:10751080
22. Reiser I, Nishikawa RM, Giger ML, Wu
T, Rafferty EA, Moore R, Kopans DB
(2006) Computerized mass detection for
digital breast tomosynthesis directly from
projection images. Med Phys 33:482
491
23. Chan H-P, Wei J, Zhang Y, Helvie MA,
Moore RH, Sahiner B, Hadjiiski LM,
Kopans DB (2008) Computer-aided
detection of masses in digital
tomosynthesis mammography:
comparison of three approaches. Med
Phys 35:40874095
1552
24. Reiser I, Nishikawa RM, Edwards AV,
Kopans DB, Schmidt RA, Papaioannou
J, Moore RH (2008) Automated
detection of microcalcification clusters
for digital breast tomosynthesis using
projection data only: a preliminary
study. Med Phys 35:14861493
25. Poplack SP, Tosteson TD, Kogel CA,
Nagy HM (2007) Digital breast
tomosynthesis: initial experience in 98
women with abnormal digital screening
mammography. AJR Am J Roentgenol
189:616623
26. Good WF, Abrams GS, Catullo VJ,
Chough DM, Ganott MA, Hakim CM,
Gur D (2008) Digital breast
tomosynthesis: a pilot observer study.
AJR Am J Roentgenol 190:865869
27. Andersson I, Ikeda DM, Zackrisson S,
Ruschin M, Svahn T, Timberg P,
Timberg A (2008) Breast tomosynth-
esis and digital mammography: a
comparison of breast cancer visibility
and BIRADS classification in a popu-
lation of cancers with subtle mam-
mographic findings. Eur Radiol
18:28172825
28. Smith AP, Rafferty EA, Niklason L
(2008) Clinical performance of breast
tomosynthesis as a function of radiol-
ogist experience level. LNCS 5116:61
66
29. van Engen R, van Wouldenberg S,
Bosmans H, Young K, Thjissen M
(2006) European protocol for the qual-
ity control of the physical aspects of
mammography screeningScreen-film
mammography. In: European Guide-
lines for Quality Assurance in Breast
Cancer Screening and Diagnosis, 4th
edn. European Commission, Luxem-
bourg, pp 61104
30. Dance DR, Skinner CL, Young KC,
Beckett JR, Kotre CJ (2000) Additional
factors for the estimation of mean
glandular breast dose using the UK
mammography dosimetry protocol.
Phys Med Biol 45:32253240
31. American College of Radiology (ACR)
(2003) Breast Imaging Reporting and
Data System Atlas (BI-RADS Atlas).
American College of Radiology,
Reston
32. Metz CE, Pan X (1999) Proper
binormal ROC curves: theory and
maximum-likelihood estimation. J
Math Psychol 43:133
33. Pesce LL, Metz CE (2007) Reliable and
computationally efficient maximum-
likelihood estimation of proper bi-
normal ROC curves. Acad Radiol
14:814829
34. Dorfman DD, Berbaum KS (2000) A
contaminated binormal model for ROC
data: part II. A formal model. Acad
Radiol 7:427437
35. Obuchowski NA (2007) New metho-
dological tools for multiple-reader
ROC studies. Radiology 243:1012
36. Obuchowski NA (1995) Multireader,
multimodality receiver operating char-
acteristic curve studies: hypothesis
testing and sample size estimation
using an analysis of variance approach
with dependent observations. Acad
Radiol 2:S22S29
37. Hillis SL (2007) A comparison of
denominator degrees of freedom meth-
ods for multiple observer ROC analy-
sis. Stat Med 26:596619
38. Obuchowski NA (1997) Testing for
equivalence of diagnostic tests. AJR
Am J Roentgenol 168:1317
39. Gennaro G, di Maggio C (2006) Dose
comparison between screen/film and
full-field digital mammography. Eur
Radiol 16:25592566
40. Samei E, Saunders RS, Baker JA,
Delong DM (2007) Digital mammog-
raphy: effects of reduced radiation dose
on diangostic performance. Radiology
243:396404
41. Svahn T, Hemdal B, Ruschin M,
Chakraborty DP, Andersson I, Tingberg
A, Mattsson S (2007) Dose reduction
and its influence on diagnostic accuracy
and radiation risk in digital mammog-
raphy: an observer performance study
using an anthropomorphic breast
phantom. Br J Radiol 80:557562
42. Dobbins JT III (2009) Tomosynthesis
imaging: at a translational crossroads.
Med Phys 36:19561967
1553

Potrebbero piacerti anche