Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Gianluca Baio
Andrea Berardi
Anna Heath
Bayesian Cost-
Effectiveness
Analysis with
the R package
BCEA
Use R!
Series Editors
Robert Gentleman Kurt Hornik Giovanni Parmigiani
Anna Heath
Bayesian Cost-Effectiveness
Analysis with the R package
BCEA
123
Gianluca Baio Anna Heath
Department of Statistical Science Department of Statistical Science
University College London University College London
London London
UK UK
Andrea Berardi
Department of Statistics
University of Milano-Bicocca
Milan
Italy
This book originates from the work that we have done, at different times and in
different capacities, in the area of statistical modelling for health economic evalu-
ation. In our view, this is a very interesting and exciting area for statisticians:
despite the strong connotation derived by its name, health economic evaluation is
just as much (if not more!) about statistics than it is about healthcare or economics.
Statistical modelling is a fundamental part of any such evaluation and as models
and the data that are used to populate them become bigger, more complex and
representative of a complicated underlying reality, so do the skills required by a
modeller.
Broadly speaking, the objective of publicly funded healthcare systems (such as
the UK’s) is to maximise health gains across the general population, given finite
monetary resources and a limited budget. Bodies such as the National Institute for
Health and Care Excellence (NICE) provide guidance on decision-making on the
basis of health economic evaluation. This covers a suite of analytical approaches
(usually termed “cost-effectiveness analysis”) for combining costs and conse-
quences of intervention(s) compared to a control, the purpose of which is to aid
decision-making associated with resource allocation. To this aim, much of the
recent research has been oriented towards building the health economic evaluation
on sound and advanced statistical decision-theoretic foundations.
Historically, cost-effectiveness analysis has been based on modelling often
performed in specialised commercial packages (such as TreeAge) or even more
frequently spreadsheet calculators (almost invariably MicrosoftExcel). The
“party-line” for why this is the case is that these are “easy to use, familiar, readily
available and easy to share with stakeholders and clients”. Possibly, in addition to
these, another crucial factor for the wide popularity of these tools is the fact that
often modellers are not statisticians by training (and thus less familiar with
general-purpose statistical packages such as SAS, Stata or R). Even more inter-
estingly, it is often the case that cost-effectiveness models are based on existing
templates (usually developed as Excel spreadsheets, for example for a specific
country or drug) and then “adapted” to the situation at hand.
vii
viii Preface
Luckily, we are not alone (although perhaps not in the majority) in arguing that
many of these perceived advantages require a serious rethink. In our view, there are
several limitations to the current state of modelling in health economics: firstly, the
process often implies a separation of the different steps required for the evaluation.
This potentially increases the risk of human errors and confusion, because the
results of the intermediate steps (e.g. the statistical analysis of data collected in a
randomised trial) are usually copied and pasted in Excel to populate cells and
formulae (see for instance our discussion in x1.4 and x4.2). Secondly, in an Excel
file calculations are usually spread over several sheets that are linked by formulae or
cross references. While in the case of simple models this is actually a neat way of
structuring the work, it can become unwieldy and difficult to track modifications for
more complex models, based on a combination of different datasets and thus
analyses (which of course is increasingly the norm!).
The idea of the R package BCEA has evolved naturally by the need of replicating
some types of analyses when post-processing the output of the models we were
developing in our applied work, while overcoming the limitations of the “standard”
work flow based on spreadsheets. It felt natural to make the effort of systematising
the functions we were using to do standard analyses and as we started doing so, we
realised that there was much potential and interesting work to be done. The main
objective of this book is to aid statisticians and modellers in health economics with
the “easier” part of the process—making sense of their model results and help them
reproduce the analysis that is, more or less, ubiquitous in the relevant output (be it a
research paper, or a dossier to be submitted to a regulatory agency such as NICE).
To this aim, the book is structured as follows. First, in Chap. 1, we introduce the
main concepts underlying the Bayesian approach and the basics of health economic
evaluation, with particular reference to the relevant statistical modelling. Again,
linking the two is natural to us as we are of a very strong Bayesian persuasion. In
addition to this, however, it is interesting to note that Bayesian methods are
extremely popular in this area of research, since they are particularly useful in
modelling composite sources of information (often termed “evidence synthesis”)
and effectively underlie the important concept of Probabilistic Sensitivity Analysis
(PSA, see for instance Chap. 4).
Chapter 2 presents the two case studies we use throughout the book. In par-
ticular, we introduce the statistical modelling and notation, describe the whole
process of running the analysis and obtaining the relevant output (in the form of
posterior distributions) and then the extra modelling required to compute the
quantities of interest for the economic analysis. This process is performed under a
fully Bayesian approach and is based on a combination of R and BUGS=JAGS, the de
facto standard software to perform Markov Chain Monte Carlo analysis.
Chapter 3 introduces the R package BCEA and its basic functionalities by means
of the two running examples. The very nature of BCEA is to follow a full Bayesian
analysis of the statistical model used to estimate the economic quantities required
for the cost-effectiveness analysis, but we make here (and later in the book) the
point that it can also be used in the case where the modelling is done using
Preface ix
xi
xii Contents
xv
xvi Acronyms
1.1 Introduction
Modelling for the economic evaluation of healthcare data has received much attention
in both the health economics and the statistical literature in recent years [1, 2],
increasingly often under a Bayesian statistical approach [3–6].
Generally speaking, health economic evaluation aims to compare the economic
performance of two or more alternative health interventions. In other words, the
objective is the evaluation of a multivariate outcome that jointly accounts for some
specified clinical benefits or consequences and the resulting costs. From the statis-
tical point of view, this is an interesting problem because of the generally complex
structure of relationships linking the two outcomes. In addition, simplifying assump-
tions, such as (bivariate) normality of the underlying distributions, are usually not
granted (we return to this point later).
In this context, the application of Bayesian methods in health economics is par-
ticularly helpful because of several reasons:
• Bayesian modelling is naturally embedded in the wider scheme of decision theory;
ultimately, health economic evaluations are performed to determine the optimal
course of actions in the face of uncertainty about the future outcomes of a given
intervention, both in terms of clinical benefits and the associated costs.
• Bayesian methods allow extreme flexibility in modelling, especially since the
application of revolutionary computational methods such as Markov Chain Monte
Carlo has become widespread. This is particularly relevant when the economic
evaluation is performed by combining different data into a comprehensive decision
model.
• Sensitivity analysis can be performed in a straightforward way under the Bayesian
approach and can be seen as a by-product of the modelling strategy. This is
extremely helpful in health economics, as decisions are often made on the basis of
limited evidence. For this reason, it is essential to understand the impact of model
and parameter uncertainty on the final outputs.
This chapter is broadly divided in two main parts. The first one introduces the
main aspects of Bayesian inference (in Sect. 1.2). First, in Sect. 1.2.1 we introduce the
main ideas underlying the Bayesian philosophy. We then present the most important
practical issues in modelling in Sect. 1.2.2 and computation in Sect. 1.2.3.
The second part of the chapter is focussed on presenting the basics of health
economic evaluation in Sect. 1.3 and then the practical aspects of this process are
presented in Sect. 1.4 which introduces the process of health economic evaluation
from the Bayesian point of view.
In this section we briefly review the main features of the Bayesian approach to statis-
tical inference as well as the basics of Bayesian computation. A detailed presentation
of these subtle and important topics is outside the scope of this book and therefore
we only briefly sketch them here and refer the reader to [5, 7–14].
A Bayesian model specifies a full probability distribution to describe uncertainty.
This applies to data, which are subject to sampling variability, as well as to parameters
(or hypotheses), which are typically unobservable and thus are subject to epistemic
uncertainty (e.g. the experimenter’s imperfect knowledge about their value) and even
future, yet unobserved realisations of the observable variables (data) [14].
As a consequence, probability is used in the Bayesian framework to assess any
form of imperfect information or knowledge. Thus, before even seeing the data,
the experimenter needs to identify a suitable probability distribution to describe the
overall uncertainty about the data y and the parameters θ. We generally indicate this
as p(y, θ).
By the basic rules of probability, it is always possible to factorise a joint distrib-
ution as the product of a marginal and a conditional distribution. For instance, one
could re-write p(y, θ) as the product of the marginal distribution for the parameters
p(θ) and the conditional distribution for the data, given the parameters p(y|θ). But
in exactly the same fashion, one could also re-express the joint distribution as the
product of the marginal distribution for the data p(y) and the conditional distribution
for the parameters given the data p(θ|y).
Consequently,
p(y, θ) = p(θ)p(y|θ) = p(y)p(θ|y)
p(θ)p(y|θ)
p(θ|y) = . (1.1)
p(y)
1.2 Bayesian Inference and Computation 3
Looking at (1.1) and (1.2) it should be clear that, in a Bayesian analysis, the
objective is to evaluate the level of uncertainty on some target quantity (be it the
unobservable parameter θ or the unobserved variable ỹ), given the inputs, i.e. the
observed data y and the set of assumptions that define the model in question.
Crucially, the Bayesian approach correctly recognises that, once the data have
been observed, there is no uncertainty left on their realised value. Unlike classical
statistical methods, what values might have been observed and how likely they might
have been under the specified model are totally irrelevant concepts.1
In a Bayesian context, inference directly involves the quantities that are not
observed—and again there is no distinction in the quantification of the uncer-
tainty about their value depending on their nature (i.e. parameters or future data). A
Bayesian analysis aims at revising the level of uncertainty in light of the evidence
that has become available: if data y are observed, we move from a prior to a posterior
state of knowledge/uncertainty.
When conducting a real-life Bayesian analysis, one has to think carefully about the
model used to represent not only sampling variability for the observable data, but
also the relevant parameters.2
We note here that, in some sense, modelling the data is in general an “easier” or,
may be, less controversial task, perhaps because data will eventually be observed,
core” Bayesian approach, parameters are just convenient mathematical abstractions that simplify
the modelling for an observable variable y—see for example [6] and the references therein.
4 1 Bayesian Analysis in Health Economics
so that model fit can be assessed, at least to some extent. On the other hand, some
people feel uncomfortable in defining a model for the parameters, which represent
quantities that we will never be in a position of actually observing.
In our view, in principle this latter task is not much different or harder than the
former. It certainly requires an extra modelling effort; and it certainly has the potential
to exert notable impact on the final results. However, by virtue of the whole modelling
structure, it has also the characteristic of being extremely explicit. Prior distributions
for the parameters cannot be hidden; and if they are, the model is very easy to discard
as non-scientific.
Nevertheless, the problem of how we should specify the priors for the parameters
does play a fundamental role in the construction of a Bayesian model. Technically,
a few considerations should drive this choice.
What do parameters mean (if anything)?
First, parameters have often some natural or implied physical meaning. For example,
consider data for y, the number of patients who experience some clinical outcome
out of a sample of n individuals. A model y|θ ∼ Binomial(θ, n) can be considered,
in which the parameter θ indicates the “population probability of experiencing the
outcome”, or, in other words, the probability that an individual randomly selected
from a relevant population will experience the outcome in question.
In this case, it is relatively simple to give the parameter a clear interpretation and
derive some physical properties—for instance, because θ is a probability, it should
range between 0 and 1 and be a continuous quantity. We can use this information to
find a suitable probability model, much as we have done when choosing the Binomial
distribution for the observed data.
One possibility is the Beta distribution θ ∼ Beta(α, β)—this is a continuous prob-
ability model with range in [0; 1] and upon varying the values of its parameters (α, β)
it can take on several shapes (e.g. skewed towards either end of the range, or symmet-
rical). In general, by setting suitable values for the parameters of a prior distribution,
it is possible to encode prior knowledge (as we demonstrate in the next sections).
Of course, the Beta is only one particular possibility and others exist. For example,
one could first construct the transformation
θ
φ = g(θ) = logit(θ) = log ,
1−θ
which rescales θ in the range (−∞; ∞) and stretches the distribution to get an approx-
imate symmetrical shape. Then, it will be reasonable to model φ ∼ Normal(μ, σ).
Because φ = g(θ), then the prior on φ will imply a prior on the inverse transforma-
tion g −1 (φ) = θ—although technically this can be hard to derive analytically as it
may require complex computations. We return to this point later in this section.
1.2 Bayesian Inference and Computation 5
Probability of success
Fig. 1.1 A graphical representation of an informative prior based on a Beta distribution (the blue
continuous line) or on a logit-Normal distribution (represented by the histogram). The two different
models effectively encodes exactly the same level of prior information
6 1 Bayesian Analysis in Health Economics
scale, so care would be needed to identify the correct values. For example, if we used
μ = logit(0.4) = −0.41 and σ = 0.413,3 this would induce a prior distribution on
φ that effectively represents the same information on the natural scale of θ.
The histogram in Fig. 1.1 depicts the logit-Normal prior on θ (i.e. the rescaled
version of the Normal prior distribution on φ)—as it is possible to see, this is virtually
identical with the Beta prior described above.
Model the relevant parameters
In other circumstances, it is more difficult to give the parameters a physical meaning
and thus defining a “good” prior distribution requires a bit more ingenuity. For
example, costs are usually characterised by a markedly skewed distribution and thus
a suitable model is the Gamma, which is characterised by two parameters θ = (η, λ).
These represent, respectively, the shape and the rate.
The difficulty with this distribution is that the “original-scale” parameters are quite
hard to interpret in a physical sense; in fact, they are just a mathematical construction
that defines the probability density, which happens to be reasonable for a variable
such as the costs associated with an intervention. It is more difficult in this case to
give a clear meaning to the parameters and thus eliciting a suitable prior distribution
(possibly encoding some substantive knowledge) becomes more complicated [15].
It is in general much easier to think in terms of some “natural-scale” parameters,
say ω = (μ, σ), representing for example the mean and standard deviation of the
costs on the natural scale. This is because we have a better grasp of what these
parameters mean in the real world and thus it is possible for us to figure out what
features we should include in the model that we choose. In addition to this, as we
briefly mentioned in Sect. 1.1 and will reprise in Sects. 1.3 and 3.3, decision-making is
effectively based on the evaluation of the population average values for the economic
outcomes and thus the mean of the cost distribution is in fact the parameter of direct
interest.
Typically, there is a unique deterministic relationship ω = h(θ) linking the
natural- to the original-scale parameters that define the mathematical form of the
distribution. As we hinted above, defining a prior on ω will automatically imply one
for θ. For example, by the mathematical properties of the Gamma density, the ele-
ments of ω (on which we want to set the priors) are defined in terms of the elements
of θ as
η η
μ= and σ= (1.3)
λ λ2
(similar relationships are in general available for the vast majority of probability
distributions).
3 We need to encode the assumption that, on the logit scale, 0.6 is the point beyond which only 2.5%
of the mass lies. Given the assumption of normality, this is easy to obtain by setting logit(0.4) +
1.96σ = logit(0.6) from which we can easily derive σ = logit(0.6)−logit(0.4)
1.96 = 0.413.
1.2 Bayesian Inference and Computation 7
Whatever the choice for the priors on the natural-scale parameters, in the case of
the Gamma distribution, inverting the deterministic relationships in (1.3) it is easy
to obtain
μ
η = μλ and λ= . (1.4)
σ2
More importantly, because μ and σ are random variables associated with a probability
distribution, so will be η and λ and thus the prior on ω automatically induces the
prior for θ.
How much information should be included in the prior?
Undoubtedly, the level of information contained in the prior plays a crucial role in
a Bayesian analysis, since its influence on the posterior and predictive distributions
may be large. This is particularly relevant in cases where the evidence provided by the
data is weak (for instance in the case of very small sample sizes). In these situations,
it is advisable to use as much information as possible in the prior, to complement the
limited amount of information present in the observed data and to perform substantial
sensitivity analysis to identify crucial assumptions that may bias the analysis.
On the other hand, in cases where a large amount of data is available to directly
inform a parameter, then the prior distribution becomes, to some extent, less influ-
ential. In such a situation, it is perhaps reasonable, or at least less critical, to encode
a lower level of information in the prior. This can be achieved for example by using
Uniform priors on a suitable range, or Normal priors centred on 0 and with very large
variances. These “vague” (also referred to as “minimally informative” or “flat”) pri-
ors can be used to perform an analysis in which the data drive the posterior and
predictive results.
We note, however, that the choice of vague prior depends on the nature and
physical properties of the parameters; for example, variances need to be defined on
a positive range (0; +∞) and thus it is not sensible to use a flat Normal prior (which
by definition extends from −∞ to +∞). Perhaps, a reasonable alternative would
be to model some suitable transformation of a variance (e.g. its logarithm) using a
Normal vague prior—but of course, care is needed in identifying the implications of
this choice on the natural scale (we return to this point in the next section).
There is then a sense in which, even when using minimally informative priors,
some substantial information is included in the modelling. Consequently, in any
case the general advice is that, when genuine information is available, it should be
included in the model in a clear and explicit way and that sensitivity analysis should
be thoroughly performed.
This aspect is well established in health economic evaluation, particularly under
the Bayesian approach. For instance, back to the simple cost example, one possibility
is to model the priors on ω using vague Uniform distributions:
for suitably selected values Hμ , Hσ . This would amount to assuming that values in
[0; Hμ ] are all reasonable for the population average cost of the treatment under
study—and a similar reasoning would apply to the standard deviation. Of course,
this may not be the best choice and were genuine information available, we should
include it in the model. For example, the nature of the intervention is clearly known
to the investigator and thus it is plausible that some “most-likely range” be available
at least approximately.
Assess the implications of the assumptions
In addition to performing sensitivity analysis to assess the robustness of the assump-
tions, in a Bayesian context it is important to check the consistency of what the priors
imply. For instance, in the cost model we may choose to use a vague specification
for the priors of the natural-scale parameters. Nevertheless, the implied priors for
the original-scale parameters will in general not be vague at all. In fact by assuming
a flat prior on the natural-scale parameters, we are implying some information on
the original-scale parameters of the assumed Gamma distribution.
One added advantage of modelling directly the relevant parameters is that this is
not really a problem; the resulting posterior distributions will be of course affected
by the assumptions we make in the priors; but, by definition, however informative the
implied priors for (η, λ) turn out to be, this will by necessity be consistent with the
substantive knowledge (or lack thereof) that we are assuming for the natural-scale
parameters.
On the other hand, when vague priors are used for original-scale parame-
ters (which may not be the focus of our analysis), the unintended information
may lead to severe bias in the analysis. For instance, suppose we model directly
η ∼ Uniform(0, 10000) and λ ∼ Uniform(0, 10000), with the intention of using a
0e+00 2e+04 4e+04 6e+04 8e+04 1e+05
(a) (b)
80000
60000
40000
20000
0
Fig. 1.2 Implied prior distributions for the mean μ, in panel (a), and the standard deviation σ, in
panel (b), obtained by assuming vague Uniform priors on the shape η and the rate λ of a Gamma
distribution
1.2 Bayesian Inference and Computation 9
vague specification. In fact, using (1.4) we can compute the implied priors for μ and
σ. Figure 1.2 shows how both these distributions place most of the mass to very small
values—possibly an unreasonable and unwanted restriction.
Computational issues
While in principle Bayesian analysis may be arguably considered conceptually
straightforward, the computational details are in general not trivial and make it gen-
erally complex. Nevertheless, under specific circumstances (and for relatively simple
models), it is possible to obtain analytic solutions to the computation of posterior
and predictive distributions.
One such case involves the use of conjugate priors [16]. These indicate a particular
mathematical formulation where the prior is selected in a way that, when combined
with the model chosen for the data, the resulting posterior is of the same form. For
example, if the available data y are binary and are modelled using a Bernoulli dis-
tribution y ∼ Bernoulli(θ) or equivalently p(y|θ) = θy (1 − θ)(1−y) , it can be easily
proven4 that choosing a Beta prior θ ∼ Beta(α, β) yields a posterior distribution
θ|y ∼ Beta(α∗ , β ∗ ), with α∗ = α + y and β ∗ = β + 1 − y. In other words, updat-
ing from the prior to the posterior occurs within the same probability family (in this
case the Beta distribution)—in effect, the information provided by the data is entirely
encoded in the updated parameters (α∗ , β ∗ ).
The obvious implication is that no complex mathematical procedure is required
to compute the posterior (and, similarly, the predictive) distribution. This is a very
appealing way of setting up a Bayesian model, mainly because several standard con-
jugated models exist—see for example [6]. In addition, it is often easy to encode prior
information using conjugate models—one simple example of a conjugate, informa-
tive prior was given above, when constructing the Beta distribution shown in Fig. 1.1.
On the other hand, conjugate priors are dictated purely by mathematical conve-
nience and they fail to fully allow the inclusion of genuine or substantial knowledge
about the nature of the model and its parameter. For this reason, in all but trivial cases,
conjugate priors become a limitation rather than an asset to a Bayesian analysis; for
example, no conjugate priors exist for the popular logistic regression model.
Thus, it is usually necessary to go beyond conjugacy and consider more complex
priors. This, however, comes at the price of increased computation complexity and
often no analytic or closed form solution exists for the posterior or the predictive
distribution. In these cases, inference by simulation is usually the preferred solution.
4 The basic idea is to investigate the form of the likelihood function L(θ), i.e. the model for the
data p(y|θ), but considered as a function of the parameters. If L(θ) can be written in terms of a
known distribution, then this represents the conjugate family. For instance, the Bernoulli likelihood
is L(θ) = p(y|θ) = θy (1 − θ)(1−y) , which is actually the core of a Beta density. When computing
the posterior distribution by applying Bayes theorem and combining the likelihood L(θ) with the
conjugate prior, effectively the two terms have the same mathematical form, which leads to a closed
form for the posterior.
10 1 Bayesian Analysis in Health Economics
Arguably, the main reason for the enormous increase in the use of the Bayesian
approach in practical applications is the development of simulation algorithms and
specific software that, coupled with the availability of cheap computational power
(which became widespread in the 1990s), allow the end-user to effectively use suit-
able analytic models, with virtually no limitation in terms of the complexity.
The main simulation method for Bayesian inference is Markov Chain Monte
Carlo (MCMC), a class of algorithms for sampling from generic probability
distributions—again, here we do not deal with technicalities, but refer the read-
ers to [11, 12, 17–22]. Robert and Casella [23] review the history and assess the
impact of MCMC. Spiegelhalter et al. [5] discuss the application of MCMC methods
to clinical trials and epidemiological analysis, while Baio [6] and Welton et al. [24]
present the main feature of MCMC methods and their applications, specifically to
the problem of health economic evaluation.
In a nutshell, the idea underlying MCMC is to construct a Markov chain, a
sequence of random variables for which the distribution of the next value only
depends on the current one, rather than the entire history. Given some initial val-
ues, this process can be used to repeatedly sample and eventually converge to the
target distribution, e.g. the posterior distribution for a set of parameters of interest.
Once convergence has been reached, it is possible to use the simulated values to
compute summary statistics (e.g. mean, standard deviation or quantiles), or draw
histograms to characterise the shape of the posterior distributions of interest.
Figure 1.3 depicts the MCMC procedure for the case of two parameters (μ, σ); in
this case, we superimpose the “true” joint density (the solid dark ellipses), which is
the “target” for the simulation algorithm. Obviously, this is done for demonstration
only—in general we do not know what the target distribution is (and this is why we
use MCMC to estimate it!).
Panel (a) shows the first 10 iterations of the process: the first iteration (labelled as
“1”) is set as the initial value and it happens to be in a part of the relevant space that
is not covered by the “true” distribution. Thus, this point is really not representative
of the underlying target distribution. In fact, as is common, the first few values
of the simulation are spread all over the space and do not really cover the target
area. However, as the number of simulations increases in panel (b), more and more
simulated points actually fall within the “true” distribution, because the process is
reaching convergence. In panel (c), after 1000 iterations effectively all of the target
area has been covered by simulated values, which can be then used to characterise
the joint posterior distribution p(μ, σ).
It is interesting to notice that, in general, it is possible to construct a suitable
Markov chain for a very wide range of problems and, more importantly, given a suf-
ficient number of simulations, it can be proved that it is almost certain that the Markov
chain will converge to the target (posterior) distribution. However, this process may
require a large number of iterations before it actually starts visiting the target area,
rather than points in the parametric space that have virtually no mass under the pos-
1.2 Bayesian Inference and Computation 11
7
10 10
2 2
6
6
5
5
29
18
σ
σ
1224
22
3 7 3 27
147
4
4
4 9 19
4 179
20
30 25 266 16
21
86 8
5 28 1323
5 15
11
3
3
1 1
2
2
−2 0 2 4 6 8 −2 0 2 4 6 8
μ μ
10
2
694
643
627787301 947 517
436 71
441 673
388
6
450319935365
614
484 511
44
281
405
271
678976
476
848
914
727 442 994 544
190
91292 487
554
745 323818 147
841
101
878 82393
413 76 865
193
959
904 58
750
729
776
686
611
661
726
631
407
210
205
681
485 254
700
724 111
230
150 832
260
640
340 185
134
285
123
872 579
483495
997654
995 596792
612 784
911
655
575 155
90371
524
221
648 538
937161
646
296
149
421
701
574 266 877
173
639880
33
5
465 606
103 635
253
429 537
410
846 177
379
540
710
980
461
598
492 80158729
844 400
369
982
342
386
156 572615
555 668
244939
184 412
867
541
277807
966
826
4061
968 99018
416
744 675828
1000
274
653
50
σ
788 915
241213
662
310 730
439
912
691
592
198
279 222
466
151
208 885
385
665
712
616
488489141
589
805
257
981
452 735
383
162
785622
513
722916
104
570
595
418
227
973
608
84117
737
204
116
887 304
559
402
322
604998
715
433
99
55325
96
810
723
87
480473761290
746 496874886
891
409 906
324
977
265
478
422 127
432
462469
288
417
845
175
100 472
291
419
533
270
142
287
747
883 547
781
831
110
390 54
169
650
896
952 943
0
660
866
642
32 145
946
725
195 999
934
507 759
289
815 247
586
3
527
687 316
112 600
273
827630
263
516743
Fig. 1.3 Markov Chain Monte Carlo in practice: panel (a) shows the first 10 iteration of a MCMC
for two parameters μ (on the x−axis) and σ (on the y−axis). The solid ellipses represent the “true”
underlying joint distribution p(μ, σ). At the beginning of the process, the simulations do not cover
the target area, but as the number of iterations increases as in panels (b) and (c), effectively all the
target area is fully represented. Convergence to the target distribution can be visually inspected by
running two (or more) Markov chains in parallel and checking the mix up as in panel (d). Adapted
from: [6]
terior distribution (e.g. the points labelled as “1”, “2”, “3”, “10” and “11” in Fig. 1.3).
For this reason it is essential to carefully check convergence and typically to discard
the iterations before convergence.
Panel (d) shows a traceplot for two parallel Markov chains, which are initialised at
different values: this graph plots on the x−axis the number of simulations performed
and on the y−axis the simulated values. As the number of simulations increases, the
two chains converge to a common area and mix up. We can visually inspect graphs
such as this and determine the point at which convergence has occurred. All the
12 1 Bayesian Analysis in Health Economics
simulations preceding this point are then discarded, while those after convergence
are used to perform the analysis.
For example, consider the simple case where there is only one scalar parameter of
interest. After the model is successfully run, we can typically access a vector of n sim
simulations θ̃ = (θ̃1 , . . . , θ̃n sim ) from the posterior. We can use this to summarise the
results, for example by computing the posterior mean
1
n sim
E[θ|y] = θ̃i ,
n sim i=1
or identifying suitable quantiles (for example the 2.5 and 97.5%), to give an approx-
imate (in this case 95%) credible interval.
If the MCMC algorithm has run successfully,5 then the information provided in θ̃
will be good enough to fully characterise the variability in the posterior distribution.
Of course, ideally we would like to use a large enough number of simulations; often,
n sim is set to 1000, although this is pretty much an arbitrary choice. Depending on the
underlying variability in the target distribution, this number may not be sufficient to
fully characterise the entire distribution (although it will usually estimate with suit-
able precision its core). Again, we refer to, for instance, [6] for more details. Examples
of the procedure described above are also presented in Sects. 2.3.1 and 2.4.1.
5 All the references cited above discuss to great length the issues of convergence and autocorrelation
and methods (both visual and formal) to assess them. In the discussion here, we assume that the
actual convergence of the MCMC procedure to the relevant posterior or predictive distributions has
been achieved and checked satisfactorily, but again, we refer the reader to the literature mentioned
above for further discussion and details. We return to this point in Chap. 2 when describing the case
studies.
1.3 Basics of Health Economic Evaluation 13
Health Service, NHS), this is a fundamental problem as public resources are finite
and limited and thus it is often necessary to prioritise the allocation of public funds
on health interventions.
Crucially, “optimality” can be determined by framing the problem in decision-
theoretic terms [1, 3, 5, 6], which implies the following steps.
• Characterise the variability in the economic outcome (e, c), which is typically
due to sampling, using a probability distribution p(e, c|θ), indexed by a set of
parameters θ. Within the Bayesian framework, uncertainty in the parameters is
also modelled using a probability distribution p(θ).
• Value the consequences of applying a treatment t, through the realisation of the
outcome (e, c) by means of a utility function u(e, c; t).
• Assess “optimality” by computing for each intervention the expectation of the
utility function, with respect to both “population” (parameters) and “individual”
(sampling) uncertainty/variability
U t = E [u(e, c; t)] .
In line with the precepts of (Bayesian) decision theory, given current evidence the
“best” intervention is the one associated with the maximum expected utility. This
is because it can be easily proved that maximising the expected utility is equivalent
to maximising the probability of obtaining the outcome associated with the highest
(subjective) value for the decision-maker [1, 6, 7, 10].
Under the Bayesian framework, U t is dimensionless, i.e. it is a pure number, since
both sources of basic uncertainty have been marginalised out in computing the expec-
tation. Consequently, the expected utility allows a direct comparison of the alternative
options.
While the general setting is fairly straightforward, in practice, the application of
the decision-theoretic framework for health economic evaluation is characterised by
the following complications.
1. As any Bayesian analysis, the definition of a suitable probabilistic description of
the current level of knowledge in the population parameters may be difficult and
potentially based on subjective judgement.
2. There is no unique specification of the method of valuation for the consequences
of the interventions is (i.e. what utility function should be chosen):
3. Typically, replacing one intervention with a new alternative is associated with
some risks such as the irreversibility of investments [25]. Thus, basing a decision
on current knowledge may not be ideal, if the available evidence-base is not
particularly strong/definitive (we elaborate on this point in Chap. 4).
As for the utility function, health economic evaluations are generally based on
the (monetary) net benefit [26]
u(e, c; t) = ke − c.
14 1 Bayesian Analysis in Health Economics
Here k is a willingness-to-pay parameter, used to put cost and benefits on the same
scale and represents the budget that the decision-maker is willing to invest to increase
the benefits by one unit. The main appeal of the net benefit is that it has a fixed
form, once the variables (e, c) are defined, thus providing easy guidance to valuation
of the interventions. Moreover, the net benefit is linear in (e, c), which facilitates
interpretation and calculations. Nevertheless, the use of the net benefit presupposes
that the decision-maker is risk neutral, which is by no means always appropriate in
health policy problems [27].
If we consider the simpler scenario where T = (0, 1), decision-making can be
equivalently effected by considering the expected incremental benefit (of treatment
1 over treatment 0)
EIB = U 1 − U 0 (1.5)
—of course, if EIB > 0, then U 1 > U 0 and therefore t = 1 is the optimal treatment
(being associated with the highest expected utility).
In particular, using the monetary net benefit as utility function, (1.5) can be re-
expressed as
where
Δe = E[e|θ 1 ] − E[e|θ 0 ] = μ1e − μ0e
E[Δc ]
ICER =
E[Δe ]
then it is straightforward to see that when the net monetary benefit is used as utility
function, then
⎧
⎪ E[Δc ]
⎪
⎨ k > E[Δ ] = ICER, for E[Δe ] > 0
e
EIB > 0 if and only if
⎪
⎪ E[Δ c]
⎩k < = ICER, for E[Δe ] < 0
E[Δe ]
Notice that, in the Bayesian framework, (Δe , Δc ) are random variables, because
while sampling variability is being averaged out, these are defined as functions of
the parameters θ = (θ 1 , θ 0 ). The second layer of uncertainty (i.e. the population,
parameters domain) can be further averaged out. Consequently, E[Δe ] and E[Δc ] are
actually pure numbers and so is the ICER.
The two layers of uncertainty underlying the Bayesian decision-making process as
well as the relationships between the variables defined above can be best appreciated
through the inspection of the cost-effectiveness plane, depicting the joint distribution
of the random variables (Δe , Δc ) in the x− and y−axis, respectively.
Intuitively, the cost-effectiveness plane characterises the uncertainty in the para-
meters θ. This is represented by the dots populating the graph in Fig. 1.4a, which
can be obtained, for example, by simulation. By taking the expectations over the
marginal distributions for Δe and Δc , we then marginalise out this uncertainty and
Cost differential
Cost differential
Cost differential
Fig. 1.4 Cost-effectiveness plane, showing simulations from the joint (posterior) distribution of
the random variables (Δe , Δc )
16 1 Bayesian Analysis in Health Economics
obtain a single point in the plane, which represents the “typical future consequence”.
This is shown as the dot in Fig. 1.4b, where the underlying distribution has been
shaded out.
Figure 1.4c also shows the “sustainability area”, i.e. the part of the cost-
effectiveness plane which lies below the line E[Δc ] = kE[Δe ], for a given value
of the willingness-to-pay k. Given the equivalence between the EIB and the ICER,
interventions for which the ICER is in the sustainability area are more cost-effective
than the comparator. Changing the value for the threshold, may modify the decision
as to whether t = 1 is the most cost-effective intervention. The EIB can be plot as a
function of k to identify the “break-even point”, i.e. the value of the willingness-to-
pay in correspondence of which the EIB becomes positive.
Finally, Fig. 1.4d shows the sustainability area for a different choice of the para-
meter k. In this case, because the ICER —and, for that matter, most of the entire
distribution of (Δe , Δc )—lie outside the sustainability area, the new intervention
t = 1 is not cost-effective. We elaborate this further in Chap. 4, when discussing
probabilistic sensitivity analysis.
The general process of doing a Bayesian analysis (with a view of using the results of
the model to perform an economic evaluation) can be classified according to several
steps. We review these steps in the following, relating the process to its practical
features and assuming R as the software of choice.
• Estimates relevant population • Combines the parameters to obtain • Summarises the economic model
parameters θ a population average measure for by computing suitable measures of
• Varies with the type of available costs and clinical benefits “cost-effectiveness”
data (& statistical approach!) • Varies with the type of available • Dictates the best course of actions,
data & statistical model used given current evidence
• Standardised process
Fig. 1.5 A graphical representation of the process of health economic evaluation based on cost-
effectiveness or cost-utility analysis
1.4 Doing Bayesian Analysis and Health Economic Evaluation in R 17
Figure 1.5 shows a graphical representation of this process. The process starts with
a statistical model that is used to estimate some relevant parameters, which are then
fed to an economic model with the objective of obtaining the relevant population
summaries indicating the incremental benefits and costs for a given intervention.
These are in turn used as the basis for the decision analysis, as described above. The
final aspect is represented by the evaluation of how the uncertainty that characterises
the model impacts the final decision-making process. We describe the several steps
(building blocks) and their relevance to the analysis in the following.
In this step, we typically create, aggregate and modify the original variables available
in the dataset(s) that we wish to analyse. In the context of economic evaluation this
may be needed because the outcomes of interest may have to be computed as functions
of other observable variables—for example, total costs could be obtained as the sum
of several cost items (e.g. service provision, acquisition of the intervention, additional
treatments and so on).
In any case, this step, typically performed directly in R, serves to generate a data
list that contains the values of all the variables that are of interest and should be
modelled formally. The complexity of this data list depends on the nature of the
original data: for example, when dealing with experimental evidence (e.g. coming
from a RCT), often we model directly the quantities of interest (i.e. the variables of
costs and clinical effectiveness or utility).
For example, in the context of an RCT, we would be likely to directly observe the
variables (eit , cit ) for individuals i = 1, . . . , n t in each treatment arm t = 0, . . . , T
and could model them so that the relevant parameters are their population averages
(μet , μct )—see for instance Sect. 5.4 in [6].
In other cases, for example when using aggregated data, it is necessary to build
a more complex model that directly considers ancillary variables (which may be
observed) and these are then manipulated to derive the relevant economic outcomes.
This type of modelling is often referred to as “decision-analytic” and it typically
amounts to creating a set of relationships among a set of random quantities. A deci-
sion tree may be used to combine measures of costs and effectiveness (e.g. in terms
of reduction in the occurrence of adverse events)—examples of this strategy are in
Sect. 5.5 in [6]. We also consider models of this kind in Chap. 2 and in Sect. 4.4.3.1.
This is the most mathematical and in many ways creative part of the process; accord-
ing to the nature and availability of the data, we need to create a suitable probabilistic
18 1 Bayesian Analysis in Health Economics
model to describe uncertainty. Technically, this step is required even outside of the
Bayesian framework that we adopt. Of course, under the Bayesian paradigm, all the
principles described in Sect. 1.2.2 should be applied. Again, depending on the nature
of the data, the model may be more or less complex and encode a larger or smaller
number of assumptions/probabilistic features.
Assuming that the method of inference is some sort of simulation-based procedure
such as MCMC, this step is usually performed by first “translating” the model on a
text file, which contains the description of the assumptions in terms of distributional
and deterministic relationships among the variables. A “frequentist counterpart” to
this step would be the creation of a script which codes the modelling assumptions.
We provide examples of this process under a full Bayesian framework in Chap. 2,
where we also briefly discuss the issues related with convergence and model
checking.
At this point, we can “run the model”, which provides us with an estimation of
the quantities of interest. As we have repeatedly mentioned, these may be directly
the average costs and benefits under different interventions, or perhaps some other
quantities (e.g. the transition probabilities in a Markov model setting).
In our ideal Bayesian process, this step is performed by specialised software (e.g.
JAGS or BUGS) to run the MCMC procedure, which we typically interface with R.
In other words, after we have created the data list and the text file with the model
assumptions, we call the MCMC sampler directly from R. This will then take over and
run the actual analysis, at the end of which the results will be automatically exported
in the R workspace, in the form a suitable object, containing, e.g. the samples from the
posterior or predictive distributions of interest. We show in Chap. 2 some examples
of this procedure.
Once the model has run, the next step involves checking its performance (e.g.
in terms of convergence, if the procedure is based, as it often is, on an MCMC
algorithm). There are several diagnostic tools for MCMC, most of which can be
implemented directly in R. Thus, again following our ideal process, at this point the
used will have re-gained control of the R session in which the simulations from the
model are stored. Standard methods of analysis of convergence and autocorrelation
are described in detail in many specific texts, for instance [6, 14, 18].
The combination of the steps described so far can be thought of as the Statistical
model box of Fig. 1.5.
1.4 Doing Bayesian Analysis and Health Economic Evaluation in R 19
Perhaps even more importantly, from the health economic point of view, depending
on the type of data available, the results of the model may not directly provide
the information or variables needed to perform the cost-effectiveness analysis. For
instance, while individual level data may be used to estimate directly the average
cost and benefits, using aggregated data may mean that the model is estimating some
parameters which are not necessarily the actual measures of clinical benefit and cost
(e, c).
Thus, it will be often necessary to combine the quantities estimated from the model
using logical relationships that define (e, c). For example, the model may estimate
the posterior distribution for λt and γ, indicating respectively the treatment-specific
length of hospitalisation for a given disease and the cost associated with it. Neither
of these can be directly used as a measure of cost associated with the treatment being
considered, but we may construct a new variable ct = λt γ to represent it.
This step is described by the Economic model box in Fig. 1.5 and performing it
in R can be particularly effective—once the full posterior distributions are available
to the R workspace, calculations such as the one showed above are generally trivial.
Figure 1.6 shows a graphical representation of how the Statistical model and the
Economic model described in Fig. 1.5 are performed and combined. Basically, the
whole process begins with the creation of a model, which describes how the inter-
vention is applied to the relevant population, what the effects are (in terms of benefits
and costs) and what variables or parameters are considered. This step may be done
in “pen-and-paper” and is in fact the most creative part of the whole exercise. Often,
we can rely on simple structures, such as decision trees (as in the top-left corner
of Fig. 1.6), while sometimes this requires a more complicated description of the
underlying reality. Again, notice that this step is required irrespective of the statisti-
cal approach considered.
In the full Bayesian framework that we advocate, these assumptions will be trans-
lated into code, e.g. using JAGS or BUGS, as shown in the top-right corner of Fig. 1.6;
in a non-Bayesian context, other software may be used (e.g. SAS, Stata or R), but in
a sense this procedure of scripting is also common to all statistical approaches and
allows to make the model easily replicable. We note here (and return to this point in
Sect. 3.1) that using less sophisticated tools such as Microsoft Excel may render
this step less straightforward.
Once the model has been coded up, it can be run to obtain the relevant estimates
(bottom-right corner of Fig. 1.6). In this case, we consider an R script that defines
the data and interfaces with the MCMC software (in this case JAGS) to compute the
posterior distributions of the model parameters. Of course, this step needs careful
consideration, e.g. it is important to check model convergence and assess whether
the output is reasonable.
Finally (bottom-left corner of Fig. 1.6), we can post-process the model output to
create the relevant quantities to be passed to the Economic model. In this case, we
use again R to combine the model parameters into suitable variables of benefits and
20 1 Bayesian Analysis in Health Economics
Fig. 1.6 Graphical representation of the process of (Bayesian) health economic evaluation in terms
of the Statistical model and Economic model of Fig. 1.5. First (top-left corner ), a model is created,
which describes the possible clinical pathways and the effects of a given intervention in terms of
costs and benefits. Then (top-right corner), the assumptions encoded in the model are translated
into suitable code. The model is then run (bottom-right corner), e.g. by writing suitable R code
to define the data and interface with a specific software (e.g. JAGS or BUGS). Finally (bottom-left
corner), the results of the model are post-processed in R to produce the economic summaries that
are then fed to the “Decision analysis” block
costs (e and c, respectively). This step basically amounts to performing the Economic
model block of Fig. 1.5.
The rest of the decision process, represented by the Decision analysis box in Fig. 1.5,
is probably the easiest; in fact, once the relevant quantities have been estimated,
the optimal decision given the current knowledge can be derived by computing the
summaries described in Sect. 1.3; the results of this process may be depicted as in
panel (b) of Fig. 1.4.
The dashed arrows connecting the Statistical model to the Economic model
through the Uncertainty analysis box in Fig. 1.5 also describe how this process
occurs: if the uncertainty described by the posterior distributions is marginalised
(averaged) out, then the analysis is performed on the “straight line” from the sta-
tistical to the decision analysis. This represents the decision process under current
uncertainty and identifies the best course of action today.
1.4 Doing Bayesian Analysis and Health Economic Evaluation in R 21
On the other hand, if uncertainty is not marginalised, then we can analyse the
“potential futures” separately, e.g. as in panels (a), (c) and (d) of Fig. 1.4. A full
Bayesian approach will allow to directly perform this form of “probabilistic sensi-
tivity analysis”, which allows the evaluation of the impact of parameters uncertainty
on the optimal decision.
The main objective of this book is to describe how health economic evaluation can be
systematised and performed routinely and thoroughly using the R package BCEA—the
remaining chapters will present in details the features of the package, using worked
examples going through the several steps of the economic analysis.
In a sense, BCEA plays a role in the economic evaluation after the statistical model
has been performed. While to use BCEA it is not even strictly necessary to consider
a Bayesian approach (we return to this point in Sect. 3.1 and in Chap. 5), the entire
book is based on the premise that the researchers have indeed performed a Bayesian
analysis of their clinical and economic data. We stress here however that BCEA only
deals with the decision and uncertainty analysis.
References
17. W. Gilks, S. Richardson, D. Spiegelhalter, Markov Chain Monte Carlo in Practice (Chapman
Hall, London, 1996)
18. D. Gamerman, Markov Chain Monte Carlo (Chapman and Hall, London, 1997)
19. C. Robert, G. Casella, Monte Carlo Statistical Methods, 2nd edn. (Springer, New York, 2004)
20. A. Gelman, J. Carlin, H. Stern, D. Rubin, Bayesian Data Analysis, 2nd edn. (Chapman Hall,
New York, 2004)
21. C. Robert, G. Casella, Introducing Monte Carlo Methods with R (Springer, New York, 2010)
22. S. Brooks, A. Gelman, G. Jones, X. Meng, Handbook of Markov Chain Monte Carlo (Chapman
Hall/CRC, Boca Raton, 2011)
23. C. Robert, G. Casella, Statistical Science 26, 102 (2011)
24. N. Welton, A. Sutton, N. Cooper, K. Abrams, Evidence Synthesis for Decision Making in
Healthcare (Wiley, Chichester, 2012)
25. K. Claxton, J. Health Econ. 18, 342 (1999)
26. A. Stinnett, J. Mullahy, Medical Decision Making 18(Suppl), S68 (1998)
27. B. Koerkamp, M. Hunink, T. Stijnen, J. Hammitt, K. Kuntz, M. Weinstein, Medical Decision
Making 27(2), 101 (2007)
Chapter 2
Case Studies
2.1 Introduction
In this chapter, we present the two case studies that are used as running examples
throughout the book.
The first example considers a decision-analytic model. This is a popular tool in
health economic evaluation, when the objective is to compare the expected costs and
consequences of decision options by synthesising information from multiple sources
[1]. Examples of decision-analytic models include decision trees or Markov models.
As these models are based on several different sources of information, they can offer
decision-makers the best available information to reach their decision, as opposed to
modelling based on a single randomised clinical trial (RCT). Additionally, RCTs may
be limited in scope (e.g. in terms of the temporal follow up) and thus not be ideal for
characterising the long-term consequences of applying an intervention. Thus, even
when RCT data are available, a health economic evaluation is often extended to a
decision-analytic approach to allow decision-makers to capture information about
the long-term effects and costs.
The second example is a multi-decision problem. This means that, in contrast to
standard economic modelling, where a new intervention t = 1 is compared to the
status quo t = 0, we consider here T = 4 potential interventions. This is naturally
linked to the wider topic of network meta-analysis [2], which is an extension of
different statistical methods that allow researchers to pool evidence coming from
different sources and include direct and indirect comparisons. This is particularly
relevant when head-to-head evidence is only available for some of the interventions
under consideration. Suitable statistical modelling can be used to infer about the
direct comparisons with no information by using the indirect ones.
For both examples, we first introduce the general background, discussing the dis-
ease area and the interventions under consideration. We then describe the assumptions
underlying the statistical model (e.g. in terms of distributions for the observed/ob-
servable variables and the unobservable parameters). We also show how these can
be translated into suitable code to perform a full Bayesian analysis and obtain
samples from the relevant posterior distributions (see Sect. 1.2). Finally, we demon-
strate any post-processing required to produce the relevant inputs for the economic
model (e.g. the population average differential of costs and benefits—see Sect. 1.4).
In the rest of the book, we refer (rather interchangeably) to OpenBUGS [3] and JAGS
[4], arguably the most popular software to perform Bayesian analysis by means of
Markov Chain Monte Carlo simulations.
The examples are then used to showcase the facilities of BCEA and to explain
the process of performing an economic evaluation in R, once the statistical model
has been fitted. It is important to note (and we will expand on this point in Chap. 3)
that BCEA can be used to perform cost-effectiveness analysis when the full statistical
model has not been fitted within the Bayesian framework. Nevertheless, we strongly
advocate the use of a Bayesian framework and thus we have included these examples
to demonstrate a full Bayesian analysis.
Starting from this chapter on, the text will include frequent code blocks to show
how to execute commands and use the BCEA package. R code is presented in code
blocks in the text, with each new line starting with the symbol >. Indentation indicates
lines continuing from the previous statement. Hash symbols (i.e. #) in code blocks
indicate comments. In-line words formatted in mono-spaced font (such as this)
indicate code, for example short commands or function parameters.
In this section, we briefly review the ideal computer configuration we are assuming to
run the examples later in this chapter and in the rest of the book. It is difficult to guar-
antee that these instructions will be valid for every future release of the programmes
we consider here, although they have been tested under the current releases.
We assume that the user’s computer has the following software installed:
• R and the package BCEA. Other optional packages (e.g. R2OpenBUGS, R2jags,
reshape, plyr or INLA) may need to be installed;
• OpenBUGS or JAGS. These are necessary to perform the full Bayesian analyses we
discuss in the rest of the book. It is not necessary to install both;
• The R front-end, for example Rstudio (available for download at the webpage
https://www.rstudio.com/). This is also optional and all the work can be done
using the standard R terminal;
• A spreadsheet calculator, e.g. MS Excel or the freely available LibreOffice,
which is a decent surrogate and can be downloaded at https://www.libreoffice.
org/.
In the following, we provide some general instructions, for MS Windows, Linux
or Mac OS operating systems.
2.2 Preliminaries: Computer Configuration 25
For MS Windows users, the set-up should be fairly easy and amounts to the following
steps:
1. Install OpenBUGS
• Download the latest release (currently it is version 3.2.3, stored in the file
OpenBUGS323setup.exe) from http://openbugs.net/w/Downloads and run it
by double-clicking on it.
2. Install R
a. Download R from the Comprehensive R Archive Network (CRAN): http://
cran.r-project.org/bin/windows/ (click on the link “install R for the first time”).
b. When the process is finished, open R and type in the terminal the following
command.
> install.packages("BCEA")
> install.packages("R2OpenBUGS")
These commands will download and install the packages BCEA and
R2OpenBUGS. The latter is needed to interface OpenBUGS with R. Follow the
on-screen instructions (you will be asked to select a mirror from which to
obtain the necessary files). Notice that the command install.packages
("Name_of_the_package") can be used to install any other R package.
3. (Optional): Install JAGS
a. Download the installer from the webpage http://sourceforge.net/projects/
mcmc-jags/files/JAGS/4.x/Windows/ by clicking on the latest available exe-
cutable file (currently, JAGS-4.2.0.exe). Executing this file will install JAGS
on the user’s machine.
b. In the R terminal type the command
> install.packages("R2jags")
This will install the package R2jags, which allows to interface JAGS from R.
Linux or Mac OS users should follow slightly different approaches. The installation
of R is pretty much the same as for MS Windows users. From the webpage http://
cran.r-project.org/ select the relevant operating system (Linux or Mac OS) and then
the relevant version (e.g. debian, redhat, suse or ubuntu, for Linux). Follow the
instructions to install the software. Once this is done, open R and install the package
BCEA following the process described above.
26 2 Case Studies
OpenBUGS runs natively in Linux and so it can be installed following the instructions
given at http://openbugs.net/w/Downloads. First, download the most recent ver-
sion of the source file, currently OpenBUGS-3.2.3.tar.gz. Then open a Linux ter-
minal and follow these steps:
1. Unpack the file and move to the newly created directory OpenBUGS-3.2.3. by
typing the following commands.
tar zxvf OpenBUGS -3.2.3. tar . gz
cd OpenBUGS -3.2.3
Notice that if the user does not have administrative access, this command will
fail. A possible workaround is to specify a location to which OpenBUGS should be
installed that is owned by the user, for example
./ configure -- prefix =/ home / user / myfolder
make
sudo make install
which returns a list of the folders and files contained in the folder /home/user.
This will look something like
4 drwxr -xr -x 9 user user 4096 Jul 5 17:02 Desktop
4 drwx - - - - - - 25 user user 4096 Jul 14 09:23 myfolder
196 -rw - rw -r - - 1 root root 197534 Jul 11 16:08 some_file . png
...
and, in this case, the folder myfolder does belong to the user user and thus the
installation of OpenBUGS in that folder would be completed successfully.
It is also possible to install JAGS, following these steps:
1. Download the latest tar.gz file (currently, JAGS-4.2.0.tar.gz) from the web-
page http://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/Source/.
2. Open a Linux terminal window, extract the content of the archive file and move
to the newly created folder JAGS-4.2.0
tar xzvf JAGS -4.2.0. tar . gz
cd JAGS -4.2.0
2.2 Preliminaries: Computer Configuration 27
While OpenBUGS does not run natively under Mac OS, a possible workaround is to
install a hardware virtualisation software such as Parallels Desktop for Mac Os
(http://www.parallels.com/uk/products/desktop/), or a “compatibility layer”, such as
wine (https://www.winehq.org/download/), which allow to run Windows applications
from Mac.
Conversely, JAGS does run natively under Mac OS too and can be installed using
the following steps:
1. Download the latest .dmg file (currently, JAGS-4.2.0.dmg) from https://sourceforge.
net/projects/mcmc-jags/files/JAGS/4.x/Mac%20OS%20X/
2. Double click the .dmg file to make its content available (the name will show up
in the Finder sidebar), usually a window opens showing the content as well;
3. Drag the application from the .dmg window into /Applications to install (you
may need an administrator password);
4. Wait for the copy process to finish;
5. Eject the .dmg (by clicking the eject button in the Sidebar);
6. Delete the .dmg from Downloads.
Several tutorial are available online to guide in the process of installation and use
of both OpenBUGS and JAGS.
2.3 Vaccine
Consider an infectious disease, for instance influenza, for which a new vaccine has
been produced. Under the current management of the disease some individuals treat
the infection by taking over-the-counter (OTC) medications. Some subjects visit their
doctor and, depending on the gravity of the infection, may receive treatment with
antiviral drugs, which usually cure the infection. However, in some cases complica-
tions may occur. Minor complications will need a second doctor’s visit after which
the patients become more likely to receive antiviral treatment. Major complications
28 2 Case Studies
are represented by pneumonia and can result in hospitalisation and possibly death. In
this scenario, the costs generated by the management of the disease are represented by
OTC medications, doctor visits, the prescription of antiviral drugs, hospital episodes
and indirect costs such as time off work.
The focus is on the clinical and economic evaluation of the policy that makes the
vaccine available to those who wish to use it (t = 1) against the null option (t = 0)
under which the vaccine will remain unavailable. More details of this example can
be found in [5] and references therein.
2.3.1.1 Assumptions
at risk, we can then derive the expected number of individuals experiencing each of
these events.
As for the costs, we consider the relevant resources as h = 1: doctor visits; h = 2:
hospital episodes; h = 3: vaccination; h = 4: time to receive vaccination; h = 5: days
off work; h = 6: antiviral drugs; h = 7: OTC medications; h = 8: travel to receive
vaccination. For each, we define ψh to represent the associated unit cost for which we
assume informative lognormal distributions, a convenient choice to model positive,
continuous variables such as costs.
Finally, we include in the model suitable parameters to represent the loss in quality
of life generated by the occurrence of the clinical outcomes. Let ω j represent the
QALYs lost when an individual experiences the j-th outcome. We assume that doctor
visits do not generate loss in QALYs and therefore set ω2 = ω3 := 0; the remaining
ω j ’s are modelled using informative lognormal distributions.
The assumptions encoded by this model are that we consider a population parame-
ter θ = (θ 0 , θ 1 ), with the two components being defined as θ 0 = (β j , γ1 , γ2 , δ, ξ, η,
λ, ψh , ω j ) and θ 1 = (φ, β j , ρv , γ1 , γ2 , δ, ξ, η, λ, ψh , ω j ). We assume that the com-
ponents of θ have the distributions specified in Table 2.1, which are derived by using
suitable “hyper-parameters” that have been set to encode knowledge D available
from previous studies and expert opinion. For example, the parameter φ identifies
a probability (the vaccine coverage) and we may have information about past sea-
sons to suggest that this has been estimated to be between 25 and 63%; this can be
translated to a Beta distribution whose parameters can be determined so that roughly
95% of the probability mass lie between these two values. See [5] for a more detailed
discussion of this point.
It is easy to check that the assumptions in terms of the interval estimates for the
parameters are consistent with the choice of distributions in R, for example using
something like the following code:
> phi <- rbeta (100000 ,11.31 ,14.44)
> c ( quantile ( phi ,.025) , quantile ( phi ,.5) , quantile ( phi ,.975) )
2.5% 50% 97.5%
0.2581039 0.4368227 0.6292335
The assumptions and the model structure defined above can be translated into suitable
code to perform a MCMC analysis and obtain estimates from the posterior distribu-
tions of all the relevant parameters. For example, we could write the following code
to be used with OpenBUGS or JAGS:
model {
# 1. Define the number of people in each group n [v , t ], where t =1 ,2 is status
quo vs vaccination and v =1 ,2 is non vaccinated vs vaccinated
# t =1: If the vaccine is not available , no one will use it
# number of vaccinated in the population
V [1] <- 0
# number of individuals in the two groups
30 2 Case Studies
# t =2: When the vaccine is available , some people will use it but
some people won ’ t
# number of vaccinated in the population
V [2] ~ dbin ( phi , N )
# number of individuals in the two groups
n [1 ,2] <- N - V [2] # non vaccinated
n [2 ,2] <- V [2] # vaccinated
# 2. Vaccination coverage
phi ~ dbeta ( a . phi , b . phi )
Table 2.1 Distributional assumptions for the model. For each parameter, the distributions are
chosen to model the available prior knowledge, represented by existing data or expert opinions.
The mathematical form of the distributions is chosen according to the nature of the parameter (i.e.
parameters describing probability of occurrence of an event are usually given a Beta distribution),
while the values of the hyper-parameters are chosen so that the distribution is consistent with the
prior information derived by the clinical literature or expert opinion
Parameter Mean 2.5% Median 97.5% Distribution
φ 0.435 0.245 0.436 0.625 Beta (11.31, 14.44)
β1 0.0701 0.0387 0.0680 0.1116 Beta (13.01, 172.38)
β2 0.295 0.124 0.288 0.497 Beta (5.80, 13.80)
β3 0.401 0.388 0.401 0.415 Beta (1909.50, 2851.86)
β4 0.01339 0.00852 0.01322 0.01938 Beta (20.94, 1538.71)
β5 0.000378 0.000223 0.000364 0.000616 Lognormal (−7.91, 14.93)
β6 0.000748 0.000366 0.000702 0.001331 Lognormal (−7.26, 7.66)
β7 0.1021 0.0255 0.0954 0.2265 Beta (3.50, 31.50)
ρ1 0.688 0.593 0.686 0.794 Lognormal (−0.374,
0.00524)
γ1 0.420 0.417 0.420 0.423 Beta (45471.58, 62794.09)
γ2 0.814 0.806 0.814 0.822 Beta (7701.86, 1759.89)
δ 6.97 2.00 7.00 12.00 Poisson (7.00)
ξ 0.950 0.940 0.950 0.959 Beta (1804.05, 94.95)
η 0.900 0.890 0.900 0.909 Beta (3239.10, 359.90)
λ 2.90 1.22 2.69 5.97 Lognormal (0.98, 0.17)
ψ1 20.55 12.36 19.77 32.07 Lognormal (3.00,
0.0606)
ψ2 2661.92 1554.18 2575.67 4106.98 Lognormal (7.85,
0.0606)
ψ3 7.21 4.22 6.95 11.42 Lognormal (1.95,
0.0606)
ψ4 10.26 6.16 9.92 15.90 Lognormal (2.29, 0.0606)
ψ5 46.31 27.20 44.96 70.69 Lognormal (3.80, 0.0606)
ψ6 3.86 2.39 3.73 5.95 Lognormal (1.31, 0.0606)
ψ7 1.592 0.949 1.562 2.452 Lognormal (0.44, 0.0606)
ψ8 0.807 0.484 0.776 1.311 Lognormal (−0.241,
0.0606)
ω1 4.26 2.14 4.05 7.59 Lognormal (1.40, 0.0993)
ω4 6.39 3.81 6.23 9.82 Lognormal (1.82, 0.0606)
ω5 6.34 3.83 6.15 9.94 Lognormal (1.82, 0.0606)
ω6 15.20 9.09 14.88 23.34 Lognormal (2.70, 0.054)
ω7 0.556 0.316 0.541 0.932 Lognormal (−0.634,
0.0717)
2.3 Vaccine 33
The model consists of nine modules as annotated in the code above. Notice that
the values of the parameters for each distribution are kept as variables (rather than
hard-coded as a fixed number). This is in general a good idea, since changes in
the assumed values can be reflected directly using the same code. Of course, this
means that the numerical value must be passed to the computer code somewhere else
in the scripting process. This, however, helps clarify the whole process and makes
debugging easier.
As is possible to see, most of the commands in the BUGS/JAGS language are effec-
tively typed in a way that strongly resembles the standard statistical notation, with the
twiddle symbol ∼ indicating a stochastic relationship (i.e. a probability distribution),
while the assignment symbol -> indicates logical (or deterministic) relationships.
Typically, this code is saved to a text file, say vaccine.txt. It is good practice to
store the files in a well-structured set of directories or at least to provide pointers for
R so that it can search for the relevant files efficiently. Examples include the directory
from which R is launched or alternatively in the directory that is currently in use by
R (also termed the “working directory”). The R command
> setwd (" PATH_TO_RELEVANT_FOLDER ")
can be used to set the working directory to any folder, while the command
> getwd ()
[1] "/ home / user / MyStuff "
returns the current (working) directory. Note that R uses Unix-like notation and
forward slashes / to separate folders in a text string. Conversely, MS Windows uses
backward slashes \ to accomplish the same task. This means that on a MS Windows
computer, the working directory will be defined by R as something like
> # On a Windows machine :
> getwd ()
[1] " C :/ user / MyStuff "
while the MS Windows notation (e.g. by copying and pasting the address of the folder
from the file explorer) would actually be "C:\user\MyStuff". It is thus important
to be careful when copying and pasting folder locations from MS Windows into R and
the user has two options, both based on Unix-like notation: the first one is to just
convert any backward slash to a forward slash. The second option is to escape the
backward slashes using a double backward slash (\\), for example as in the following
R code.
> # On a Windows machine , these two commands are the same :
> # 1. using forward slashes
> setwd (" C :/ user / Mystuff )
The following R code is used to pre-process and load the data in the R workspace
before the model and the health economic analysis can be run.
> ## Launches the file Utils . R which contains useful functions used throughout
this script
> source (" http :// www . statistica . it / gianluca / BCEABook / WebMaterial / Utils . R ")
> ## Loads the values of the hyper - parameters ( needed to run the Bayesian model
using JAGS )
> # Number of people in the populations
> N <- 100000
(notice that the + at the beginning of a line inside the for loops is just R standard
notation to indicate commands that span over more than one line).
36 2 Case Studies
The very first line of the script executes the file Utils.R from its remote location
(http://www.statistica.it/gianluca/BCEABook/WebMaterial/Utils.R); this
file contains a set of functions and commands that are used throughout the script and
thus is fundamental to launch it before the rest of the script can be executed. Although
the number of files necessary to run the entire analysis may increase (thus, at face
value, increasing the complexity of the process), it is actually good programming
practice to use a combination of many smaller, focussed scripts, rather than include
every commands or functions required in one single, massive file. This, again, makes
the process transparent and easier to debug or critically appraise.
The rest of the script defines the values for the parameters used in the distributions
associated to the quantities modelled and described above. For example, the function
betaPar2 (which is defined in the file Utils.R) can be used to determine the values
of the parameters of a Beta distribution so that its average is around 0.436 and 95%
of the mass is below the value of 0.6. In particular, running this command on a R
terminal gives the following output:
> betaPar2 (.434 ,.6 ,.95)
$res1
[1] 11.30643
$res2
[1] 14.4411
$theta . mode
[1] 0.434
$theta . mean
[1] 0.4391267
$theta . median
[1] 0.437
$theta . sd
[1] 0.09595895
betaPar2 creates a list of results: the first two elements of the list, res1 and res2
are the estimated values of the parameters to be used with a Beta distribution so that
roughly 95% of the probability mass is below 0.6. This is in line with the assumptions
presented in Table 2.1 for the parameter φ. Again, we can check the appropriateness
of this choice by simply typing the following commands to the R terminal.
> phi <- rbeta (100000 ,11.30643 ,14.4411)
> c ( quantile ( phi ,.025) , quantile ( phi ,.5) , quantile ( phi ,.975) )
2.5% 50% 97.5%
0.2566397 0.4371251 0.6296423
The other elements of the list are theta.mode, theta.mean, theta.median and
theta.sd, which store the values for the mode, mean, median and standard deviation
2.3 Vaccine 37
of the resulting Beta distribution. Notice the R “dollar” notation, which can be used
to access elements of an object — in other words, if the object x is stored in the R
workspace and contains the elements y, z and w, then these can be accessed by using
the notation x$y, x$z or x$w.
Another thing to notice is that it is fairly easy to annotate the R code in an infor-
mative way. This again increases transparency and facilitates the work of reviewers
or modellers called upon a critical evaluation of the analysis process. In line with
the point we made above about using many simpler and specific files to execute the
several steps of the analysis, rather than one large (and potentially messy) file, it
is a good idea to save this code to a script file, say LoadData.R, again assumed to
be stored in the working directory. From within the R terminal, the script can be
launched and executed by typing the command
> source (" LoadData . R ")
At this point, the user is ready to run the model—in a full Bayesian context, this
typically means performing a MCMC analysis (cfr. Sect. 1.2.3) to obtain a sample
from the posterior distribution of the random quantities of interest. We reiterate here
that these may be unobservable parameters as well as unobserved variables.
R is particularly effective at interfacing with the main software for Bayesian
analysis—here we refer to the most popular OpenBUGS [3] and JAGS [4], but there is
a R package to interface with a more recent addition, Stan [6]. This means that it is
possible to produce a set of scripts that can be run in R to pre-process the data, call
the MCMC sampler in the background and run the model (written in a .txt file, as
shown above) and then post-process the results, e.g. to obtain the suitable measures
of population average costs and effectiveness.
For example, the following commands can be used to run the Bayesian model
defined above:
> # Loads the package to run OpenBUGS or JAGS from R
> library ( R2OpenBUGS )
> library ( R2jags )
> # Defines the current as the working directory
> working . dir <- paste ( getwd () ,"/" , sep ="")
> # Launches the file Utils .R which contains useful functions used
throughout this script
> source (" http :// www . statistica . it / gianluca / BCEABook / WebMaterial /
Utils .R ")
> # Loads the data into R ( assumes the file is stored in the
working directory - if not the full path can be provided )
> source (" LoadData .R ")
psi " ," N . outcomes " ," N. resources " ," mu . lambda " ," tau . lambda " ," a . xi
" ," b . xi " ," a . eta " ," b . eta " ," a. delta ")
> # Prints the summary stats and attaches the results to the R
workspace
> print ( vaccine , digits =3 , intervals =c (0.025 , 0.975) )
> # In OpenBUGS :
> attach . bugs ( vaccine )
> # In JAGS :
> attach . jags ( vaccine )
For convenience, we can save them in a file, say RunMCMC.R, which can then be
run from within the R terminal using the source command.
> source (" RunMCMC . R ")
This script proceeds by first loading the relevant packages (which allow R to
interface with either OpenBUGS or JAGS); this can be done using the command
library(R2OpenBUGS) or library(R2jags), depending on the Bayesian software
2.3 Vaccine 39
of choice. Of course, for these to work, either or both OpenBUGS and JAGS need to be
installed on the user’s machine (we refer interested readers to Sect. 2.2 or the relevant
websites, where information is provided on installation and use under different oper-
ating systems). In the first part of the script, we also execute the files Utils.R and
LoadData.R, presented above, which prepare the data for either OpenBUGS or JAGS
to use. Finally, the current folder is set up as the working directory (but of course,
the user can choose any folder for this).
The next step amounts to storing all the relevant input data for the model code
into a list. In this case, we need to include all the values for the parameters of the
distributions used in the file vaccine.txt, which encodes the model assumptions.
Then, we instruct R to read the model assumptions from the file vaccine.txt and
finally we define the “parameters” to be monitored. Again, we note that with this
terminology we refer to any unobserved or unobservable quantity for which we
require inference in the form of a sample from the posterior distribution.
Before we run OpenBUGS or JAGS we need to define the list of “initial values”,
which are used to start the Markov chain(s). Notice that both BUGS or JAGS can
randomly generate initial values. However, it is generally better to closely control
this process [7]. This can be done by creating a suitable R function that stores in a list
random values for all the quantities that need initialisation. These are obtained by
specifying the underlying distribution—for instance, in this case we are generating
the initial value for φ from a Uniform(0, 1) distribution (this is reasonable as φ is a
probability and so it needs to have a continuous value between 0 and 1). In principle,
any quantity that is modelled using a probability distribution and is not observed
needs to be initialised. With reference to the model code presented above, it would
not possible to initialise the node n[1,2], because it is defined as a deterministic
function of other quantities (in this case N and V[2]).
Finally, we define the total number of iterations, the number of iterations to be
discarded in the estimate of the posterior distributions (burn-in) and the possible value
of the “thinning”. This refers to the operation of only saving one every l iterations from
the Markov Chains. This can help reduce the level of autocorrelation in the resulting
chains. For example, we could decide to store 1,000 iterations and obtain this either
by saving the last 1,000 runs from the overall process (i.e. by discarding the first 9,000
of the 10,000 iterations produced), or by running the process for 100,000 iterations,
discarding the first 9,500 and then saving one every 181 iterations. Of course, the
latter alternative involves a longer process just to end up with the same number
of samples on which to base the estimation of the posteriors. But the advantage is
that it is likely that it will show a lower level of autocorrelation, which means a
larger amount of information and thus better precision in characterising the target
distributions.
Once these steps have been executed, we can use the commands bugs or jags
to run the model. Both would call the relevant MCMC sampler in the back-
ground and produce the MCMC estimates. When the process is finished, the user
regains control of the R session. A new object, in this case named vaccine, is
40 2 Case Studies
created in the current workspace. This object can be manipulated to check model
convergence, visualise the summary results (using the print method available for
both R2OpenBUGS and R2jags) and save the results (i.e. the simulated values from
the posterior distributions) to the R workspace.
For example, a summary table can be obtained as follows (here, we only present
the first few rows, for simplicity):
> print ( vaccine , interval = c (.025 ,.975) , digits =3)
Inference for Bugs model at " vaccine . txt ", fit using jags ,
2 chains , each with 10000 iterations ( first 9500 discarded ) , n . thin = 181
n . sims = 1000 iterations saved
mu . vect sd . vect 2.5% 97.5% Rhat n . eff
Adverse . events 4384.479 2518.102 969.425 10740.800 1.005 310
Death [1 ,1] 1.573 1.539 0.000 5.000 1.000 1000
Death [2 ,1] 0.850 1.084 0.000 4.000 1.001 1000
Death [1 ,2] 0.000 0.000 0.000 0.000 1.000 1
Death [2 ,2] 0.248 0.545 0.000 2.000 1.000 1000
GP [1 ,1] 2045.987 896.964 654.925 4092.150 1.000 1000
GP [2 ,1] 1148.308 543.198 340.925 2435.475 1.000 1000
GP [1 ,2] 0.000 0.000 0.000 0.000 1.000 1
GP [2 ,2] 279.658 151.580 78.000 658.325 1.000 1000
Hospital [1 ,1] 0.764 0.959 0.000 3.000 1.001 1000
Hospital [2 ,1] 0.438 0.698 0.000 2.000 1.002 620
...
For each parameter included in the list of quantities to be monitored, this table shows
the mean and standard deviation (the columns labelled as mu.vect and sd.vect),
together with the 2.5 and 97.5% quantiles of the posterior distributions (which give
a rough approximation of a 95% credible interval).
The final columns of the table (indexed by the labels Rhat and n.eff, respectively)
present some important convergence statistics. The first one is the potential scale
reduction R̂, often termed the Gelman–Rubin statistic. This quantity can be computed
when the MCMC process is based on running at least two parallel chains and basically
compares the within to the between chain variability. The rationale is that when this
ratio is close to 1, then there is some evidence of “convergence” because all the
chains present similar variability and do not vary substantially among each other, thus
indicating that they are all visiting a common area in the parameter’s space. Typically,
values below the arbitrary threshold of 1.1 are considered to suggest convergence to
the relevant posterior distributions.
The second one is the “effective sample size” n eff . The idea behind this quantity is
that the MCMC analysis is based on a sample of n.sims iterations (in this case, this is
1,000). Thus, if these were obtained using a sample of independent observations from
the posterior distributions, this would be worth exactly 1,000 data points. However,
because MCMC is a process in which future observations depends on the current
one, there is some intrinsic “autocorrelation”, which means that often a sample of
2.3 Vaccine 41
S iterations has a value in terms of information that is actually lower. This value is
quantified by the effective sample size. When n.eff is close to n.sims, this indicates
that the level of autocorrelation is low and that in effect the n.sims points used to
obtain the summary statistics are worth more or less their nominal value. On the
other hand, when the two are very different this indicates that the MCMC sample
contains less information about the posterior.
For example, because of the autocorrelation, the 1,000 simulations used to charac-
terise the posterior distribution of the node Adverse.events are actually equivalent
to a sample made by around 310 independent observations from that posterior. In
cases such as this, when R̂ < 1.1 but n.eff is much smaller than n.sims we could
conclude that the sample obtained has indeed converged to the posterior distribu-
tion but does not contain enough information to fully characterise it. For example,
the mean and the central part of the distribution may be estimated with good preci-
sion, but the tails may not. One easy (albeit potentially computationally intensive)
workaround is to run the MCMC for a (much) longer run and possibly increase the
thinning.
Additional analyses to check on convergence may be performed, for example
by providing traceplots of the chains, e.g. as in Fig. 1.3(d), for example using the
following command
> traceplot ( vaccine )
which produces an interactive traceplot for each of the monitored nodes. More
advanced graphing and analysis can be done by subsetting the object vaccine and
accessing the elements stored therein. Details on how to do this are shown, for exam-
ple, in [7].
In order to perform the economic analysis, we need to define suitable summary mea-
sures of cost and effectiveness. The total cost associated with each clinical resource
can be computed by multiplying the unit cost ψh by the number of patients
consum-
(1) (2)
ing it. For instance, the overall cost of doctor visit is GPtv + GPtv × ψ1 . If, for
convenience of terminology, we indicate with Ntvh the total number of individuals
consuming the h-th resource under intervention t and in group v, we can then extend
this reasoning and compute the average population cost under intervention t as
1
1 8
ct := Ntvh ψh . (2.1)
N v=0 h=1
Similarly, the total QALYs lost due to the occurrence of the relevant outcomes
can be obtained by multiplying the number of individuals experiencing them by ω j .
For example, the total number of QALYs lost to influenza infection can be computed
as Itv × ω1 . If we let Mtv j indicate the number of subjects with the j-th outcome
42 2 Case Studies
The results of the MCMC procedure used to run the model described above can
be obtained by simply running the scripts discussed in Sect. 2.3.1. However, they are
also available in the R object vaccine.RData, which can be directly downloaded at
http://www.statistica.it/gianluca/BCEABook/vaccine.RData. For example, this can
be uploaded to the R session by typing the following command:
> load (" http :// www . statistica . it / gianluca / BCEABook / vaccine . RData ")
> ls ()
[1] " Adverse . events " " Death " " GP " " Hospital "
[5] " Infected " "N" " Pneumonia " " Repeat . GP "
[9] " delta " " eta " " gamma " " lambda "
[13] " n " " n . sims " " omega " " psi "
[17] " xi "
Each of these R objects contains n sims = 1000 simulations from the relevant pos-
terior distributions. Before the economic analysis can be run, it is necessary to define
the measures of overall cost and effectiveness given in Eqs. (2.1) and (2.2), respec-
tively. This can be done using the results produced by the MCMC procedure with
the following R code. Notice that since the utilities are originally defined as quality
adjusted life days, it is necessary to rescale them to obtain QALYs.
> ## Compute effectiveness in QALYs lost for both strategies
> QALYs . inf <- QALYs . pne <- QALYs . hosp <- QALYs . adv <- QALYs . death <- matrix (0 ,
n . sims ,2)
> for ( t in 1:2) {
QALYs . inf [, t ] <- (( Infected [,t ,1] + Infected [,t ,2]) * omega [ ,1]/365) / N
QALYs . pne [, t ] <- (( Pneumonia [,t ,1] + Pneumonia [,t ,2]) * omega [ ,4]/365) / N
QALYs . hosp [, t ] <- (( Hospital [,t ,1] + Hospital [,t ,2]) * omega [ ,5]/365) / N
QALYs . death [, t ] <- (( Death [,t ,1] + Death [,t ,2]) * omega [ ,6]) / N
}
> QALYs . adv [ ,2] <- ( Adverse . events * omega [ ,7]/365) / N
> e <- -( QALYs . inf + QALYs . pne + QALYs . adv + QALYs . hosp + QALYs . death )
The notation Infected[,t,1] indicates all the simulations (the first dimension
of the array) for the t-th intervention (which the for loop sets sequentially to 1
and 2 to indicate t = 0, 1, respectively) and for the first vaccination group. Sim-
ilarly, Infected[,t,2] indicates all the simulations for the t-th intervention and
for the second vaccination group. Thus, each of these two elements effectively pro-
duces the value Mtv1 (where j = 1 indicates the first outcome) and consequently, the
code ((Infected[,t,1] + Infected[,t,2])*omega[,1]/365)/N does identify
2.3 Vaccine 43
the quantity N1 1v=0 Mtv1 ω1 . Following a similar reasoning for all the other out-
comes and summing them all up, we do obtain the measure of effectiveness, which
is stored in a matrix e with n sims rows and 2 columns (one for each intervention
considered).
We can follow a similar strategy to identify the costs associated with each inter-
vention. First we define the number of “users” (which we indicated earlier as Ntvh
and according to the resource depends on the number of doctor (general practitioner)
visits, hospitalisations, infections, repeated hospitalisations, or individuals at risk);
then we multiply these by the associated cost (contained in the variable psi). Then
we sum all the components to derive the overall average cost for each treatment
strategy.
> ## Compute costs for both strategies
> cost . GP <- cost . hosp <- cost . vac <- cost . time . vac <- cost . time . off <- cost .
trt1 <- cost . trt2 <- cost . otc <- cost . travel <- matrix (0 , n . sims ,2)
> for ( t in 1:2) {
cost . GP [, t] <- ( GP [,t ,1]+ GP [,t ,2]+ Repeat . GP [,t ,1]+ Repeat . GP [,t ,2]) * psi
[ ,1]/ N
cost . hosp [, t ] <- ( Hospital [,t ,1]+ Hospital [,t ,2]) * psi [ ,2]/ N
cost . vac [, t ] <- n [ ,2 , t ]* psi [ ,3]/ N
cost . time . vac [, t ] <- n [ ,2 , t ]* psi [ ,4]/ N
cost . time . off [, t ] <- ( Infected [,t ,1]+ Infected [,t ,2]) * psi [ ,5]* eta * lambda / N
cost . trt1 [, t ] <- ( GP [,t ,1]+ GP [,t ,2]) * gamma [ ,1]* psi [ ,6]* delta / N
cost . trt2 [, t ] <- ( Repeat . GP [,t ,1]+ Repeat . GP [,t ,2]) * gamma [ ,2]* psi [ ,6]* delta /
N
cost . otc [, t ] <- ( Infected [,t ,1]+ Infected [,t ,2]) * psi [ ,7]* xi / N
cost . travel [, t ] <- n [ ,2 , t ]* psi [ ,8]/ N
}
> c <- cost . GP + cost . hosp + cost . vac + cost . time . vac + cost . time . off + cost .
trt1 + cost . trt2 + cost . travel + cost . otc
At this point we are ready to run the Decision Analysis and the Uncertainty
Analysis, which BCEA can take care of. We present these parts in Chaps. 3 and 4.
Assumptions
The dataset includes N = 50 data points nested within S = 24 studies . For each
study arm i = 1, . . . , N we observe a variable ri indicating the number of patients
quitting smoking out of a total sample size of n i individuals. In addition, we also
record a variable ti taking on the possible values 1, 2, 3, 4, indicating the treatment
associated with the i-th data point. The nesting within the trial is accounted for by a
variable si taking values in 1, . . . , S.
Most studies are simple head-to-head comparisons (i.e. comparing only two
interventions), while two of the them are multi-arm trials (the first one involving
t = 1, 3, 4, and the second one comparing t = 2, 3, 4). Most trials compare one of
the active treatments t = 2, 3, 4 against the control treatment “No intervention”. Five
of the studies consider comparisons between two or more active treatments. The full
dataset is presented in Table 2.2.
Figure 2.1 shows the description of the “network” of data available—the process
of combining this information into a consistent framework is often referred to as
“Network Meta-Analysis” (NMA).
For each study arm we model the number of observed quitters as the realisation
of a Binomial random variable:
ri ∼ Binomial ( pi , n i )
where pi is the specific probability of smoking cessation. The main objective of the
model is to use the available data to derive a pooled estimation for πt , the intervention-
specific probability of smoking cessation.
2.4 Smoking Cessation 45
Table 2.2 The dataset containing information on the S = 24 trials on smoking cessation. The data
were originally reported in [11]
Study (si ) Intervention (ti ) Quitters (ri ) Participants (n i ) Comparator (ci )
1 1 9 140 1
1 3 23 140 1
1 4 10 138 1
2 2 11 78 2
2 3 12 85 2
2 4 29 170 2
3 1 75 731 1
3 3 363 714 1
4 1 2 106 1
4 3 9 205 1
5 1 58 549 1
5 3 237 1561 1
6 1 0 33 1
6 3 9 48 1
7 1 3 100 1
7 3 31 98 1
8 1 1 31 1
8 3 26 95 1
9 1 6 39 1
9 3 17 77 1
10 1 79 702 1
10 2 77 694 1
11 1 18 671 1
11 2 21 535 1
12 1 64 642 1
12 3 107 761 1
13 1 5 62 1
13 3 8 90 1
14 1 20 234 1
14 3 34 237 1
15 1 0 20 1
15 4 9 20 1
16 1 8 116 1
16 2 19 149 1
17 1 95 1107 1
17 3 143 1031 1
18 1 15 187 1
(continued)
46 2 Case Studies
We use the following strategy. First we model the probabilities pi using a struc-
tured formulation
logit( pi ) = μsi + δsi ,ti 1 − I{ti = bsi } .
The parameter δsi ,ti represents the incremental effect of treatment ti with respect to
the reference intervention being considered in the study si . Specifically, we assume by
common convention that the intervention associated with the minimum label value
found in each study, is the reference intervention for that study. This formulation
allows for a clear specification of study-specific effects and can be easily extended to
include study-treatment interaction. The reference (or baseline) intervention for each
study is indicated by bsi ; thus δsi ,bsi = 0, with the effect of the baseline intervention
for each study s represented by μs . Consequently, in each study s we assume that the
comparator’s effect is the study baseline and that the incremental effect of treatment
t is represented by δs,t if t = bs .
The parameters in μ are given independent minimally informative Normal distri-
iid
butions μs ∼ Normal(0, v), where v is a large fixed value identifying the initial value
of the variance of the distributions. On the contrary, we assume that the parameters
δsi ,ti represent “structured” effects
with
mdi = dti − dbsi .
1
π0 = S μs
s=1 I{bs = 1} s:bs =1
logit(πt ) = π0 + dt , t = 1, . . . , T
We run the MTC model in JAGS (although the code presented below will also work
in OpenBUGS with only minor modifications—cfr. Sect. 2.3.1.4). To run the Bayesian
evidence synthesis model, it is necessary to store the model specification in a text
file that will be then interpreted by JAGS. This file contains the description of the
Bayesian model in terms of the stochastic and deterministic relationships between
the variables building the model network or graph (more precisely, a direct acyclic
graph, or DAG).
The model is an adaptation from the specifications reported by Welton et al.
(2012) and the NICE Decision Support Unit (2013) [9, 10]. The JAGS code used for
the analysis of the smoking cessation data is shown below:
### JAGS model ###
model {
for ( i in 1: nobs ){
r[ i ]~ dbin (p[i],n[i ])
p[ i] <- ilogit ( mu [s [i ]]+ delta [s[i],t[i ]])
delta [s[ i],t[i ]] ~ dnorm ( md [i ], tau )
md [i] <- d[t[i ]] - d[b[s[i ]]]
}
for ( i in 1: ns ){
mu [i ]~ dnorm (0 ,.0001)
AbsTrEf [ i] <- ifelse (b[i ]==1 , mu [i ] ,0)
}
pi0 <- sum ( AbsTrEf []) / incb
tau <- pow (sd , -2)
sd ~ dunif (0.00001 ,2)
d [1] <- 0
for ( k in 2: nt ){
d[ k ]~ dnorm (0 ,.0001)
}
for ( j in 1: nt ){
logit ( pi [j ]) <- pi0 +d[j]
for (k in 1: nt ){
lor [j ,k] <- d[j]-d[k]
log ( or [j ,k ]) <- lor [j , k]
rr [j ,k ] <- pi [j ]/ pi [k]
}
}
}
To run the analysis it is necessary to save the model in a plain text file. No specific
extensions are required and in this example we will save the file with the name
smoking_model_RE.R in the directory from which we run R. We will assume that
the csv file containing the data inputs (i.e. smoking_data.csv) is in the same folder.
2.4 Smoking Cessation 49
The directory R is using can be displayed using the command getwd() and can be
modified by specifying the desired address as the argument of the function setwd,
i.e. setwd("PATH_TO_NEW_DIRECTORY").
It is necessary to import the data into R and to pre-process the inputs prior to
running the Bayesian model. This can be done by running the following code:
> # load the R2jags package and the the data file
> library ( R2jags )
> smoking = read . csv (" smoking_data . csv " , header = TRUE )
The package R2jags is necessary to connect R and JAGS, and is loaded with the
command library(R2jags). The command read.csv is used to read into R the
data inputs contained in the csv file smoking_data.csv, which will be saved as a
data.frame object. Since the quantities need to be available in the R workspace, they
are saved as new R variables. The baseline treatment is incremented by one when
saving it with the command b=b_i+1, so that the comparator t = 0 (no intervention)
is associated with the index 1, the intervention t = 1 (self-help) with the index 2, and
so on. This is because both R and JAGS index arrays with the first element starting from
1 (as opposed to 0). The total number of studies, the arm index for each observation
in the respective trial, the number of comparators and observations and the number of
trials including the baseline reference treatment, in this case t = 0 (no intervention),
are also calculated from the data.
The jags function used to run the Bayesian evidence synthesis model requires
several inputs:
• data: a named list including all the inputs needed by the model;
• inits: a list of initial values or a function generating the initial values for (a
subset of) the stochastic parameters in the model. In this example, we set inits
to NULL, which means that JAGS will choose at random the initial values for all
the parameters in the model. The initial values of the parameters will be randomly
drawn from the space of values they can assume, determined by their stochastic
definition;
• parameters.to.save: a vector of variables to monitor, i.e. the parameters of
interest. JAGS will save the output of the simulations from the associated posterior
distributions only of the monitored parameters;
50 2 Case Studies
• model.file: the name or address of the file containing the model. Since we pre-
viously saved the model in the R as the file smoking_model_RE.R, the name of
this file will be the value passed to this argument;
• n.chains: the number of parallel Markov chains to run. It is highly recommended
that these are at least 2, to allow for checking the convergence and the mixing of
the chains;
• n.iter: the number of iterations to perform for each chain from initialisation;
• n.thin: the thinning rate, i.e. after how many iterations a single value form the
posterior distribution is saved, discarding the others;
• n.burnin: the length of the burn-in, i.e. the number of simulations to discard after
the initialisation of the chains before saving any value. If not specified as in this
case, by default it is set to n.iter/2.
More details on how to run a JAGS model and then post-process its results for the
purposes of health economic analysis are given in [7].
At this point, all the necessary data inputs have been pre-processed and it is
possible to run the MTC analysis model:
> # define data and parameters to monitor
> inputs = list (" s " ," n " ," r " ," t " ," ns " ," nt " ," b " ," nobs " ," incb " ," na ")
> pars = c (" rr " ," pi " ," p " ," d " ," sd " ," T ")
> smoking_output <- jags ( data = inputs , inits = NULL , parameters . to . save = pars ,
model . file = model . file , n . chains =2 , n . iter =10000 , n . thin =10)
The jags function will save the output of the model in the rjags object which
we called smoking_output. A summary of the model results can be printed out by
executing the following line of code:
> print ( smoking_output )
Inference for Bugs model at " smoking_model_RE . txt ", fit using jags ,
2 chains , each with 20000 iterations ( first 10000 discarded ) , n . thin = 10
n . sims = 2000 iterations saved
mu . vect sd . vect 2.5% 25% 50% 75% 97.5% Rhat n. eff
d [1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
d [2] 0.499 0.395 -0.278 0.245 0.501 0.757 1.306 1.001 2000
d [3] 0.843 0.239 0.383 0.684 0.833 0.995 1.338 1.000 2000
d [4] 1.107 0.446 0.248 0.817 1.094 1.391 2.011 1.001 2000
pi [1] 0.062 0.012 0.041 0.054 0.061 0.069 0.086 1.002 1000
pi [2] 0.100 0.031 0.053 0.078 0.096 0.117 0.172 1.001 2000
pi [3] 0.132 0.021 0.096 0.118 0.131 0.145 0.174 1.003 730
pi [4] 0.169 0.051 0.087 0.134 0.164 0.199 0.287 1.001 2000
rr [1 ,1] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
rr [2 ,1] 1.685 0.635 0.774 1.257 1.585 2.003 3.232 1.001 2000
rr [3 ,1] 2.200 0.497 1.416 1.858 2.129 2.469 3.346 1.000 2000
rr [4 ,1] 2.878 1.181 1.254 2.090 2.657 3.432 5.677 1.001 2000
rr [1 ,2] 0.676 0.252 0.309 0.499 0.631 0.795 1.292 1.001 2000
rr [2 ,2] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
rr [3 ,2] 1.454 0.551 0.672 1.054 1.366 1.726 2.814 1.001 2000
rr [4 ,2] 1.849 0.814 0.727 1.291 1.691 2.253 3.865 1.001 2000
rr [1 ,3] 0.476 0.103 0.299 0.405 0.470 0.538 0.706 1.000 2000
rr [2 ,3] 0.785 0.295 0.355 0.579 0.732 0.949 1.488 1.001 2000
rr [3 ,3] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
rr [4 ,3] 1.324 0.489 0.596 0.989 1.249 1.559 2.464 1.001 1700
rr [1 ,4] 0.403 0.159 0.176 0.291 0.376 0.478 0.798 1.001 2000
rr [2 ,4] 0.646 0.288 0.259 0.444 0.591 0.774 1.375 1.001 2000
rr [3 ,4] 0.856 0.316 0.406 0.641 0.801 1.011 1.677 1.001 1700
rr [4 ,4] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
sd 0.601 0.128 0.395 0.510 0.589 0.676 0.896 1.007 2000
deviance 281.227 9.952 263.929 274.289 280.731 287.548 302.935 1.001 2000
The table above reports for each monitored variable the estimated mean, standard
deviation, several percentiles of the posterior distributions (2.5, 25, 50, 75 and 97.5%),
the convergence Gelman–Rubin R diagnostic and the effective sample size n.eff.
The latter gives an indication of the presence of autocorrelation within the chains by
quantifying the information contained in the vector of simulations used to estimate
every parameter. The percentiles can be used to approximate a credibility interval
(CrI), which is an interval of values containing a posterior probability mass equal to
0.95; assuming the unimodality of the distributions (as it is the case), the posterior
95% CrI can be approximated by taking the 2.5 and 97.5% percentiles as the lower
and upper bound, respectively.1
The diagnostics measures indicate a good convergence of the model, with the
Gelman–Rubin R statistics being below 1.1 for all the measures. In addition the
effective sample size does not signal the presence of autocorrelation within the chains.
From this output we can observe that the most effective treatment is t = 4 (Group
counselling) with an associated probability of patients quitting smoking equal to
π4 = 0.17 (95% CrI [0.09; 0.29]). It is followed by: t = 3 (Individual counselling)
associated with a probability of quitting of π3 = 0.13 (95% CrI [0.10; 0.17]); t = 2
(Self-help) with a probability of π2 = 0.10 (95% CrI [0.05; 0.17]); and lastly t = 1
(No intervention) with an estimated probability of quitting equal to 0.06 (95% CrI
[0.04; 0.09]). The results are represented graphically in Fig. 2.2. It can be observed
that the uncertainty associated with the effect size of group counselling is high, with
a 95% credible interval wider than the ones for the other interventions.
The plot in Fig. 2.2 can be reproduced using the following code:
> attach . jags ( smoking_output )
> tr . eff = data . frame ( t( apply (pi ,2 , quantile ,c (0.025 ,0.975) )) )
> names ( tr . eff )= c (" low " ," high ")
> treats = c (" No intervention " ," Self - help " ," Individual counselling
" ," Group counselling ")
> tr . eff = cbind ( tr . eff , mean = smoking_output$BUGSoutput$mean$pi ,
interventions = factor ( treats , levels = treats ) )
> detach . bugs ()
1 It should be noted that the estimation of the “effective number of parameters” p D is controversial.
The definition reported in [12] and in [3], which is also the one adopted in BUGS, should be preferred
instead of the one reported by R2jags [13]. This statistic is calculated by R2jags as:
while both [12] and in [3] report that the preferred definition is:
p D = D̄model − D(
θ)
where D̄model is the posterior deviance of the model and D( θ) is the deviance in correspondence
of the estimated posterior mean of the vector of parameters θ. It should be noted that the definition
of p D has a direct impact on the deviance information criteria (DIC), which is an index commonly
used for model comparisons, defined as DIC = D̄model + p D = D( θ) + 2 p D .
52 2 Case Studies
Meta−analysis results
No intervention
Self−help
Individual counselling
Group counselling
Fig. 2.2 Mean and 95% credible intervals (CrI) of the re-estimated treatment effects of each treat-
ment. Group counselling resulted in having the best estimated efficacy, followed by individual
counselling, self-help and no intervention. The credible interval associated with the group coun-
selling estimate was substantially wider than the ones for the other comparators
The simulations from the posterior distributions of the parameters that will be used
in the economic model are stored in the BUGSoutput element of the rjags output
object. The vectors of simulations can be attached to the current workspace by using
the command attach.jags, which makes the values available in the workspace.2
As the economic model will be based on the 2,000 values obtained in JAGS, it is
necessary to extract these values from the output object. In the following code, we
attach the JAGS output to the workspace and copy the values simulated from the
posterior distributions of the estimated probability of cessation for each treatment
π = (π1 , π2 , π3 , π4 ) in a 2000 × 4 matrix pi. The latter will be used as inputs for
the economic model.
> attach . jags ( smoking_output )
> pi <- pi
2 When
using OpenBUGS and R2OpenBUGS, the object can be attached to the R workspace using the
command attach.bugs(object) or attach.jags(object), respectively.
2.4 Smoking Cessation 53
Similarly to the Vaccine example, we need now to include other variables and, gener-
ally, post-process the output of the Bayesian model to obtain the quantities necessary
to perform the Decision and Uncertainty Analyses.
For example, in this case no data on the costs were provided in addition to the
effectiveness reported in [11]. Thus, for the purposes of this example, we extracted
information published in [14], who reported costs for different class of interventions
for smoking cessation. The costs in British pounds were taken from [15]. Although
the interventions reported in [11] were not described in detail, a comparison of the
meta-analysis results with the comparative efficacy measures given in [14] showed
consistent results, indicating substantial similarity between the interventions in the
two studies.
The costs for the comparators included in the analysis are composed as follows:
No intervention:
• No costs: £0;
Self-help:
• Nicotine replacement therapy (NRT) for five weeks (35 patches at £1.30 each);
Individual counselling:
• NRT for five weeks (35 patches at £1.30 each);
• Five clinic visits (£10.00 each);
Group counselling:
• NRT for five weeks (35 patches at £1.30 each);
• Five group visits (£19.46 each).
The total average costs per intervention were: £0 for t = 0; £45.50 for t = 1;
£95.50 for t = 2; and £142.80 for t = 3. Due to the expected variability associ-
ated to the compliance to the interventions in general practice and to the potential
need of additional counselling and pharmacological treatment for some patient, it is
reasonable to describe the uncertainty associated with the costs with a probability
distribution.
For simplicity, a triangular distribution is associated with all treatment costs
(excluding the reference “No intervention” comparator), with limits defined by the
average intervention cost ±30%. The triangular distribution is a triangle-shaped curve
with a null associated density of probability outside the specified lower and upper
bounds. It increases linearly from the lower bound to its mode, and decreases linearly
up to the upper limit. A graphical representation is given in Fig. 2.3. A real-world
analysis could be based on more appropriate assumptions for the cost distributions.
The distributions of the costs need to be simulated to be inputted in the cost-
effectiveness model. The reference comparator t = 0 is assumed not to have an
associated costs, i.e. its cost is always null. In formal terms, a degenerate proba-
bility distribution which assumes the value zero with probability equal to one is
54 2 Case Studies
0.09
Density
0.06
0.03
0.00
35 40 45 50 55
Cost
Fig. 2.3 The distribution represents the uncertainty associated with the costs for the self-help
intervention. The curve is shaped as a triangle, hence its name. In this case the mean is equidistant
to the lower and upper bound, thus corresponding to the mean (and median) of the distribution
assigned to this parameter. The costs for the other interventions are simulated from
the intervention-specific triangular distributions described above. Functions to sam-
ple from a triangular distribution are not included in the default libraries of R, thus
the triangle package needs to be installed to use the following code. The package
is available on CRAN, and can be installed as usually from a GUI or by inputting
the following command:
> install . packages (" triangle ")
The code to obtain the simulated values from the probability distributions of the
costs, stored in the cost matrix, is presented below. Since we populated the matrix
with zeroes when creating the object, the costs for t = 0 are automatically assigned.
The function rtriangle accepts as arguments the number of simulations needed, the
lower bound of the distribution a and the upper bound b. If not specified, the mode
of the distribution c is calculated by default as the average of the two extremes; since
we are using symmetric distributions and thus the mode corresponds to the mean,
there is no need to specify this parameter.
Table 2.3 Life expectancy increments gained by smoking cessations per gender and age at quitting.
Source: [16]
Life years gained relative to continuing smokers
Age at quitting Men Women
35 8.5 7.7
45 7.1 7.2
55 4.8 5.6
65 4.6 5.1
2.4 Smoking Cessation 55
Table 2.4 Proportion of smokers per age group. The data on smoking statistics have been published
by the charity Action on Smoking and Health in October 2013, reporting the prevalence of cigarette
smoking in the UK. Source [17]
Age group Proportion of smokers (%)
16–19 15
20–24 29
25-34 27
35–49 23
50–59 21
60+ 13
3 At
the address http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%
3A77-270247.
56 2 Case Studies
Table 2.5 Data inputs for the simulation of life years gained by smoking cessation. The dataset is
contained in the file smoking_cessation_simulation.csv
Age Population Propotion Smokers Proportion Male life Female life
of smokers of age group years years
15–19 3,997,000 0.15 599,550 0.08 8.50 7.70
20–24 4,297,000 0.29 1,246,130 0.09 8.50 7.70
25–29 4,307,000 0.27 1,162,890 0.09 8.50 7.70
30–34 4,126,000 0.27 1,114,020 0.08 8.50 7.70
35–39 4,194,000 0.23 964,620 0.09 8.50 7.70
40–44 4,626,000 0.23 1,063,980 0.09 7.10 7.20
45–49 4,643,000 0.23 1,067,890 0.09 7.10 7.20
50–54 4,095,000 0.21 859,950 0.08 4.80 5.60
55–59 3,614,000 0.21 758,940 0.07 4.80 5.60
60–64 3,807,000 0.13 494,910 0.08 4.60 5.10
65–69 3,017,000 0.13 392,210 0.06 4.60 5.10
70–74 2,463,000 0.13 320,190 0.05 4.60 5.10
75–79 2,006,000 0.13 260,780 0.04 4.60 5.10
a Binomial model based on the split reported in the ASH smoking statistics and the
life years reported in [16] were assigned.
To obtain the 2,000 simulations from the posterior distribution of the average
life years gained by quitters, the code below has been used. Notice that, in order to
use the code, the file smoking_cessation_simulation.csv needs to be available
in the same directory from which R is run, or the correct address to the file needs
to be specified. Each of the 1,000 individuals in the cohorts are associated with
a simulated age. This is drawn from a multinomial distribution with a vector of
probabilities equal to the observed frequency for each age group. The gained life
years are calculated for each group based on the gender split. The results are then
averaged over the sample, to obtain a vector composed by 2,000 elements. To repeat
the process 4 times, obtaining 2,000 simulations for each treatment, 8,000 samples
from the multinomial distribution are taken. These are successively arranged in a
matrix with 2,000 rows and 4 columns.
> data = read . csv ( file =" smoking_cessation_simulation . csv ")
> life . years = with ( data , rmultinom (2000*4 ,1000 , pr . age ) *
(.52* Male . ly +.48* Female . ly ) )
> life . years = matrix ( apply ( life . years ,2 , sum ) /1000 ,
nrow =2000 , ncol =4)
At this point it is possible to obtain the life years gained for each intervention. It is
only necessary to multiply the probability of smoking cessation π for each treatment
by the average number of life years gained by quitting. This can be obtained by a
multiplication of the two quantities:
> e= pi * life . years
2.4 Smoking Cessation 57
Again, this process is completed by running BCEA and performing the Decision
and Uncertainty Analysis (as described in details in Chaps. 3 and 4).
References
3.1 Introduction
(a)
(b)
function bcea returns as output an object in the class bcea. These generic functions
are represented by the orange boxes. Finally, the red boxes identify the functions
that are specific to BCEA. In the rest of the book, we present each of these elements
and explain how they can be used in the process of statistical analysis of a health
economic evaluation problem.
BCEA accepts, as inputs, the outcomes of a health economic evaluation compar-
ing different interventions or strategies, ideally but not necessarily produced using
MCMC (Markov Chain Monte Carlo) methods. It is not a tool to perform the evalua-
tion itself, but rather to produce readable and reproducible outputs of the evaluation. It
also provides many useful, technically advanced measures and graphical summaries
to aid researchers interpret their results.
In general, BCEA requires multiple simulations from an economic model that com-
pares at least two different interventions based their cost and effectiveness measures
to produce this standardised output. The cost measure quantifies the overall costs
associated with the interventions for every simulation. The effectiveness, or efficacy,
measure can be given in any form, be it a “hard” outcome (e.g. number of avoided
cases) or a “soft” one (e.g. QALYs, Quality-Adjusted Life Years).
3.1 Introduction 63
Thus the minimum input which must be given to BCEA is composed of two n sim ×
n int matrices, where n sim is the number of simulations used to perform the analysis
(at least 2) and n int is the number of interventions being compared (again, at least 2
are required). These two matrices contain all the basic information needed by BCEA
to perform a health economic comparison of the alternative interventions.
We assume, in general, that the statistical model underlying the economic analysis
is performed in a fully Bayesian framework. This implies that the simulations for
the economic multivariate outcome (e, c) are in fact from the relevant posterior
distributions. We discuss in Chap. 5 how BCEA can be used alongside a non-Bayesian
model.
To illustrate the capabilities of BCEA, the two examples are introduced in Sects. 2.3
and 2.4 are developed as full health economic evaluations throughout the book.
Both these examples are included in the BCEA package and therefore all the results
throughout the book can be replicated using these datasets. Each of the following
sections detail a different function in BCEA demonstrating its functionality for both
single and multi-comparisons examples.
If a health economic model has been run in a similar manner to the two examples
discussed in Chap. 2 then, in general, the modeller will have access to two matrices,
which we denote by e and c. These matrices contain the simulated values of the
effectiveness and costs, associated with the interventions t = 0, . . . , T , where T + 1
is the total number of treatments, equal to 2 for the Vaccine example and 4 for the
Smoking cessation example. The generic element of position [s, t] in each matrix is
the measurement of the outcome observed in the s-th simulation, with s = 1, . . . , S,
where S is the number samples, under intervention t, with t = 0, . . . , T . For the
Vaccine example (Sect. 2.3), S = 1 000 and for the Smoking example (Sect. 2.4)
S = 2 000.
To begin any analysis using the package BCEA, the bcea function must be called.
This function processes the matrices for the costs and effectiveness so that the model
output is in the correct form for other functions in the package BCEA. Additionally,
the bcea object can be used to give basic summaries and plots. Therefore, when this
function is called it should be assigned to an object, to create an object of class bcea.
This object contains the following elements which are then used as inputs to the other
functions in the BCEA package. A bcea object contains the following elements:
• n.sim: the number of model simulations, i.e. the number of rows of the e and c
matrices given as arguments to the bcea function;
• n.comparators: the total number of interventions included in the model, i.e. 4
for the smoking cession example;
64 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
• vi: a matrix with n.sim rows and columns equal to the length of k including the
value of information for each simulation from the Bayesian model and for each
value of the grid approximation of the willingness to pay (cfr. Sect. 4.2.1);
• Ustar: a matrix with n.sim rows and columns equal to the length of k, indi-
cating the maximum simulation-specific utilities for every value included in the
willingness to pay grid;
• ol: the opportunity loss value for each simulation and willingness to pay value,
reported as a matrix with n.sim rows and a number of columns equal to the length
of the vector k (cfr. Sect. 4.2.1);
• evi: a vector with the same number of elements as k, with the expected value of
(perfect) information for every considered willingness to pay threshold as values
(cfr. Sect. 4.3);
• interventions: a vector of length n.comparators with the labels given to each
comparator;
• ref: the numeric index associated with the reference intervention;
• comp: the numeric index(es) associated with the non-reference intervention(s);
• step: the step used to form the grid approximation of the willingness to pay grid,
such that a 501-elements grid of values is produced with 0 and Kmax as the extreme
values. Ignored if wtp is passed as an argument to the bcea function;
• e: the matrix including the simulation-specific clinical benefits for each comparator
used to generate the object;
• c: the matrix including the simulation-specific costs for each comparator used to
generate the object.
These items in the bcea object are sub-settable as in lists, i.e. the command
object$n.sim will extract the first element of the bcea object.
To use the Vaccine dataset included in the package BCEA, it is sufficient to load
BCEA using the function library, and attaching the dataset with the command
data(Vaccine). Doing so will import in the current workspace all the variables
required to run the analysis.1
> library ( BCEA )
> data ( Vaccine )
> ls ()
[1]" N " " N . outcomes " " N . resources " " QALYs . adv "
[5] " QALYs . death " " QALYs . hosp " " QALYs . inf " " QALYs . pne "
[9] " c " " cost . GP " " cost . hosp " " cost . otc "
[13] " cost . time . off " " cost . time . vac " " cost . travel " " cost . trt1 "
[17] " cost . trt2 " " cost . vac " "e" " treats "
1 Processing the data as demonstrated in Sect. 2.3.2 will yield slightly different results than those
presented in this section as the parameters were produced in two different simulations. The following
analyses are based on the Vaccine dataset included in the BCEA package.
66 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
The definition of each object included in the Vaccine dataset is discussed in the
package documentation by executing the command ?Vaccine in the R console. At
this point we can begin to run the health economic analysis by using the bcea function
to format the model output. A suitable vector of labels for the two interventions can
be defined (this step is optional) and the function bcea is then called where the
matrices e and c are the effectiveness, in terms of QALYs, and the costs for the
Vaccine example.
> library ( BCEA )
> treats <- c (" Status quo " ," Vaccination ")
> m <- bcea (e ,c , ref =2 , interventions = treats )
Note that plot is an S3 method for the bcea class of object and therefore help for
this function is accessed using ?plot.bcea.
As mentioned earlier, the input data for the population average measure of cost
and effectiveness are ideally obtained from a full Bayesian model, as in the example
just described. Nevertheless, BCEA can also accommodate data on (e, c) obtained
under a frequentist approach, e.g. using bootstrap. Perhaps, these are available
in a spreadsheet (we return to this point repeatedly in the rest of the book, e.g.
in Sects. 4.3.2, 5.2.4.1 and 5.2.5), say the file Bootstrap.csv in which the first two
columns are the simulations for the measure of effects for the two treatments con-
sidered, while the data in columns three and four are the corresponding values for
the measure of cost. In such a situation, we could import these to R as follows.
3.2 Economic Analysis: The bcea Function 67
Fig. 3.3 The graph shows the main results of the analysis. It can be produced by the call to BCEA
by setting the option plot=TRUE or by calling the function plot with a valid BCEA object as its
argument. From the top-left corner, clockwise, it includes: the cost-effectiveness plane Sect. 3.4,
the expected incremental benefit Sect. 3.5, the cost-effectiveness acceptability curve Sect. 4.2.2 and
the expected value of perfect information Sect. 4.3.1
> # Imports the spreadsheet ( assuming the file is in the working directory - if
not , need to change the path !)
> inputs <- read . csv (" Boostrap . csv ")
To avoid this problem, we need to change the class of the simulated values, so that
e and c are matrices, for example using the following code.
> # Check the class of the objects
> class ( inputs )
[1] " data . frame "
> class ( e )
[1] " data . frame "
> class ( c )
[1] " data . frame "
> # Re - create the objects e and c as matrices and check the class
> e <- as . matrix ( inputs [, c (1:2) ])
> c <- as . matrix ( inputs [, c (3:4) ])
> class ( e )
[1] " matrix "
> class ( c )
[1] " matrix "
To use the Smoking dataset Sect. 2.4 included in the package BCEA, the data set must
be loaded in a similar fashion to the Vaccine data set using data(Smoking). This
will import the Smoking data set into the current workspace. As the effectiveness
and cost matrices are called e and c, in both cases. If the two examples are run in the
same workspace or R session, e and c will be replaced in the workspace.
To run the economic analysis using BCEA, the “Group counselling” intervention
is chosen as the reference intervention. Since this intervention is associated with
the fourth column of the effectiveness and cost matrices, it is selected by specifying
ref=4. To read the outputs more easily, the intervention labels are passed to the
function as well.
3.2 Economic Analysis: The bcea Function 69
Additionally, notice that no restrictions are applied to the values passed as the wtp
argument. The vector elements need to be at least one, and if any of them is negative
the values are re-scaled by incrementing all values by the absolute value of the lowest
element (i.e. passing the argument wtp=c(-5,5) will produce an analysis over the
values 0 and 10).
The standard BCEA plot, obtained using plot(m) and demonstrated for the Vaccine
example, can be cluttered when multiple comparators are included in the analysis. For
this reason, all plots in the BCEA package can be produced either using base graphics
(the default) or ggplot2, an advanced plotting system based on the grammar of
graphics [4] which is preferred for multiple comparators as ggplot2 allows for finer
controls over the graphs.
To select the ggplot version of the plot, the option graph="ggplot2" must be
added to the plot or plot.bcea function call. The string is partial-matched to either
ggplot2 or base, hence selecting graph="g" or "b" is sufficient to indicate which
graphical engine should be used. The two versions of the graphs share the same
function calls, and are selected by the use of the graph option only. The graphical
results have been kept as consistent as possible between the two versions.
Adding the option graph="ggplot2" to the plot function will produce a ggplot
object which, if not assigned to a named object, will be printed by default. It is possible
to store the plot in an object, modify it and produce the graph using the functions
70 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
print or plot, via the S3 methods for the class ggplot (i.e. print.ggplot and
plot.ggplot).
In general, the ggplot versions of the plots extend the amount of options available
using graph="base", and the modularity of ggplot objects allows for post-hoc
modifications as well. In Fig. 3.4, the position of the legends is set to the bottom of
the graphs, outside the plot area. The size of the text labels is reduced by setting
the argument size=rel(2). The argument ICER.size is set to 2 and passed to the
ceplane.plot function (cfr. Sect. 3.4) to include the ICERs in the cost-effectiveness
plane. The code used to produce Fig. 3.4 is reported below.
100
100
EIB
0
50
−100
0
k = 250 k* = 177 k* = 210
0.75 40
0.50 30
EVPI
0.25 20
10
0.00
Fig. 3.4 The summary of the health economic analysis produced by the ggplot version of
plot.bcea. The different colours and line types indicate the three pairwise comparisons versus
the status quo (No intervention). The two willingness to pay values in correspondence of which the
decision changes are represented in the expected incremental benefit (EIB) and expected value of
perfect information (EVPI) plots. An arbitrary willingness to pay, equal to £250 per life year saved,
has been chosen for the cost-effectiveness plane graph
3.2 Economic Analysis: The bcea Function 71
The interpretation of these 4 graphics and their manipulation in both base graph-
ics and ggplot2 graphics will be dealt with in the following sections; the cost-
effectiveness plane in Sect. 3.4, the expected incremental benefit in Sect. 3.5, the
cost-effectiveness acceptability curve in Sect. 4.2.2 and the expected value of perfect
information in Sect. 4.3.1.
A summary table reporting the basic results of the health economic analysis can be
obtained from the BCEA object using the summary function. This is an S3 method for
objects of class bcea, similar to the plot function applied to produce the graphical
summary. It produces the following output for the Vaccine example:
> summary ( m )
Optimal decision : choose Status quo for k <20100 and Vaccination for k >=20100
Expected utility
Status quo -36.054
Vaccination -34.826
EVPI 2.4145
2 This
choice is due to the fact that the average threshold of cost-effectiveness commonly used by
NICE (National Institute for Health and Care Excellence) varies between £20 000 and £30 000 per
QALY gained.
72 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
summary is not included in the grid (which can be accessed by typing m$k), an error
message will be produced. For example, using the command summary(m,1234) will
result in the following output:
> summary (m ,1234)
Error in summary . bcea (m , 1234) :
The willingness to pay parameter is defined in the interval [0 -50000] , with
increments of 100
Calls : summary -> summary . bcea Execution halted
The summary table displays the results of the health economic analysis, including
the optimal decision over a range of willingness to pay thresholds, identified by the
maximisation of the expected utilities. In this case the break-even point, the threshold
value k where the decision changes, is 20 100 monetary units. The summary also
reports the values for the EIB (Sect. 3.5), CEAC (Sect. 4.2.2) and EVPI (Sect. 4.3.1)
for the selected willingness to pay threshold, along with the ICER.
In the Vaccine example, the ICER is below the threshold of 25 000 and thus the
vaccination policy is cost-effective in comparison to the status quo. A more in-depth
explanation of the probabilistic sensitivity analysis and the tools provided by BCEA
to interpret and report it (e.g. the CEAC and the EVPI) is deferred to Chap. 4.
Running the analysis for a different willingness to pay, for example k = 10 000,
may result in a different optimal decision, depending on whether the ICER is above
or below the selected willingness to pay. If this new threshold were selected for the
Vaccine case, the ICER would now be above it and thus the decision taken in this
scenario would be associated with less uncertainty. In fact, the ICER is estimated at
20 098 monetary units, twice the value of the willingness to pay threshold selected
in this case (but notice that the summary reports the grid estimate of 20 100).
> summary (m , wtp =10000)
Optimal decision : choose Status quo for k <20100 and Vaccination for k >=20100
Expected utility
Status quo -20.215
Vaccination -22.745
EVPI 0.6944
For the Smoking example, the default summary is given for Kmax. However, as
above, the willingness to pay value can be changed from this default, using the wtp
argument:
3.3 Basic Health Economic Evaluation: The summary Command 73
Expected utility
No intervention 103.86
Self - help 123.00
Individual counselling 126.27
Group counselling 141.73
EVPI 42.984
The summary table shows that, based on the expected incremental benefit, the
optimal decision changes twice over the chosen grid of the willingness to pay values.
Below a threshold of willingness to pay equal to £177 per life year gained, the opti-
mal decision is No treatment. For values of the willingness to pay between £177 and
£210 per life year gained, the most cost-effective decision would be the Self-help
intervention. For thresholds greater than £210 the optimal strategy is Group coun-
selling. Notice that, Individual counselling is dominated by the other comparisons
at considered the willingness to pay values. The break-even points are relatively low
in value, indicating that the introduction of smoking cessation interventions would
be cost-effective compared to the null option (No treatment).
Due to the multiple treatment options this summary table is more complex than
the summary for the Vaccine example. The ICER is given for the three comparison
treatments, compared with Group Counselling. The EIB and CEAC are also given
for these pairwise comparisons. Finally, note that there is only one value given for
the EVPI (see Sect. 4.3.1). This is because the EVPI relates to uncertainty underlying
the whole model rather than the paired comparisons individually, we return to this
idea in Sect. 4.3.1.2.
74 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
The summary table produced using the summary function already provides relevant
information about the results of the economic analysis; however, a graphical repre-
sentation of the results can indicate behaviours or particular characteristics that may
be missed from the analysis of the summary indexes. The first and probably most
important representation of the data is the cost-effectiveness plane (which inciden-
tally is recommended by most health technology assessment agencies as a necessary
tool for economic evaluation). The cost-effectiveness plane, discussed in Sect. 1.3
and Fig. 1.4, is a visual description of the incremental costs and effects of an option
compared to some standard intervention [1].
In Fig. 3.5, the cost-effectiveness plane for the Vaccine example, the dots are a
representation of the differential outcomes (effectiveness and costs) observed in each
simulation. If a dot falls in the sustainability area shaded in grey, it indicates that the
expected incremental benefit for the comparison is positive for the given simulation
and chosen willingness to pay threshold. The red dot is a representation of the ICER
on the cost-effectiveness plane and is obtained as the averages of the two marginal
distributions (for Δe and Δc ). The numerical value of the ICER is printed in the
top-right corner by default, while the willingness to pay threshold (k) is displayed in
the bottom-left corner. This gives the gradient of the line partitioning the plane and
defines the cost-effectiveness acceptability (sustainability) region.
Several options are available for this function. The willingness to pay can be
adjusted by setting the option wtp to a different value, i.e. wtp=10000, which will
change the slope of the line, varying the value assigned to k in the equation Δc = kΔe
defining the acceptability region. The willingness to pay is defined in the interval
[0, ∞), and any value in this range can be assigned to this argument, 0 included.
Assigning a negative value to wtp will generate an error. In this case, note that, the
selected value for wtp does not have to be in the grid defined by the element m$k.
The position of the ICER label can be adjusted by using the pos argument, placing
the legend in any chosen corner of the graph. This can be done by setting the para-
meter pos to the different values topright (the default), topleft, bottomright or
bottomleft. It is also possible to assign a two-dimensional numerical vector to this
argument. A value equal to 0 in the first element positions the label on the bottom,
3.4 Cost-Effectiveness Plane 75
• ICER=20097.59
10
5
Cost differential
0
−5
k = 25000
Effectiveness differential
Fig. 3.5 The cost-effectiveness plane for the Vaccine example. The red dot indicates the average
of the distribution of the outcomes, i.e. the ICER. The grey-shaded surface is a representation of the
sustainability area, in correspondence of the fixed willingness to pay threshold, in this case fixed at
25,000 monetary units (the default)
while a 0 as the second element of the vector indicates the left side of the graph. If
the first and/or second elements are not equal to zero, the label is positioned on the
top and/or on the right, respectively.
In cases with more than two interventions, like for the Smoking example, the
argument comparison can be used to select which pairwise comparisons to visualise
on the plot. For example, the following code will plot the cost-effectiveness plane
for Group Counselling (reference treatment) against Self-Help (treatment 2):
> ceplane . plot (m , comparison =2 , wtp =250)
It is also possible to add more than one treatment comparison by setting comparison
as a vector, e.g.
> ceplane . plot (m , comparison =c (1 ,3) , wtp =250)
The values in this vector must be valid indexes, i.e. they need to be integer positive
numbers between 1 and the number of non-reference comparators, 3 for the Smoking
example. If this number is not known, the number of non-reference comparators
76 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
Note that if more than one pairwise comparison is plotted then the ICER and sus-
tainability area cannot be plotted using the default graphics package.
To add these elements to the graphic, the ggplot graphics package must be used.
This package also allows the user to have more control over the plot, which is useful
in situations where the cost-effectiveness plane must conform to certain publications
standards.
In multi-decision problems, the acceptability area is included by default in the
ggplot cost-effectiveness plane. The ICER, on the other hand, is not included by
default but this can be included using the argument ICER.size=2. This displays
the ICERs with a size equal to 2 millimetres. The following code produces Fig. 3.6,
which includes the 3 pairwise ICERs of size 2mm and an non-default acceptability
area with a willingness to pay for 250 monetary units.
> ceplane . plot (m , wtp =250 , graph =" ggplot2 ", ICER . size =2)
The results from Fig. 3.6 indicate that the three pairwise comparisons—represented
by the three different clouds of points—are similar in terms of variability. The dis-
tance from one to the next seems similar in terms of increments of the differential
costs and effectiveness. Clearly, the most effective and costly intervention on average
is Group counselling, indicated by all three ICERs residing in the top left quadrant.
This is followed by Individual counselling and Self-help. The No intervention option
is obviously the least expensive strategy, with no costs to be borne but also has the
smallest probability of success. Note also that the costs are therefore directly propor-
tional to the efficacy for all comparators, making Self-help the least expensive direct
intervention but also the least effective among the three active interventions. Indi-
vidual counselling is between Self-help and Group counselling for both outcomes.
All options presented in Sect. 3.4.1 are compatible with the ggplot cost-effective-
ness plane, but the opposite is not always true. In addition to the base plot manip-
ulations, it is possible to use the size option to set the value (in millimetres) of the
size of the willingness to pay label. A null size (i.e. size=0) can be set to avoid
the label from being displayed. Depending on the distribution of the cloud of points
and the chosen willingness to pay threshold, the default positioning algorithm for
the willingness to pay label can result in a non-optimal result, in particular when the
acceptability region limit crosses the left margin of the plot. An alternative position-
ing can be selected by setting the argument label.pos=FALSE in the ceplane.plot
function call. This option will place the label at the bottom of the plot area.
3.4 Cost-Effectiveness Plane 77
Cost−Effectiveness Plane
100
50
0
k = 250
−1 0 1 2
Effectiveness differential
Group counselling vs No treatment Group counselling vs Self−help
Group counselling vs Individual counselling
Fig. 3.6 The cost-effectiveness plane for the Smoking example produced by the ceplane.plot
function by setting the argument graph to "ggplot2". The theme applied in the graph is a modified
version of theme_bw to keep consistency between this version and the one using base graphics. The
output of the function is a ggplot object
For the Vaccine example, the ICER value is printed on the cost-effectiveness
plane. For a model with only two decisions the ICER, is also given of the ggplot
version on the cost-effectiveness plane. The ICER legend positioning works slightly
differently for the ggplot version and is in general less restrained than in the base
graphics plot. It is possible to place it outside the plot limits with assigning the
values "bottom", "top", "right" or "left" (with quotes) to the pos argument.
Alternatively it can be drawn inside the plot limits using a two-dimensional vector
indicating the relative positions ranging from 0 to 1 on the x- and y-axis respectively,
so for example the option pos=c(0,1) will put the label in the top-left corner of the
plot area, and pos=c(0.5,0.5) will place it at the centre of the plot. The default
value is set to FALSE, indicating that the label will appear in the top-right corner of
the plot area, in a slightly more optimised position than setting pos=c(1,1). Setting
the option value to TRUE will place the label on the bottom of the plot.
In the case of multiple comparisons, the legend detailing which intervention is rep-
resented by which cloud of points—seen on the same plot for the Smoking example—
can be manipulated in a similar fashion to the ICER label for the single compari-
son model. However, for multiple decisions the two commands, pos="bottom" and
pos=TRUE differ, the first uses a horizontal alignment for the elements in the legend,
the latter will stack them vertically.
78 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
The layers of the plot can be removed, added or modified by accessing the layers
element of the ggplot object. For a pairwise comparison, the object will be com-
posed of 7 layers: the line and the area defining the acceptability region, two layers
containing the two axes, the points representing the simulations, the legend and finally
the willingness to pay label. These elements can be accessed with the command:
> ce . plot$layers
[[1]]
mapping : x = x , y = y
geom_line : colour = black
stat_identity :
position_identity : ( width = NULL , height = NULL )
[[2]]
mapping : x = x , y = y
geom_polygon : fill = light grey , alpha = 0.3
stat_identity :
position_identity : ( width = NULL , height = NULL )
[[3]]
mapping : yintercept = 0
geom_hline : colour = grey
stat_hline : yintercept = NULL
position_identity : ( width = NULL , height = NULL )
[[4]]
mapping : xintercept = 0
geom_vline : colour = grey
stat_vline : xintercept = NULL
position_identity : ( width = NULL , height = NULL )
[[5]]
geom_point : na . rm = FALSE , size = 1
stat_identity :
position_identity : ( width = NULL , height = NULL )
[[6]]
mapping : x = lambda .e , y = lambda . c
geom_point : na . rm = FALSE , colour = red , size = 2
stat_identity :
position_identity : ( width = NULL , height = NULL )
[[7]]
mapping : x = x , y = y
geom_text : label = k = 250 , hjust = 0.15 , size = 3.5
stat_identity :
position_identity : ( width = NULL , height = NULL )
Cost−Effectiveness Plane
150
Cost differential
100
50
−1 0 1 2
Effectiveness differential
Group counselling vs No treatment Group counselling vs Self−help
Group counselling vs Individual counselling
Fig. 3.7 A version of the cost-effectiveness plane modified by changing the ggplot object proper-
ties. The cost-effectiveness acceptability region and the willingness to pay label have been removed,
and a panel grid has been included. The modularity of this class of objects allows for a high degree
of personalisation of the final appearance
Thus it is possible to modify the plot post hoc, by removing, adding or modifying
layer elements. For example, to produce a plot of the plane excluding the cost-
effectiveness acceptability area it is sufficient to execute the following code, which
will produce the plot in Fig. 3.7.
> # remove layers 1, 2 and 7 from the ggplot object
> ce . plot$layers <- ce . plot$layers [- c (1 ,2 ,7) ]
> # print the plot
> plot ( ce . plot )
The function ceplane.plot also contains some advanced options which allow the
user to customise the appearence of the resulting graph even further. In particular,
it is possible to pass the optional inputs xlab="string" where "string" is a text
string containing the label that the user wants to put on the x- axis, instead of the
default "Effectiveness differential". Similarly, it is possible to pass an argu-
ment ylab="string", which replaces the default string "Cost differential" for
80 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
the y-axis. Finally, the option title="string" instructs BCEA to print a customised
title for the graph. An example of advanced use for this function is the following:
> ceplane . plot (m , xlab =" Difference in QALYs ",
ylab =" Difference in costs ( Pounds ) ",
title =" C / E plane ")
In addition, it is possible to modify the x- and y-axes limits using the option
xl=c(lower,upper) and yl=c(lower,upper), where lower and upper are suitable
values (these can of course be different for two axes).
The expected incremental benefit (EIB) is a summary measure useful to assess the
potential changes in the decision under different scenarios (see Sect. 1.3). When
considering a pairwise comparison (e.g. in the simple case of a reference intervention
t = 1 and a comparator, such as the status quo, t = 0), it is defined as the difference
between the expected utilities of the two alternatives:
In (3.1), U 1 and U 0 are synthetic measures of the benefits which the intervention t is
expected to produce. Since the aim of the cost-effectiveness analysis is to maximise
the benefits, the treatment with the highest expected utilities will be selected as the
“best” treatment option. Thus, if EIB> 0, then t = 1 is more cost-effective than
t = 0.
Of course, the expected utility is defined depending on the utility function selected
by the decision-maker; when the common monetary net benefit is used, the EIB can
be expressed as a function of the effectiveness and cost differentials (Δe , Δc ), as
in (1.6).
In practical terms, BCEA estimates the EIB using the S = n.sim simulated values
passed as inputs for the relevant quantities (e, c) as
1
S
EIB = [u(es , cs ; 1) − u(es , cs ; 0)],
S s=1
where (es , cs ) is the s-th simulated values for the population average measure of
effectiveness and costs.
Assuming that the monetary net benefit is used as utility function, this effectively
means that BCEA computes a full distribution of incremental benefits
IB(θ ) = kΔe − Δc
3.5 Expected Incremental Benefit 81
Density
willingness to pay threshold
−40 −20 0 20 40 60
IB(θ)
—recall that (Δe , Δc ) are random variables, whose variations are determined by the
posterior distribution of θ . For each simulation s = 1, . . . , S, BCEA computes the
resulting value for IB(θ ) and then the EIB can be estimated as the average of this
distribution
1
S
EIB = IB(θ s ),
S s=1
where θ s is the realised configuration of the parameters θ for the s-th simulation.
This procedure clarifies the existence of the two layers of uncertainty in the analy-
sis, which is also evident in the cost-effectiveness plane: uncertainty in the parameters
is characterised by considering the full (posterior) distribution of the relevant quanti-
ties (Δe , Δc ). This already averages out the individual variability, but can be further
summarised by taking the expectation over the distribution of the parameters, to
provide summaries such as the ICER and the EIB.
The value of the IB is accessible from the BCEA object by the function sim.table,
detailed in Sect. 4.2.1. A graphical summary of the distribution of the incremen-
tal benefits for pairwise comparisons can be produced using the BCEA command
ib.plot, which produces the graph in Fig. 3.8.
> ib . plot ( m )
82 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
k* = 20100
Fig. 3.9 Expected incremental benefit as a function of the willingness to pay for the Vaccine
example. The break-even value corresponds to k ∗ = 20100, indicating that above that threshold
the alternative treatment is more cost-effective than the status quo, since for k > k ∗ follows that
U 1 − U 0 > 0. The use of the net benefit as a utility function makes the EIB function linear with
respect to the willingness to pay k
3.5 Expected Incremental Benefit 83
200
Group counselling vs Self−help
Group counselling vs Individual counselling
100
EIB
0
−100
k* = 159 k* = 225
Willingness to pay
Fig. 3.10 Expected incremental benefit as a function of the willingness to pay for the Smoking
example. There are two break-even points in this example corresponding to k ∗ = 159 and k ∗ = 225.
Note that the EIB are with respect to Group Counselling, this means that while the second break-even
point coincides with the Group counselling versus Self-Help line crossing 0, the first break-even
point is given at the point where the No treatment and Self-Help lines intersect as these are the most
cost-effective treatments for low willingness to pay values
The function eib.plot plots all the available comparisons by default. Optionally
a specific subset of comparisons to be represented can be selected. This can be done
by assigning a vector with numeric values indexing the comparisons to be included
as elements to the argument comparison. For example, if the BCEA object contains
multiple interventions, the option comparison=2 will produce the EIB plot for the
second non-reference comparator versus the reference one, sorted by the order of
appearance in the matrices e and c given to the BCEA object. The break-even points
(if any) can be excluded from the plot by setting the argument size=NA. However,
controlling the label size via the size argument is possible only in the ggplot version
of the plot (see below).
The pos option is used only when multiple comparisons are available. In this
case a legend allowing the user to identify the different comparisons is added to the
plot, and can be positioned as in the ceplane.plot function. The values "top",
"bottom", "right" or "left" or a combination of two of them (e.g. "topright")
will position the label in the respective position inside the plot area. For example the
code,
> eib . plot (m , pos =" topleft ")
where m is the BCEA object for the Smoking example, produces Fig. 3.10. The para-
meter pos can be specified also in the form of a two-element numeric vector. The
value 0 in the first position indicates the left of the plot, while 0 in the second position
84 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
will place the label on the bottom of the plot. A numeric value different than 0 (e.g.
equal to 1) will refer to the right or top, respectively if in the first or second position
of the vector.
Notice that in Fig. 3.9 two additional lines give the credible intervals for the
EIB, whereas for the multiple comparisons the credible intervals are not given. The
argument plot.cri controls these credible intervals and is set to NULL by default.
This means that if a single comparison is available or selected, the eib.plot function
also draws the 95% credible interval of the distribution of the incremental benefit. The
intervals are not drawn by default if multiple comparisons are selected. However, they
can be included in the graph by setting the parameter plot.cri=TRUE. In addition,
the interval level can be set using the alpha argument, default value 0.05, implying
that the 95% credible interval is drawn.
If plot.cri=TRUE, the function will calculate the credible intervals at level 1 −
alpha. The credible intervals are estimated by default by calculating the 2.5-th and
97.5-th percentiles of the IB distribution. Alternatively they can be estimated using
a normal approximation by setting cri.quantile=FALSE. This alternative method
assumes normality in the distribution of the incremental benefit and thus, for each
value k of the grid approximation of the willingness to pay, the credible intervals are
estimated as
EIB ± z α/2 Var (IB(θ )),
Contour plots are used to extend the amount of graphical information contained in
the representation of the simulated outcomes on the cost-effectiveness plane. The
BCEA package implements two different tools to compare the joint distribution of
the outcomes, the functions contour4 and contour2. The contour function is an
alternative representation of the cost-effectiveness plane given by ceplane.plot.
While the latter focuses on the distributional average (i.e. the ICER), contour gives
a graphical overview of the dispersion of the cloud of points, displaying information
about the uncertainty associated with the outcomes.
4 The contour.bcea function is an S3 method for BCEA objects, thus it can be invoked by calling
the function contour and giving a valid bcea object as input.
3.6 Contour Plots 85
10
Cost differential
5
0
−5
Effectiveness differential
Fig. 3.11 The contour plot for the Vaccine example of the bivariate distribution of the differential
effectiveness and costs produced by the function contour.bcea. The contour lines give a represen-
tation of the variability of the distribution and of the relationship between the two outcomes. The
four labels at the corners of the plot indicate the proportion of simulations falling in each quadrant
of the Cartesian plane
The contour function can be invoked with the following code, which will produce
the plot in Fig. 3.11:
> contour ( m )
The function plots the outcomes of the simulation on the cost-effectiveness plane,
including a contour indicating the different density levels of the joint distribution
of the differentials of costs and effectiveness. The contour lines divide the observed
bivariate distribution of the outcome (Δe , Δc ) in a prespecified number of areas. Each
contour line is a curve along which the estimated probability distribution function
has a constant value. For example, if the chosen number of contour lines is four,
the distribution will be divided in five areas each containing 20% of all simulated
outcomes. A larger number of simulations will determine a more precise estimation of
the variance and therefore of the contours of the distribution. By default, the function
partitions the Cartesian plane in 5 regions, each associated with an equal estimated
density of probability with respect to the bivariate distribution of the differential
outcomes.
86 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
10
Cost differential
5
0
−5
k = 25000
Effectiveness differential
Fig. 3.12 The contour plot of the bivariate distribution of the differential effectiveness and
costs for the Vaccine case study produced by the function contour2. This function differs from
contour.bcea since it includes decisional elements such as the cost-effectiveness acceptability
region. In this example it can be seen that the mean of the distribution is not exactly centred, since
the mean is driven by the simulations resulting in a high effectiveness differential between the two
strategies. This results in a difference between the mean and median of the distribution
The contour2 function includes the contour of the bivariate distribution as well
as decision-making elements (i.e. the ICER and the cost-effectiveness acceptability
region). The plot in Fig. 3.12 is produced by the following code:
> contour2 ( m )
Cost−Effectiveness Plane
100
50
0
k = 250
0 1 2 3
Effectiveness differential
Group counselling vs No intervention Group counselling vs Self−help
Group counselling vs Individual counselling
Fig. 3.13 The representation of the cost-effectiveness plane for the smoking cessation example.
The contours highlight the similarity in the uncertainty between the three bivariate distributions of
the differential outcomes and costs. The differential distributions are all contained in the positive
part of the y axis on the plane, meaning that the “Group counselling” intervention is more expensive
than all three others with low uncertainty. The group counselling intervention is cost-effective on
average with respect to all three other comparators for a willingness to pay of £250 per life year
saved
distributions and the cost-effectiveness acceptability region. The plot in Fig. 3.13
can be produced by the following command:
> contour2 (m , graph =" ggplot2 ", comparison = NULL , wtp =250 , pos = TRUE , ICER . size =2)
Clearly, this Figure is similar to Fig. 3.6, with the additional of the bivariate contours
allowing the user to identify the variation in the different comparisons more clearly.
Evidently, this version of the contour2 function is able to pass additional values
to the ceplane.plot function (e.g. the arguments pos, ICER.size, label.pos).
In addition, both contour and contour2 can be further customised using the same
optional arguments that have been described in Sect. 3.4.3 (that is irrespective of
which graphical engine is used to produce the graph).
3.7 Health Economic Evaluation for Multiple Comparators and the Efficiency Frontier 89
There are several ways of looking at the respective cost-effectiveness between com-
parators in an analysis of multiple comparisons. The most common graphical tool
for this evaluation is the cost-effectiveness plane. However, the comparative eval-
uation can be executed using another graphical instrument, the cost-effectiveness
efficiency frontier. The efficiency frontier is an extension of the standard approach
of incremental cost-effectiveness ratios and provides information for the health eco-
nomic evaluation when a universal willingness to pay threshold is not employed (e.g.
Germany) and it is particularly informative for assessing maximum reimbursement
prices [5].
The efficiency frontier compares the net costs and benefits of different interven-
tions in a therapeutic area. It is different from the common differential approach
(e.g. the cost-effectiveness plane) as the net measures are used. The predicted costs
and effectiveness for the interventions under consideration are compared directly to
the costs and effectiveness measure for treatments that are currently available. The
frontier itself defines the set of interventions for which cost is at an acceptable level
for the benefits given by the treatment. A new treatment would be deemed efficient—
i.e. it would then lie on the efficiency frontier—if, either, the average effectiveness
for the new treatment is greater than any of the currently available treatments or,
the cost of the treatment is lower than currently available treatments with the same
effectiveness. This area for efficiency lies the right of the curve in Fig. 3.14.
Practically, efficiency is determined sequentially. This means that we start from
an arbitrary inefficient point (i.e. the origins of the axes) and then determine the
intervention with the smallest average effectiveness. In general, this intervention
will also have a higher cost than the starting point—this intervention will also have
the lowest ICER Incremental Cost-Effectiveness Ratio (ICER)value amongst the
comparators. If two ICERs are equal then the treatment with the lowest cost is deemed
to be efficient, i.e. lie on the efficiency frontier. The next intervention included on the
frontier then has the next lowest effectiveness and cost measures—i.e. has the lowest
ICER value compared to the current efficient intervention. This method proceeds
until all efficient technologies have been identified.
The BCEA function ceef.plot produces a graphical and optionally a tabular output
of the efficiency frontier, both single and multiple comparisons. Given a bcea object
m, the frontier can be produced simply by the ceef.plot(m) command. In the plot,
the circles indicate the mean for the cost and effectiveness distributions for each
treatment option. The number in each circle corresponds to the order of the treatments
in the legend. If the number is black then the intervention is on the efficiency frontier.
Grey numbers indicate dominated treatments. By default, the function presents the
efficiency frontier plot in Fig. 3.14 and a summary, as displayed below:
90 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
150
4
100
3
Cost
50
1 : No intervention
2 : Self−help
3 : Individual counselling
4 : Group counselling
1
0
Effectiveness
Fig. 3.14 The cost-effectiveness efficiency frontier for the smoking cessation example produced
by the ceef.plot function. The colours of the numbers in the circles indicate if a comparator is
included on the efficiency frontier or not. In this case, the interventions No treatment, Self-help and
Group counselling are on the frontier. Individual counselling is extendedly dominated by Self-help
and Group counselling
> ceef . plot (m , pos =" right " , start . from . origins = FALSE )
are also reported. In particular, the slope can be interpreted as the increase in costs
for an additional unit in effectiveness, i.e. the ICER for the comparison against the
previous treatment. For example, the ICER for the comparison between Self-help
and No treatment is £176.01 per life year gained.
The dominance type for comparators not on the efficiency frontier is reported
in the output table. This can be of two types: absolute or extended dominance. An
intervention is absolutely dominated if another comparator has both lower costs and
greater health benefits, i.e. the ICER for at least one pairwise comparison is negative.
Comparators in a situation of extended dominance are not wholly inefficient, but
are dominated because a combination of two other interventions will provide more
benefits for lower costs. For example, in the Smoking example, a combination of
Group Counselling and Self-Help would give more benefits for the same cost as
Individual Counselling.
The plot produced by the ceef.plot function, displayed in Fig. 3.14, is composed
of different elements:
• the outcomes of the simulations, i.e. the matrices e and c provided to the bcea
function, represented by the scatter points, with different colours for the compara-
tors;
• the average cost and effectiveness point for each comparator considered in the
analysis. These are represented by the circles including numbers indexing the
interventions by their order of appearance in the bcea object. The legend provides
information on the labels of the comparators;
• the efficiency frontier line, connecting the interventions on the frontier;
• the dominance regions, shaded in grey. A lighter shade indicates that interventions
in that area would be (absolutely) dominated by a single intervention, while mul-
tiple interventions would dominate comparators in the area with the darker shade.
Comparators in the non-shaded areas between the dominance regions and the
efficiency frontier are extendedly dominated. The graphical representation of the
dominance areas can be suppressed by setting dominance=FALSE in the function
call.
The start.from.origins option is used to choose the starting point of the fron-
tier. By default its value is set to TRUE, meaning that the efficiency frontier will
have the origins of the axes, i.e. the point (0, 0) as starting point. If this is set to
FALSE, the starting point will be the average outcomes of the least effective and
costly option among the compared interventions. If any of the comparators result in
negative costs or benefits, the argument start.from.origins will be set to FALSE
with a warning message. The starting point will not be included in the summary if
not in correspondence of the average outcomes of one of the included interventions.
As German guidelines recommend representing the costs on the x-axis and ben-
efits on the y-axis [5], an option to invert the axes has been included. This can be
done by specifying flip=TRUE in the function call. It is worth noting that, in the
efficiency frontier summary, the angle of increase of the segments in the frontier will
reflect this axes inversion. However, the segment slopes will not change, to retain
consistency with the definition of ICER (additional cost per gain in benefit).
92 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis
The function allows any subset of the comparators to be included in the estimation
of the efficiency frontier. The interventions to be included in the analysis can be
selected by assigning a numeric vector of at least two elements to the argument
comparators, with the indexes of the comparators as elements. For example, to
include only the first and third comparator in the efficiency frontier for the smoking
cessation analysis, it is sufficient to add comparators=c(1,3) to the efficiency
frontier function call. Additionally, the positioning of the legend can be modified
from the default (i.e. in the top-right corner of the graph) by modifying the value
assigned to the argument pos. The values that can be assigned to this argument are
consistent to the other plotting functions in BCEA , e.g. ceplane.plot and eib.plot.
The ggplot2 version of the graph shares the design with the base graphics version.
In addition to the higher flexibility in the legend positioning provided by the argument
pos, a named theme element can be included in the function call, which will be
added to the ggplot2 object. The dominance regions are also rendered in a slightly
different way, with levels of transparency stacking up when multiple comparators
define a common dominance area. As such, the darkness of the grey-shaded areas
depend on the number of comparators sharing absolute dominance areas.
References
1. G. Baio, Bayesian Methods in Health Economics (Chapman Hall/CRC Press, Boca Raton, FL,
2012)
2. C. Williams, J. Lewsey, A. Briggs, D. Mackay (2016). doi:10.1177/0272989X16651869
3. G. Baio, A. Berardi, BCEA: A Package for Bayesian Cost-Effectiveness Analysis. http://cran.r-
project.org/web/packages/BCEA/index.html
4. H. Wickham, ggplot2: Elegant Graphics for Data Analysis (Use R!) (Springer, Berlin, 2009)
5. General methods for the assessment of the relation of benefits to costs. Technical report, Institute
for Quality and Efficiency in Health Care (IQWiG) (2009). https://www.iqwig.de/download/
General_Methods_for_the_Assessment_of_the_Relation_of_Benefits_to_Costs.pdf
Chapter 4
Probabilistic Sensitivity Analysis Using
BCEA
4.1 Introduction
In a nutshell, PSA for parameter uncertainty is a procedure in which the input para-
meters are considered as random quantities. This randomness is associated with a
probability distribution that describes the state of the science (i.e. the background
knowledge of the decision-maker) [2]. As such, PSA is fundamentally a Bayesian
exercise where the individual variability in the population is marginalised out but
the impact of parameter uncertainty on the decision is considered explicitly. Cal-
culating the ICER and EIB averages over both these sources of uncertainty as the
expected value for the utility function is found with respect to the joint distribution
of parameters and data. This parametric uncertainty is propagated through the eco-
nomic model to produce a distribution of decisions where randomness is induced by
parameter uncertainty.
From the frequentist point of view, PSA is unintuitive as parameters are not con-
sidered as random quantities and therefore are not subject to epistemic uncertainty.
Consequently, PSA is performed using a two-stage approach. First, the statistical
model is used to estimate the parameters, e.g. using the Maximum Likelihood Esti-
mates (MLEs) θ̂ as a function of the observed data, say y. These estimates are then
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 95
Fig. 4.1 Frequentist, two-stage versus Bayesian, one-stage health economic evaluation. The top
panel presents the frequentist process, where the parameters of the model are first estimated, e.g.
using MLEs. These are used to define some probability distributions, which are in turn fed through
the economic model. In a full Bayesian approach (bottom panel), this process happens at once:
the uncertainty in the parameters is updated from the prior to the posterior distribution, which is
directly passed to the economic model
Figure 4.2 gives a visual indication of the PSA process where the parameter
distributions seen on the left-hand side of the figure can be determined in a Bayesian
or frequentist setting. PSA begins by simulating a set of parameter values, represented
by red crosses in Fig. 4.2. These parameter values are then fed through the economic
model to give a value for the population summaries (Δe , Δc ), shown in the middle
column in the Figure and recorded in the final column “Decision Analysis”. These
measures are then combined to perform the cost-effectiveness analysis and calculate
a suitable summary e.g. IB(θ), for the specific simulation. This process is replicated
for another set of the simulated values to create a table of (Δe , Δc ) for different
simulations, along with a table of summary measures.
In this way, PSA does not differ from the analysis framework used in BCEA
which is based on a simulation approach from the distribution of (Δe , Δc ). Doing
the full economic analysis involves summarising this table using a suitable summary
such as the ICER. However, PSA involves considering the distribution of the final
summary measure such as the row-wise IB(θ). This gives a decision for each row
of the PSA table, conditional on the parameter values in that specific simulation.
This actually implies that one main tool to evaluate the parameter uncertainty is
the cost-effectiveness plane, because it provides helpful information about how the
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 97
Fig. 4.2 A schematic representation of the process of “Probabilistic sensitivity analysis”. Uncer-
tainty about the parameters is formalised in terms of suitable probability distributions and propagated
through the economic model (which can be seen as a “black box”) to produce a distribution of deci-
sion processes. These can be summarised (to determine the best option, given current evidence) or
analysed separately, to assess the impact of uncertainty in the parameters
model output varies due to uncertainty in the parameters. However, to further extend
the analysis of this decision uncertainty, the impact of parameter uncertainty on the
decision-making process can be assessed for different willingness to pay values.
BCEA is able to provide several types of output that can be used to assess the health
economic evaluation. As seen in Chap. 3, a summary can be produced as follows:
> summary(m)
where m is a BCEA object. This function provides the output reported in Sect. 3.3.
In addition to the basic health economic measures, e.g. the EIB and the ICER,
BCEA provides the some summary measures for the PSA, allowing a more in-depth
analysis of the variation observed in the results, specifically the CEAC (Sect. 4.2.2)
and the EVPI (Sect. 4.3.1). The full output of the PSA is stored in the BCEA object,
and can be easily accessed using the function sim.table, e.g. with the following code:
> table=sim. table (m,wtp=25000)
98 4 Probabilistic Sensitivity Analysis Using BCEA
The willingness to pay value (indicated by the argument wtp) must be selected
from the values of the grid generated when the BCEA object was created, as in the
summary function (see Sect. 3.3) and is set to 25 000 monetary units or Kmax when
that argument has been used by default.
The output of the sim.table function is a list, composed of the following elements:
• Table: the table in which the output is stored as a matrix;
• names.cols: the column names of the Table matrix;
• wtp: the chosen willingness to pay value threshold. All measures depend on it
since it is a parameter in the utility function;
• ind.table: the index associated with the selected wtp value in the grid used to run
the analysis. It is the position the wtp occupies in the m$k vector, where m is the
original bcea object.
The matrix Table contains the health economics outputs in correspondence of each
simulation and can be accessed by subsetting the object created with the sim.table
function. The first lines of the table can be printed in the R console as follows:
> head( table$Table)
U1 U2 U* IB2_1 OL VI
1 -36.57582 -38.71760 -36.57582 -2.1417866 2.141787 -1.750121
2 -27.92514 -27.67448 -27.67448 0.2506573 0.000000 7.151217
3 -28.03024 -33.37394 -28.03024 -5.3436963 5.343696 6.795451
4 -53.28408 -47.13734 -47.13734 6.1467384 0.000000 -12.311646
5 -43.58389 -40.40469 -40.40469 3.1791976 0.000000 -5.578996
6 -42.37456 -33.08547 -33.08547 9.2890987 0.000000 1.740230
(incidentally, this particular excerpt refers to the Vaccine example).
The table is easily readable and reports for every simulation, indexed by the
leftmost column, the following quantities:
• U1 and U2: the utility values for the first and second interventions. When multiple
comparators are included, additional columns will be produced, one for every
considered comparator;
• U*: the maximum utility value among the comparators, indicating which inter-
vention produced the most benefits at each simulation;
• IB2_1: the incremental benefit IB for the comparison between intervention 2 and
intervention 1. Additional columns are included when multiple comparators are
considered (e.g. IB3_1);
• OL: the opportunity loss, obtained as the difference between the maximum utility
computed for the current parameter configuration (e.g. at the current simulation)
U* and the current utility of the intervention associated with the maximum utility
overall. In the current example and for the selected threshold of willingness to pay,
the mean of the vector U1,1 where the vaccine is not available, is lower than the
mean of the vector U2, vaccine available, as vaccination is the most cost-effective
intervention, given current evidence. Thus, for each row of the simulations table,
1 Noticethat this is in fact U 0 , in our notation. The slight confusion is due to the fact that it is not
advisable (or indeed even possible in many instances) to use a 0 index, in R. Similarly, the value
U2 indicates the utility for treatment t = 1, U 1 .
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 99
the OL is computed as the difference between the current value of U* and the value
of U2. For this reason, in all simulations where vaccination is indeed more cost-
effective (i.e. when IB2_1 is positive), OL(θ) = 0 as there would be no opportunity
loss, if the parameter configuration were the one obtained in the current simulation;
• VI: the value of information, which is computed as the difference between the
maximum utility computed for the current parameter configuration U* and the
utility of the intervention which is associated with the maximum utility overall. In
the Vaccine example and for the selected threshold of willingness to pay, vaccina-
tion (U2) is the most cost-effective intervention, given current evidence. Thus, for
each row of the simulations table, the VI is computed as the difference between
the current value of U* and the mean of the entire vector U2. Negative values of
the VI imply that for those simulation-specific parameter values both treatment
options are less valuable than the current optimal decision, in this case vaccination.
BCEA includes a set of functions that can depict in graphical form the results
of PSA, in terms of the most commonly used indicators, which we describe in the
following.
which depends on the willingness to pay value k. This means that the CEAC can
be used to determine the probability that treatment 1 is optimal changes as the will-
ingness to pay threshold increases. In addition, this shows the clear links between
the analysis of the cost-effectiveness plane and the CEAC. Figure 4.3 shows in pan-
els (a)–(c) the cost-effectiveness plane for three different choices of the willingness
to pay parameter, k. In each, the CEAC is exactly the proportion of points in the
sustainability area. Panel (d) shows the CEAC for a range of values of k in the
interval [0 − 50 000].
100 4 Probabilistic Sensitivity Analysis Using BCEA
Fig. 4.3 A graphical representation of the links between the cost-effectiveness plane and the cost-
effectiveness acceptability curve
In general, the CEAC can also be directly compared to the EIB. The intervention
with the highest associated probability of cost-effectiveness (CEAC) will present
higher expected utilities with respect to the other comparators. For example, if two
alternative interventions are considered and one of them has an associated CEAC
value equal to 0.51, it will be considered cost-effective on average, producing a
positive differential in the utilities. The CEAC gives additional information as it gives
us additional information about the uncertainty. A probability of cost-effectiveness
of 0.51 and 1.00 will result in the same choice if analysing the ICER or the EIB,
while describing two very different situations. In the first case the difference between
the interventions is only slightly in favour of one intervention. However, in the
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 101
second situation (CEAC equal to one) the decision to implement the cost-effective
intervention has very little associated uncertainty. Notice that this property will not
hold when the underlying joint distribution of cost and effectiveness differentials is
extremely skewed, in which case an intervention that is deemed to be cost-effective,
given current evidence, may be associated with a relatively low CEAC.
The CEAC value for the chosen willingness to pay threshold is included by default
in the summary.bcea function output, described in Sects. 3.3 and 4.2.1. Functions are
included in BCEA to produce a graphical output for pairwise and multiple compar-
isons.
For the Vaccine example, the summary table in Sect. 3.3 reports that the CEAC is
equal to 0.529, for a willingness to pay threshold of 25 000 monetary units. This
indicates a relatively low probability of cost-effectiveness for the vaccination policy
over the status quo, as the CEAC is close 0.5. When only two comparators are under
investigation, a CEAC value equal to 0.5 means that the two interventions have the
same probability of cost-effectiveness. This is the maximum possible uncertainty
associated with the decision-making process. Therefore, a CEAC value of 0.529
means that, even though the vaccination is cost-effective on average, the difference
with the reference comparator is modest.
An alternative way of thinking of the CEAC value is to consider all “potential
futures” determined by the current uncertainty in the parameters: under this inter-
pretation, in nearly 53% of these cases t = 1 will turn out to be cost-effective (and
thus the “correct” decision, given the available knowledge). This also states that,
for a willingness to pay of 25 000 monetary units, nearly 53% of the points in the
cost-effectiveness plane lie in the sustainability area.
From the decision-maker’s perspective it is very informative to consider the CEAC
for different willingness to pay values, as it allows them to understand the level of
confidence they can have in the decision. It also demonstrates how the willingness
to pay influences this level of confidence. In fact, regulatory agencies such as NICE
in the UK do not use a single threshold value but rather evaluate the comparative
cost-effectiveness on a set interval. As the CEAC depends strongly on the chosen
threshold, it can be sensitive to small increments or decrements in the value of the
willingness to pay and can vary substantially.
To plot the cost-effectiveness acceptability curve for a bcea object m the function
ceac.plot is used, producing the output depicted in Fig. 4.4.
> ceac . plot (m)
The CEAC curve in Fig. 4.4 increases together with the willingness to pay. As
vaccination is more expensive and more effective than the status quo, if the willing-
ness to pay increases, a higher number of simulations yield a positive incremental
102 4 Probabilistic Sensitivity Analysis Using BCEA
1.0
make to assess the impact of
the variation of the
willingness to pay on the
0.8
Probability of cost effectiveness
probability of
cost-effectiveness. This
enables the analysis of the
0.6
uncertainty in different
scenarios, for different
values of the maximum cost
0.4
per unit increase in
effectiveness the
decision-maker is willing to 0.2
pay
0.0
benefit. In other terms, the slope of the line defining the cost-effectiveness accept-
ability region on the cost-effectiveness plane increases, so more points are included
in the sustainability region.
The values of the CEAC for a given threshold value can be extracted directly from
the bcea object by extracting the ceac element. For example, the CEAC value for
willingness to pay values of 20 000 and 30 000 monetary units will be displayed by
running the following code:
> with(m, ceac[which(k==20000)])
[1] 0.457
> with(m, ceac[which(k==30000)])
[1] 0.586
or equivalently
> m$ceac[which(m$k==20000)]
[1] 0.457
> m$ceac[which(m$k==30000)]
[1] 0.586
The lines above will return an error if the specified value for k is not present in the
grid of willingness to pay values included in the m$k element of the bcea object m.
The CEAC plot can become more informative by adding horizontal lines at given
probability values to read off the probabilities more easily. To add these lines in the
base version of the plot simply call the function lines after the ceac.plot function. For
example to include horizontal lines at the level 0.25, 0.5 and 0.75, run the following
code:
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 103
1.00
Probability of cost−effectiveness
0.75
0.50
0.25
0.00
Fig. 4.5 The inclusion of the panel background grid allows for an easier graphical assessment of
the values of the cost-effectiveness acceptability curve. This can be easily done by re-enabling the
panel.grid in the theme options in ggplot
For lower thresholds the decision is clearly in favour of not implementing the
vaccination, with a CEAC below 0.25 for a willingness to pay threshold less than
10 000 monetary units per QALY. For higher values of willingness to pay, however,
the decision uncertainty is still high as the probability of cost-effectiveness does not
reach 0.75 in the considered range of willingness to pay values.
When more than two comparators are considered in an economic analysis, the pair-
wise CEACs with respect to a single reference intervention may not give sufficient
information. For example, Fig. 4.6 shows the probability of cost-effectiveness for
Group counselling compared to the other interventions in a pairwise fashion. This
analysis gives no information about the other comparisons that do not include Group
counselling. This can potentially lead to misinterpretations since these probabilities
do not take into account the whole set of comparators. This issue is also present in
the EIB analysis. This is seen in the interpretation of Fig. 3.10 which is not straight-
forward in the multiple treatment comparison setting.
BCEA provides a necessary tool to overcome this problem using the multi.ce
function, which computes the probability of cost-effectiveness for each treatment
based on the utilities of all the comparators. This allows the user to analyse the overall
probability of cost-effectiveness for each comparator, taking into account all possible
interventions. To produce a CEAC plot that includes the intervention-specific cost-
effectiveness curve, the following code is used. First, the multiple treatment analysis
is performed using the bcea object m. The results must then be stored in an mce
object. Finally, the plot in Fig. 4.7 is produced by calling the mce.plot function. The
argument pos can be used to change the legend position, and in this case it is set to
top-right to avoid the legend and the curves overlapping.
> mce=multi . ce(m)
> mce. plot (mce, pos="topright ")
The function multi.ce requires BCEA a object as argument. It will output a list
composed of the following elements:
• m.ce, a matrix containing the values of the cost-effectiveness acceptability curves
for each intervention over the willingness to pay grid. The matrix is composed of
one row for every willingness to pay value included in the bcea object m$k (i.e.
501 if the argument wtp is not specified in the bcea function call), and columns
equal to the number of included comparators;
• ceaf, the cost-effectiveness acceptability frontier. This vector is determined by the
maximum value of the CEACs for each value in the analysed willingness to pay
grid;
• n.comparators, the number of included treatment strategies;
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 105
Cost Effectiveness
Acceptability Curve
1.0
0.8
Probability of cost effectiveness
0.6
0.4
0.2
Fig. 4.6 The figure depicts the three pairwise CEACs, representing the comparisons between the
“Group counselling” intervention versus all other comparators taken one by one. This plot does
not give information on the probability of cost-effectiveness of each strategy when considering all
other treatment options at the same time. This issue can be overcome by analysing all comparators
at the same time
• k, the willingness to pay grid. This is equal to the grid in the original bcea object,
in this example m$k;
• interventions, a vector including the names of the compared interventions, and
equal to the interventions vector of the original bcea object.
To produce a graph of the CEACs for all comparators considered together, the
command mce.plot is used. This command accepts as inputs an object produced by
the multi.ce function, the pos argument indicating the legend position. As usual, an
option to select whether the plot should be produced using base graphics or ggplot
can be included. The legend position can be changed in the same way as for the
base ceac.plot function Sect. 3.4.1. Again, if selecting ggplot, finer control of the
legend position is possible. The legend can be placed outside the plot limits using
a string providing the position (e.g. "bottom") or alternatively a two-element vector
specifying the relative position on the two axis; the two elements of the vector can
assume any value, so that for example c(0.5,0.5) indicates the centre of the plot and
c(1,0) the bottom-right corner inside the plot.
The results displayed in Fig. 4.7 confirm and extend the conclusions previously
drawn for the Smoking example, No intervention is the optimal choice for low
106 4 Probabilistic Sensitivity Analysis Using BCEA
1.0
No intervention
Self−help
Individual counselling
Group counselling
Probability of most cost effectiveness
0.8
0.6
0.4
0.2
0.0
Fig. 4.7 This figure is a graphical representation of the probability of cost-effectiveness of each
treatment when all other comparators are considered. The information given is substantially different
from the pairwise CEACs, since it allows for the evaluation of the “best” treatment option over the
considered grid of willingness to pay values. The uncertainty associated with the decision can be
inferred by the distance between the treatment-specific curves
willingness to pay thresholds. However, it can also be seen that the probability of
cost-effectiveness decreases steeply as the threshold increases. Between the values
£177 and 210 the curve with the highest value is self-help but, again, the associated
probability of cost-effectiveness is modest, lower than 0.40. In addition, it is not sub-
stantially higher than the probability of the other interventions being cost-effective
as the CEAC values are similar.
The values of the CEAC curves can be extracted from the mce object for a given
willingness to pay threshold, for example 194, using the following code:
> mce$m. ce[which(mce$k==194) ,]
[1] 0.1830 0.3085 0.1985 0.3100
The code above will return an empty vector if the threshold value 194 is not present
in the grid vector mce$k. A slightly more sophisticated way of preventing this error
is extracting the value for the threshold minimising the distance from the chosen
threshold, which in this case yields the same output since the chosen willingness to
pay is included in the vector.
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 107
Probability of most cost effectiveness Cost−effectiveness acceptability frontier Cost−effectiveness acceptability frontier
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0 100 200 300 400 500 0 500 1000 1500 2000 2500
500 2, 500
Fig. 4.8 The cost-effectiveness acceptability frontier (CEAF) for the smoking cessation example.
The CEAF indicates the overall value of uncertainty in the comparative cost-effectiveness consider-
ing all interventions at the same time. The low value of the curve highlights high uncertainty in the
interval £150–200. This is because the average utilities for all four interventions are much closer in
this interval than for smaller and bigger values of the willingness to pay
a value equal to 0.60 for higher thresholds, as shown in Fig. 4.8b which plots the
frontier for a higher limit of the willingness to pay threshold. To produce this graph,
the economic analysis must be re-run as follows:
> m2=bcea(e , c , ref=4,intervention=treats ,Kmax=2500)
> mce2=multi . ce(m2)
> ceaf . plot (mce2)
The cost-effectiveness acceptability probability remains stable around a value equal
to about 0.60, with Group counselling being the optimal choice for willingness to
pay values greater than £225 per life year gained.
0.75
0.50
0.25
0.00
Fig. 4.9 The multiple-comparison cost-effectiveness acceptability curves with the overlaid accept-
ability frontier curve for the smoking cessation example. The transparency argument alpha available
in the geom_line layer makes it easy to understand graphically which comparator has the highest
probability of cost-effectiveness
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 109
The code includes some control over the appearance of the frontier in the graph. An
additional geom_line layer is added to overlay the cost-effectiveness acceptability
frontier, using the aesthetics provided by the mce.plot function. The values of the
curve over the willingness to pay grid are calculated by means of the stat_summary
function using maximum value of the curves on each point. Figure 4.9 demonstrates
the produced graphic.
One measure to quantify the value of additional information is known as the Expected
Value of Perfect Information (EVPI). This measure translates the uncertainty associ-
ated with the cost-effectiveness evaluation in the model into an economic quantity.
This quantification is based on the Opportunity Loss (OL), which is a measure of
the potential losses caused by choosing the most cost-effective intervention on aver-
age when it does not result in the intervention with the highest utility in a “possible
future”. A future can be thought of as obtaining enough data to know the exact value
of the utilities for the different interventions. This would allow the decision-makers
to known the optimal treatment with certainty. In a Bayesian setting, the “possible
futures” are simply represented by the samples obtained from the posterior distrib-
110 4 Probabilistic Sensitivity Analysis Using BCEA
ution for the utilities. Thus, the opportunity loss occurs when the optimal treatment
on average is non-optimal for a specific point in the distribution for the utilities.
To calculate the EVPI practically, possible futures for the different utilities are
represented by the simulations. The utility values in each simulation are assumed to
be known, corresponding to a possible future, which could happen with a probability
based on the current available knowledge included in and represented by the model.
The opportunity loss is the difference between the maximum value of the simulation-
specific (known-distribution) utility U ∗ (θ) and the utility for the intervention result-
ing in the overall maximum expected utility U (θ τ ), where τ = arg maxt U t (cfr. the
discussion in Sect. 4.2.1):
Usually, for a large number simulations the OL will be 0 as the optimal treat-
ment on average will also be the optimal treatment for the majority of simulations.
This means that the opportunity loss is always positive as either we choose the cur-
rent optimal treatment or the treatment with a higher utility value for that specific
simulation.
The EVPI is then defined as the average of the opportunity loss. This measures the
average potential losses in utility caused by the simulation-specific optimal decision
being non-optimal in reality.
If the probability of cost-effectiveness is low, then more simulations will give a non-
zero opportunity loss and consequently the EVPI will be higher. This means that if
the probability of cost-effectiveness is very high, it is unlikely that more information
would be worthwhile, as the most cost-effective treatment is already evident. How-
ever, the EVPI gives additional information over the CEAC as it takes into account
the opportunity lost as well as simply the probability of cost-effectiveness.
For example, there may be a setting where the probability of cost-effectiveness
is low, so the decision-maker believes that decision uncertainty is important. How-
ever, this is simply because the two treatments are very similar in both costs and
effectiveness. In this case the OL will be low as the utilities will be similar for both
treatments for all simulations. Therefore, the cost of making the incorrect decision is
very low. This will be reflected in the EVPI but not in the CEAC and implies that the
optimal treatment can be chosen with little financial risk, even with a low probability
of cost-effectiveness.
To give an indication of the cost of uncertainty for a chosen value of k, the EVPI is pre-
sented in the summary table. For the Vaccine example, this measure is estimated to be
equal to 2.41 monetary units for a threshold of 25 000 monetary units per QALY. The
4.3 Value of Information Analysis 111
2 Ingeneral, the EVPI value given in BCEA is the per person EVPI, to compare with the cost of
future research this EVPI value should be multiplied by the number of people who will receive the
treatment. This is because the cost of an incorrect decision is higher if a greater number of patients
use the treatment.
112 4 Probabilistic Sensitivity Analysis Using BCEA
2.5
2.0
1.5
EVPI
1.0
0.5
0.0
Fig. 4.10 The expected value of perfect information for the vaccination example for willingness
to pay thresholds between 0 and 50 000
The analysis of the expected value of information can be carried out for multiple
comparators in the same way. Once the bcea object m is obtained, the EVPI can be
plotted using
> evi . plot (m)
As already presented in Sect. 4.2.2.2, the two break-even points can be seen in
Fig. 4.11: the first one when the decision changes from implementing “No treatment”
to “Self-help” (k1∗ = £177) and the second when the optimal decision changes to
“Group counselling” (k2∗ = £210). Notice that the EVPI is a single value even in the
multi-comparison setting. This is in contrast to all the other measures that must be
extended in the multi-comparison setting. This is because the EVPI is based on the
average opportunity loss across all the possible treatment options. For example, while
Group Counselling dominates on average for larger willingness to pay values, the
dominating treatment for a specific simulations can be either No treatment, Individual
Counselling or Self-Help. This induces one value for the opportunity loss for each
simulation, rather than three different values.
The cost-effectiveness frontier, in Fig. 4.9, showed that the probability of cost
effectiveness rapidly decreases as the willingness to pay increases from 0 before
4.3 Value of Information Analysis 113
50
40
30
EVPI
20
10
0
Fig. 4.11 The expected value of perfect information for the smoking cessation example. The two
break-even points can be seen in the plot
stabilising below 0.60 for willingness to pay values greater than £150. Therefore,
there is a moderate amount of uncertainty about which treatment is cost-effective.
However, Fig. 4.11 shows that the EVPI is relatively low. It is possible, therefore,
to proceed with Group Counselling without performing additional research. Again,
this highlights the importance of the EVPI as a tool for PSA as it takes into account
both the opportunity loss and the probability of making an incorrect decision.
Note that the EVPI must be compared with the costs of future research as in many
examples the value of resolving the uncertainty in the health economic model will
seem high. However, as the cost of future research is also typically high it can still
exceed the EVPI. The interpretation of the EVPI must be made in the context of the
modelling scenario, not against a general-purpose threshold or comparator.
The EVPI is a useful tool for performing PSA, especially when used in conjunction
with the CEAC and can be easily calculated as part of a BCEA standard procedure.
This allows both the CEAC and the EVPI to be provided as part of the general plot
and summary functions.
114 4 Probabilistic Sensitivity Analysis Using BCEA
However, in the case where the EVPI is high compared to the cost of additional
research it is useful to know where to target that research to reduce the decision uncer-
tainty sufficiently. That is to say when the opportunity loss under current information
is high compared to the cost of obtaining additional information, it is important to
know how to reduce this opportunity loss as efficiently and as cheaply as possi-
ble. Additionally, in some settings, decision-makers are interested in understanding
which parameters are driving the decision uncertainty.
This is very important in health economic modelling as some of the underlying
parameters are known with relative certainty. For example, there may be large amount
of research on the prevalence of a given disease; similarly, the cost of the treatment
may be known with reasonable precision. Evidently, investigating these parameters
further to reduce the decision uncertainty would waste valuable resources and delay
getting a potentially cost-effective intervention to market. Ideally, therefore, it would
be advisable to calculate the value of resolving uncertainty for certain parameters or
subsets of parameters in order to target research efforts.
This subset analysis would also be important in deciding whether a specific pro-
posed trial is cost-effective. In this setting, the proposed study would target some
model parameters and the expected value of learning these specific parameters would
need to exceed to cost of the proposed trial. Again, note that it is important to compare
this value with the value of the proposed trial to ascertain whether the uncertainty is
high.
In general, the value of a subset of parameters is known as the Expected Value
of Perfect Partial Information (EVPPI); this indicator can be used to quantify the
value of resolving uncertainty in a specific parameter (or subset of parameters), while
leaving uncertainty in the remaining model parameters unchanged.
While intuitively this appears a simple extension to the general framework in
which the EVPI is computed, the quantification of the EVPPI does pose serious
computational challenges. Traditionally, the EVPPI was calculated using computa-
tionally intensive nested simulations. However, recent results [5–8] have allowed
users to approximate the EVPPI efficiently.
Crucially, these approximations are based solely on the PSA values for the para-
meters (e.g. in a Bayesian setting the posterior distribution obtained using MCMC
methods), allowing these methods to be included in general-purpose software like
BCEA. To begin, the PSA simulations for the parameters themselves need to be
stored and loaded in R. These PSA samples must be the parameter simulations used
to calculate the measures of costs and effectiveness. In a Bayesian setting these will
typically be available in a BUGS/JAGS object in R as in Sect. 4.3.3.4. However, if the
health economic model has been built-in Excel then the parameter simulations can
be loaded into R from a .csv file to calculate the EVPPI (see Fig. 5.5 and the discus-
sion in Sects. 5.2.4.1 and 5.2.5). For example, if a spreadsheet containing simulated
values for a set of relevant parameters was saved as the file PSAsimulations.csv, then
these could be imported in R using the commands
> psa_samples <- read . csv("PSAsimulations . csv" , header=T)
4.3 Value of Information Analysis 115
The resulting object psa_samples could then be passed to BCEA to execute the
analysis of the EVPPI.
In line with (4.1), the opportunity loss for a specific value φ is the difference between
the value U ∗ (φ) and the utility for the intervention resulting in the overall maximum
utility
OL(φ) = U ∗ (φ) − U (φτ ),
where U (φτ ) is again the utility with ψ marginalised out. Finally, the EVPPI is the
expectation of this (conditional) opportunity loss
EVPPI = Eφ [OL(φ)]
= Eφ [U ∗ (φ)] − U ∗ .
As mentioned above, this formulation adds very little theoretical complexity, with
respect to the EVPI. However, the calculation of U ∗ (φ) involves a maximisation over
116 4 Probabilistic Sensitivity Analysis Using BCEA
for each treatment option t. This means regression is being used to marginalise out
the uncertainty due to the nuisance parameters ψ. Once the function Ut (φ) has been
estimated for all treatment options then
assuming that ε ∼ Normal(0, σε2 ). In (4.4), the computed values of Ut (θ), which are
available from the PSA process described in Sect. 4.2.1, are used as input data, in
conjunction with the simulated values of the parameters of interest. Thus, in terms
of regression, the relevant “response” is given by the values of Ut (θ), while the
“covariates”, or independent variables, are φ. Both these quantities are obtained
from the PSA process and are therefore available at no extra computational cost,
once the PSA procedure is in place.
Notice that in (4.4), the target to be estimated is the unobserved maximum con-
ditional expectation Eψ|φ [Ut (θ)], which is effectively a function of φ only, which
we indicate as gt (φ). This is because uncertainty due to other parameters has been
marginalised out leaving a function conditional on the values of φ only.
In general, this function gt (φ) can have a complicated form implying that a method
such as linear regression would be unlikely to capture the relationship between φ and
Eψ|φ [Ut (θ)]. Therefore, a flexible regression method is needed, to estimate function
gt (φ). A possible way of doing this is to use “non-parametric” regression methods, in
which the predictor does not take a predetermined form but is constructed according
to information derived from the data. Once the function gt (φ) has been estimated,
it is possible to use the resulting estimates to find the average opportunity loss as all
the other terms can be simply calculated directly from the health economic model
output.
4.3 Value of Information Analysis 117
By default, the BCEA function evppi uses Generalised Additive Models (GAM)
[9] when φ contains only one parameter; we briefly describe this in Sect. 4.3.3.1. For
the general case in which φ is multi-dimensional, BCEA resorts to a fast Gaussian
Process (GP) [10] approximation method developed for EVPPI calculation in [8],
which we present in Sect. 4.3.3.3. BCEA also implements a GP regression method
developed by [7] (discussed in Sect. 4.3.3.2 below). While this is in some cases
slightly more flexible (because of its underlying formulation), it is in general less
efficient from the computational point of view.
In addition, the two methods from [5, 6] can be used to approximate the EVPPI
for a single parameter. These are also implemented in the function evppi and are
described in Sect. 4.3.6. However, they can be considered as “deprecated”, since for
a small number of parameters in the subset φ, the GAM method is superior both in
terms of accuracy and computational speed.
The following sections give a short explanation of all these methods in order that
the user may have a fuller understanding of the technical, regression method specific
aspects that can be manipulated by the user in BCEA. We note that the complexity
of the mathematical formulation is beyond the scope of this book and thus we only
sketch the basic features of the advanced methods. The reader is referred to the
relevant literature for more in-depth information.
GAMs model the observed utility values as a sum of smooth functions of the impor-
tant parameters φ. In BCEA, these smooth functions are splines—technically these
are piecewise polynomial functions where the degree of the polynomial defines the
smoothness of the function. The polynomial degree is selected automatically by the
evppi function. This effectively amounts to modelling
Qφ
gt (φs ) = h t (φsq ),
q=1
where Q φ is the number of important parameters (i.e. the size of the subset φ) and
h t (·) are the smooth functions.
In standard GAM regression methods, each smooth function or spline is a function
of one parameter in the set φ. This can occasionally be too restrictive in health
economic models, especially when the values in φ are correlated, and thus the EVPPI
estimate found using this method is unreliable. It is for this reason that for multi-
dimensional problems a different regression method is used by default in BCEA.
If GAM methods are used for multi-parameter sets φ, BCEA uses a GAM with
interactions between all the parameters by default. This means that the polynomials
include cross terms and therefore the number of potential terms is greatly increased,
especially for higher numbers of parameters. Just to give an example, assuming that
118 4 Probabilistic Sensitivity Analysis Using BCEA
cubic splines are fitted (the default choice in BCEA), considering Q φ = 3 implies
that the full GAM model has 125 parameters.
While the estimation procedure is still extremely fast, a large dataset (i.e.
S = 50 000) is required to calculate the EVPPI when φ contains more than four
parameters. Additionally, even a large dataset cannot prevent over-parametrisation
in some settings meaning that the GAM cannot be fitted accurately. It is possible to
remove these interaction terms in BCEA, as shown in Sect. 4.3.4 but, again, this can
negatively affect the accuracy of the estimate.
where θ s and φs are the s-th simulated values for θ and φ, respectively; σε2 is the
residual variance from the regression construction; H is a design matrix
⎛ ⎞
1 φ11 · · · φ1Q φ
⎜ 1 φ21 · · · φ2Q φ ⎟
⎜ ⎟
H =⎜. .. ⎟;
⎝ .. . ⎠
1 φ S1 · · · φ S Q φ
β is the vector of regression coefficients describing the linear relationship between the
important parameters φ and the conditional expectation of the utilities (net benefits).
This means that the mean utility value is based on a linear regression of φ. However,
a GP is more flexible than linear regression due to the covariance matrix Σ, which
is determined by a covariance function C, a matrix operator whose elements C(r, s)
describe the covariance between any two points Ut (θr ) and Ut (θ s ).
Strong and Oakley’s original formulation uses a squared exponential covariance
function C Exp , defined by
⎡ ⎤
Qφ
2
φ − φ
CExp (r, s) = σ 2 exp ⎦
rq sq
q=1
δq
4.3 Value of Information Analysis 119
where φrq and φsq are the r -th and the s-th simulated value of the q-th parameter in
φ, respectively. For this covariance function, σ 2 is the GP marginal variance and δq
defines the “smoothness” of the relationship between two utility values with “similar”
values for φq . For high values of δq the correlation between the two conditional
expectations with similar values for φq is small. Therefore, the function gt (φ) is a
very rough (variable) function of φq . The δq values are treated as hyperparameters
to be estimated from the data.
This model includes 2Q φ + 3 hyperparameters: the Q φ + 1 regression coeffi-
cients β, the Q φ “smoothness” parameters δ = (δ1 , . . . , δ Q φ ), the marginal standard
deviation of the GP σ and the residual error σε of (4.4). The multivariate normal
structure allows the use of numerical optimisation to find the maximum a posteriori
estimates for these hyperparameters. This, however, implies a large computational
cost [11]—technically, the reason is the necessity to invert a large, dense matrix
(that is a matrix full of non-zero entries). This means that while extremely flexible,
because there is a specific parameter δq for each of the elements in φ, the resulting
estimation can be very computational intensive.
the computational time is linked to the number of grid points, fewer grid points will
decrease the computational time. The evppi function creates the grid automatically
but to increase computational time or accuracy this grid can be manipulated, as we
show in Sect. 4.3.5.
This grid is only computationally efficient in two dimensions, so dimension reduc-
tion is used to obtain a suitable two-dimensional transformation of the parameters
to estimate covariance matrix. This dimension reduction, known as Principal Fitted
Components, tests whether these two dimensions contain all the relevant information
about the utility values. Therefore, if this test indicates that more than two dimen-
sions are needed to contain all the information then this method can struggle and full
residual checking is required to check the estimation procedure, see Figs. 4.14 and
4.15.
Before showing how BCEA can be used to compute the EVPPI using INLA and
Stochastic Partial Differential Equations (SPDEs), we mention that in order to do
so, it is necessary to install the R packages INLA and splancs. As usual, this can be
done by typing to the R terminal the following commands.
> install.packages("splancs")
> install.packages("INLA", repos="https://www.math.ntnu.no/inla/R/stable")
Notice that since INLA is not stored on CRAN, it is necessary to instruct R about the
repository at which is available (i.e. the argument repos in the call to install.packages).
If these two packages are not installed in the user’s machine, BCEA will produce a
message to request their installation.
To explore the use of the evppi function, we revisit the Vaccine example. In order
to use the evppi function, the user must have access to the PSA samples/posterior
draw the parameters vector θ as well as a BCEA object m. For this example, the PSA
samples of the original parametrisation of the model have been used and extracted
directly from the BUGS object, vaccine, in the Vaccine workspace provided with
BCEA. If the parameter simulations are available in Microsoft Excel or a similar
spreadsheet calculator, then these simulations can simply be imported into R, e.g.
using a .csv file.
If the user is working directly from the BUGS/JAGS output, a BCEA function
called CreateInputs is available to convert this output into input for the evppi function.
This function takes as argument the BUGS/JAGS object containing the MCMC
simulations and returns a matrix, with rows equal to the number of simulations and
columns equal to the number of parameters in θ, and a vector of parameter names.
The call to CreateInputs is presented below.
> inp <- CreateInputs(vaccine)
> names(inp)
[1] "mat" "parameters"
4.3 Value of Information Analysis 121
Fig. 4.12 Plot of the EVPI Expected Value of Perfect Partial Information
and the EVPPI for β1 and β2 EVPI
EVPPI for beta.1., and beta.2.
2.5
for different willingness to
pay values
2.0
1.5
1.0
0.5
0.0
The matrix of PSA simulation is saved as the object inp$mat, which is then used
directly as an input to the evppi function. If the PSA simulations are saved in an
.csv file, then this can used directly in the function. The other object in the inp list,
inp$parameters, is a vector of the names of the parameters in this matrix. This vector,
or more usually sub-vectors, can also be used as an input to the evppi function. If
a .csv file is used then the column headings from that file can be given as inputs
instead.
The basic syntax for calculating the EVPPI is as follows:
> evppi(parameter,input,he)
where parameter is a vector of values or column names for which the EVPPI is being
calculated, input is the matrix or data frame containing the parameter simulations
and he is a bcea object. For example, to calculate the EVPPI for the parameters β1
and β2 , using the default settings, in the Vaccine example the following command
would be used:
> EVPPI <- evppi(c("beta.1.","beta.2."),inp$mat,m)
As the evppi function can take some time to calculate the EVPPI, a read out of the
progress is printed. The most time-consuming part of this process is described as
Calculating fitted values for the GP regression. Depending on the complexity of the
problem and the number of simulations used, this can take minutes.
It is possible to plot the evppi object using an S3 method for objects of the evppi
class. This gives a graphic showing the EVPI and the EVPPI for all willingness to
pay values included in the m$k vector (Fig. 4.12), obtained by invoking the command
plot(EVPPI) in the R terminal.
122 4 Probabilistic Sensitivity Analysis Using BCEA
In addition to this functionality, an evppi object contains objects that can be used
for further analysis; these can be explored by typing
> names(EVPPI)
which returns the following output:
[1] "evppi" "index" "k"
[4] "evi" "parameters" "time"
[7] "method" "fitted.costs" "fitted.effects"
These elements can be extracted for individual analysis and relate to the following
quantities.
• evppi: a vector giving the EVPPI value for each value of the willingness to pay
value in m$k.
• index: a vector detailing the column numbers for the parameters of interest.
• k: a vector of the willingness to pay values for which the EVPPI and the EVPI
have been calculated.
• evi: a vector of the values for the EVPI for different values of the willingness to
pay
• parameters: a single character string detailing the parameters for which the EVPPI
has been calculated. This character string is used to create the legend for the
graphical output of the plot function.
• time: a list giving the computational time taken to calculate the different stages of
the EVPPI estimation procedure.
• method: the calculation method used for the EVPPI. This will be a list object (cfr.
Sect. 4.3.5.1).
• fitted.costs: this gives the estimated gt (φ) function for costs, for the non-parametric
regression methods presented in Sect. 4.3.3
• fitted.effects: as for costs, this gives the estimated gt (φ) function for effects.
The most important element of this object is the evppi vector giving the EVPPI
values for the different willingness to pay values. In a standard analysis (using BCEA
default settings) this whole vector can be quite unwieldy. However, the following code
can be used to extract the EVPPI value for different willingness to pay thresholds,
specifically in this case 20 000 monetary units.
> EVPPI$evppi[which(EVPPI$k==20000)]
[1] 1.02189
Note that the costs and effects are fitted separately in the evppi function. Pre-
viously, the utility function Ut (θ) was used as the response for the non-parametric
regression. However, as the utility functions depend directly on the willingness to pay,
this would imply calculating the EVPPI 501 times in a standard BCEA procedure.
Therefore, to speed up computation the costs and effects are fitted separately and
then combined for each willingness to pay threshold. Another computational saving
is made by using incremental costs and effects as the response. This means that in
a health economic model with T decision options, BCEA fits 2(T − 1) regression
4.3 Value of Information Analysis 123
curves. In the process, BCEA provides a read out of the progress and presents the
computational time for each decision curve.
To demonstrate the evppi function for multi-decisions we use the smoking cessation
example. Again, the simulations for each of the model parameters must be loaded into
the R workspace. These simulations are contained in the rjags object smoking_output
in the Smoking dataset—this was created by running the MTC model of Sect. 2.4.1
in JAGS. The following code is used to extract the parameter simulations from the
rjags object:
> data(Smoking)
> inp<-CreateInputs(smoking_output)
In this example, there are 25 parameters in the smoking_output dataset. However,
some of these are simply constants used to simplify the calculation of the costs and
effectiveness measures. If these constant variables are used in the evppi function, then
the following errors will occur, for single and multi-parameter EVPPI respectively:
> # Single Parameter EVPPI
> EVPPI<-evppi(c(1),inp$mat,m)
Error in smooth.construct.cr.smooth.spec
(object$margin[[i]], dat, knt) :
d.1. has insufficient unique values to support 5 knots: reduce k.
Finding projections
> EVPPI<-evppi(c(2:3),inp$mat,m,h.value=0.0000005)
Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Calculating EVPPI
> plot(EVPPI)
For this example, we demonstrate the readout that should be expected from the
EVPPI function, which allows the user to track the progress of the function. Notice
there are six separate readouts indicating that the fitted values for GP regression
using INLA/SPDE are found six separate times. This corresponds to the costs and
effects for the three different comparisons in the Smoking example, as discussed in
Sect. 4.3.3.4.
Figure 4.13 shows the EVPPI plot generated using the above code. Again, notice
that the two break-even points can be clearly seen for the EVPPI curve. Note that, the
EVPPI is always dominated by the EVPI but the difference between the two curves
does not remain constant for the different willingness to pay values.
The evppi function can take extra arguments to allow users to manipulate the underly-
ing estimation procedures. In general, these extra arguments fine tune the procedure
4.3 Value of Information Analysis 125
Fig. 4.13 Plot of the EVPI Expected Value of Perfect Partial Information
and the EVPPI for parameter EVPI
50
EVPPI for 2, and 3
columns 2 and 3 in the
smoking_output dataset for
40
different willingness to pay
values
30
20
10
0
1e−03
5
5e−04
0
Residuals
Residuals
0e+00
−5
−5e−04
−10
−1e−03
Fig. 4.14 The plot of residuals against fitted values for the Vaccine example for both costs and
effects
This can be done using the diag.evppi function using the additional argument
int=k, where k is the column in the m$delta.e matrix for which we wish to access
the fit. For example, the call
> diag.evppi(EVPPI,m,diag="residuals",int=2)
would instruct BCEA to plot the regression fit for the second incremental costs and
effects.
This section relates to the more technical aspects for controlling the non-parametric
estimation methods. This section is more advanced and assumes that the reader
has read Sect. 4.3.3 and specifically Sect. 4.3.3.3. It is not necessary to use these
additional fine-tuning procedures to produce EVPPI estimates but in some more
complicated health economic models it may be necessary to improve the EVPPI
estimation accuracy using these more advanced manipulations.
128 4 Probabilistic Sensitivity Analysis Using BCEA
0.00030
5.3
5.2
0.00028
5.1
Sample Quantiles
Sample Quantiles
0.00026
5.0
4.9
0.00024
4.8
0.00022
4.7
−3 −1 1 2 3 −3 −1 1 2 3
Theoretical Quantiles Theoretical Quantiles
Fig. 4.15 A QQ plot for the residuals in the Vaccine example for both costs and effects
To investigate the grid properties, the option plot=T should be set in the call to
evppi. This plots the grid for each function gt (φ) and also allows the user to save
the grid during the estimation procedure. Figure 4.16 gives a good example of a grid
used to approximate the EVPPI. The blue dots represent the PSA data whereas the
vertices of the triangles are the points where the surface is estimated.
In principal, the grids should be broadly circular with no visual outliers—blue
dots isolated from the other points by a blue boundary line. The inner section should
have dense triangles with a boundary relatively tight to the blue data points. The
outer section should be spaced away from the blue data points and can have larger
triangles. Both these sections are encased by blue boundary lines.
The closer these boundaries sit to the data, the faster the computation as there
are fewer grid points. However, the INLA method fixes the value of the surface at
the outer boundary meaning that boundaries close to the data can skew the results.
The inner (outer) boundary can be controlled using the argument convex.inner=a
(convex.outer=a’), where a (a’) is a negative value, typically between −0.2 and
−0.6 for the inner and −0.5 and −1 for the outer boundary. Notice that the value a’
should be more negative than a, as more negative values indicate that the boundary
will be further from the points of interest.
Technically, these negative values define the minimum radius of the curvature
of the boundary, defined as a fraction of the radius of the points. This means that,
if convex.inner=-0.1 then the minimum curvature of the boundary has radius equal
to one-tenth of the radius of the points, giving a boundary that hugs the points of
interest quite tightly. As this is decreased to more negative values, the boundary
is constrained to be further from the points of interest. Incidentally, Fig. 4.16 uses
values of −0.3 and −0.7 respectively for the two boundaries.
The density of the points can also be controlled with the argument cutoff=b and
max.edge=c, where b can typically be between 0.1 and 0.5 and c between 0.5 and
1.2. These values simply define the minimum and maximum (absolute) difference
between the points in this grid. Small values increase the density and larger values
decrease it, with the computation time varying inversely to the density of the grid
points.
130 4 Probabilistic Sensitivity Analysis Using BCEA
Clearly,
this increases the number of regression parameters from (Q φ + 1) to Q φ +
Qφ
1+ , giving an increase in computational time.
2
In a multi-decision case, it may be advisable to only use interactions for the curve
where the issues occurred, in this case the first incremental effects, to avoid a large
increase in computational time. This is achieved using the list environment in R. The
4.3 Value of Information Analysis 131
3000
2
2000
1000
Residuals
Residuals
0
0
−2
−4
−2000
−1000 1000 −4 0 2 4 6
Fitted values Fitted values
Fig. 4.17 The fitted against residuals for an EVPPI estimation procedure with a high-dimensional
reduction needed to capture all the information in φ
first element in the list is the interaction levels for the effects and the second is the
interaction levels for the costs. So in this example, the following code would be used
> interactions <- list(c(2,1,1),c(1,1,1))
> EVPPI <- evppi(1:3,inp$mat,m,int.ord=interactions)
to only use second-order interactions for the first incremental effects.
Once the EVPPI has been fitted with interactions, we must reassess whether
the EVPPI has been correctly estimated by inspecting the residuals. If this has not
improved the fit significantly then either the interaction levels can be increased further
or the non-default GP method should be used. However, both these strategies have a
greater computational cost.
EVPPI Using Non-default GP Regression
To use non-default GP regression for all regression fits, e.g. the costs and effects for
all incremental decisions, the argument method="GP" must be included in the call
to the evppi function.
As discussed in Sect. 4.3.3.2, the hyperparameters for the non-default regression
are estimated using numerical methods. This involves inverting a square matrix with
the same number of rows and columns as the PSA simulations which is compu-
tationally expensive. This means the estimation of the hyperparameters can take a
substantial time.
Therefore, the default for this method is to use 500 PSA simulations to calculate
the hyperparameters. This can be increased, in improve accuracy, using the argument
n.sims=S, where S is larger than 500. The computational speed will also be increased
by reducing this number, although clearly this affects accuracy. It is important to note
that matrix inversion has a S3 cost, implying that doubling the PSA simulation size
132 4 Probabilistic Sensitivity Analysis Using BCEA
Throughout this section it is clear that a trade-off between computational speed and
accuracy must be made. For simpler examples, the default fast GP method will be
suitable but as the examples become more complex, as seen by the residuals, it may
be necessary to increase computation time to get more accurate results.
As the evppi procedure fits several regression curves to model all incremental
costs and effects, it may be that some of the curves can be fitted with the faster
procedures, while the more complex curves may require more computational time.
To allow for this, the method argument can be given as a list object in R, similar to
the int.ord argument for including interactions.
Again, the first list element contains the methods that should be used for effects
and the second list element contains the methods for the costs. The fast GP regression
method is called "INLA". Therefore, using the same example as before, where the
first incremental effects were poorly fitted, the following code would be used:
> methods <- list(c("GP ","INLA ","INLA "), c("INLA ","INLA ","
INLA "))
> EVPPI <- evppi(1:3,inp$mat,m,method=methods)
4.3 Value of Information Analysis 133
This strategy for EVPPI estimation demonstrates that while the default options for
the evppi function have been chosen as a trade-off between computational time and
accuracy, it is important to perform residual checking as part of an EVPPI procedure.
It is not recommended to use the evppi function to perform entirely “black-box”
estimation for the EVPPI as this can lead to misleading EVPPI estimates.
In this section we review two approximation methods that can be used to estimate
a single-parameter EVPPI, i.e. the case in which the set of important parameters
only contains one element and is thus indicated as a scalar, φ. As mentioned earlier,
these methods can be considered as deprecated since for such a simple setting, it is
possible to apply the GAM estimation which is accurate and efficient. Nevertheless,
we present these two older methods for completeness. For the same reason, they are
still included in the evppi function in BCEA.
The basic idea underlying the methods proposed almost at the same time and
independently by Sadatsafavi et al. [6] and Strong and Oakley [5] is that gt (φ) can
be estimated by noting that the optimal decision only changes a small number of
times, normally less than 3, when traversing all the possible values of φ. This means
that for almost all values of φ the optimal decision is the same for φ + and φ − ,
for an arbitrary small value . As an example, the optimal decision may only change
once, i.e. it is t = 1 for some value m > φ and t = 0 for m ≤ φ. Since the interest
is on a single-parameter φ, these change points are simply located on a line, rather
than more complicated shapes in higher dimensions.
While the optimal decision remains constant, gt (φ) can be approximated by the
average observed utility value in that set. In other words, if the decision remains
constant in an interval Iφ = [φl ; φu ], then gt (φ) is well approximated by the average
observed utility values calculated using the all φ ∈ Iφ . If this process is performed
for all changes of decision that occur in the parameter (unidimensional) space, then
gt (φ) can be estimated by a small number of values, which can then be used to
compute the opportunity loss.
The problem with this theory is that we typically have no substantial information
about the values of φ which change the optimal decision. The two single-parameter
EVPPI estimation methods use different algorithms to determine these cut-off points:
Sadatsafavi et al. search for points where the optimal decision is most likely to
change. As the algorithm searches for these points, the number of expected decision
changes must be specified. Conversely, Strong and Oakley split the full PSA sample
into “small” sub-samples and thus assume that the decision remains constant within
these sub-samples. Therefore, gt (φ) is approximated simply by finding the maximum
observed utility within each of these samples. For this method, the user must specify
the number of the sub-samples to be used.
134 4 Probabilistic Sensitivity Analysis Using BCEA
It is possible to use these methods by simply adding the option method, set to
either "so" or "sad", to the evppi. This instructs BCEA to use the method of Strong
and Oakley or that of Sadatsafavi et al., respectively. In addition, the user must also
specify either the number of subsets, e.g. n.blocks=50 for Strong and Oakley, or the
number of decision changes, e.g. n.seps=1 for Sadatsafavi et al.
Notice that it is possible to specify more than one parameter for these methods;
in this case, BCEA will calculate the single-parameter EVPPI separately for all the
parameters considered. For example, if we specify that the relevant parameters are
β1 and β2 , the two single-parameter EVPPIs can be obtained using the code:
> EVPPI.so <- evppi(c("beta.1.","beta.2."),inp$mat,m,method="so",n.blocks=50)
> EVPPI.sad <- evppi(c("beta.1.","beta.2."),inp$mat,m,method="sad",n.seps=1)
In this particular case, we are specifying that the full range of PSA samples should
be split into 50 “blocks”, when using the option so. Similarly, we are also instructing
evppi that we are only expecting one change in the optimal decision (indicated by
the option n.seps=1, for method="sad").
Again, it is possible to extract individual EVPPI values and plot the EVPPI when
these methods have been used. The most important distinction from the default
methods is that these methods calculate the EVPPI for different parameters and thus
more than one vector of EVPPI values is stored in the object m$evppi. These can be
extracted using the $ notation in R, which allows the user to access the elements of
an object. For example, we could extract the value of the EVPPI for the parameter
β1 and willingness to pay of 20 000 monetary units obtained using the Strong and
Oakley algorithm by typing
> EVPPI.so$evppi$‘beta.1.‘[which(m$k==20000)]
Similarly, the value of the EVPPI for β2 obtained using the method of Sadatsafavi
et al. is accessed using the code
> EVPPI.sad$evppi$‘beta.2.‘[which(m$k==20000)]
Despite their slightly different nature, because the resulting objects are still in the
class EVPPI, the plot method is still available and thus we can simply visualise the
results by typing
> plot(EVPPI.so)
> plot(EVPPI.sad)
Figures 4.18 and 4.19 showed a visual indication of which parameter has a higher
impact on uncertainty. This ordering should stay constant for all willingness to
pay values.
Looking at these plots, observe some problems with the estimation with the
Sadatsafavi et al. method, rather than a smooth function, reaching a sharp peak at the
change points, the EVPPI jumps around a little more frequently. It is also possible
4.3 Value of Information Analysis 135
2.5
calculated using the Strong (1) EVPPI for beta.1.
(2) EVPPI for beta.2.
and Oakley method with 50
2.0
blocks for two parameters of
the Vaccine example
1.5
1.0
0.5
0.0 (1)
(2)
(1)
0.0
(2)
to note that the two parameter single-parameter methods differ, showing that the
chosen model specific inputs are incorrect. Information about how to choose these
inputs can be found in the original papers or in the review [11]. These are relatively
complex and are not implemented in BCEA, which makes these single-parameter
methods challenging to use accurately.
Another way in which the EVPPI can be used is to provide an “overall” assessment of
the impact of each single parameter on the decision uncertainty. To this aim, BCEA
has a specialised function info.rank which produces a plot of the univariate EVPPI
for all the parameters of interest (as specified by the user). While this is not ideal,
since correlation among parameters and model structure does have an impact on
136 4 Probabilistic Sensitivity Analysis Using BCEA
Fig. 4.20 The Info-rank plot for the Vaccine example. Each bar quantifies the proportion of the
total EVPI associated with each of the parameters used as input
the joint value of the EVPPI (which is not a linear combination of the individual
EVPPIs!), the Info-rank plot with all the model parameters ranked can be used as a
sort of Tornado diagram, a tool often used in deterministic sensitivity analysis [14].
For example, in the case of the Vaccine case study, we can obtain the Info-rank
plot for all the underlying model parameters using the following commands.
> # Creates the object with the relevant inputs
> inp <- CreateInputs(vaccine)
However, it can be shown that the EVPPI of a set of parameters must be at least
as big as the individual EVPPI values. Therefore, parameters with high individual
EVPPI will always result in joint parameter subset with high value. But, nothing can
be said about parameters with small individual EVPPI values especially in decision
tree models which are typically multiplicative in the parameters. This means that
learning the value of one of these parameters has little value as the other elements
remain uncertain. However, learning all the parameters can greatly decrease decision
uncertainty and therefore has large value to the decision-maker. Nonetheless, the
Info-rank plot gives an overview, which is perhaps useful (in conjunction with expert
knowledge about the model parameters) to drive the selection of the subset φ to be
included in the full analysis of the EVPPI.
In addition to the graph, the function info.rank automatically prints the complete
ranking for the parameters selected. This is also stored in an element $rank. The
user can control the vector of parameters to be ranked, given as either a vector of
strings giving the parameter names or a numeric vector, corresponding to the column
numbers of important parameters. For example, the code
> info.rank(parameters=c("beta.6.","gamma.2.",eta"),input=inp$mat,m)
> info.rank(parameters=c(43,48,46),input=inp$mat,m)
produce the same output, since the values 43, 48 and 46 are the indexes associ-
ated with the positions of the parameters beta.6., gamma.2. and eta in the vector
inp$parameters. The simulations for all the parameters in the model need to be stored
in a matrix, in this example inp$mat. As mentioned earlier, this may be created using
the utility function CreateInputs, or imported from a spreadsheet.
The user can also select a specific value for the willingness to pay for which the
Info-rank is computed and plotted by setting the argument wtp=value. The element
value must be one of the elements of the willingness to pay grid, i.e. the values
stored in m$k. The default value is determined as the break-even point from the
BCEA object containing the economic evaluation. It is important to note that the
ranking will often change substantially as a function of the willingness to pay.
Additional options include graphical parameters that the user can specify. For
example, the argument xlim specifies the limits of the x−axis; the argument ca
determines the font size for the axis label (with default value set at 0.7 of the full
size); cn indicates the font size for the vector of parameter names (again with default at
0.7 of the full size); mai specifies the margins of the graph (default = c(1.36,1.5,1,1));
and finally rel is a logical argument that specifies whether the ratio of EVPPI to EVPI
(which can be obtained by default as rel=TRUE) or the absolute value of the EVPPI
(rel=FALSE) should be used for the analysis.
138 4 Probabilistic Sensitivity Analysis Using BCEA
This section is concerned with testing the sensitivity of the cost-effectiveness analysis
to the underlying model and health economic modelling assumptions.
T
Ū = qt U t = q0 U 0 + q1 U 1 + · · · + q T U T
t=0
with qt ≥ 0 ∀t ∈ {0, · · · , T } and t qt = 1. For each intervention t, the quantity qt
represents its market share and U t its expected utilities. The resulting quantity Ū can
be easily compared with the “optimal” expected utility U ∗ to evaluate the potential
losses induced by the different market composition. In other terms, the expected
utility for the chosen market scenario is the weighted average of the expected utility
of all treatment options t with the respective market share qt as weights.
Let us assume that the market shares are driven by the willingness of the indi-
viduals to start therapies. In a possible scenario, the majority of the patients prefer
quitting smoking without any help, and the other patients are less willing to undergo
therapies as they become more costly. We can therefore imagine in this scenario that
40% of the individuals are in the no treatment group, 30% in the self-help group,
and only 20 and 10% seek individual and group counselling respectively. The market
shares vector are then defined as
> mkt.shares = c(0.4, 0.3, 0.2, 0.1)
To produce the results for the mixed analysis, it is necessary to execute the
mixedAn function, to create an object ma or the class mixedAn.
> ma = mixedAn(m, mkt.shares)
The resulting ma object can be used in two ways:
• To extract the point estimate of the loss in EVPI for a given willingness to pay value,
by comparing the “optimal” situation in which the most cost-effective comparator
t is adopted (such that the associated market shares qt are equal to one);
• To produce a graphical analysis of the behaviour of the EVPI over a range of
willingness to pay values for the mixed strategy and “optimal” choice.
The first item can be obtained using the summary function. Since this is an S3
function for the objects of class mixedAn, the relative help page can be found by
executing the ?summary.mixedAn function from the R console. For example, the
point estimate for the loss in EVPI for a willingness to pay of 250 per QALY gained
can be obtained with:
> summary(ma, wtp=250)
The two EVPI curves, for the optimal and mixed-treatment scenarios together,
can be plotted using the plot function. Both the base graphics and ggplot2 versions of
the plot are available, and can be selected by specifying the argument graph="base"
(or "b" for short, the default) or graph="ggplot2" (or "g" for short). The function
accept other arguments analogously to the previously presented functions to tweak
the appearance of the graph in addition to the choice of the plotting library to be
used (the help page can be opened by running ?plot.mixedAn in the R console). The
y-axis limits can be selected by passing a two-element vector to the argument y.limits
(defaults to NULL), while the legend position can be chosen using pos, consistent
140 4 Probabilistic Sensitivity Analysis Using BCEA
150
EVPI
100
50
Fig. 4.21 The figure represents the values of the expected value of perfect information under the
“optimal strategy” and “mixed strategy” scenarios for varying willingness to pay thresholds. It is
clearly shown that in this case the EVPI for the mixed strategy is always greater than for the optimal
strategy, due to the sub-optimality of the market shares leading to higher values of the opportunity
loss
with other functions such as ceplane.plot. The output of the function for the code
shown below is presented in Fig. 4.21.
> plot(ma, graph="g", pos="b")
Figure 4.21 shows the behaviour of the EVPI over willingness to pay values
ranging from £0 to 500. The EVPI for the mixed strategy is greater than the opti-
mal strategy scenario, in which the whole of the market share is allocated to the
intervention with the highest utility at any given willingness to pay value. The gap
between the two curves is wider for the two extreme points of the chosen interval of
WTP values, while they are relatively close in the interval between £150 and 250 per
life years gained. This is because in this interval the decision uncertainty is highest
decreases the opportunity loss differential between the two scenarios. Clearly, when
the optimal strategy is uncertain then all strategies are similar and therefore using a
mixed strategy is close to the optimal strategy.
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 141
In the all analyses presented so far, the utility for the decision-maker has always been
assumed to be described by the monetary net benefit (see Sect. 1.3). This assumption
imposes a form of risk neutrality on the decision-maker, which might not be always
reasonable [2]. A scenario considering risk aversion explicitly, with different risk
aversion profiles, can be implemented by extending the form of the utility function
[3]. One of the possible ways to include the risk aversion in the decision problem is
to re-define the utility function (1.5) as:
1
u(b, r ) = 1 − exp(−r b)
r
where the parameter r > 0 represents the risk aversion attributed to the decision-
maker. The higher the value of r , the more risk-averse the decision-maker is consid-
ered to be, where b := ke − c is the monetary net benefit.3
It is not usually possible to make the degree of risk aversion explicit, as it is
unlikely to be known. Therefore, an analysis using different risk aversion scenarios
can be carried out to analyse the decision-making process as a function of r .
The CEriskav function provided in BCEA can be used to perform health economic
analysis in the presence of risk aversion. In a similar manner to the mixed strategy
analysis presented in Sect. 4.4.1, a new object is created after the user specifies the
risk aversion parameter r . This can be a single positive number or a vector of possible
values denoting the risk aversion profile of the decision-maker. A vector r can be
defined and fed into the function together with the bcea object.
To perform the analysis with risk aversion we will use the Vaccine example. We will
assume that the object m of class bcea, produced by the initial call to the function
bcea and containing the results of the base case analysis, is available in the current
workspace. A vector r containing the risk aversion parameters to be used in the
scenarios. To assess the robustness of the results to variations in risk aversion, we
input different values in a vector, r:
> r <- c(0.0000000001,0.005,0.020,0.035)
3 It
should be noted that as a result the form of the known-distribution utility function assumes a
complex form. For a more complete discussion please see [2].
142 4 Probabilistic Sensitivity Analysis Using BCEA
the r vector is defined, the risk aversion analysis can be run using the function
CEriskav. This will create an object of class CEriskav assigned to the object cr:
> cr <- CEriskav(m,r=r,comparison=1)
The objects of class CEriskav can be used to produce the expected incremental
benefit and expected value of perfect information plots in BCEA. To do so it is
sufficient to call the plot function with an object of class CEriskav as argument. This
will produce the output displayed in Fig. 4.22:
> plot(cr)
Calling the plot function with a CEriskav object (i.e. the plot.CEriskav function)
will produce the two graphs in two separate graphical windows by default. This
creates a new graphical device for the second plot (the EVPI). This behaviour can be
modified by passing additional arguments to the plotting function. If the argument
plot="ask" is added, the "ask" option in the graphical parameters list "par" will be
temporarily be set to TRUE. In this way the second plot will be displayed in the
active device, after the application receives, either the Return key, in an interactive
R session, or readline() in non-interactive sessions.
Objects of class CEriskav such as cr contain several subsettable named elements:
• Ur: a four-dimensional numeric matrix containing the utility value for each simu-
lation, willingness to pay value (WTP) contained in the approximation grid vector
k, intervention and risk aversion parameter included in the vector r;
• Urstar: a three-dimensional numeric matrix containing the highest utility value
among the comparators for each simulation, willingness to pay value in the approx-
imation grid and risk aversion parameter;
• IBr: the incremental benefit matrices for all risk aversion scenarios, built as three-
dimensional matrices over the simulation, WTP values and values defined in r;
• eibr: the EIB for each pair of WTP and risk aversion values, i.e. the average of IBr
over the simulations;
• vir: the value of information, in a multi-dimensional matrix with the same structure
as IBr;
• evir: the expected value of information obtained averaging vir over the simulations;
• R: the number of risk aversion scenarios assessed;
• r: the input vector r , containing the risk aversion parameters passed to the CEriskav
function;
• k: the grid approximation of the interval of willingness to pay values taken from
the bcea object given as argument to CEriskav, in this case m.
All the sensitivity analyses presented so far are based on the premise that the only way
in which uncertainty affects the results of the economic evaluation is either through
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 143
r→0
r = 0.005
200
r = 0.02
r = 0.035
150
100
50
0
Willingness to pay
r→0
r = 0.005
20
r = 0.02
r = 0.035
15
10
5
0
Willingness to pay
Fig. 4.22 The figures show the output of the plot function for the risk aversion analysis. The figures
show the effect of different risk aversion scenarios on the expected incremental benefit (EIB) and
the expected value of perfect information (EVPI), respectively, at the top and bottom of the figure.
It can be easily noticed that the EIB departs from linearity and the decision uncertainty represented
in the EVPI grows with increasing aversion to risk
144 4 Probabilistic Sensitivity Analysis Using BCEA
sampling variability for the observed data or epistemic uncertainty in the parameters
that populate the model. In other words, the economic assessment is performed
conditional on the model selected to describe the underlying phenomenon.
However, almost invariably the model structure, i.e. the set of assumptions and
alleged relationships among the quantities involved, is an approximation to a complex
reality. As Box and Draper state; “all models are wrong, but some are useful” [15].
From a hard-core Bayesian perspective, the ideal solution would be to extend the
modelling to include a prior distribution over a set of H finite, mutually exclusive and
exhaustive potential model structures M = (M1 , M2 , . . . , M H ). In principle, these
may be characterised by some common features (i.e. some Markov model to represent
the natural history of a disease) and each could have slightly different assumptions;
for example, M1 may assume a logNormal distribution for some relevant cost, while
M2 may assume a Gamma.
In this case, the data would update all the prior distributions (over the parameters
within each model and over the distribution of models) so as to determine a poste-
rior probability that each of the Mh is the “best” model (i.e. the one that is most
supported by the data). It would be possible then to either discard the options with
too small posterior probabilities or even build some form of model average, where
the weights associated with each structure are these posterior probabilities. Leaving
aside the technical difficulties with this strategy, the main problem is the underlying
assumption that we are able to fully specify all the possible alternative models. This
is rarely possible in practical scenarios.
One possible way to overcome this issue is to aim for a less ambitious objective—
the basic idea is to formalise a relatively small set of possible models and compare
them in terms of their out-of-sample predictions. This quantifies how well the pre-
dictive distribution for a given model would fit a replicated dataset based on the
observed data. Notice that, especially in health economic evaluations, the possible
models considered are merely a (rough) approximation to the complex phenomenon
under study, so there is unlikely to be a guarantee that any of these models should
be the “true” one.
Under this strategy, it is therefore necessary to determine a measure of good-
ness of fit that can be associated with each of the models being compared. Within
the Bayesian framework, one convenient choice is to use the Deviance Information
Criterion (DIC) [16]. A technical discussion of this quantity and its limitations are
beyond the objectives of this book—for a brief discussion see [2]. Nevertheless, the
intuition behind it is that it represents a measure of model fit based on (a function
of) the likelihood D(θ) = −2 log p(y | θ) and a term p D , which is used to penalise
model complexity. The reason for the inclusion of the penalty is that models contain-
ing too many parameters will tend to overfit the observed data, i.e. do particularly
well in terms of explaining the realised dataset, but are likely to perform poorly on
other occurrences of similar data.
From the technical point of view, it is fairly easy to compute the DIC as a by-
product of the MCMC procedure. Structural PSA can be then performed using the
following steps.
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 145
exp(−0.5ΔDICh )
wh = , (4.6)
H
exp(−0.5ΔDICh )
h=1
Fig. 4.23 Graphical representation of the Chemotherapy model, in terms of a decision tree
We also assume that some information, e.g. from registries, is present to inform
the prior distribution for the cost of ambulatory and hospital care and that this
can be encoded in the following form: camb ∼ logNormal(4.77, 0.17) and chosp ∼
logNormal(8.60, 0.17). These imply that we are expecting them to realistically vary
in the intervals £(85; £165) and £(3813; £7738), respectively. Moreover, we know
dr ug dr ug
that the drugs have fixed costs c0 = 110 and c1 = 520.
Arguably, the crucial parameter in this model is the reduction in the probability
of side effects ρ. We assume that only limited evidence is available on the actual
effectiveness for t = 1 and thus consider two somewhat contrasting scenarios. In
the first one, we model ρ ∼ Normal(0.8, 0.2), to express the belief that the new
chemotherapy is on average 20% more effective than the standard of care, with some
relatively limited variability around this estimate. The second scenario assumes a
“sceptical” view and models ρ ∼ Normal(1, 0.2). This implies that on average the
new chemotherapy is no better than the status quo, while allowing for uncertainty
in this assessment. Figure 4.24 shows a graphical representation of these two prior
distributions—panel (a) shows the “enthusiastic” prior, while panel (b) depicts the
“sceptical” prior.
The model assumptions can be encoded in the following BUGS/JAGS code (notice
that again we define the two treatment arms as t = 1, 2 for the standard of care and
the new drug, respectively).
model {
# Observed data
for (t in 1:2) {
SE[t] ~ dbin(pi[t],n[t])
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 147
20000
(a) (b)
15000
15000
10000
10000
5000
5000
0
0
0.0 0.5 1.0 1.5 0.5 1.0 1.5
Reduction in the chance of side effects Reduction in the chance of side effects
Fig. 4.24 Prior assumptions on the reduction factor for the chance of side effects, ρ. Panel a assumes
an “enthusiastic” prior, where the new chemotherapy is assumed to be on average better than the
standard of care, while allowing for a possibility that this is actually not the correct case; panel b
shows a “sceptical” prior, under which the new chemotherapy is assumed to be on average just as
effective as the standard of care
A[t] ~ dbin(gamma,SE[t])
}
}
}
We allow for the two different formulations of the model by passing two sets of
values for the parameter m.rho (set to 0.8 and 1 in the two cases). In this particular
case, we are assuming that the standard deviation sigma.rho does not vary in the two
scenarios, but of course this assumption could (and perhaps should) be relaxed.
The two models can be run using the R2jags package and the results may be stored
in R objects, say chemo_enth and chemo_scep, both in the class rjags and containing
the results of the MCMC simulations4
> # Loads the model results from the BCEA website
> library(BCEA)
> load("http://www.statistica.it/gianluca/BCEA/chemo_PSA.Rdata")
format as the standard call to the function bcea and allow the user to specify the
reference intervention and a vector of labels.
The resulting object m_avg is a list, whose element can be explored by typing the
following commands
> # Lists the elements of the object m_avg
> names(m_avg)
[1]"he""w" "DIC"
Fig. 4.25 Comparison of the three models. Panel a shows the cost-effectiveness plane for the
“enthusiastic” model, while panel b considers the “sceptical” model and panel c depicts the model
average result. In this case, the contour in panel c shows lower variability as it is more tightly centred
around the mean (i.e. the estimated ICER)
The figures can be obtained by typing the following commands to the R terminal.
> # Plots the CEACs
> plot(m$he$k,m$he$ceac,t="l",lwd=2,xlab="Willingness to pay",ylab="
Probability of cost effectiveness")
> points(m_enth$k,m_enth$ceac,t="l",lty=2)
> points(m_scep$k,m_scep$ceac,t="l",lty=3)
> legend("bottomright",c("Model average","Enthusiastic model","Sceptical
model"),bty="n",lty=c(1,2,3),cex=.8)
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 151
300
Expected value of information
0.8
Probability of cost effectiveness
250
200
0.6
150
0.4
100
0.2
50
Model average Model average
Enthusiastic model Enthusiastic model
0.0
0
0 10000 20000 30000 40000 50000 0 10000 20000 30000 40000 50000
Fig. 4.26 Comparison of the three models. Panel a shows the CEACs computed for the “enthu-
siastic” (dashed line), “sceptical” (dotted line) and the model average (solid line), while panel b
shows the analysis of the expected value of information in the three cases. In this case, the model
average is also associated with lower impact of uncertainty in the final decision, as indicated by the
higher value of the CEAC (for all k considered) as well as the lower value of the EVPI
References
11. A. Heath, I. Manolopoulou, G. Baio, A review of methods for the analysis of the expected
value of information (2015). ArXiv e-prints arXiv:1507.02513
12. H. Rue, S. Martino, N. Chopin, J. R. Stat. Soc. B 71, 319 (2009)
13. F. Lindgren, H. Rue, J. Lindström, J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73(4), 423 (2011)
14. A. Briggs, M. Weinstein, E. Fenwick, J. Karnon, M. Schulpher, A. Paltiel, Value Health 15,
835 (2012)
15. G. Box, N. Draper, Empirical Model Building and Response Surfaces (Wiley, New York, NY,
1987)
16. D. Spiegelhalter, K. Abrams, J. Myles, Bayesian Approaches to Clinical Trials and Health-Care
Evaluation (Wiley, Chichester, UK, 2004)
Chapter 5
BCEAweb: A User-Friendly Web-App
to Use BCEA
5.1 Introduction
In this chapter, we introduce BCEAweb, a web interface for BCEA. BCEAweb is a web
application aimed at everyone who does not use R to develop economic models
and wants a user-friendly way to analyse both the assumptions and the results of
a health economic evaluation. The results of any probabilistic model can be very
easily imported into the web-app, and the outcomes are analysed using a wide array
of standardised functions. The chapter will introduce the use of the main functions
of BCEAweb and how to use its capabilities to produce results summaries, tables and
graphs.
The interface allows the user to produce a huge array of analysis outputs, both
in tabular and graphical form, in a familiar environment such as a web page. The
only inputs needed are the outputs of a probabilistic health economic model, be
it frequentist (based, for example, on bootstrapping techniques) or fully Bayesian.
It also includes functionalities to produce a full report and analyse the inputs of a
probabilistic analysis to test the distributional assumptions.
Throughout this chapter, we will make use of the two examples used so far to show
the functionalities of BCEA. The Vaccine and Smoking Cessation models introduced
in the previous chapters will be used as practical examples of how to use BCEAweb
in real-world examples.
The vast majority of health economics models are built in MSExcel. This is because
the users of these models are familiar with the software, and it is accepted by virtually
all health authorities and decision makers across the world, including the National
This chapter was written by Gianluca Baio, Andrea Berardi, Anna Heath and Polina Hadji-
panayiotou.
Institute for Health and Care Excellence (NICE), the Pharmaceutical Benefits Advi-
sory Committee (PBAC) and the Canadian Agency for Drugs and Technologies in
Health (CADTH). While models programmed in Excel are presented in a familiar,
user-friendly fashion, the software itself can prove to be a limit when building com-
plex models. The intricate wiring and referencing style of these models is often the
cause of programming errors. Very often models rely on Visual Basic for Applica-
tions (VBA) for Excel for complex procedures.
We acknowledge that Excel models are usually sufficient to demonstrate the
value for money of new (or existing) technologies. However, often the presentation
of the results is lacking, and the calculation of more complex quantities is inefficient
(e.g. CEAC for multiple comparators) or are not feasible at all (e.g. EVPPI), mostly
because VBA lacks the mathematical or statistical capabilities and flexibility of a
language such as R. BCEAweb is aimed at researchers and health economists who would
like to expand the scope of their analyses without re-building models from scratch,
programming additional analyses of the outputs and, perhaps most importantly, doing
so without using R.
The main objective of BCEAweb is to make all the functionalities included in BCEA
available and easy to use for everyone, without writing a single line of code. The
programme was inspired by the Sheffield Accelerated Value of Information (SAVI)
web-app [1], which can be accessed at the webpage http://savi.shef.ac.uk/SAVI/.
The focus of SAVI is the research on the methods of calculation of the expected
value of (partial) information developed by Strong and colleagues (and presented
in Sect. 4.3.3.2), and it also offers facilities to calculate cost-effectiveness summary
measures. On the other hand, the purpose of BCEAweb is to offer an easy and stan-
dardised way to produce outputs from a health economic evaluation, with the EVPPI
among them. It should also be noted that BCEAweb also includes an EVPPI calcula-
tion method which is faster than the one implemented in SAVI for multidimensional
problems, based on the work on the EVPPI presented in [2].
The strength of BCEAweb is that it allows many different input formats, includ-
ing csv files obtained from spreadsheet calculators (e.g. MS Excel), OpenBUGS/
JAGS and R itself. The outputs can be saved either individually from the web-app, or
by exporting a modular summary in Word or pdf format, which also includes brief
and flexible interpretations of the results. The report is modular, allowing users to
choose the sections to be included.
ui.R. The first includes all the functions and R commands to be run, while the latter
contains the code building the user-interface controlling the layout and appearance
of the web application. These are managed server-side, producing a web page relying
on HTML, CSS and Javascript without the programmer having to write in languages
other than R. Additionally the web-app relies on additional files used in functions
such as report generation, which are localised and accessed only on the server.
On the client side, a modern web browser supporting Javascript is capable of
displaying the web-app. When accessing it through the internet, all the calculations
are performed by the server, so that even the more demanding operations are not rely-
ing on the user’s device. The application can also be run locally, when a connection
is not available or when potentially sensitive data cannot be shared on the internet,
but obviously in this case the execution is performed on the users’ machine.
Any modern browser is able to display BCEAweb: both remotely, accessing it on
the internet, and locally, by running it through R. The R package is needed only
if running the web-app locally, while an internet connection is required to access
BCEAweb remotely.
As already mentioned, the output of an economic model can be imported easily, and
in different formats, into BCEAweb. The functionalities of the web-app require two
different sets of inputs: the simulated (or sampled) values of the parameters, used to
test the distributional assumptions of the PSA, and the PSA results, i.e. the costs and
health effects resulting from each set of simulated (or sampled) parameter values,
used to summarise and produce the results of the probabilistic analysis.
The latter set of values (i.e. the PSA results) is generally saved by all health
economic models, as it represents the PSA results and is used to produce tools such
as the cost-effectiveness plane. However, the sampled or simulated parameter values
are not always saved in Excel models as they are usually discarded. Therefore,
to make use of the tools to check the parametric distributional assumptions and to
calculate the EVPPI and info-rank values, the analysts will need to make sure that
the economic model saves the values of both the simulated parameter values and
the PSA results. Details on how to format the inputs for the web-app are reported in
Sects. 5.2.4.1 and 5.2.5.
At a first look, BCEAweb seems to be a regular web page, and it actually is. The
welcome page, first displayed when accessing the web-app, is shown in Fig. 5.1.
This page provides details about what BCEAweb is, how to use it and about how it fits
in the general process of a health economic evaluation. Many hyperlinks, coloured
156 5 BCEAweb: A User-Friendly Web-App to Use BCEA
Fig. 5.1 The landing page of BCEAweb provides information on the web-app and how to use it. The
buttons at the top of the page are used to navigate through the pages. The web-app is run locally
in the examples pictured throughout this chapter, and thus the address bar refers to the IP address
localhost resolves to
in orange, are included throughout the text. The tabs at the top of the page can be
clicked to navigate through the different sections of BCEAweb.
As shown in Fig. 5.1, BCEAweb is fundamentally divided into 6 main pages. Each
of them will be described in a section of this chapter. These can be easily accessed
from any page by clicking on the respective label in the navigation bar:
• Welcome: the landing page shown in Fig. 5.1. The welcome page includes expla-
nations about what BCEAweb is and does, and provides a basic usage guide;
• Check assumptions: this page allows the users to check the distributional assump-
tions underpinning the probabilistic sensitivity analysis;
• Economic analysis: where the model outcomes are uploaded and the main set-
tings regulated. It provides several tools for the analysis of the economic results.
It shows the results on a cost-effectiveness plane and includes other analyses such
as the EIB and the efficiency frontier;
• Probabilistic Sensitivity Analysis: calculates and shows results for tools
commonly used in the probabilistic sensitivity analysis, i.e. the pairwise and
multiple-comparison CEAC and CEAF;
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 157
• Value of information: allows the calculation of the EVPI, EVPPI and the info-
rank summary, based on the value of partial perfect information;
• Report: creates a standardised report, with modular sections, can be downloaded
from this page.
The first tab of the web-app is called “Check Assumptions”. It is sufficient to click
on its name in the navigation bar to access it. On this page, the user can upload the
simulations of the parameters used in the iterations of the PSA from any probabilistic
form of an economic model. The functionalities are particularly useful to easily test
for violations of distributional assumptions, which might be caused, for example,
by miscalculations of the distribution parameters in models caused by a very large
number of inputs. Rather than checking the values one by one, the sampled values
can be analysed based on the analysis of the empirical (i.e. observed) distributions
to ensure they are correct. This functionality extends BCEA and is not included in the
R package. The web-app presents the page shown in Fig. 5.2.
The drop-down menu at the top of the grey box on the left of the page allows the
selection of the preferred input data format. The data and can be fed into the web-app
in three different formats:
• Spreadsheet, by saving the simulations in a csv (comma-separated values) file.
This is particularly useful if the simulations are produced in a spreadsheet pro-
gramme, such as MS Excel, LibreOffice or Google Sheets;
Fig. 5.2 The “Check assumptions” tab before importing any data. The left section is reserved for
inputs and parameters, while outputs are displayed on the right side of the page
158 5 BCEAweb: A User-Friendly Web-App to Use BCEA
• BUGS, if the values of the parameters were obtained in a software such as OpenBUGS
or JAGS. The user will be required to upload the index file and a number of CODA
files equal to the number of simulated Markov chains;
• R, by providing an RDS file obtained by saving an object obtained from a BUGS
programme in R containing the values of the simulations. The file needs to be
saved using the saveRDS function.
The data need to be saved on a file (or files, in the case of a CODA input from a BUGS
programme) which will be imported into BCEAweb. The content of the files need to
be formatted in a standardised way so that the web-app can successfully import it.
The data formats for the files are described below.
The spreadsheet input form is the easiest way to provide inputs if the economic model
is programmed in a software such as MS Excel, as the parameters can be easily
exported in a csv file. The data need to be arranged in a matrix with parameters by
column and iterations by row, as shown in Fig. 5.3. The first row is reserved for the
parameter names.
To import the values of simulations performed in R using a BUGS programme such
as OpenBUGS or JAGS, it is necessary to save the output object into an RDS file. The
output object (e.g. obtained from the jags function) needs to be of list mode, with
Fig. 5.3 The data format for csv files to be imported in the “Check assumptions” tab of BCEAweb.
The first row is dedicated to the names of the parameters, which will be used by the web-app to
populate the parameter menus
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 159
The page will display a loading bar showing the transfer status of the data. As soon
as the input files are imported, the web-app will immediately show an histogram of
the distribution of the first parameter in the dataset, together with a table reporting
summary statistics of the parameter distribution at the bottom of the chart, as depicted
in Fig. 5.4. The variables to be displayed can be picked from the menu on the left
side of the page, either by clicking on the menu and selecting one in the list or by
typing the parameter label in the box, which will activate the search function. The
width of the histogram bars can also be set by varying the number of bins using the
slider at the bottom of the left side of the page. A greater number of bins will increase
the number of bars (i.e. decreasing the number of observations per rectangle), while
decreasing the number of bins will increase their size, reducing the number of bars.
Trace plots can be shown for each parameter distribution by clicking on the respec-
tive button in the navigation bar. This functionality is not very useful when import-
ing data from a spreadsheet as it can only show whether there are any unexpected
sampling behaviours dependent on the iteration number. On the other hand if the
simulations are passed to BCEAweb in a BUGS or R format and the values are sampled
from multiple, parallel chains, these can be easily compared variable by variable
using the trace plots.
Additional tools are available if the input data are from multiple parallel chains and
imported in the web-app in the BUGS or R format. By choosing one of these two options
in the input format menu bar, additional navigation buttons will appear compared to
the spreadsheet input option. By clicking on them, additional diagnostic tools will
appear, useful for checking the convergence of Markov chains and the presence of
correlation: the Gelman-Rubin plots and an analysis of the effective sample size and
of the autocorrelation. These are not included when importing data from a spreadsheet
as generally electronic models programmed in MS Excel or similar programmes do
not make use of parallel sampling from multiple chains. These tools are not discussed
in detail here; interested readers can find a description of these and other diagnostics
in [4].
160 5 BCEAweb: A User-Friendly Web-App to Use BCEA
Fig. 5.4 The “Check assumptions” tab after data have been imported in a spreadsheet form.
Additional analysis tools are available when importing simulations in the R or BUGS format (not
shown here)
The basic analysis of the cost-effectiveness results is carried out in the “Economic
analysis”. On this page, the cost-effectiveness results are uploaded and the eco-
nomic analysis begins. Analogously to the “Check assumptions” tab, the user can
upload the results of a probabilistic model in three different formats: spreadsheet,
BUGS and R. The data need to be arranged as already explained in the previous section
for each of the available formats, with parameters by columns and iterations by row.
The model outputs need to be ordered so that the health outcomes and costs for each
intervention are alternated, as shown in Fig. 5.5.
The same data arrangement is required if supplying the values in the R format:
health outcomes and costs need to be provided, alternated for each of the included
comparisons, in a matrix or data.frame object. This object needs to be saved in
RDS format using the saveRDS function for the web-app to be able to read it. Health
outcomes and costs must be provided in the same alternated arrangement also if
using a BUGS programme; one file containing the values for each of the simulated
Markov chains and the CODA index file need to be uploaded onto the web-app.
The “Economic analysis” page, displayed in Fig. 5.6, allows the user to choose
the parameters to be used by the underlying BCEA engine to analyse the results. On
the left side of the page the user can define the grid of willingness to pay (WTP)
thresholds used in the analysis by specifying the threshold range and the value of the
“step”, i.e. the increment from one grid value to the next. The three parameters are
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 161
Fig. 5.5 The data format for spreadsheet files to be uploaded in the “Economic analysis” tab.
The first row is reserved for variable names which are not used but included for consistency with
the input format of the “Check assumptions” tab
required to have values such that the difference between the maximum and minimum
threshold is divisible by the step. The default values of BCEA produce a 501-element
grid, which is generally fine enough to capture differences in the economic results
conditional on the WTP. Increasing the grid density by decreasing the step value has
detrimental effects on computational speed, therefore the users are advised to change
the grid density carefully. The cost-effectiveness threshold can also be set on this
page. This value is used as the cut-off value for the decision analysis, and is used to
determine if interventions are cost-effective by comparing the chosen threshold and
the ICER estimated in the economic analysis. This parameter is used in the functions
included in BCEAweb, such as the cost-effectiveness plane and the cost-effectiveness
summary.
Once the PSA data are uploaded to BCEAweb, additional options become available
on the left of the page. These are the names of the compared interventions, which
can be changed by the user, and the intervention chosen as reference. These options
match the ones used in the bcea function presented in Sect. 3.2. When the data are
uploaded and these additional options are set, the analysis is performed by clicking
on the “Run the analysis” grey button at the bottom of the page. The page will
display the cost-effectiveness summary, as shown in Fig. 5.6.
As soon as the analysis is run, BCEAweb produces the cost-effectiveness analysis
summary, the cost-effectiveness plane, the expected incremental benefit plot and the
cost-effectiveness efficiency frontier. These can be accessed by clicking in the sub-
navigation bar; each of them will display the respective standardise output produced
by BCEA:
162 5 BCEAweb: A User-Friendly Web-App to Use BCEA
Fig. 5.6 The “Economic Analysis” tab once the data have been uploaded and the model run. The
vaccine example was used to produce the results in the Figure
the importing procedure has already been carried out, the results are available to
all tabs of the web-app, and there is no need to re-upload them. In this case the
“Probabilistic sensitivity analysis” page does not require any additional
input, and the graphs will be automatically displayed.
The “Probabilistic sensitivity analysis” is structured in three sub-
sections. Each of them contains a different tool to analyse the probability of cost-
effectiveness of the compared interventions: the pairwise cost-effectiveness accept-
ability curve (CEAC), the multiple-comparison CEAC and the cost-effectiveness
acceptability frontier (CEAF). These tools are described in more detail in Sect. 4.2.2.
The plots are shown automatically if the data have already been imported into the
web-app (if not, these will have to be re-uploaded from the “Economic analysis”
tab). The page will appear as shown in Fig. 5.7. The three curves can be accessed
by clicking on the respective buttons in the navigation sub-menu on the top-left
corner of the page. The pairwise CEAC and the CEAF values can be downloaded
by clicking on the “Download CEAC table” (or “Download CEAF table” for the
frontier). This will let the user download a file in format csv including the values
of the CEAC (or CEAF). Additionally the CEAC or CEAF estimates can be queried
for any given threshold value included in the grid approximation (specified in the
“Economic analysis” tab) directly in the web-app. Changing the wtp value in the
drop-down menu will return the respective probability of cost-effectiveness. However
if multiple interventions are compared, as in Fig. 5.7, the CEAC value shown will
refer to the first curve plotted. In the case shown the CEAC values shown (i.e. 0.6600)
indicates the probability of “Group counselling” being cost-effective when compared
to “No contact”.
Fig. 5.7 The “Probabilistic sensitivity analysis” tab, showing the cost-effectiveness acceptability
curves for the smoking cessation example
164 5 BCEAweb: A User-Friendly Web-App to Use BCEA
The “Value of information” tab is focused on three tools for the value of infor-
mation analysis:
• Expected value of perfect information, or EVPI, is the monetary value (based on
the monetary net benefit utility function) attributable to removing all uncertainty
from the economic model (and thus in the decisional process);
• Expected value of partial perfect information, or EVPPI, is the value attributable
to a reduction in the uncertainty associated to a single parameter or to a specific
set of parameters in the economic model;
• Info-rank plot, an extension of the tornado plot. This is useful to assess how para-
meters contribute to the uncertainty in the model by ordering them in decreasing
order of the ratio of single-parameter EVPPI on EVPI for a given WTP threshold
(shown in Fig. 5.8).
The EVPI tab works in a similar fashion to the CEAC, as it does not require
any other data than the PSA data imported into the web-app on the “Economic
analysis” tab. Analogously to the CEAC, the values can be explored by using the
drop-down menu in the page, as well as downloaded in a csv file by clicking on the
“Download EVPI table” button.
The “Info-rank” and “EVPPI” require that the parameter simulations, uploaded
in the “Check assumptions” tab, and the probabilistic economic results, uploaded
in the “Economic analysis” tab, are available. If these are not available in BCEAweb,
it will not be possible to calculate the EVPPI and the info-rank table. Once both
the parameter values and economic results are correctly imported, the parameter
selection menus in the “Info-rank” and “EVPPI” tabs (labelled “Select relevant
parameters” and “Select parameters to compute the EVPPI”, respectively)
will display a list of the model parameters. These can be either selected from the
drop-down menu or searched by typing the parameter labels into the menu field. Any
number of parameters can be selected; choosing “All parameters” selects all of
them at the same time. The analyses are run by selecting the parameters and clicking
on the grey “Run” button at the bottom of the page. In addition the EVPI, EVPPI
and Info-rank tables of values can be downloaded in csv format by clicking on the
grey “Download” button.
The EVPPI tab in BCEAweb includes alternative methods for the estimation of
the EVPPI, and allows for fine-tuning the parameters of the procedure based on the
chosen methodology. In addition to a plot comparing the full-model EVPI and the
EVPPI for the selected parameter (or set of parameters), BCEAweb also includes a
“Diagnostic” tab, which can be accessed from the EVPPI page. It includes residual
plots for both the costs and effectiveness so the user can check for unexpected behav-
iours in the model fit which might make the EVPPI estimates unreliable. BCEAweb
allows users to check the reliability of the estimation and also gives the instruments
to intervene on simple issues; for example non-linearities in the distribution of the
residuals can be addressed by increasing the interaction order using the INLA-SPDE
estimation method.
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 165
Fig. 5.8 The Info-rank plot for the vaccine example, including all parameters, in correspondence
of a threshold of 25,000 monetary units. While the psi.5 parameter stands out compared to the other
variables, the proportion between the single-parameter EVPPI and total EVPI is below 1%
5.2.8 Report
The “Report” tab allows users to download the outputs produced by BCEAweb in a
standardised report in either a pdf or Microsoft Word format. The list of sections
is displayed on the page, and checking the respective box will include the output
section in the report produced by the web-app. It is worth noting that the correct data
need to be available for each section to be produced correctly. The outputs produced
in the report will depend on the parameters specified in the BCEAweb tabs for each of
the selected sections, e.g. the comparator names, the variables included in the EVPPI
analysis, etc.
To download the report from BCEAweb it is sufficient to select the section required
by the user, choose the preferred output format and click on the grey “Download”
button at the bottom of the page. The report will be generated and downloaded; please
note that in some internet browsers, depending on the user settings, the report might
be displayed as a new web page. An example of a report generated by the web-app
for the vaccine example is shown in Fig. 5.9. The cost-effectiveness plane and cost-
effectiveness acceptability curve sections were selected to produce the report for the
vaccine example.
166 5 BCEAweb: A User-Friendly Web-App to Use BCEA
Fig. 5.9 The standardised report produced by BCEAweb for the vaccine example. This report was
obtained by selecting the cost-effectiveness plane and CEAC sections only
References
R
G R
Ggplot, 69, 70, 76–78, 84, 86, 105, 108 ggplot2, 61
INLA, 120
R2jags, 154
H R2OpenBUGS, 154
Health economic evaluation, 1, 2, 7, 10, 12, shiny, 154
13, 16, 20, 21, 23, 62, 63, 71, 89, 97, stats, 154
144, 153–155 triangle, 54, 128
workspace, 18, 34, 39, 53
Risk aversion, 141–143
I
Incremental benefit (IB)
ib.plot, 81 S
Incremental cost-effectiveness ratio (ICER), Structural uncertainty
14–16, 70, 72–77, 81, 82, 84, 87, 89, CEriskav, 141
91, 93, 94, 96, 97, 100, 149, 161 createinputs, 125
mixedAn, 138
PSA, 94
M struct.psa, 149
Markov Chain Monte Carlo
convergence, 40, 159
Gibbs sampling, 10, 125 W
Monetary net benefit, 14, 80, 141 Willingness to pay (wtp), 72, 98, 160