Sei sulla pagina 1di 181

UseR !

Gianluca Baio
Andrea Berardi
Anna Heath

Bayesian Cost-
Effectiveness
Analysis with
the R package
BCEA
Use R!

Series Editors
Robert Gentleman Kurt Hornik Giovanni Parmigiani

More information about this series at http://www.springer.com/series/6991


Use R!

Moore: Applied Survival Analysis Using R


Luke: A User’s Guide to Network Analysis in R
Monogan: Political Analysis Using R
Cano/M. Moguerza/Prieto Corcoba: Quality Control with R
Schwarzer/Carpenter/Rücker: Meta-Analysis with R
Gondro: Primer to Analysis of Genomic Data Using R
Chapman/Feit: R for Marketing Research and Analytics
Willekens: Multistate Analysis of Life Histories with R
Cortez: Modern Optimization with R
Kolaczyk/Csárdi: Statistical Analysis of Network Data with R
Swenson/Nathan: Functional and Phylogenetic Ecology in R
Nolan/Temple Lang: XML andWeb Technologies for Data Sciences with R
Nagarajan/Scutari/Lèbre: Bayesian Networks in R
van den Boogaart/Tolosana-Delgado: Analyzing Compositional Data with R
Bivand/Pebesma/Gómez-Rubio: Applied Spatial Data Analysis with R
(2nd ed. 2013)
Eddelbuettel: Seamless R and C++ Integration with Rcpp
Knoblauch/Maloney: Modeling Psychophysical Data in R
Lin/Shkedy/Yekutieli/Amaratunga/Bijnens: Modeling Dose-Response Microarray
Data in Early Drug Development
Experiments Using R
Cano/M. Moguerza/Redchuk: Six Sigma with R
Soetaert/Cash/Mazzia: Solving Differential Equations in R
Gianluca Baio Andrea Berardi

Anna Heath

Bayesian Cost-Effectiveness
Analysis with the R package
BCEA

123
Gianluca Baio Anna Heath
Department of Statistical Science Department of Statistical Science
University College London University College London
London London
UK UK

Andrea Berardi
Department of Statistics
University of Milano-Bicocca
Milan
Italy

ISSN 2197-5736 ISSN 2197-5744 (electronic)


Use R!
ISBN 978-3-319-55716-8 ISBN 978-3-319-55718-2 (eBook)
DOI 10.1007/978-3-319-55718-2
Library of Congress Control Number: 2017937734

© Springer International Publishing AG 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To Kobi—who I will keep tickling even when
he is a grown man and has moved out to live
on his own (upstairs from us. Because Marta
does not want him to be too far away…)
To Valerio—“Shirak!”
To my Grandpa David
Preface

This book originates from the work that we have done, at different times and in
different capacities, in the area of statistical modelling for health economic evalu-
ation. In our view, this is a very interesting and exciting area for statisticians:
despite the strong connotation derived by its name, health economic evaluation is
just as much (if not more!) about statistics than it is about healthcare or economics.
Statistical modelling is a fundamental part of any such evaluation and as models
and the data that are used to populate them become bigger, more complex and
representative of a complicated underlying reality, so do the skills required by a
modeller.
Broadly speaking, the objective of publicly funded healthcare systems (such as
the UK’s) is to maximise health gains across the general population, given finite
monetary resources and a limited budget. Bodies such as the National Institute for
Health and Care Excellence (NICE) provide guidance on decision-making on the
basis of health economic evaluation. This covers a suite of analytical approaches
(usually termed “cost-effectiveness analysis”) for combining costs and conse-
quences of intervention(s) compared to a control, the purpose of which is to aid
decision-making associated with resource allocation. To this aim, much of the
recent research has been oriented towards building the health economic evaluation
on sound and advanced statistical decision-theoretic foundations.
Historically, cost-effectiveness analysis has been based on modelling often
performed in specialised commercial packages (such as TreeAge) or even more
frequently spreadsheet calculators (almost invariably MicrosoftExcel). The
“party-line” for why this is the case is that these are “easy to use, familiar, readily
available and easy to share with stakeholders and clients”. Possibly, in addition to
these, another crucial factor for the wide popularity of these tools is the fact that
often modellers are not statisticians by training (and thus less familiar with
general-purpose statistical packages such as SAS, Stata or R). Even more inter-
estingly, it is often the case that cost-effectiveness models are based on existing
templates (usually developed as Excel spreadsheets, for example for a specific
country or drug) and then “adapted” to the situation at hand.

vii
viii Preface

Luckily, we are not alone (although perhaps not in the majority) in arguing that
many of these perceived advantages require a serious rethink. In our view, there are
several limitations to the current state of modelling in health economics: firstly, the
process often implies a separation of the different steps required for the evaluation.
This potentially increases the risk of human errors and confusion, because the
results of the intermediate steps (e.g. the statistical analysis of data collected in a
randomised trial) are usually copied and pasted in Excel to populate cells and
formulae (see for instance our discussion in x1.4 and x4.2). Secondly, in an Excel
file calculations are usually spread over several sheets that are linked by formulae or
cross references. While in the case of simple models this is actually a neat way of
structuring the work, it can become unwieldy and difficult to track modifications for
more complex models, based on a combination of different datasets and thus
analyses (which of course is increasingly the norm!).
The idea of the R package BCEA has evolved naturally by the need of replicating
some types of analyses when post-processing the output of the models we were
developing in our applied work, while overcoming the limitations of the “standard”
work flow based on spreadsheets. It felt natural to make the effort of systematising
the functions we were using to do standard analyses and as we started doing so, we
realised that there was much potential and interesting work to be done. The main
objective of this book is to aid statisticians and modellers in health economics with
the “easier” part of the process—making sense of their model results and help them
reproduce the analysis that is, more or less, ubiquitous in the relevant output (be it a
research paper, or a dossier to be submitted to a regulatory agency such as NICE).
To this aim, the book is structured as follows. First, in Chap. 1, we introduce the
main concepts underlying the Bayesian approach and the basics of health economic
evaluation, with particular reference to the relevant statistical modelling. Again,
linking the two is natural to us as we are of a very strong Bayesian persuasion. In
addition to this, however, it is interesting to note that Bayesian methods are
extremely popular in this area of research, since they are particularly useful in
modelling composite sources of information (often termed “evidence synthesis”)
and effectively underlie the important concept of Probabilistic Sensitivity Analysis
(PSA, see for instance Chap. 4).
Chapter 2 presents the two case studies we use throughout the book. In par-
ticular, we introduce the statistical modelling and notation, describe the whole
process of running the analysis and obtaining the relevant output (in the form of
posterior distributions) and then the extra modelling required to compute the
quantities of interest for the economic analysis. This process is performed under a
fully Bayesian approach and is based on a combination of R and BUGS=JAGS, the de
facto standard software to perform Markov Chain Monte Carlo analysis.
Chapter 3 introduces the R package BCEA and its basic functionalities by means
of the two running examples. The very nature of BCEA is to follow a full Bayesian
analysis of the statistical model used to estimate the economic quantities required
for the cost-effectiveness analysis, but we make here (and later in the book) the
point that it can also be used in the case where the modelling is done using
Preface ix

frequentist arguments—provided suitable inputs (e.g. in the forms of simulations


for the effects and costs of the interventions under investigation).
Chapter 4 is perhaps the most technical and, as mentioned above, it introduces
the concept of PSA and its application. It is also there that we make the point that
increasingly popular and important analyses (e.g. based on the evaluation of the
value of information) simply cannot be performed in spreadsheet or other
sub-optimal software.
Finally, Chap. 5 presents an extension of BCEA, which can be turned into a
web-based application using R Shiny. To a R user, this is perhaps not very useful—
all that BCEAweb can do, BCEA in R can do even better (while the reverse is not
totally true). However, we do feel that using web-interfaces is indeed very
important to disseminate the message and convince practitioners of the supremacy
of R over Excel or other specialised software. The main argument is that BCEAweb
allows the user an intermediate step between the “standard” Excel based modelling
and the “ideal” (at least to our mind) situation in which all the analysis is performed
in R: BCEAweb can also be used to produce a graphical interface to help the
“translation” of the model in simpler, maybe graphical terms. This will probably
overcome the complaints that clients (e.g. pharmaceutical companies commis-
sioning cost-effectiveness analysis for their products) or stakeholders (e.g.
reviewers and committee members in regulatory agencies) have: they want to be
able to use menu-bars and sliders to modify the models in an easy and intuitive
way. Tools such as BCEAweb will allow this.
The final version of the book has benefitted from comments, suggestions and
discussion with many friends and colleagues. Among them, we would like to
mention Mark Strong, Nicky Welton, Chris Jackson, Matthew Bending, James
Jarrett, Andreas Karabis, Petros Pechlivanoglou and Katrin Haeussler. Polina
Hadjipanayiotou has worked extensively on BCEAweb as part of her M.Sc. disser-
tation at University College London and has contributed to the write-up of Chap. 5.
One particular thought goes to Richard Nixon, a brilliant researcher and a lovely
person—it has been a pleasure to have met him and picked his brains on how to
make BCEA better. Eva Hiripi at Springer has been extremely patient and helpful in
her support, while we were preparing the book (we often joked that we would
publish it in 2036, so we are pretty pleased that in fact we are 20 years earlier than
predicted!). Similarly, the comments of two anonymous reviewers have been very
useful and have contributed to making the final version of this book a better and,
hopefully, clearer one.

London, UK Gianluca Baio


December 2016 Andrea Berardi
Anna Heath
Contents

1 Bayesian Analysis in Health Economics . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Bayesian Inference and Computation . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Bayesian Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Specifying a Bayesian Model . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Bayesian Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Basics of Health Economic Evaluation . . . . . . . . . . . . . . . . . . . . . 12
1.4 Doing Bayesian Analysis and Health Economic Evaluation
in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.1 Pre-processing the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.2 Building and Coding the Analysis Model . . . . . . . . . . . . . . 17
1.4.3 Running the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.4 Post-processing the Results. . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.5 Performing the Decision Analysis . . . . . . . . . . . . . . . . . . . 20
1.4.6 Using BCEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Preliminaries: Computer Configuration . . . . . . . . . . . . . . . . . . . . . 24
2.2.1 MS Windows Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.2 Linux or Mac OS Users . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Vaccine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 (Bayesian) Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.2 Economic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4 Smoking Cessation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.1 (Bayesian) Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.2 Economic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

xi
xii Contents

3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis . . . . . 59


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.2 Economic Analysis: The bcea Function . . . . . . . . . . . . . . . . . . . . 63
3.2.1 Example: Vaccine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.2 Example: Smoking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3 Basic Health Economic Evaluation: The summary Command . . . 71
3.4 Cost-Effectiveness Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.4.1 The ceplane.plot Function . . . . . . . . . . . . . . . . . . . . . 74
3.4.2 ggplot Version of the Cost-Effectiveness Plane . . . . . . . 76
3.4.3 Advanced Options for ceplane.plot . . . . . . . . . . . . . . 79
3.5 Expected Incremental Benefit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.6 Contour Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.7 Health Economic Evaluation for Multiple Comparators
and the Efficiency Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 89
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 92
4 Probabilistic Sensitivity Analysis Using BCEA . . . . . . . . . . . . . . . . . . 93
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty . . . . . 94
4.2.1 Summary Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.2 Cost-Effectiveness Acceptability Curve . . . . . . . . . . . . . . . 99
4.3 Value of Information Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3.1 Expected Value of Perfect Information . . . . . . . . . . . . . . . . 109
4.3.2 Expected Value of Perfect Partial Information . . . . . . . . . . 113
4.3.3 Approximation Methods for the Computation
of the EVPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 115
4.3.4 Advanced Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 124
4.3.5 Technical Options for Controlling the EVPPI
Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 127
4.3.6 Deprecated (Single-Parameter) Methods . . . . . . . . . . . . . .. 133
4.3.7 The Info-Rank Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 135
4.4 PSA Applied to Model Assumptions and Structural
Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 138
4.4.1 Mixed Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 138
4.4.2 Including Risk Aversion in the Utility Function . . . . . . . .. 141
4.4.3 Probabilistic Sensitivity Analysis to Structural
Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 142
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 151
5 BCEAweb: A User-Friendly Web-App to Use BCEA . . . . . . . . . . . . . . 153
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA . . . . . . . . . . . 153
5.2.1 A Brief Technical Overview of BCEAweb . . . . . . . . . . . . . 154
5.2.2 Note on Data Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Contents xiii

5.2.3 Introduction to the Interface . . . . .


. . . . . . . . . . . . . . . . . . . 155
5.2.4 Check Assumptions . . . . . . . .
. . . . . . . . . . . . . . . . . . . 157
5.2.5 Economic Analysis . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 160
5.2.6 Probabilistic Sensitivity Analysis .
. . . . . . . . . . . . . . . . . . . 162
5.2.7 Value of Information . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 164
5.2.8 Report . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 165
References ............................. . . . . . . . . . . . . . . . . . . 166
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Acronyms

ASH Action on Smoking and Health


CADTH Canadian Agency for Drugs and Technologies in Health
CEAC Cost-Effectiveness Acceptability Curve
CEAF Cost-Effectiveness Acceptability Frontier
CRAN Comprehensive R Archive Network
CrI Credible Interval
DAG Directed Acyclic Graph
EIB Expected Incremental Benefit
EVPI Expected Value of Perfect Information
EVPPI Expected Value of Perfect Partial Information
GAM Generalised Additive Model
GP Gaussian Process
IB Incremental Benefit
ICER Incremental Cost-Effectiveness Ratio
INLA Integrated Nested Laplace Approximation
MCMC Markov Chain Monte Carlo
MLE Maximum Likelihood Estimate
MTC Mixed Treatment Comparison
NHS National Health Service
NICE National Institute for Health and Care Excellence
NMA Network Meta-Analysis
NRT Nicotine Replacement Therapy
OL Opportunity Loss
ONS Office for National Statistics
OTC Over The Counter
PBAC Pharmaceutical Benefits Advisory Committee
PSA Probabilistic Sensitivity Analysis
QALY Quality-Adjusted Life Year
RCT Randomised Controlled Trial
SAVI Sheffield Accelerated Value of Information

xv
xvi Acronyms

SPDE Stochastic Partial Differential Equation


VBA Visual Basic for Application
VI Value of Information
WTP Willingness To Pay
Chapter 1
Bayesian Analysis in Health Economics

1.1 Introduction

Modelling for the economic evaluation of healthcare data has received much attention
in both the health economics and the statistical literature in recent years [1, 2],
increasingly often under a Bayesian statistical approach [3–6].
Generally speaking, health economic evaluation aims to compare the economic
performance of two or more alternative health interventions. In other words, the
objective is the evaluation of a multivariate outcome that jointly accounts for some
specified clinical benefits or consequences and the resulting costs. From the statis-
tical point of view, this is an interesting problem because of the generally complex
structure of relationships linking the two outcomes. In addition, simplifying assump-
tions, such as (bivariate) normality of the underlying distributions, are usually not
granted (we return to this point later).
In this context, the application of Bayesian methods in health economics is par-
ticularly helpful because of several reasons:
• Bayesian modelling is naturally embedded in the wider scheme of decision theory;
ultimately, health economic evaluations are performed to determine the optimal
course of actions in the face of uncertainty about the future outcomes of a given
intervention, both in terms of clinical benefits and the associated costs.
• Bayesian methods allow extreme flexibility in modelling, especially since the
application of revolutionary computational methods such as Markov Chain Monte
Carlo has become widespread. This is particularly relevant when the economic
evaluation is performed by combining different data into a comprehensive decision
model.
• Sensitivity analysis can be performed in a straightforward way under the Bayesian
approach and can be seen as a by-product of the modelling strategy. This is
extremely helpful in health economics, as decisions are often made on the basis of
limited evidence. For this reason, it is essential to understand the impact of model
and parameter uncertainty on the final outputs.

© Springer International Publishing AG 2017 1


G. Baio et al., Bayesian Cost-Effectiveness Analysis with the R package BCEA,
Use R!, DOI 10.1007/978-3-319-55718-2_1
2 1 Bayesian Analysis in Health Economics

This chapter is broadly divided in two main parts. The first one introduces the
main aspects of Bayesian inference (in Sect. 1.2). First, in Sect. 1.2.1 we introduce the
main ideas underlying the Bayesian philosophy. We then present the most important
practical issues in modelling in Sect. 1.2.2 and computation in Sect. 1.2.3.
The second part of the chapter is focussed on presenting the basics of health
economic evaluation in Sect. 1.3 and then the practical aspects of this process are
presented in Sect. 1.4 which introduces the process of health economic evaluation
from the Bayesian point of view.

1.2 Bayesian Inference and Computation

1.2.1 Bayesian Ideas

In this section we briefly review the main features of the Bayesian approach to statis-
tical inference as well as the basics of Bayesian computation. A detailed presentation
of these subtle and important topics is outside the scope of this book and therefore
we only briefly sketch them here and refer the reader to [5, 7–14].
A Bayesian model specifies a full probability distribution to describe uncertainty.
This applies to data, which are subject to sampling variability, as well as to parameters
(or hypotheses), which are typically unobservable and thus are subject to epistemic
uncertainty (e.g. the experimenter’s imperfect knowledge about their value) and even
future, yet unobserved realisations of the observable variables (data) [14].
As a consequence, probability is used in the Bayesian framework to assess any
form of imperfect information or knowledge. Thus, before even seeing the data,
the experimenter needs to identify a suitable probability distribution to describe the
overall uncertainty about the data y and the parameters θ. We generally indicate this
as p(y, θ).
By the basic rules of probability, it is always possible to factorise a joint distrib-
ution as the product of a marginal and a conditional distribution. For instance, one
could re-write p(y, θ) as the product of the marginal distribution for the parameters
p(θ) and the conditional distribution for the data, given the parameters p(y|θ). But
in exactly the same fashion, one could also re-express the joint distribution as the
product of the marginal distribution for the data p(y) and the conditional distribution
for the parameters given the data p(θ|y).
Consequently,
p(y, θ) = p(θ)p(y|θ) = p(y)p(θ|y)

from which Bayes’ Theorem follows in a straightforward way:

p(θ)p(y|θ)
p(θ|y) = . (1.1)
p(y)
1.2 Bayesian Inference and Computation 3

While mathematically incontrovertible, Bayes’ Theorem has deeper philosophical


implications, which have led to heated debates, within and without the field of Sta-
tistics. In fact, the qualitative implications of this construction are that, if we are
willing to describe our uncertainty on the parameters before seeing the current data
through a probability distribution, then we can update this uncertainty by means of
the evidence provided by the data into a posterior probability, the left-hand side of
(1.1). This allows us to make inference in terms of direct probabilistic statements.
In fact, a truly “hard-core” Bayesian analysis would not even worry about parame-
ters but would rather model directly the observable variables (e.g. produce a marginal
model for the data only). In any case, it is possible to evaluate probabilistically unob-
served data ỹ, assumed to be of the same nature as those already observed by means
of the (posterior) predictive distribution

p(ỹ|y) = p(ỹ|θ)p(θ|y)dθ. (1.2)

Looking at (1.1) and (1.2) it should be clear that, in a Bayesian analysis, the
objective is to evaluate the level of uncertainty on some target quantity (be it the
unobservable parameter θ or the unobserved variable ỹ), given the inputs, i.e. the
observed data y and the set of assumptions that define the model in question.
Crucially, the Bayesian approach correctly recognises that, once the data have
been observed, there is no uncertainty left on their realised value. Unlike classical
statistical methods, what values might have been observed and how likely they might
have been under the specified model are totally irrelevant concepts.1
In a Bayesian context, inference directly involves the quantities that are not
observed—and again there is no distinction in the quantification of the uncer-
tainty about their value depending on their nature (i.e. parameters or future data). A
Bayesian analysis aims at revising the level of uncertainty in light of the evidence
that has become available: if data y are observed, we move from a prior to a posterior
state of knowledge/uncertainty.

1.2.2 Specifying a Bayesian Model

When conducting a real-life Bayesian analysis, one has to think carefully about the
model used to represent not only sampling variability for the observable data, but
also the relevant parameters.2
We note here that, in some sense, modelling the data is in general an “easier” or,
may be, less controversial task, perhaps because data will eventually be observed,

1 For instance, recall that p-values


are defined in terms of the chance of obtaining data that are even
more extreme than the one that have actually observed.
2 While this discussion is beyond the objectives of this book, we re-iterate here that in a “hard-

core” Bayesian approach, parameters are just convenient mathematical abstractions that simplify
the modelling for an observable variable y—see for example [6] and the references therein.
4 1 Bayesian Analysis in Health Economics

so that model fit can be assessed, at least to some extent. On the other hand, some
people feel uncomfortable in defining a model for the parameters, which represent
quantities that we will never be in a position of actually observing.
In our view, in principle this latter task is not much different or harder than the
former. It certainly requires an extra modelling effort; and it certainly has the potential
to exert notable impact on the final results. However, by virtue of the whole modelling
structure, it has also the characteristic of being extremely explicit. Prior distributions
for the parameters cannot be hidden; and if they are, the model is very easy to discard
as non-scientific.
Nevertheless, the problem of how we should specify the priors for the parameters
does play a fundamental role in the construction of a Bayesian model. Technically,
a few considerations should drive this choice.
What do parameters mean (if anything)?
First, parameters have often some natural or implied physical meaning. For example,
consider data for y, the number of patients who experience some clinical outcome
out of a sample of n individuals. A model y|θ ∼ Binomial(θ, n) can be considered,
in which the parameter θ indicates the “population probability of experiencing the
outcome”, or, in other words, the probability that an individual randomly selected
from a relevant population will experience the outcome in question.
In this case, it is relatively simple to give the parameter a clear interpretation and
derive some physical properties—for instance, because θ is a probability, it should
range between 0 and 1 and be a continuous quantity. We can use this information to
find a suitable probability model, much as we have done when choosing the Binomial
distribution for the observed data.
One possibility is the Beta distribution θ ∼ Beta(α, β)—this is a continuous prob-
ability model with range in [0; 1] and upon varying the values of its parameters (α, β)
it can take on several shapes (e.g. skewed towards either end of the range, or symmet-
rical). In general, by setting suitable values for the parameters of a prior distribution,
it is possible to encode prior knowledge (as we demonstrate in the next sections).
Of course, the Beta is only one particular possibility and others exist. For example,
one could first construct the transformation
 
θ
φ = g(θ) = logit(θ) = log ,
1−θ

which rescales θ in the range (−∞; ∞) and stretches the distribution to get an approx-
imate symmetrical shape. Then, it will be reasonable to model φ ∼ Normal(μ, σ).
Because φ = g(θ), then the prior on φ will imply a prior on the inverse transforma-
tion g −1 (φ) = θ—although technically this can be hard to derive analytically as it
may require complex computations. We return to this point later in this section.
1.2 Bayesian Inference and Computation 5

Include substantial information in the model


The second consideration is that, often, there is some substantial knowledge about
the likely values of the parameter. For instance, we may have access to previously
conducted, similar studies in which the probability of the outcome has been estimated,
typically by means of intervals (e.g. point estimates together with a measure of
variability or confidence intervals). Alternatively, it may be possible that experts are
able to provide some indication as to what the most likely values or range for the
parameter should be. This information should be included in the specification of the
model to complement the evidence provided by the observed data. This is particularly
relevant when the sample size is small.
Suppose for instance that through either of these routes, it is reasonable to assume
that, before observing any new data, the most likely range for θ is [0.2; 0.6]—this
means that, for instance, the probability of the outcome for a random individual in
the target population is between 20 and 60%.
Under the Beta model, it is possible to tweak the values of α and β so that, say,
95% of the prior distribution for θ lies between the two extremes of 0.2 and 0.6. As
it turns out, choosing α = 9.2 and β = 13.8 accomplishes this goal. The continuous
line in Fig. 1.1 shows the resulting Beta prior distribution for θ. Most of the mass (in
fact, exactly 95% of the entire density) lies in the interval [0.2; 0.6], as required.
Similarly, we could select values for μ and σ to build a Normal distribution that
encodes the same information—but notice that μ and σ are defined on a different

Normal (on logit scale)


Beta(9.2,13.8)

0.0 0.2 0.4 0.6 0.8 1.0

Probability of success

Fig. 1.1 A graphical representation of an informative prior based on a Beta distribution (the blue
continuous line) or on a logit-Normal distribution (represented by the histogram). The two different
models effectively encodes exactly the same level of prior information
6 1 Bayesian Analysis in Health Economics

scale, so care would be needed to identify the correct values. For example, if we used
μ = logit(0.4) = −0.41 and σ = 0.413,3 this would induce a prior distribution on
φ that effectively represents the same information on the natural scale of θ.
The histogram in Fig. 1.1 depicts the logit-Normal prior on θ (i.e. the rescaled
version of the Normal prior distribution on φ)—as it is possible to see, this is virtually
identical with the Beta prior described above.
Model the relevant parameters
In other circumstances, it is more difficult to give the parameters a physical meaning
and thus defining a “good” prior distribution requires a bit more ingenuity. For
example, costs are usually characterised by a markedly skewed distribution and thus
a suitable model is the Gamma, which is characterised by two parameters θ = (η, λ).
These represent, respectively, the shape and the rate.
The difficulty with this distribution is that the “original-scale” parameters are quite
hard to interpret in a physical sense; in fact, they are just a mathematical construction
that defines the probability density, which happens to be reasonable for a variable
such as the costs associated with an intervention. It is more difficult in this case to
give a clear meaning to the parameters and thus eliciting a suitable prior distribution
(possibly encoding some substantive knowledge) becomes more complicated [15].
It is in general much easier to think in terms of some “natural-scale” parameters,
say ω = (μ, σ), representing for example the mean and standard deviation of the
costs on the natural scale. This is because we have a better grasp of what these
parameters mean in the real world and thus it is possible for us to figure out what
features we should include in the model that we choose. In addition to this, as we
briefly mentioned in Sect. 1.1 and will reprise in Sects. 1.3 and 3.3, decision-making is
effectively based on the evaluation of the population average values for the economic
outcomes and thus the mean of the cost distribution is in fact the parameter of direct
interest.
Typically, there is a unique deterministic relationship ω = h(θ) linking the
natural- to the original-scale parameters that define the mathematical form of the
distribution. As we hinted above, defining a prior on ω will automatically imply one
for θ. For example, by the mathematical properties of the Gamma density, the ele-
ments of ω (on which we want to set the priors) are defined in terms of the elements
of θ as

η η
μ= and σ= (1.3)
λ λ2

(similar relationships are in general available for the vast majority of probability
distributions).

3 We need to encode the assumption that, on the logit scale, 0.6 is the point beyond which only 2.5%

of the mass lies. Given the assumption of normality, this is easy to obtain by setting logit(0.4) +
1.96σ = logit(0.6) from which we can easily derive σ = logit(0.6)−logit(0.4)
1.96 = 0.413.
1.2 Bayesian Inference and Computation 7

Whatever the choice for the priors on the natural-scale parameters, in the case of
the Gamma distribution, inverting the deterministic relationships in (1.3) it is easy
to obtain
μ
η = μλ and λ= . (1.4)
σ2
More importantly, because μ and σ are random variables associated with a probability
distribution, so will be η and λ and thus the prior on ω automatically induces the
prior for θ.
How much information should be included in the prior?
Undoubtedly, the level of information contained in the prior plays a crucial role in
a Bayesian analysis, since its influence on the posterior and predictive distributions
may be large. This is particularly relevant in cases where the evidence provided by the
data is weak (for instance in the case of very small sample sizes). In these situations,
it is advisable to use as much information as possible in the prior, to complement the
limited amount of information present in the observed data and to perform substantial
sensitivity analysis to identify crucial assumptions that may bias the analysis.
On the other hand, in cases where a large amount of data is available to directly
inform a parameter, then the prior distribution becomes, to some extent, less influ-
ential. In such a situation, it is perhaps reasonable, or at least less critical, to encode
a lower level of information in the prior. This can be achieved for example by using
Uniform priors on a suitable range, or Normal priors centred on 0 and with very large
variances. These “vague” (also referred to as “minimally informative” or “flat”) pri-
ors can be used to perform an analysis in which the data drive the posterior and
predictive results.
We note, however, that the choice of vague prior depends on the nature and
physical properties of the parameters; for example, variances need to be defined on
a positive range (0; +∞) and thus it is not sensible to use a flat Normal prior (which
by definition extends from −∞ to +∞). Perhaps, a reasonable alternative would
be to model some suitable transformation of a variance (e.g. its logarithm) using a
Normal vague prior—but of course, care is needed in identifying the implications of
this choice on the natural scale (we return to this point in the next section).
There is then a sense in which, even when using minimally informative priors,
some substantial information is included in the modelling. Consequently, in any
case the general advice is that, when genuine information is available, it should be
included in the model in a clear and explicit way and that sensitivity analysis should
be thoroughly performed.
This aspect is well established in health economic evaluation, particularly under
the Bayesian approach. For instance, back to the simple cost example, one possibility
is to model the priors on ω using vague Uniform distributions:

μ ∼ Uniform(0, Hμ ) and σ ∼ Uniform(0, Hσ ),


8 1 Bayesian Analysis in Health Economics

for suitably selected values Hμ , Hσ . This would amount to assuming that values in
[0; Hμ ] are all reasonable for the population average cost of the treatment under
study—and a similar reasoning would apply to the standard deviation. Of course,
this may not be the best choice and were genuine information available, we should
include it in the model. For example, the nature of the intervention is clearly known
to the investigator and thus it is plausible that some “most-likely range” be available
at least approximately.
Assess the implications of the assumptions
In addition to performing sensitivity analysis to assess the robustness of the assump-
tions, in a Bayesian context it is important to check the consistency of what the priors
imply. For instance, in the cost model we may choose to use a vague specification
for the priors of the natural-scale parameters. Nevertheless, the implied priors for
the original-scale parameters will in general not be vague at all. In fact by assuming
a flat prior on the natural-scale parameters, we are implying some information on
the original-scale parameters of the assumed Gamma distribution.
One added advantage of modelling directly the relevant parameters is that this is
not really a problem; the resulting posterior distributions will be of course affected
by the assumptions we make in the priors; but, by definition, however informative the
implied priors for (η, λ) turn out to be, this will by necessity be consistent with the
substantive knowledge (or lack thereof) that we are assuming for the natural-scale
parameters.
On the other hand, when vague priors are used for original-scale parame-
ters (which may not be the focus of our analysis), the unintended information
may lead to severe bias in the analysis. For instance, suppose we model directly
η ∼ Uniform(0, 10000) and λ ∼ Uniform(0, 10000), with the intention of using a
0e+00 2e+04 4e+04 6e+04 8e+04 1e+05

(a) (b)
80000
60000
40000
20000
0

0 20 40 60 80 100 0.0 0.5 1.0 1.5 2.0


μ σ

Fig. 1.2 Implied prior distributions for the mean μ, in panel (a), and the standard deviation σ, in
panel (b), obtained by assuming vague Uniform priors on the shape η and the rate λ of a Gamma
distribution
1.2 Bayesian Inference and Computation 9

vague specification. In fact, using (1.4) we can compute the implied priors for μ and
σ. Figure 1.2 shows how both these distributions place most of the mass to very small
values—possibly an unreasonable and unwanted restriction.
Computational issues
While in principle Bayesian analysis may be arguably considered conceptually
straightforward, the computational details are in general not trivial and make it gen-
erally complex. Nevertheless, under specific circumstances (and for relatively simple
models), it is possible to obtain analytic solutions to the computation of posterior
and predictive distributions.
One such case involves the use of conjugate priors [16]. These indicate a particular
mathematical formulation where the prior is selected in a way that, when combined
with the model chosen for the data, the resulting posterior is of the same form. For
example, if the available data y are binary and are modelled using a Bernoulli dis-
tribution y ∼ Bernoulli(θ) or equivalently p(y|θ) = θy (1 − θ)(1−y) , it can be easily
proven4 that choosing a Beta prior θ ∼ Beta(α, β) yields a posterior distribution
θ|y ∼ Beta(α∗ , β ∗ ), with α∗ = α + y and β ∗ = β + 1 − y. In other words, updat-
ing from the prior to the posterior occurs within the same probability family (in this
case the Beta distribution)—in effect, the information provided by the data is entirely
encoded in the updated parameters (α∗ , β ∗ ).
The obvious implication is that no complex mathematical procedure is required
to compute the posterior (and, similarly, the predictive) distribution. This is a very
appealing way of setting up a Bayesian model, mainly because several standard con-
jugated models exist—see for example [6]. In addition, it is often easy to encode prior
information using conjugate models—one simple example of a conjugate, informa-
tive prior was given above, when constructing the Beta distribution shown in Fig. 1.1.
On the other hand, conjugate priors are dictated purely by mathematical conve-
nience and they fail to fully allow the inclusion of genuine or substantial knowledge
about the nature of the model and its parameter. For this reason, in all but trivial cases,
conjugate priors become a limitation rather than an asset to a Bayesian analysis; for
example, no conjugate priors exist for the popular logistic regression model.
Thus, it is usually necessary to go beyond conjugacy and consider more complex
priors. This, however, comes at the price of increased computation complexity and
often no analytic or closed form solution exists for the posterior or the predictive
distribution. In these cases, inference by simulation is usually the preferred solution.

4 The basic idea is to investigate the form of the likelihood function L(θ), i.e. the model for the
data p(y|θ), but considered as a function of the parameters. If L(θ) can be written in terms of a
known distribution, then this represents the conjugate family. For instance, the Bernoulli likelihood
is L(θ) = p(y|θ) = θy (1 − θ)(1−y) , which is actually the core of a Beta density. When computing
the posterior distribution by applying Bayes theorem and combining the likelihood L(θ) with the
conjugate prior, effectively the two terms have the same mathematical form, which leads to a closed
form for the posterior.
10 1 Bayesian Analysis in Health Economics

1.2.3 Bayesian Computation

Arguably, the main reason for the enormous increase in the use of the Bayesian
approach in practical applications is the development of simulation algorithms and
specific software that, coupled with the availability of cheap computational power
(which became widespread in the 1990s), allow the end-user to effectively use suit-
able analytic models, with virtually no limitation in terms of the complexity.
The main simulation method for Bayesian inference is Markov Chain Monte
Carlo (MCMC), a class of algorithms for sampling from generic probability
distributions—again, here we do not deal with technicalities, but refer the read-
ers to [11, 12, 17–22]. Robert and Casella [23] review the history and assess the
impact of MCMC. Spiegelhalter et al. [5] discuss the application of MCMC methods
to clinical trials and epidemiological analysis, while Baio [6] and Welton et al. [24]
present the main feature of MCMC methods and their applications, specifically to
the problem of health economic evaluation.
In a nutshell, the idea underlying MCMC is to construct a Markov chain, a
sequence of random variables for which the distribution of the next value only
depends on the current one, rather than the entire history. Given some initial val-
ues, this process can be used to repeatedly sample and eventually converge to the
target distribution, e.g. the posterior distribution for a set of parameters of interest.
Once convergence has been reached, it is possible to use the simulated values to
compute summary statistics (e.g. mean, standard deviation or quantiles), or draw
histograms to characterise the shape of the posterior distributions of interest.
Figure 1.3 depicts the MCMC procedure for the case of two parameters (μ, σ); in
this case, we superimpose the “true” joint density (the solid dark ellipses), which is
the “target” for the simulation algorithm. Obviously, this is done for demonstration
only—in general we do not know what the target distribution is (and this is why we
use MCMC to estimate it!).
Panel (a) shows the first 10 iterations of the process: the first iteration (labelled as
“1”) is set as the initial value and it happens to be in a part of the relevant space that
is not covered by the “true” distribution. Thus, this point is really not representative
of the underlying target distribution. In fact, as is common, the first few values
of the simulation are spread all over the space and do not really cover the target
area. However, as the number of simulations increases in panel (b), more and more
simulated points actually fall within the “true” distribution, because the process is
reaching convergence. In panel (c), after 1000 iterations effectively all of the target
area has been covered by simulated values, which can be then used to characterise
the joint posterior distribution p(μ, σ).
It is interesting to notice that, in general, it is possible to construct a suitable
Markov chain for a very wide range of problems and, more importantly, given a suf-
ficient number of simulations, it can be proved that it is almost certain that the Markov
chain will converge to the target (posterior) distribution. However, this process may
require a large number of iterations before it actually starts visiting the target area,
rather than points in the parametric space that have virtually no mass under the pos-
1.2 Bayesian Inference and Computation 11

(a) After 10 iterations (b) After 30 iterations


7

7
10 10
2 2
6

6
5

5
29
18
σ

σ
1224
22
3 7 3 27
147
4

4
4 9 19
4 179
20
30 25 266 16
21
86 8
5 28 1323
5 15
11
3

3
1 1
2

2
−2 0 2 4 6 8 −2 0 2 4 6 8
μ μ

(c) After 1000 iterations (d)


42 100 150 200 Chain 1
Chain 2
7

10
2
694
643
627787301 947 517
436 71
441 673
388
6

450319935365
614
484 511
44
281
405
271
678976
476
848
914
727 442 994 544
190
91292 487
554
745 323818 147
841
101
878 82393
413 76 865
193
959
904 58
750
729
776
686
611
661
726
631
407
210
205
681
485 254
700
724 111
230
150 832
260
640
340 185
134
285
123
872 579
483495
997654
995 596792
612 784
911
655
575 155
90371
524
221
648 538
937161
646
296
149
421
701
574 266 877
173
639880
33
5

465 606
103 635
253
429 537
410
846 177
379
540
710
980
461
598
492 80158729
844 400
369
982
342
386
156 572615
555 668
244939
184 412
867
541
277807
966
826
4061
968 99018
416
744 675828
1000
274
653
50
σ

905 717 378


493
526733
170
740
503 771
636
817 346
217
930 873
770
986
130
854
187
739
52
395 295
226
80
146
283 269
499 264
641
124 515 577 796
398
477 658 344
457
571
928
882
863 795
633
774
65164
567
550
819 637683
352
22
960
657
207
978
464 90012748
276
996
956
355
202
143
120 240
24
223
93
765
138
376 582
585
3 72 525 829
451 679
186 293
797
126
991629957
625
312
752
599
377
56
203
181 852
563
565
251
448
617
215
824
899834
27
106
258 321
481
543
593
689 756
890
921 945
720
109
821
667
811
619 196
238
459
363
670
731 37
664 381
220
136317
309
97
929 414
382
938692
443
690 940
523
849
211 884 605
252
298
182
157
754
444
59
556 65
164
780250
502
925
506
255
546 246
549
311
924
932
716
718
73 88
59414
200
789 373
514194
7
197
698
509
962
360813
354
491
963 337
800
468 558
705
167
860
951
75
368529
552
518 366
628
618 262
174
449394
437
766
267399
922
272 498
403
970576
315
870
415 908
975
159
374659
560
851
602
919
847
671
66 474
180
237
521
684
696
201
888
583
453
212
85 338
742918
455
397
769
588
875
591125
248
893
154
431 48
333
27835
69
816
357
907
778
793
183
286 835
284
92
446802
343
755
808
135
347 522234
144
282
941
520
438
171
626
711 372
445
327
964
232988677
783
313
411
326
460
707 603
936
909
699
261 50
920
236
569
853
706
838
118
115 857
933
391
424
663
132
158
858243
153 508
113
49
927
862
993
179163
645
714
535
229
78
339 823
773
601
856
530
362898
969 305
370
297
504
430 753
732471
401
165
119 913
736 779
31
359
329
4

788 915
241213
662
310 730
439
912
691
592
198
279 222
466
151
208 885
385
665
712
616
488489141
589
805
257
981
452 735
383
162
785622
513
722916
104
570
595
418
227
973
608
84117
737
204
116
887 304
559
402
322
604998
715
433
99
55325
96
810
723
87
480473761290
746 496874886
891
409 906
324
977
265
478
422 127
432
462469
288
417
845
175
100 472
291
419
533
270
142
287
747
883 547
781
831
110
390 54
169
650
896
952 943
0

728 948 864


833
486
632 102
881
318
408
89
34145
734
871
121
25919
426
4 396
721
695
131
361
512
356
944475
806 519
757
256
738
751 17
564
39869
107
804
20427
505
139
314
458
105
794
798
568435
620
763
767
839
843 947
609
842
497
758
67
26 676
837
233
983 108
332
889 490
985
972 903
500
206
148
580
741 30 218
38
79
425137
216
971 168
456
21
961
7725790
299140
561
775
917
623
923
334 62
979
987
791
440
669
389
81 709
954
892307
534
953
803
3861
35
95
129
812
199 98
607
249
330
294
114
189
989 423
463
235
63
6532 528
16
768
152
879 191
302
53 719
434
958
786
682
364
553
702
83
647 268
876
584708
7086
942
672 910
345
697
772
188
590
855
836
367
214
392
548
634
404 992
447
350
965 172
693
375 8 467
868
830
192
950782760
384303
814 902
688
894
380850
542306
387
573 470
840
901
713
578
762 931
799
225
967 34
764
545231
494
820
610 178
621
242
895
638
358
501
348
224
566 897
128
300
122
94
624
859
32868219
166 531
320 41 74597
275551
308
825749
536
649
666 5479
228
28
57
777
955
133
652
680176974
51
13
353
949
656 23
644
454
984
539
15 420
510
209 331 336
557
703
349
11 613 36674
43
406239
926
280 46
60
562
809
160351
685
581 245
482
428
704
822
−100 −50

660
866
642
32 145
946
725
195 999
934
507 759
289
815 247
586
3

527
687 316
112 600
273
827630
263
516743

1 Burn−in Sample after convergence


2

−2 0 2 4 6 8 0 100 200 300 400 500


μ Iteration

Fig. 1.3 Markov Chain Monte Carlo in practice: panel (a) shows the first 10 iteration of a MCMC
for two parameters μ (on the x−axis) and σ (on the y−axis). The solid ellipses represent the “true”
underlying joint distribution p(μ, σ). At the beginning of the process, the simulations do not cover
the target area, but as the number of iterations increases as in panels (b) and (c), effectively all the
target area is fully represented. Convergence to the target distribution can be visually inspected by
running two (or more) Markov chains in parallel and checking the mix up as in panel (d). Adapted
from: [6]

terior distribution (e.g. the points labelled as “1”, “2”, “3”, “10” and “11” in Fig. 1.3).
For this reason it is essential to carefully check convergence and typically to discard
the iterations before convergence.
Panel (d) shows a traceplot for two parallel Markov chains, which are initialised at
different values: this graph plots on the x−axis the number of simulations performed
and on the y−axis the simulated values. As the number of simulations increases, the
two chains converge to a common area and mix up. We can visually inspect graphs
such as this and determine the point at which convergence has occurred. All the
12 1 Bayesian Analysis in Health Economics

simulations preceding this point are then discarded, while those after convergence
are used to perform the analysis.
For example, consider the simple case where there is only one scalar parameter of
interest. After the model is successfully run, we can typically access a vector of n sim
simulations θ̃ = (θ̃1 , . . . , θ̃n sim ) from the posterior. We can use this to summarise the
results, for example by computing the posterior mean

1 
n sim
E[θ|y] = θ̃i ,
n sim i=1

or identifying suitable quantiles (for example the 2.5 and 97.5%), to give an approx-
imate (in this case 95%) credible interval.
If the MCMC algorithm has run successfully,5 then the information provided in θ̃
will be good enough to fully characterise the variability in the posterior distribution.
Of course, ideally we would like to use a large enough number of simulations; often,
n sim is set to 1000, although this is pretty much an arbitrary choice. Depending on the
underlying variability in the target distribution, this number may not be sufficient to
fully characterise the entire distribution (although it will usually estimate with suit-
able precision its core). Again, we refer to, for instance, [6] for more details. Examples
of the procedure described above are also presented in Sects. 2.3.1 and 2.4.1.

1.3 Basics of Health Economic Evaluation

Health economics is typically concerned with evaluating a set of interventions


t ∈ T = (0, 1, . . . , T ) that are available to treat a given condition. These may be
drugs, life-style modification or complex interventions—the general concepts of
economic evaluations apply regardless. We only consider here the most important
issues; comprehensive references include [1, 2, 6].
As mentioned above, the economic outcome is a multivariate response y = (e, c),
represented by a suitable clinical outcome (e.g. blood pressure or occurrence of
myocardial infarction), together with a measure of the costs associated with the given
intervention. On the basis of the available evidence (e.g. coming from a randomised
controlled trial or a combination of different sources, including observational data),
the problem is to decide which option is “optimal” and should then be applied to
the whole homogenous population. In the context of publicly funded healthcare
systems (such as those in many European countries, including the UK National

5 All the references cited above discuss to great length the issues of convergence and autocorrelation

and methods (both visual and formal) to assess them. In the discussion here, we assume that the
actual convergence of the MCMC procedure to the relevant posterior or predictive distributions has
been achieved and checked satisfactorily, but again, we refer the reader to the literature mentioned
above for further discussion and details. We return to this point in Chap. 2 when describing the case
studies.
1.3 Basics of Health Economic Evaluation 13

Health Service, NHS), this is a fundamental problem as public resources are finite
and limited and thus it is often necessary to prioritise the allocation of public funds
on health interventions.
Crucially, “optimality” can be determined by framing the problem in decision-
theoretic terms [1, 3, 5, 6], which implies the following steps.
• Characterise the variability in the economic outcome (e, c), which is typically
due to sampling, using a probability distribution p(e, c|θ), indexed by a set of
parameters θ. Within the Bayesian framework, uncertainty in the parameters is
also modelled using a probability distribution p(θ).
• Value the consequences of applying a treatment t, through the realisation of the
outcome (e, c) by means of a utility function u(e, c; t).
• Assess “optimality” by computing for each intervention the expectation of the
utility function, with respect to both “population” (parameters) and “individual”
(sampling) uncertainty/variability

U t = E [u(e, c; t)] .

In line with the precepts of (Bayesian) decision theory, given current evidence the
“best” intervention is the one associated with the maximum expected utility. This
is because it can be easily proved that maximising the expected utility is equivalent
to maximising the probability of obtaining the outcome associated with the highest
(subjective) value for the decision-maker [1, 6, 7, 10].
Under the Bayesian framework, U t is dimensionless, i.e. it is a pure number, since
both sources of basic uncertainty have been marginalised out in computing the expec-
tation. Consequently, the expected utility allows a direct comparison of the alternative
options.
While the general setting is fairly straightforward, in practice, the application of
the decision-theoretic framework for health economic evaluation is characterised by
the following complications.
1. As any Bayesian analysis, the definition of a suitable probabilistic description of
the current level of knowledge in the population parameters may be difficult and
potentially based on subjective judgement.
2. There is no unique specification of the method of valuation for the consequences
of the interventions is (i.e. what utility function should be chosen):
3. Typically, replacing one intervention with a new alternative is associated with
some risks such as the irreversibility of investments [25]. Thus, basing a decision
on current knowledge may not be ideal, if the available evidence-base is not
particularly strong/definitive (we elaborate on this point in Chap. 4).
As for the utility function, health economic evaluations are generally based on
the (monetary) net benefit [26]

u(e, c; t) = ke − c.
14 1 Bayesian Analysis in Health Economics

Here k is a willingness-to-pay parameter, used to put cost and benefits on the same
scale and represents the budget that the decision-maker is willing to invest to increase
the benefits by one unit. The main appeal of the net benefit is that it has a fixed
form, once the variables (e, c) are defined, thus providing easy guidance to valuation
of the interventions. Moreover, the net benefit is linear in (e, c), which facilitates
interpretation and calculations. Nevertheless, the use of the net benefit presupposes
that the decision-maker is risk neutral, which is by no means always appropriate in
health policy problems [27].
If we consider the simpler scenario where T = (0, 1), decision-making can be
equivalently effected by considering the expected incremental benefit (of treatment
1 over treatment 0)

EIB = U 1 − U 0 (1.5)

—of course, if EIB > 0, then U 1 > U 0 and therefore t = 1 is the optimal treatment
(being associated with the highest expected utility).
In particular, using the monetary net benefit as utility function, (1.5) can be re-
expressed as

EIB = E[kΔe − Δc ] = kE[Δe ] − E[Δc ] (1.6)

where
Δe = E[e|θ 1 ] − E[e|θ 0 ] = μ1e − μ0e

is the average increment in the benefits (from using t = 1 instead of t = 0) and


similarly
Δc = E[c|θ 1 ] − E[c|θ 0 ] = μ1c − μ0c

is the average increment in costs deriving from selecting t = 1.


If we define the Incremental Cost-Effectiveness Ratio as

E[Δc ]
ICER =
E[Δe ]

then it is straightforward to see that when the net monetary benefit is used as utility
function, then

⎪ E[Δc ]

⎨ k > E[Δ ] = ICER, for E[Δe ] > 0
e
EIB > 0 if and only if

⎪ E[Δ c]
⎩k < = ICER, for E[Δe ] < 0
E[Δe ]

and thus decision-making can be equivalently effected by comparing the ICER to


the willingness-to-pay threshold.
1.3 Basics of Health Economic Evaluation 15

Notice that, in the Bayesian framework, (Δe , Δc ) are random variables, because
while sampling variability is being averaged out, these are defined as functions of
the parameters θ = (θ 1 , θ 0 ). The second layer of uncertainty (i.e. the population,
parameters domain) can be further averaged out. Consequently, E[Δe ] and E[Δc ] are
actually pure numbers and so is the ICER.
The two layers of uncertainty underlying the Bayesian decision-making process as
well as the relationships between the variables defined above can be best appreciated
through the inspection of the cost-effectiveness plane, depicting the joint distribution
of the random variables (Δe , Δc ) in the x− and y−axis, respectively.
Intuitively, the cost-effectiveness plane characterises the uncertainty in the para-
meters θ. This is represented by the dots populating the graph in Fig. 1.4a, which
can be obtained, for example, by simulation. By taking the expectations over the
marginal distributions for Δe and Δc , we then marginalise out this uncertainty and
Cost differential

Cost differential

Effectiveness differential Effectiveness differential


(a) Joint distribution for (Δe , Δc ) (b) ICER overlain on the joint distribution
of (Δe , Δc )
Cost differential

Cost differential

Effectiveness differential Effectiveness differential


(c) ICER lying inside the sustainablity area (d) ICER lying outside the sustainablity area

Fig. 1.4 Cost-effectiveness plane, showing simulations from the joint (posterior) distribution of
the random variables (Δe , Δc )
16 1 Bayesian Analysis in Health Economics

obtain a single point in the plane, which represents the “typical future consequence”.
This is shown as the dot in Fig. 1.4b, where the underlying distribution has been
shaded out.
Figure 1.4c also shows the “sustainability area”, i.e. the part of the cost-
effectiveness plane which lies below the line E[Δc ] = kE[Δe ], for a given value
of the willingness-to-pay k. Given the equivalence between the EIB and the ICER,
interventions for which the ICER is in the sustainability area are more cost-effective
than the comparator. Changing the value for the threshold, may modify the decision
as to whether t = 1 is the most cost-effective intervention. The EIB can be plot as a
function of k to identify the “break-even point”, i.e. the value of the willingness-to-
pay in correspondence of which the EIB becomes positive.
Finally, Fig. 1.4d shows the sustainability area for a different choice of the para-
meter k. In this case, because the ICER —and, for that matter, most of the entire
distribution of (Δe , Δc )—lie outside the sustainability area, the new intervention
t = 1 is not cost-effective. We elaborate this further in Chap. 4, when discussing
probabilistic sensitivity analysis.

1.4 Doing Bayesian Analysis and Health Economic


Evaluation in R

The general process of doing a Bayesian analysis (with a view of using the results of
the model to perform an economic evaluation) can be classified according to several
steps. We review these steps in the following, relating the process to its practical
features and assuming R as the software of choice.

Uncertainty • Assesses the impact of uncertainty (eg in parameters or


model structure) on the economic results
analysis • Fundamentally Bayesian!

Statistical Economic Decision


model model analysis

• Estimates relevant population • Combines the parameters to obtain • Summarises the economic model
parameters θ a population average measure for by computing suitable measures of
• Varies with the type of available costs and clinical benefits “cost-effectiveness”
data (& statistical approach!) • Varies with the type of available • Dictates the best course of actions,
data & statistical model used given current evidence
• Standardised process

Fig. 1.5 A graphical representation of the process of health economic evaluation based on cost-
effectiveness or cost-utility analysis
1.4 Doing Bayesian Analysis and Health Economic Evaluation in R 17

Figure 1.5 shows a graphical representation of this process. The process starts with
a statistical model that is used to estimate some relevant parameters, which are then
fed to an economic model with the objective of obtaining the relevant population
summaries indicating the incremental benefits and costs for a given intervention.
These are in turn used as the basis for the decision analysis, as described above. The
final aspect is represented by the evaluation of how the uncertainty that characterises
the model impacts the final decision-making process. We describe the several steps
(building blocks) and their relevance to the analysis in the following.

1.4.1 Pre-processing the Data

In this step, we typically create, aggregate and modify the original variables available
in the dataset(s) that we wish to analyse. In the context of economic evaluation this
may be needed because the outcomes of interest may have to be computed as functions
of other observable variables—for example, total costs could be obtained as the sum
of several cost items (e.g. service provision, acquisition of the intervention, additional
treatments and so on).
In any case, this step, typically performed directly in R, serves to generate a data
list that contains the values of all the variables that are of interest and should be
modelled formally. The complexity of this data list depends on the nature of the
original data: for example, when dealing with experimental evidence (e.g. coming
from a RCT), often we model directly the quantities of interest (i.e. the variables of
costs and clinical effectiveness or utility).
For example, in the context of an RCT, we would be likely to directly observe the
variables (eit , cit ) for individuals i = 1, . . . , n t in each treatment arm t = 0, . . . , T
and could model them so that the relevant parameters are their population averages
(μet , μct )—see for instance Sect. 5.4 in [6].
In other cases, for example when using aggregated data, it is necessary to build
a more complex model that directly considers ancillary variables (which may be
observed) and these are then manipulated to derive the relevant economic outcomes.
This type of modelling is often referred to as “decision-analytic” and it typically
amounts to creating a set of relationships among a set of random quantities. A deci-
sion tree may be used to combine measures of costs and effectiveness (e.g. in terms
of reduction in the occurrence of adverse events)—examples of this strategy are in
Sect. 5.5 in [6]. We also consider models of this kind in Chap. 2 and in Sect. 4.4.3.1.

1.4.2 Building and Coding the Analysis Model

This is the most mathematical and in many ways creative part of the process; accord-
ing to the nature and availability of the data, we need to create a suitable probabilistic
18 1 Bayesian Analysis in Health Economics

model to describe uncertainty. Technically, this step is required even outside of the
Bayesian framework that we adopt. Of course, under the Bayesian paradigm, all the
principles described in Sect. 1.2.2 should be applied. Again, depending on the nature
of the data, the model may be more or less complex and encode a larger or smaller
number of assumptions/probabilistic features.
Assuming that the method of inference is some sort of simulation-based procedure
such as MCMC, this step is usually performed by first “translating” the model on a
text file, which contains the description of the assumptions in terms of distributional
and deterministic relationships among the variables. A “frequentist counterpart” to
this step would be the creation of a script which codes the modelling assumptions.
We provide examples of this process under a full Bayesian framework in Chap. 2,
where we also briefly discuss the issues related with convergence and model
checking.

1.4.3 Running the Model

At this point, we can “run the model”, which provides us with an estimation of
the quantities of interest. As we have repeatedly mentioned, these may be directly
the average costs and benefits under different interventions, or perhaps some other
quantities (e.g. the transition probabilities in a Markov model setting).
In our ideal Bayesian process, this step is performed by specialised software (e.g.
JAGS or BUGS) to run the MCMC procedure, which we typically interface with R.
In other words, after we have created the data list and the text file with the model
assumptions, we call the MCMC sampler directly from R. This will then take over and
run the actual analysis, at the end of which the results will be automatically exported
in the R workspace, in the form a suitable object, containing, e.g. the samples from the
posterior or predictive distributions of interest. We show in Chap. 2 some examples
of this procedure.
Once the model has run, the next step involves checking its performance (e.g.
in terms of convergence, if the procedure is based, as it often is, on an MCMC
algorithm). There are several diagnostic tools for MCMC, most of which can be
implemented directly in R. Thus, again following our ideal process, at this point the
used will have re-gained control of the R session in which the simulations from the
model are stored. Standard methods of analysis of convergence and autocorrelation
are described in detail in many specific texts, for instance [6, 14, 18].
The combination of the steps described so far can be thought of as the Statistical
model box of Fig. 1.5.
1.4 Doing Bayesian Analysis and Health Economic Evaluation in R 19

1.4.4 Post-processing the Results

Perhaps even more importantly, from the health economic point of view, depending
on the type of data available, the results of the model may not directly provide
the information or variables needed to perform the cost-effectiveness analysis. For
instance, while individual level data may be used to estimate directly the average
cost and benefits, using aggregated data may mean that the model is estimating some
parameters which are not necessarily the actual measures of clinical benefit and cost
(e, c).
Thus, it will be often necessary to combine the quantities estimated from the model
using logical relationships that define (e, c). For example, the model may estimate
the posterior distribution for λt and γ, indicating respectively the treatment-specific
length of hospitalisation for a given disease and the cost associated with it. Neither
of these can be directly used as a measure of cost associated with the treatment being
considered, but we may construct a new variable ct = λt γ to represent it.
This step is described by the Economic model box in Fig. 1.5 and performing it
in R can be particularly effective—once the full posterior distributions are available
to the R workspace, calculations such as the one showed above are generally trivial.
Figure 1.6 shows a graphical representation of how the Statistical model and the
Economic model described in Fig. 1.5 are performed and combined. Basically, the
whole process begins with the creation of a model, which describes how the inter-
vention is applied to the relevant population, what the effects are (in terms of benefits
and costs) and what variables or parameters are considered. This step may be done
in “pen-and-paper” and is in fact the most creative part of the whole exercise. Often,
we can rely on simple structures, such as decision trees (as in the top-left corner
of Fig. 1.6), while sometimes this requires a more complicated description of the
underlying reality. Again, notice that this step is required irrespective of the statisti-
cal approach considered.
In the full Bayesian framework that we advocate, these assumptions will be trans-
lated into code, e.g. using JAGS or BUGS, as shown in the top-right corner of Fig. 1.6;
in a non-Bayesian context, other software may be used (e.g. SAS, Stata or R), but in
a sense this procedure of scripting is also common to all statistical approaches and
allows to make the model easily replicable. We note here (and return to this point in
Sect. 3.1) that using less sophisticated tools such as Microsoft Excel may render
this step less straightforward.
Once the model has been coded up, it can be run to obtain the relevant estimates
(bottom-right corner of Fig. 1.6). In this case, we consider an R script that defines
the data and interfaces with the MCMC software (in this case JAGS) to compute the
posterior distributions of the model parameters. Of course, this step needs careful
consideration, e.g. it is important to check model convergence and assess whether
the output is reasonable.
Finally (bottom-left corner of Fig. 1.6), we can post-process the model output to
create the relevant quantities to be passed to the Economic model. In this case, we
use again R to combine the model parameters into suitable variables of benefits and
20 1 Bayesian Analysis in Health Economics

# JAGS model (saved to ‘modelChemo.txt’)


model {
Economic outcomes pi[1] ∼ dbeta(a.pi,b.pi)
At pi[2] <- pi[1]*rho
Ambulatory care et = 0, ct = cdrug + camb
(γ)
t
rho ∼ dnorm(m.rho,tau.rho)
SEt
Blood-related gamma ∼ dbeta(a.gamma,b.gamma)
side effects c.amb ∼ dlnorm(m.amb,tau.amb)
(πt )
SEt − At
Hospital admission et = 0, ct = cdrug
t + chosp
⇒ c.hosp ∼ dlnorm(m.hosp,tau.hosp)
for (t in 1:2) {
(1 − γ)
N
SE[t] ∼ dbin(pi[t],N)
Standard A[t] ∼ dbin(gamma,SE[t])
treatment
H[t] <- SE[t] - A[t]
}
}

N − SEt
No side effects et = 1, ct = cdrug
t
(1 − πt )

# Calls JAGS in background to run the model


library(R2jags)
data <- list("a.pi","b.pi","a.gamma","b.gamma",
"m.amb","tau.amb","m.hosp","tau.hosp",
"m.rho","tau.rho","N")
filein <- "modelChemo.txt"
params <- c("pi","gamma","c.amb","c.hosp",
# Creates the variables of cost & effectiveness "rho","SE","A","H")
e <- c <- matrix(NA,1000,2) inits <- function()
e <- N - SE list(pi=c(runif(1),NA),gamma=runif(1),
for (t in 1:2) {
c[,t] <- c.drug[t]*(N-SE[,t]) +
⇐ c.amb=rlnorm(1),c.hosp=rlnorm(1),
rho=runif(1))
(c.amb+c.drug[t])*A[,t] +
(c.hosp+c.drug[t])*H[,t] chemo <- jags(data,inits,params,n.iter=20000,
} model.file=filein,n.chains=2,n.burnin=9500,
n.thin=42,DIC=FALSE)
print(chemo,digits=3,intervals=c(0.025, 0.975))
attach.jags(chemo)

Fig. 1.6 Graphical representation of the process of (Bayesian) health economic evaluation in terms
of the Statistical model and Economic model of Fig. 1.5. First (top-left corner ), a model is created,
which describes the possible clinical pathways and the effects of a given intervention in terms of
costs and benefits. Then (top-right corner), the assumptions encoded in the model are translated
into suitable code. The model is then run (bottom-right corner), e.g. by writing suitable R code
to define the data and interface with a specific software (e.g. JAGS or BUGS). Finally (bottom-left
corner), the results of the model are post-processed in R to produce the economic summaries that
are then fed to the “Decision analysis” block

costs (e and c, respectively). This step basically amounts to performing the Economic
model block of Fig. 1.5.

1.4.5 Performing the Decision Analysis

The rest of the decision process, represented by the Decision analysis box in Fig. 1.5,
is probably the easiest; in fact, once the relevant quantities have been estimated,
the optimal decision given the current knowledge can be derived by computing the
summaries described in Sect. 1.3; the results of this process may be depicted as in
panel (b) of Fig. 1.4.
The dashed arrows connecting the Statistical model to the Economic model
through the Uncertainty analysis box in Fig. 1.5 also describe how this process
occurs: if the uncertainty described by the posterior distributions is marginalised
(averaged) out, then the analysis is performed on the “straight line” from the sta-
tistical to the decision analysis. This represents the decision process under current
uncertainty and identifies the best course of action today.
1.4 Doing Bayesian Analysis and Health Economic Evaluation in R 21

On the other hand, if uncertainty is not marginalised, then we can analyse the
“potential futures” separately, e.g. as in panels (a), (c) and (d) of Fig. 1.4. A full
Bayesian approach will allow to directly perform this form of “probabilistic sensi-
tivity analysis”, which allows the evaluation of the impact of parameters uncertainty
on the optimal decision.

1.4.6 Using BCEA

The main objective of this book is to describe how health economic evaluation can be
systematised and performed routinely and thoroughly using the R package BCEA—the
remaining chapters will present in details the features of the package, using worked
examples going through the several steps of the economic analysis.
In a sense, BCEA plays a role in the economic evaluation after the statistical model
has been performed. While to use BCEA it is not even strictly necessary to consider
a Bayesian approach (we return to this point in Sect. 3.1 and in Chap. 5), the entire
book is based on the premise that the researchers have indeed performed a Bayesian
analysis of their clinical and economic data. We stress here however that BCEA only
deals with the decision and uncertainty analysis.

References

1. A. Briggs, M. Sculpher, K. Claxton, Decision Modelling for Health Economic Evaluation


(Oxford University Press, Oxford, 2006)
2. A. Willan, A. Briggs, The Statistical Analysis of Cost-effectiveness Data (Wiley, Chichester,
2006)
3. A. O’Hagan, J. Stevens, Health Economics 10, 303 (2001)
4. A. O’Hagan, J. Stevens, J. Montmartin, Statistics in Medicine 20, 733 (2001)
5. D. Spiegelhalter, K. Abrams, J. Myles, Bayesian Approaches to Clinical Trials and Health-
Care Evaluation (Wiley, Chichester, 2004)
6. G. Baio, Bayesian Methods in Health Economics (Chapman Hall/CRC Press, Boca Raton,
2012)
7. J. Bernardo, A. Smith, Bayesian Theory (Wiley, New York, 1999)
8. D. Lindley, The Statistician 49, 293 (2000)
9. C. Robert, The Bayesian Choice, 2nd edn. (Springer, New York, 2001)
10. D. Lindley, Understanding Uncertainty (Wiley, New York, 2006)
11. B. Carlin, T. Louis, Bayesian Methods for Data Analysis, 3rd edn. (Chapman Hall/CRC, Boca
Raton, 2009)
12. S. Jackman, Bayesian Analysis for the Social Sciences (Wiley, New York, 2009)
13. R. Christensen, W. Johnson, A. Branscum, T. Hanson, Bayesian Ideas and Data Analyses
(Chapman Hall/CRC, Boca Raton, 2011)
14. A. Gelman, J. Carlin, H. Stern, D. Dunson, A. Vehtari, D. Rubin, Bayesian Data Analysis, 3rd
edn. (Chapman Hall/CRC, New York, 2013)
15. G. Baio, Statistics in Medicine 33(11), 1900 (2013)
16. H. Raiffa, H. Schlaifer, Applied Statistical Decision Theory (Harvard University Press, Boston,
1961)
22 1 Bayesian Analysis in Health Economics

17. W. Gilks, S. Richardson, D. Spiegelhalter, Markov Chain Monte Carlo in Practice (Chapman
Hall, London, 1996)
18. D. Gamerman, Markov Chain Monte Carlo (Chapman and Hall, London, 1997)
19. C. Robert, G. Casella, Monte Carlo Statistical Methods, 2nd edn. (Springer, New York, 2004)
20. A. Gelman, J. Carlin, H. Stern, D. Rubin, Bayesian Data Analysis, 2nd edn. (Chapman Hall,
New York, 2004)
21. C. Robert, G. Casella, Introducing Monte Carlo Methods with R (Springer, New York, 2010)
22. S. Brooks, A. Gelman, G. Jones, X. Meng, Handbook of Markov Chain Monte Carlo (Chapman
Hall/CRC, Boca Raton, 2011)
23. C. Robert, G. Casella, Statistical Science 26, 102 (2011)
24. N. Welton, A. Sutton, N. Cooper, K. Abrams, Evidence Synthesis for Decision Making in
Healthcare (Wiley, Chichester, 2012)
25. K. Claxton, J. Health Econ. 18, 342 (1999)
26. A. Stinnett, J. Mullahy, Medical Decision Making 18(Suppl), S68 (1998)
27. B. Koerkamp, M. Hunink, T. Stijnen, J. Hammitt, K. Kuntz, M. Weinstein, Medical Decision
Making 27(2), 101 (2007)
Chapter 2
Case Studies

2.1 Introduction

In this chapter, we present the two case studies that are used as running examples
throughout the book.
The first example considers a decision-analytic model. This is a popular tool in
health economic evaluation, when the objective is to compare the expected costs and
consequences of decision options by synthesising information from multiple sources
[1]. Examples of decision-analytic models include decision trees or Markov models.
As these models are based on several different sources of information, they can offer
decision-makers the best available information to reach their decision, as opposed to
modelling based on a single randomised clinical trial (RCT). Additionally, RCTs may
be limited in scope (e.g. in terms of the temporal follow up) and thus not be ideal for
characterising the long-term consequences of applying an intervention. Thus, even
when RCT data are available, a health economic evaluation is often extended to a
decision-analytic approach to allow decision-makers to capture information about
the long-term effects and costs.
The second example is a multi-decision problem. This means that, in contrast to
standard economic modelling, where a new intervention t = 1 is compared to the
status quo t = 0, we consider here T = 4 potential interventions. This is naturally
linked to the wider topic of network meta-analysis [2], which is an extension of
different statistical methods that allow researchers to pool evidence coming from
different sources and include direct and indirect comparisons. This is particularly
relevant when head-to-head evidence is only available for some of the interventions
under consideration. Suitable statistical modelling can be used to infer about the
direct comparisons with no information by using the indirect ones.
For both examples, we first introduce the general background, discussing the dis-
ease area and the interventions under consideration. We then describe the assumptions
underlying the statistical model (e.g. in terms of distributions for the observed/ob-
servable variables and the unobservable parameters). We also show how these can
be translated into suitable code to perform a full Bayesian analysis and obtain

© Springer International Publishing AG 2017 23


G. Baio et al., Bayesian Cost-Effectiveness Analysis with the R package BCEA,
Use R!, DOI 10.1007/978-3-319-55718-2_2
24 2 Case Studies

samples from the relevant posterior distributions (see Sect. 1.2). Finally, we demon-
strate any post-processing required to produce the relevant inputs for the economic
model (e.g. the population average differential of costs and benefits—see Sect. 1.4).
In the rest of the book, we refer (rather interchangeably) to OpenBUGS [3] and JAGS
[4], arguably the most popular software to perform Bayesian analysis by means of
Markov Chain Monte Carlo simulations.
The examples are then used to showcase the facilities of BCEA and to explain
the process of performing an economic evaluation in R, once the statistical model
has been fitted. It is important to note (and we will expand on this point in Chap. 3)
that BCEA can be used to perform cost-effectiveness analysis when the full statistical
model has not been fitted within the Bayesian framework. Nevertheless, we strongly
advocate the use of a Bayesian framework and thus we have included these examples
to demonstrate a full Bayesian analysis.
Starting from this chapter on, the text will include frequent code blocks to show
how to execute commands and use the BCEA package. R code is presented in code
blocks in the text, with each new line starting with the symbol >. Indentation indicates
lines continuing from the previous statement. Hash symbols (i.e. #) in code blocks
indicate comments. In-line words formatted in mono-spaced font (such as this)
indicate code, for example short commands or function parameters.

2.2 Preliminaries: Computer Configuration

In this section, we briefly review the ideal computer configuration we are assuming to
run the examples later in this chapter and in the rest of the book. It is difficult to guar-
antee that these instructions will be valid for every future release of the programmes
we consider here, although they have been tested under the current releases.
We assume that the user’s computer has the following software installed:
• R and the package BCEA. Other optional packages (e.g. R2OpenBUGS, R2jags,
reshape, plyr or INLA) may need to be installed;
• OpenBUGS or JAGS. These are necessary to perform the full Bayesian analyses we
discuss in the rest of the book. It is not necessary to install both;
• The R front-end, for example Rstudio (available for download at the webpage
https://www.rstudio.com/). This is also optional and all the work can be done
using the standard R terminal;
• A spreadsheet calculator, e.g. MS Excel or the freely available LibreOffice,
which is a decent surrogate and can be downloaded at https://www.libreoffice.
org/.
In the following, we provide some general instructions, for MS Windows, Linux
or Mac OS operating systems.
2.2 Preliminaries: Computer Configuration 25

2.2.1 MS Windows Users

For MS Windows users, the set-up should be fairly easy and amounts to the following
steps:
1. Install OpenBUGS
• Download the latest release (currently it is version 3.2.3, stored in the file
OpenBUGS323setup.exe) from http://openbugs.net/w/Downloads and run it
by double-clicking on it.
2. Install R
a. Download R from the Comprehensive R Archive Network (CRAN): http://
cran.r-project.org/bin/windows/ (click on the link “install R for the first time”).
b. When the process is finished, open R and type in the terminal the following
command.
> install.packages("BCEA")
> install.packages("R2OpenBUGS")

These commands will download and install the packages BCEA and
R2OpenBUGS. The latter is needed to interface OpenBUGS with R. Follow the
on-screen instructions (you will be asked to select a mirror from which to
obtain the necessary files). Notice that the command install.packages
("Name_of_the_package") can be used to install any other R package.
3. (Optional): Install JAGS
a. Download the installer from the webpage http://sourceforge.net/projects/
mcmc-jags/files/JAGS/4.x/Windows/ by clicking on the latest available exe-
cutable file (currently, JAGS-4.2.0.exe). Executing this file will install JAGS
on the user’s machine.
b. In the R terminal type the command
> install.packages("R2jags")

This will install the package R2jags, which allows to interface JAGS from R.

2.2.2 Linux or Mac OS Users

2.2.2.1 Installing R and BCEA

Linux or Mac OS users should follow slightly different approaches. The installation
of R is pretty much the same as for MS Windows users. From the webpage http://
cran.r-project.org/ select the relevant operating system (Linux or Mac OS) and then
the relevant version (e.g. debian, redhat, suse or ubuntu, for Linux). Follow the
instructions to install the software. Once this is done, open R and install the package
BCEA following the process described above.
26 2 Case Studies

2.2.2.2 Installing OpenBUGS and JAGS in Linux

OpenBUGS runs natively in Linux and so it can be installed following the instructions
given at http://openbugs.net/w/Downloads. First, download the most recent ver-
sion of the source file, currently OpenBUGS-3.2.3.tar.gz. Then open a Linux ter-
minal and follow these steps:
1. Unpack the file and move to the newly created directory OpenBUGS-3.2.3. by
typing the following commands.
tar zxvf OpenBUGS -3.2.3. tar . gz
cd OpenBUGS -3.2.3

2. Compile and install the software


./ configure
make
sudo make install

Notice that if the user does not have administrative access, this command will
fail. A possible workaround is to specify a location to which OpenBUGS should be
installed that is owned by the user, for example
./ configure -- prefix =/ home / user / myfolder
make
sudo make install

— it is possible to check permission by using the Unix command


ls -ls / home / user

which returns a list of the folders and files contained in the folder /home/user.
This will look something like
4 drwxr -xr -x 9 user user 4096 Jul 5 17:02 Desktop
4 drwx - - - - - - 25 user user 4096 Jul 14 09:23 myfolder
196 -rw - rw -r - - 1 root root 197534 Jul 11 16:08 some_file . png
...

and, in this case, the folder myfolder does belong to the user user and thus the
installation of OpenBUGS in that folder would be completed successfully.
It is also possible to install JAGS, following these steps:
1. Download the latest tar.gz file (currently, JAGS-4.2.0.tar.gz) from the web-
page http://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/Source/.
2. Open a Linux terminal window, extract the content of the archive file and move
to the newly created folder JAGS-4.2.0
tar xzvf JAGS -4.2.0. tar . gz
cd JAGS -4.2.0
2.2 Preliminaries: Computer Configuration 27

3. Run the configuration


sudo ./ configure -- prefix =/ usr
sudo make
sudo make install

4. Clean up the unnecessary files and folder


cd ..
sudo rm - fr JAGS -4.2.0
rm JAGS -4.2.0. tar . gz

5. Install R2jags from the R terminal, as discussed in Sect. 2.2.1.

2.2.2.3 Installing OpenBUGS and JAGS in Mac OS

While OpenBUGS does not run natively under Mac OS, a possible workaround is to
install a hardware virtualisation software such as Parallels Desktop for Mac Os
(http://www.parallels.com/uk/products/desktop/), or a “compatibility layer”, such as
wine (https://www.winehq.org/download/), which allow to run Windows applications
from Mac.
Conversely, JAGS does run natively under Mac OS too and can be installed using
the following steps:
1. Download the latest .dmg file (currently, JAGS-4.2.0.dmg) from https://sourceforge.
net/projects/mcmc-jags/files/JAGS/4.x/Mac%20OS%20X/
2. Double click the .dmg file to make its content available (the name will show up
in the Finder sidebar), usually a window opens showing the content as well;
3. Drag the application from the .dmg window into /Applications to install (you
may need an administrator password);
4. Wait for the copy process to finish;
5. Eject the .dmg (by clicking the eject button in the Sidebar);
6. Delete the .dmg from Downloads.
Several tutorial are available online to guide in the process of installation and use
of both OpenBUGS and JAGS.

2.3 Vaccine

Consider an infectious disease, for instance influenza, for which a new vaccine has
been produced. Under the current management of the disease some individuals treat
the infection by taking over-the-counter (OTC) medications. Some subjects visit their
doctor and, depending on the gravity of the infection, may receive treatment with
antiviral drugs, which usually cure the infection. However, in some cases complica-
tions may occur. Minor complications will need a second doctor’s visit after which
the patients become more likely to receive antiviral treatment. Major complications
28 2 Case Studies

are represented by pneumonia and can result in hospitalisation and possibly death. In
this scenario, the costs generated by the management of the disease are represented by
OTC medications, doctor visits, the prescription of antiviral drugs, hospital episodes
and indirect costs such as time off work.
The focus is on the clinical and economic evaluation of the policy that makes the
vaccine available to those who wish to use it (t = 1) against the null option (t = 0)
under which the vaccine will remain unavailable. More details of this example can
be found in [5] and references therein.

2.3.1 (Bayesian) Statistical Model

2.3.1.1 Assumptions

In a population made of N individuals, consider the number of patients taking up the


vaccine when available V1 ∼ Binomial(φ, N ) where φ is the vaccine coverage rate.
Obviously, V0 = 0 as the vaccine is not available under the status quo. For conve-
nience, we denote the total number of patients in the two groups, vaccinated (v = 1)
and non-vaccinated (v = 0), by n tv with n t1 := Vt and n t0 := N − Vt , respectively.
The relevant clinical outcomes are: j = 1 influenza infection; j = 2 doctor visit;
j = 3 minor complications; j = 4 major complications; j = 5 hospitalisation; j = 6
death; and j = 7 adverse events of influenza vaccination. For each clinical outcome
j, β j is its baseline rate of occurrence and ρv is the proportional reduction in the
chance of infection due to the vaccine. Vaccinated patients (v = 1) will experience a
reduction in the chance of infection by a factor ρ1 ; conversely, for v = 0, individuals
are not vaccinated and so the chance of infection is just the attack rate β1 . This is
equivalent to setting ρ0 := 0.
Under these assumptions, the number of individuals becoming infected in each
group is Itv ∼ Binomial(πv , n tv ), where πv := β1 (1 − ρv ) is the probability of infec-
tion. Among the infected subjects, the number visiting a doctor for the first time is
GPtv(1) ∼ Binomial(β2 , Itv ). Using a similar reasoning, among those who have had
a doctor visit, we can define: the number of individuals with minor complications
GPtv(2) ∼ Binomial(β3 , GPtv(1) ); the number of those with major complications Ptv ∼
Binomial(β4 , GPtv(1) ); the number of hospitalisations Htv ∼ Binomial(β5 , GPtv(1) );
and the deaths Dtv ∼ Binomial(β6 , GPtv(1) ). The number of individuals experienc-
ing adverse events due to vaccination is computed as AE tv ∼ Binomial(β7 , n tv )—
obviously, this will be identically 0 under the status quo (t = 0) and among those
individuals who choose not to take the vaccine up in the vaccination scenario
(t = 1, v = 0).
The model also includes other parameters, such as the chance of receiving a
prescription after the first doctor visit (γ1 ) or following minor complications (γ2 ) for
a number of antiviral drugs (δ); of taking OTC medications (ξ); and of remaining
off-work (η) for a number of days (λ). Combining these with the relevant populations
2.3 Vaccine 29

at risk, we can then derive the expected number of individuals experiencing each of
these events.
As for the costs, we consider the relevant resources as h = 1: doctor visits; h = 2:
hospital episodes; h = 3: vaccination; h = 4: time to receive vaccination; h = 5: days
off work; h = 6: antiviral drugs; h = 7: OTC medications; h = 8: travel to receive
vaccination. For each, we define ψh to represent the associated unit cost for which we
assume informative lognormal distributions, a convenient choice to model positive,
continuous variables such as costs.
Finally, we include in the model suitable parameters to represent the loss in quality
of life generated by the occurrence of the clinical outcomes. Let ω j represent the
QALYs lost when an individual experiences the j-th outcome. We assume that doctor
visits do not generate loss in QALYs and therefore set ω2 = ω3 := 0; the remaining
ω j ’s are modelled using informative lognormal distributions.
The assumptions encoded by this model are that we consider a population parame-
ter θ = (θ 0 , θ 1 ), with the two components being defined as θ 0 = (β j , γ1 , γ2 , δ, ξ, η,
λ, ψh , ω j ) and θ 1 = (φ, β j , ρv , γ1 , γ2 , δ, ξ, η, λ, ψh , ω j ). We assume that the com-
ponents of θ have the distributions specified in Table 2.1, which are derived by using
suitable “hyper-parameters” that have been set to encode knowledge D available
from previous studies and expert opinion. For example, the parameter φ identifies
a probability (the vaccine coverage) and we may have information about past sea-
sons to suggest that this has been estimated to be between 25 and 63%; this can be
translated to a Beta distribution whose parameters can be determined so that roughly
95% of the probability mass lie between these two values. See [5] for a more detailed
discussion of this point.
It is easy to check that the assumptions in terms of the interval estimates for the
parameters are consistent with the choice of distributions in R, for example using
something like the following code:
> phi <- rbeta (100000 ,11.31 ,14.44)
> c ( quantile ( phi ,.025) , quantile ( phi ,.5) , quantile ( phi ,.975) )
2.5% 50% 97.5%
0.2581039 0.4368227 0.6292335

2.3.1.2 Coding the Assumptions into BUGS/JAGS Language

The assumptions and the model structure defined above can be translated into suitable
code to perform a MCMC analysis and obtain estimates from the posterior distribu-
tions of all the relevant parameters. For example, we could write the following code
to be used with OpenBUGS or JAGS:
model {
# 1. Define the number of people in each group n [v , t ], where t =1 ,2 is status
quo vs vaccination and v =1 ,2 is non vaccinated vs vaccinated
# t =1: If the vaccine is not available , no one will use it
# number of vaccinated in the population
V [1] <- 0
# number of individuals in the two groups
30 2 Case Studies

n [1 ,1] <- N - V [1] # non vaccinated


n [2 ,1] <- V [1] # vaccinated

# t =2: When the vaccine is available , some people will use it but
some people won ’ t
# number of vaccinated in the population
V [2] ~ dbin ( phi , N )
# number of individuals in the two groups
n [1 ,2] <- N - V [2] # non vaccinated
n [2 ,2] <- V [2] # vaccinated

# 2. Vaccination coverage
phi ~ dbeta ( a . phi , b . phi )

# 3. Probability of experiencing the clinical outcomes ( in total , N . outcomes =


7)
# 1. Influenza infection
# 2. GP visits
# 3. Minor complications ( repeat visit )
# 4. Major complications ( pneumonia )
# 5. Hospitalisations
# 6. Death
# 7. Adverse events due to vaccination
for ( r in 1:4) {
beta [ r] ~ dbeta ( a . beta [ r ], b . beta [ r ])
}
for ( r in 5:6) {
beta [ r] ~ dlnorm ( a . beta [ r ], b . beta [r ])
}
beta [ N . outcomes ] ~ dbeta ( a . beta [ N . outcomes ], b . beta [ N. outcomes ])

# 4. Vaccine effectiveness in reducing influenza ( for v =1 , it is obviously 0)


rho [1] <- 0
rho [2] ~ dlnorm ( mu . rho , tau . rho )

# 5. Probability of influenza infection


for ( t in 1:2) {
for ( v in 1:2) {
pi [t , v ] <- beta [1]*(1 - rho [ v ])
}
}

# 6. Number of patients experiencing the events for both


interventions & compliance groups
for ( t in 1:2) {
for ( v in 1:2) {
Infected [t , v ] ~ dbin ( pi [t , v ], n[v , t ])
GP [t , v ] ~ dbin ( beta [2] , Infected [t , v ])
Repeat . GP [t , v ] ~ dbin ( beta [3] , GP [t , v ])
Pneumonia [t , v ] ~ dbin ( beta [4] , GP [t , v ])
Hospital [t , v ] ~ dbin ( beta [5] , GP [t , v ])
Death [t , v ] ~ dbin ( beta [6] , GP [t , v ])
Trt [1 ,t , v ] ~ dbin ( gamma [1] , GP [t , v ])
Trt [2 ,t , v ] ~ dbin ( gamma [2] , Mild . Compl [t , v ])
2.3 Vaccine 31

Mild . Compl [t , v ] <- Repeat . GP [t , v ] + Pneumonia [t , v ]


}
}
Adverse . events ~ dbin ( beta [ N . outcomes ], n [2 ,2])

# 7. Probability of experiencing other events ( impacts on costs and QALYs / QALDs


)
for ( i in 1:2) {
# Treatment with antibiotics after GP visit
gamma [i ] ~ dbeta ( a . gamma [ i ], b . gamma [ i ])
}
# Number of prescriptions of antivirals
delta ~ dpois ( a . delta )
# Taking OTC
xi ~ dbeta ( a . xi , b . xi )
# Being off work
eta ~ dbeta ( a . eta , b . eta )
# Length of absence from work for influenza
lambda ~ dlnorm ( mu . lambda , tau . lambda )

# 8. Costs of clinical resourses ( N . resources = 8)


# 1. Cost of GP visit
# 2. Cost of hospital episode
# 3. Cost of vaccination
# 4. Cost of time to receive vaccination
# 5. Cost of days work absence due to influenza
# 6. Cost of antiviral drugs
# 7. Cost of OTC treatments
# 8. Cost of travel to receive vaccination
for ( r in 1: N . resources ) {
psi [ r ] ~ dlnorm ( mu . psi [ r ], tau . psi [r ])
}

# 9. Quality of life adjusted days / years loss


# 1. Influenza infection
# 2. GP visits ( no QALD /Y loss )
# 3. Minor complications ( repeat visit , no QALD / Y loss )
# 4. Major complications ( pneumonia )
# 5. Hospitalisations ( same QALD / Y loss as pneumonia )
# 6. Death
# 7. Adverse events due to vaccination
omega [1] ~ dlnorm ( mu . omega [1] , tau . omega [1])
omega [2] <- 0; omega [3] <- 0;
for ( r in 4: N . outcomes ) {
omega [r ] ~ dlnorm ( mu . omega [ r ], tau . omega [ r ])
}
}
32 2 Case Studies

Table 2.1 Distributional assumptions for the model. For each parameter, the distributions are
chosen to model the available prior knowledge, represented by existing data or expert opinions.
The mathematical form of the distributions is chosen according to the nature of the parameter (i.e.
parameters describing probability of occurrence of an event are usually given a Beta distribution),
while the values of the hyper-parameters are chosen so that the distribution is consistent with the
prior information derived by the clinical literature or expert opinion
Parameter Mean 2.5% Median 97.5% Distribution
φ 0.435 0.245 0.436 0.625 Beta (11.31, 14.44)
β1 0.0701 0.0387 0.0680 0.1116 Beta (13.01, 172.38)
β2 0.295 0.124 0.288 0.497 Beta (5.80, 13.80)
β3 0.401 0.388 0.401 0.415 Beta (1909.50, 2851.86)
β4 0.01339 0.00852 0.01322 0.01938 Beta (20.94, 1538.71)
β5 0.000378 0.000223 0.000364 0.000616 Lognormal (−7.91, 14.93)
β6 0.000748 0.000366 0.000702 0.001331 Lognormal (−7.26, 7.66)
β7 0.1021 0.0255 0.0954 0.2265 Beta (3.50, 31.50)
ρ1 0.688 0.593 0.686 0.794 Lognormal (−0.374,
0.00524)
γ1 0.420 0.417 0.420 0.423 Beta (45471.58, 62794.09)
γ2 0.814 0.806 0.814 0.822 Beta (7701.86, 1759.89)
δ 6.97 2.00 7.00 12.00 Poisson (7.00)
ξ 0.950 0.940 0.950 0.959 Beta (1804.05, 94.95)
η 0.900 0.890 0.900 0.909 Beta (3239.10, 359.90)
λ 2.90 1.22 2.69 5.97 Lognormal (0.98, 0.17)
ψ1 20.55 12.36 19.77 32.07 Lognormal (3.00,
0.0606)
ψ2 2661.92 1554.18 2575.67 4106.98 Lognormal (7.85,
0.0606)
ψ3 7.21 4.22 6.95 11.42 Lognormal (1.95,
0.0606)
ψ4 10.26 6.16 9.92 15.90 Lognormal (2.29, 0.0606)
ψ5 46.31 27.20 44.96 70.69 Lognormal (3.80, 0.0606)
ψ6 3.86 2.39 3.73 5.95 Lognormal (1.31, 0.0606)
ψ7 1.592 0.949 1.562 2.452 Lognormal (0.44, 0.0606)
ψ8 0.807 0.484 0.776 1.311 Lognormal (−0.241,
0.0606)
ω1 4.26 2.14 4.05 7.59 Lognormal (1.40, 0.0993)
ω4 6.39 3.81 6.23 9.82 Lognormal (1.82, 0.0606)
ω5 6.34 3.83 6.15 9.94 Lognormal (1.82, 0.0606)
ω6 15.20 9.09 14.88 23.34 Lognormal (2.70, 0.054)
ω7 0.556 0.316 0.541 0.932 Lognormal (−0.634,
0.0717)
2.3 Vaccine 33

The model consists of nine modules as annotated in the code above. Notice that
the values of the parameters for each distribution are kept as variables (rather than
hard-coded as a fixed number). This is in general a good idea, since changes in
the assumed values can be reflected directly using the same code. Of course, this
means that the numerical value must be passed to the computer code somewhere else
in the scripting process. This, however, helps clarify the whole process and makes
debugging easier.
As is possible to see, most of the commands in the BUGS/JAGS language are effec-
tively typed in a way that strongly resembles the standard statistical notation, with the
twiddle symbol ∼ indicating a stochastic relationship (i.e. a probability distribution),
while the assignment symbol -> indicates logical (or deterministic) relationships.
Typically, this code is saved to a text file, say vaccine.txt. It is good practice to
store the files in a well-structured set of directories or at least to provide pointers for
R so that it can search for the relevant files efficiently. Examples include the directory
from which R is launched or alternatively in the directory that is currently in use by
R (also termed the “working directory”). The R command
> setwd (" PATH_TO_RELEVANT_FOLDER ")

can be used to set the working directory to any folder, while the command
> getwd ()
[1] "/ home / user / MyStuff "

returns the current (working) directory. Note that R uses Unix-like notation and
forward slashes / to separate folders in a text string. Conversely, MS Windows uses
backward slashes \ to accomplish the same task. This means that on a MS Windows
computer, the working directory will be defined by R as something like
> # On a Windows machine :
> getwd ()
[1] " C :/ user / MyStuff "

while the MS Windows notation (e.g. by copying and pasting the address of the folder
from the file explorer) would actually be "C:\user\MyStuff". It is thus important
to be careful when copying and pasting folder locations from MS Windows into R and
the user has two options, both based on Unix-like notation: the first one is to just
convert any backward slash to a forward slash. The second option is to escape the
backward slashes using a double backward slash (\\), for example as in the following
R code.
> # On a Windows machine , these two commands are the same :
> # 1. using forward slashes
> setwd (" C :/ user / Mystuff )

> # 2. using double backward slashes


> setwd (" C :\\ user \\ MyStuff ")
34 2 Case Studies

2.3.1.3 R Code to Pre-process and Load the Data

The following R code is used to pre-process and load the data in the R workspace
before the model and the health economic analysis can be run.
> ## Launches the file Utils . R which contains useful functions used throughout
this script
> source (" http :// www . statistica . it / gianluca / BCEABook / WebMaterial / Utils . R ")

> ## Loads the values of the hyper - parameters ( needed to run the Bayesian model
using JAGS )
> # Number of people in the populations
> N <- 100000

> # Vaccine coverage


> a . phi <- betaPar2 (.434 ,.6 ,.95) $res1
> b . phi <- betaPar2 (.434 ,.6 ,.95) $res2

> # Baseline probabilities of clinical outcomes


> # 1. Influenza infection
> # 2. GP visits
> # 3. Minor complications ( repeat visit )
> # 4. Major complications ( pneumonia )
> # 5. Hospitalisations
> # 6. Death
> # 7. Adverse events due to vaccination
> N . outcomes <- 7
> mu . beta <- c (.0655 ,.273 ,.401 ,.0128 ,.00038 ,.00075)
> upp . beta <- c (.111 ,.51 ,.415 ,.0197 ,.00067 ,.000132)
> sd . beta <- c( NA , NA , NA , NA ,.0001 ,.00028)
> a . beta <- b . beta <- numeric ()
> for ( i in 1:4) {
+ a . beta [ i ] <- betaPar2 ( mu . beta [ i ], upp . beta [ i ] ,.975) $res1
+ b . beta [ i ] <- betaPar2 ( mu . beta [ i ], upp . beta [ i ] ,.975) $res2
+ }
> for ( i in 5:6) {
+ a . beta [ i ] <- lognPar ( mu . beta [ i ], sd . beta [i ]) $mulog
+ b . beta [ i ] <- 1/ lognPar ( mu . beta [ i ] , sd . beta [ i ]) $sigmalog ^2
+ }
> a . beta [ N . outcomes ] <- betaPar (.1 ,.05) $a
> b . beta [ N . outcomes ] <- betaPar (.1 ,.05) $b

> # Decrease in risk of infection due to vaccination


> mu . rho <- lognPar (.69 ,.05) $mulog ; tau . rho <- 1/ lognPar (.69 ,.05) $sigmalog ^2

> # Treatment with antibiotics after GP visit


> mu . gamma <- c (.42 ,.814) ; sd . gamma <- c (.0015 ,.004)
> a . gamma <- b. gamma <- numeric ()
> for ( i in 1:2) {
+ a . gamma [ i ] <- betaPar ( mu . gamma [ i ], sd . gamma [ i ]) $a
+ b . gamma [ i ] <- betaPar ( mu . gamma [ i ], sd . gamma [ i ]) $b
+ }

> # Number of prescriptions of antibiotics


> a . delta <- 7

> # Taking OTC


> a . xi <- betaPar (.95 ,.005) $a
> b . xi <- betaPar (.95 ,.005) $b
2.3 Vaccine 35

> # Being off work


> a . eta <- betaPar (.9 ,.005) $a
> b . eta <- betaPar (.9 ,.005) $b

> # Length of absence from work for influenza


> mu . lambda <- lognPar (2.9 ,1.25) $mulog
> tau . lambda <- 1/ lognPar (2.9 ,1.25) $sigmalog ^2

> # Costs ( N . resources = 8)


> # 1. Cost of GP visit
> # 2. Cost of hospital episode
> # 3. Cost of vaccination
> # 4. Cost of time off for individuals to receive vaccination
> # 5. Cost of days work absence due to influenza
> # 6. Cost of antibiotics
> # 7. Cost of OTC treatments
> # 8. Cost of travel to receive vaccination
> N . resources <- 8
> m . psi <- c (20.66 ,2656 ,7.24 ,10.16 ,46.27 ,3.81 ,1.6 ,.81)
> sd . psi <- c (5.015 ,440.75 ,1.81 ,2.54 ,11.57 ,.955 ,.4 ,.2)
> sd . psi <- .25* m . psi
> mu . psi <- tau . psi <- rep (0 , N . resources )
> for ( i in 1: N . resources ) {
+ mu . psi [ i ] <- lognPar ( m . psi [ i ], sd . psi [ i ]) $mulog
+ tau . psi [ i ] <- 1/ lognPar ( m . psi [ i ], sd . psi [i ]) $sigmalog ^2
+ }

> # Quality of life weights ( N . outcomes = 7)


> # 1. Influenza infection
> # 2. GP visits ( no QoL loss )
> # 3. Minor complications ( repeat visit , no QoL loss )
> > # 4. Major complications ( pneumonia )
> # 5. Hospitalisations
> # 6. Death
> # 7. Adverse events due to vaccination
> m . omega <- c (4.27 ,0 ,0 ,6.35 ,6.35 ,15.29 ,.55)
> sd . omega <- c (1.38 ,0 ,0 ,1.5875 ,1.5875 ,3.6 ,.15)
> mu . omega <- tau . omega <- rep (0 , N . outcomes )
> for ( i in c (1 ,4 ,5 ,6 ,7) ) {
+ mu . omega [ i ] <- lognPar ( m . omega [ i ], sd . omega [ i ]) $mulog
+ tau . omega [ i ] <- 1/ lognPar (m . omega [ i ], sd . omega [ i ]) $sigmalog ^2
+ }

(notice that the + at the beginning of a line inside the for loops is just R standard
notation to indicate commands that span over more than one line).
36 2 Case Studies

The very first line of the script executes the file Utils.R from its remote location
(http://www.statistica.it/gianluca/BCEABook/WebMaterial/Utils.R); this
file contains a set of functions and commands that are used throughout the script and
thus is fundamental to launch it before the rest of the script can be executed. Although
the number of files necessary to run the entire analysis may increase (thus, at face
value, increasing the complexity of the process), it is actually good programming
practice to use a combination of many smaller, focussed scripts, rather than include
every commands or functions required in one single, massive file. This, again, makes
the process transparent and easier to debug or critically appraise.
The rest of the script defines the values for the parameters used in the distributions
associated to the quantities modelled and described above. For example, the function
betaPar2 (which is defined in the file Utils.R) can be used to determine the values
of the parameters of a Beta distribution so that its average is around 0.436 and 95%
of the mass is below the value of 0.6. In particular, running this command on a R
terminal gives the following output:
> betaPar2 (.434 ,.6 ,.95)
$res1
[1] 11.30643

$res2
[1] 14.4411

$theta . mode
[1] 0.434

$theta . mean
[1] 0.4391267

$theta . median
[1] 0.437

$theta . sd
[1] 0.09595895

betaPar2 creates a list of results: the first two elements of the list, res1 and res2
are the estimated values of the parameters to be used with a Beta distribution so that
roughly 95% of the probability mass is below 0.6. This is in line with the assumptions
presented in Table 2.1 for the parameter φ. Again, we can check the appropriateness
of this choice by simply typing the following commands to the R terminal.
> phi <- rbeta (100000 ,11.30643 ,14.4411)
> c ( quantile ( phi ,.025) , quantile ( phi ,.5) , quantile ( phi ,.975) )
2.5% 50% 97.5%
0.2566397 0.4371251 0.6296423

The other elements of the list are theta.mode, theta.mean, theta.median and
theta.sd, which store the values for the mode, mean, median and standard deviation
2.3 Vaccine 37

of the resulting Beta distribution. Notice the R “dollar” notation, which can be used
to access elements of an object — in other words, if the object x is stored in the R
workspace and contains the elements y, z and w, then these can be accessed by using
the notation x$y, x$z or x$w.
Another thing to notice is that it is fairly easy to annotate the R code in an infor-
mative way. This again increases transparency and facilitates the work of reviewers
or modellers called upon a critical evaluation of the analysis process. In line with
the point we made above about using many simpler and specific files to execute the
several steps of the analysis, rather than one large (and potentially messy) file, it
is a good idea to save this code to a script file, say LoadData.R, again assumed to
be stored in the working directory. From within the R terminal, the script can be
launched and executed by typing the command
> source (" LoadData . R ")

which runs all the instructions in the script sequentially.

2.3.1.4 R Code to Remotely Run BUGS/JAGS and the Bayesian Model

At this point, the user is ready to run the model—in a full Bayesian context, this
typically means performing a MCMC analysis (cfr. Sect. 1.2.3) to obtain a sample
from the posterior distribution of the random quantities of interest. We reiterate here
that these may be unobservable parameters as well as unobserved variables.
R is particularly effective at interfacing with the main software for Bayesian
analysis—here we refer to the most popular OpenBUGS [3] and JAGS [4], but there is
a R package to interface with a more recent addition, Stan [6]. This means that it is
possible to produce a set of scripts that can be run in R to pre-process the data, call
the MCMC sampler in the background and run the model (written in a .txt file, as
shown above) and then post-process the results, e.g. to obtain the suitable measures
of population average costs and effectiveness.
For example, the following commands can be used to run the Bayesian model
defined above:
> # Loads the package to run OpenBUGS or JAGS from R
> library ( R2OpenBUGS )
> library ( R2jags )
> # Defines the current as the working directory
> working . dir <- paste ( getwd () ,"/" , sep ="")
> # Launches the file Utils .R which contains useful functions used
throughout this script
> source (" http :// www . statistica . it / gianluca / BCEABook / WebMaterial /
Utils .R ")
> # Loads the data into R ( assumes the file is stored in the
working directory - if not the full path can be provided )
> source (" LoadData .R ")

> # Defines the data list to be passed to BUGS / JAGS


> data <- list (" N " ," a. phi " ," b . phi " ," mu . rho " ," tau . rho " ," a. beta " ," b.
beta " ," a. gamma " ," b. gamma ", mu . omega " ," tau . omega " ," mu . psi " ," tau .
38 2 Case Studies

psi " ," N . outcomes " ," N. resources " ," mu . lambda " ," tau . lambda " ," a . xi
" ," b . xi " ," a . eta " ," b . eta " ," a. delta ")

> # Defines the file with the model code


> filein <- " vaccine . txt "

> # Defines the quantities to be monitored ( stored )


> params <- c (" beta " ," phi " ," omega " ," rho " ," Infected " ," GP " ," Repeat .
GP " ," Pneumonia " ," Hospital " ," Death " ," Mild . Compl " ," Trt " ," Adverse
. events " ," n " ," gamma " ," delta " ," psi " ," lambda " ," pi " ," xi " ," eta ")

> # Generates the initial values


> inits <- function () {
+ list ( phi = runif (1) , beta = runif (N. outcomes ,0 ,1) , rho = c(NA , runif (1) ) ,
gamma = runif (2 ,0 ,1) , delta = rpois (1 ,2) , omega =c( runif (1) ,NA ,NA ,
runif (1) ,NA , runif (2 ,0 ,1) ) , psi = runif (N . resources ,0 ,10) , lambda =
runif (1) , eta = runif (1) ,xi = runif (1) )
+ }

> # Defines the number of iteration , burn - in and thinning , and


runs BUGS or JAGS
> n . iter <- 100000
> n . burnin <- 9500
> n . thin <- floor (( n. iter - n. burnin ) /500)

> # 1. This runs OpenBUGS


> vaccine <- bugs ( data , inits , params , model . file = filein ,n. chains
=2 , n. iter , n. burnin , n. thin , DIC = FALSE , working . directory =
working . dir )

> # 2. This runs JAGS


> vaccine <- jags ( data , inits , params , model . file = filein ,n. chains
=2 , n. iter , n. burnin , n. thin , DIC = FALSE , working . directory =
working . dir , progress . bar =" text ")

> # Prints the summary stats and attaches the results to the R
workspace
> print ( vaccine , digits =3 , intervals =c (0.025 , 0.975) )

> # In OpenBUGS :
> attach . bugs ( vaccine )
> # In JAGS :
> attach . jags ( vaccine )

For convenience, we can save them in a file, say RunMCMC.R, which can then be
run from within the R terminal using the source command.
> source (" RunMCMC . R ")

This script proceeds by first loading the relevant packages (which allow R to
interface with either OpenBUGS or JAGS); this can be done using the command
library(R2OpenBUGS) or library(R2jags), depending on the Bayesian software
2.3 Vaccine 39

of choice. Of course, for these to work, either or both OpenBUGS and JAGS need to be
installed on the user’s machine (we refer interested readers to Sect. 2.2 or the relevant
websites, where information is provided on installation and use under different oper-
ating systems). In the first part of the script, we also execute the files Utils.R and
LoadData.R, presented above, which prepare the data for either OpenBUGS or JAGS
to use. Finally, the current folder is set up as the working directory (but of course,
the user can choose any folder for this).
The next step amounts to storing all the relevant input data for the model code
into a list. In this case, we need to include all the values for the parameters of the
distributions used in the file vaccine.txt, which encodes the model assumptions.
Then, we instruct R to read the model assumptions from the file vaccine.txt and
finally we define the “parameters” to be monitored. Again, we note that with this
terminology we refer to any unobserved or unobservable quantity for which we
require inference in the form of a sample from the posterior distribution.
Before we run OpenBUGS or JAGS we need to define the list of “initial values”,
which are used to start the Markov chain(s). Notice that both BUGS or JAGS can
randomly generate initial values. However, it is generally better to closely control
this process [7]. This can be done by creating a suitable R function that stores in a list
random values for all the quantities that need initialisation. These are obtained by
specifying the underlying distribution—for instance, in this case we are generating
the initial value for φ from a Uniform(0, 1) distribution (this is reasonable as φ is a
probability and so it needs to have a continuous value between 0 and 1). In principle,
any quantity that is modelled using a probability distribution and is not observed
needs to be initialised. With reference to the model code presented above, it would
not possible to initialise the node n[1,2], because it is defined as a deterministic
function of other quantities (in this case N and V[2]).
Finally, we define the total number of iterations, the number of iterations to be
discarded in the estimate of the posterior distributions (burn-in) and the possible value
of the “thinning”. This refers to the operation of only saving one every l iterations from
the Markov Chains. This can help reduce the level of autocorrelation in the resulting
chains. For example, we could decide to store 1,000 iterations and obtain this either
by saving the last 1,000 runs from the overall process (i.e. by discarding the first 9,000
of the 10,000 iterations produced), or by running the process for 100,000 iterations,
discarding the first 9,500 and then saving one every 181 iterations. Of course, the
latter alternative involves a longer process just to end up with the same number
of samples on which to base the estimation of the posteriors. But the advantage is
that it is likely that it will show a lower level of autocorrelation, which means a
larger amount of information and thus better precision in characterising the target
distributions.
Once these steps have been executed, we can use the commands bugs or jags
to run the model. Both would call the relevant MCMC sampler in the back-
ground and produce the MCMC estimates. When the process is finished, the user
regains control of the R session. A new object, in this case named vaccine, is
40 2 Case Studies

created in the current workspace. This object can be manipulated to check model
convergence, visualise the summary results (using the print method available for
both R2OpenBUGS and R2jags) and save the results (i.e. the simulated values from
the posterior distributions) to the R workspace.
For example, a summary table can be obtained as follows (here, we only present
the first few rows, for simplicity):
> print ( vaccine , interval = c (.025 ,.975) , digits =3)
Inference for Bugs model at " vaccine . txt ", fit using jags ,
2 chains , each with 10000 iterations ( first 9500 discarded ) , n . thin = 181
n . sims = 1000 iterations saved
mu . vect sd . vect 2.5% 97.5% Rhat n . eff
Adverse . events 4384.479 2518.102 969.425 10740.800 1.005 310
Death [1 ,1] 1.573 1.539 0.000 5.000 1.000 1000
Death [2 ,1] 0.850 1.084 0.000 4.000 1.001 1000
Death [1 ,2] 0.000 0.000 0.000 0.000 1.000 1
Death [2 ,2] 0.248 0.545 0.000 2.000 1.000 1000
GP [1 ,1] 2045.987 896.964 654.925 4092.150 1.000 1000
GP [2 ,1] 1148.308 543.198 340.925 2435.475 1.000 1000
GP [1 ,2] 0.000 0.000 0.000 0.000 1.000 1
GP [2 ,2] 279.658 151.580 78.000 658.325 1.000 1000
Hospital [1 ,1] 0.764 0.959 0.000 3.000 1.001 1000
Hospital [2 ,1] 0.438 0.698 0.000 2.000 1.002 620
...

For each parameter included in the list of quantities to be monitored, this table shows
the mean and standard deviation (the columns labelled as mu.vect and sd.vect),
together with the 2.5 and 97.5% quantiles of the posterior distributions (which give
a rough approximation of a 95% credible interval).
The final columns of the table (indexed by the labels Rhat and n.eff, respectively)
present some important convergence statistics. The first one is the potential scale
reduction R̂, often termed the Gelman–Rubin statistic. This quantity can be computed
when the MCMC process is based on running at least two parallel chains and basically
compares the within to the between chain variability. The rationale is that when this
ratio is close to 1, then there is some evidence of “convergence” because all the
chains present similar variability and do not vary substantially among each other, thus
indicating that they are all visiting a common area in the parameter’s space. Typically,
values below the arbitrary threshold of 1.1 are considered to suggest convergence to
the relevant posterior distributions.
The second one is the “effective sample size” n eff . The idea behind this quantity is
that the MCMC analysis is based on a sample of n.sims iterations (in this case, this is
1,000). Thus, if these were obtained using a sample of independent observations from
the posterior distributions, this would be worth exactly 1,000 data points. However,
because MCMC is a process in which future observations depends on the current
one, there is some intrinsic “autocorrelation”, which means that often a sample of
2.3 Vaccine 41

S iterations has a value in terms of information that is actually lower. This value is
quantified by the effective sample size. When n.eff is close to n.sims, this indicates
that the level of autocorrelation is low and that in effect the n.sims points used to
obtain the summary statistics are worth more or less their nominal value. On the
other hand, when the two are very different this indicates that the MCMC sample
contains less information about the posterior.
For example, because of the autocorrelation, the 1,000 simulations used to charac-
terise the posterior distribution of the node Adverse.events are actually equivalent
to a sample made by around 310 independent observations from that posterior. In
cases such as this, when R̂ < 1.1 but n.eff is much smaller than n.sims we could
conclude that the sample obtained has indeed converged to the posterior distribu-
tion but does not contain enough information to fully characterise it. For example,
the mean and the central part of the distribution may be estimated with good preci-
sion, but the tails may not. One easy (albeit potentially computationally intensive)
workaround is to run the MCMC for a (much) longer run and possibly increase the
thinning.
Additional analyses to check on convergence may be performed, for example
by providing traceplots of the chains, e.g. as in Fig. 1.3(d), for example using the
following command
> traceplot ( vaccine )
which produces an interactive traceplot for each of the monitored nodes. More
advanced graphing and analysis can be done by subsetting the object vaccine and
accessing the elements stored therein. Details on how to do this are shown, for exam-
ple, in [7].

2.3.2 Economic Model

In order to perform the economic analysis, we need to define suitable summary mea-
sures of cost and effectiveness. The total cost associated with each clinical resource
can be computed by multiplying the unit cost ψh by the  number of patients
 consum-
(1) (2)
ing it. For instance, the overall cost of doctor visit is GPtv + GPtv × ψ1 . If, for
convenience of terminology, we indicate with Ntvh the total number of individuals
consuming the h-th resource under intervention t and in group v, we can then extend
this reasoning and compute the average population cost under intervention t as

1 
1 8
ct := Ntvh ψh . (2.1)
N v=0 h=1

Similarly, the total QALYs lost due to the occurrence of the relevant outcomes
can be obtained by multiplying the number of individuals experiencing them by ω j .
For example, the total number of QALYs lost to influenza infection can be computed
as Itv × ω1 . If we let Mtv j indicate the number of subjects with the j-th outcome
42 2 Case Studies

in intervention t and group v, we can define the population average measure of


effectiveness for intervention t as
1 
1 7
et := Mtv j ω j . (2.2)
N v=0 j=1

The results of the MCMC procedure used to run the model described above can
be obtained by simply running the scripts discussed in Sect. 2.3.1. However, they are
also available in the R object vaccine.RData, which can be directly downloaded at
http://www.statistica.it/gianluca/BCEABook/vaccine.RData. For example, this can
be uploaded to the R session by typing the following command:
> load (" http :// www . statistica . it / gianluca / BCEABook / vaccine . RData ")
> ls ()
[1] " Adverse . events " " Death " " GP " " Hospital "
[5] " Infected " "N" " Pneumonia " " Repeat . GP "
[9] " delta " " eta " " gamma " " lambda "
[13] " n " " n . sims " " omega " " psi "
[17] " xi "

Each of these R objects contains n sims = 1000 simulations from the relevant pos-
terior distributions. Before the economic analysis can be run, it is necessary to define
the measures of overall cost and effectiveness given in Eqs. (2.1) and (2.2), respec-
tively. This can be done using the results produced by the MCMC procedure with
the following R code. Notice that since the utilities are originally defined as quality
adjusted life days, it is necessary to rescale them to obtain QALYs.
> ## Compute effectiveness in QALYs lost for both strategies
> QALYs . inf <- QALYs . pne <- QALYs . hosp <- QALYs . adv <- QALYs . death <- matrix (0 ,
n . sims ,2)
> for ( t in 1:2) {
QALYs . inf [, t ] <- (( Infected [,t ,1] + Infected [,t ,2]) * omega [ ,1]/365) / N
QALYs . pne [, t ] <- (( Pneumonia [,t ,1] + Pneumonia [,t ,2]) * omega [ ,4]/365) / N
QALYs . hosp [, t ] <- (( Hospital [,t ,1] + Hospital [,t ,2]) * omega [ ,5]/365) / N
QALYs . death [, t ] <- (( Death [,t ,1] + Death [,t ,2]) * omega [ ,6]) / N
}
> QALYs . adv [ ,2] <- ( Adverse . events * omega [ ,7]/365) / N
> e <- -( QALYs . inf + QALYs . pne + QALYs . adv + QALYs . hosp + QALYs . death )

The notation Infected[,t,1] indicates all the simulations (the first dimension
of the array) for the t-th intervention (which the for loop sets sequentially to 1
and 2 to indicate t = 0, 1, respectively) and for the first vaccination group. Sim-
ilarly, Infected[,t,2] indicates all the simulations for the t-th intervention and
for the second vaccination group. Thus, each of these two elements effectively pro-
duces the value Mtv1 (where j = 1 indicates the first outcome) and consequently, the
code ((Infected[,t,1] + Infected[,t,2])*omega[,1]/365)/N does identify
2.3 Vaccine 43


the quantity N1 1v=0 Mtv1 ω1 . Following a similar reasoning for all the other out-
comes and summing them all up, we do obtain the measure of effectiveness, which
is stored in a matrix e with n sims rows and 2 columns (one for each intervention
considered).
We can follow a similar strategy to identify the costs associated with each inter-
vention. First we define the number of “users” (which we indicated earlier as Ntvh
and according to the resource depends on the number of doctor (general practitioner)
visits, hospitalisations, infections, repeated hospitalisations, or individuals at risk);
then we multiply these by the associated cost (contained in the variable psi). Then
we sum all the components to derive the overall average cost for each treatment
strategy.
> ## Compute costs for both strategies
> cost . GP <- cost . hosp <- cost . vac <- cost . time . vac <- cost . time . off <- cost .
trt1 <- cost . trt2 <- cost . otc <- cost . travel <- matrix (0 , n . sims ,2)
> for ( t in 1:2) {
cost . GP [, t] <- ( GP [,t ,1]+ GP [,t ,2]+ Repeat . GP [,t ,1]+ Repeat . GP [,t ,2]) * psi
[ ,1]/ N
cost . hosp [, t ] <- ( Hospital [,t ,1]+ Hospital [,t ,2]) * psi [ ,2]/ N
cost . vac [, t ] <- n [ ,2 , t ]* psi [ ,3]/ N
cost . time . vac [, t ] <- n [ ,2 , t ]* psi [ ,4]/ N
cost . time . off [, t ] <- ( Infected [,t ,1]+ Infected [,t ,2]) * psi [ ,5]* eta * lambda / N
cost . trt1 [, t ] <- ( GP [,t ,1]+ GP [,t ,2]) * gamma [ ,1]* psi [ ,6]* delta / N
cost . trt2 [, t ] <- ( Repeat . GP [,t ,1]+ Repeat . GP [,t ,2]) * gamma [ ,2]* psi [ ,6]* delta /
N
cost . otc [, t ] <- ( Infected [,t ,1]+ Infected [,t ,2]) * psi [ ,7]* xi / N
cost . travel [, t ] <- n [ ,2 , t ]* psi [ ,8]/ N
}
> c <- cost . GP + cost . hosp + cost . vac + cost . time . vac + cost . time . off + cost .
trt1 + cost . trt2 + cost . travel + cost . otc

At this point we are ready to run the Decision Analysis and the Uncertainty
Analysis, which BCEA can take care of. We present these parts in Chaps. 3 and 4.

2.4 Smoking Cessation

In this example, we will consider a set T = {t = 0, 1, 2, 3} of T = 4 potential inter-


ventions to help smoking cessation; in particular, t = 0 represents no contact (status
quo); t = 1 is a self-help intervention; t = 2 is individual counselling; and t = 3
indicates group counselling. The interest is in the joint evaluation of the T interven-
tions. The analysis will be conducted in the form of a cost-consequence analysis,
and as such costs and health effects will not be correlated as it usually happens in
the wider framework of a cost-utility analysis. However, it should be noted that there
44 2 Case Studies

is no substantive difference between a cost-consequence and cost-utility analysis, as


argued in [8]. Therefore, the costs and health effects will be analysed separately and
then jointly to produce a summary of the comparative cost-effectiveness profiles of
the interventions considered.
The available clinical evidence is made by a set of trials in which some com-
binations of the available interventions have been considered. However, not all the
possible pairwise comparisons are observed. The use of a Bayesian model embedding
some suitable exchangeability assumptions allows the estimation of a suitable mea-
sure of effectiveness for all the interventions (using all the available evidence for each
t ∈ T ) and for all the possible pairwise comparisons. This is a meta-analysis tech-
nique generally referred to as Mixed Treatment Comparison (MTC), which expands
the concepts of Bayesian evidence synthesis to generate a network of evidence that
can be used to produce the required estimations. More detailed discussion is pre-
sented in [9] and [10]. The data used in this example were originally reported in
[11].

2.4.1 (Bayesian) Statistical Model

Assumptions
The dataset includes N = 50 data points nested within S = 24 studies . For each
study arm i = 1, . . . , N we observe a variable ri indicating the number of patients
quitting smoking out of a total sample size of n i individuals. In addition, we also
record a variable ti taking on the possible values 1, 2, 3, 4, indicating the treatment
associated with the i-th data point. The nesting within the trial is accounted for by a
variable si taking values in 1, . . . , S.
Most studies are simple head-to-head comparisons (i.e. comparing only two
interventions), while two of the them are multi-arm trials (the first one involving
t = 1, 3, 4, and the second one comparing t = 2, 3, 4). Most trials compare one of
the active treatments t = 2, 3, 4 against the control treatment “No intervention”. Five
of the studies consider comparisons between two or more active treatments. The full
dataset is presented in Table 2.2.
Figure 2.1 shows the description of the “network” of data available—the process
of combining this information into a consistent framework is often referred to as
“Network Meta-Analysis” (NMA).
For each study arm we model the number of observed quitters as the realisation
of a Binomial random variable:

ri ∼ Binomial ( pi , n i )

where pi is the specific probability of smoking cessation. The main objective of the
model is to use the available data to derive a pooled estimation for πt , the intervention-
specific probability of smoking cessation.
2.4 Smoking Cessation 45

Table 2.2 The dataset containing information on the S = 24 trials on smoking cessation. The data
were originally reported in [11]
Study (si ) Intervention (ti ) Quitters (ri ) Participants (n i ) Comparator (ci )
1 1 9 140 1
1 3 23 140 1
1 4 10 138 1
2 2 11 78 2
2 3 12 85 2
2 4 29 170 2
3 1 75 731 1
3 3 363 714 1
4 1 2 106 1
4 3 9 205 1
5 1 58 549 1
5 3 237 1561 1
6 1 0 33 1
6 3 9 48 1
7 1 3 100 1
7 3 31 98 1
8 1 1 31 1
8 3 26 95 1
9 1 6 39 1
9 3 17 77 1
10 1 79 702 1
10 2 77 694 1
11 1 18 671 1
11 2 21 535 1
12 1 64 642 1
12 3 107 761 1
13 1 5 62 1
13 3 8 90 1
14 1 20 234 1
14 3 34 237 1
15 1 0 20 1
15 4 9 20 1
16 1 8 116 1
16 2 19 149 1
17 1 95 1107 1
17 3 143 1031 1
18 1 15 187 1
(continued)
46 2 Case Studies

Table 2.2 (continued)


Study (si ) Intervention (ti ) Quitters (ri ) Participants (n i ) Comparator (ci )
18 3 36 504 1
19 1 78 584 1
19 3 73 675 1
20 1 69 1177 1
20 3 54 888 1
21 2 20 49 2
21 3 16 43 2
22 2 7 66 2
22 4 32 127 2
23 3 12 76 3
23 4 20 74 3
24 3 9 55 3
24 4 3 26 3

Fig. 2.1 A graphical


representation of the network
of evidence for the smoking
cessation studies

We use the following strategy. First we model the probabilities pi using a struc-
tured formulation
 
logit( pi ) = μsi + δsi ,ti 1 − I{ti = bsi } .

The parameter μsi represents a study-specific baseline value, which is common to


all interventions being compared in study si . Notice that for each i = 1, . . . , N , si
takes on the integer values in [1; S]. Thus the vector μ comprises of S elements.
2.4 Smoking Cessation 47

The parameter δsi ,ti represents the incremental effect of treatment ti with respect to
the reference intervention being considered in the study si . Specifically, we assume by
common convention that the intervention associated with the minimum label value
found in each study, is the reference intervention for that study. This formulation
allows for a clear specification of study-specific effects and can be easily extended to
include study-treatment interaction. The reference (or baseline) intervention for each
study is indicated by bsi ; thus δsi ,bsi = 0, with the effect of the baseline intervention
for each study s represented by μs . Consequently, in each study s we assume that the
comparator’s effect is the study baseline and that the incremental effect of treatment
t is represented by δs,t if t = bs .
The parameters in μ are given independent minimally informative Normal distri-
iid
butions μs ∼ Normal(0, v), where v is a large fixed value identifying the initial value
of the variance of the distributions. On the contrary, we assume that the parameters
δsi ,ti represent “structured” effects

δsi ,ti ∼ Normal(mdi , σ 2 )

with
mdi = dti − dbsi .

The parameters d = (d1 , . . . , dT ) represent some pooled intervention-specific effect


and the mean mdi is computed as the average difference between the effect for the
intervention in row i and the effect for the reference intervention bsi in study si . We
assume that d1 = 0, i.e. that the reference intervention has no effect other than the
iid
baseline level, while we model di ∼ Normal(0, v) for i = 2, . . . , T .
The parameters d are defined on the logit scale, and thus in order to compute the
estimated probability of smoking cessation on the natural scale for each treatment we
need to rescale them. We proceed by estimating the effect for the baseline intervention
t = 1. Since d1 was set to 0, the treatment effect π0 on the logit scale is given by
the average of the baseline effects in the trials including the intervention t = 1. The
treatment effects πt , t = 1, . . . , T are calculated as:

1 
π0 =  S μs
s=1 I{bs = 1} s:bs =1
logit(πt ) = π0 + dt , t = 1, . . . , T

where s : bs = 1 indicates the subset of studies


 S including the reference intervention
arm t = 1 as a comparator. The expression s=1 I{bs = 1} indicates the number of
studies including treatment t = 1, since it is the sum of the indicator function over
the trials including that intervention as the baseline comparator.
48 2 Case Studies

2.4.1.1 Running the MTC Model in JAGS

We run the MTC model in JAGS (although the code presented below will also work
in OpenBUGS with only minor modifications—cfr. Sect. 2.3.1.4). To run the Bayesian
evidence synthesis model, it is necessary to store the model specification in a text
file that will be then interpreted by JAGS. This file contains the description of the
Bayesian model in terms of the stochastic and deterministic relationships between
the variables building the model network or graph (more precisely, a direct acyclic
graph, or DAG).
The model is an adaptation from the specifications reported by Welton et al.
(2012) and the NICE Decision Support Unit (2013) [9, 10]. The JAGS code used for
the analysis of the smoking cessation data is shown below:
### JAGS model ###
model {
for ( i in 1: nobs ){
r[ i ]~ dbin (p[i],n[i ])
p[ i] <- ilogit ( mu [s [i ]]+ delta [s[i],t[i ]])
delta [s[ i],t[i ]] ~ dnorm ( md [i ], tau )
md [i] <- d[t[i ]] - d[b[s[i ]]]
}
for ( i in 1: ns ){
mu [i ]~ dnorm (0 ,.0001)
AbsTrEf [ i] <- ifelse (b[i ]==1 , mu [i ] ,0)
}
pi0 <- sum ( AbsTrEf []) / incb
tau <- pow (sd , -2)
sd ~ dunif (0.00001 ,2)
d [1] <- 0
for ( k in 2: nt ){
d[ k ]~ dnorm (0 ,.0001)
}
for ( j in 1: nt ){
logit ( pi [j ]) <- pi0 +d[j]
for (k in 1: nt ){
lor [j ,k] <- d[j]-d[k]
log ( or [j ,k ]) <- lor [j , k]
rr [j ,k ] <- pi [j ]/ pi [k]
}
}
}
To run the analysis it is necessary to save the model in a plain text file. No specific
extensions are required and in this example we will save the file with the name
smoking_model_RE.R in the directory from which we run R. We will assume that
the csv file containing the data inputs (i.e. smoking_data.csv) is in the same folder.
2.4 Smoking Cessation 49

The directory R is using can be displayed using the command getwd() and can be
modified by specifying the desired address as the argument of the function setwd,
i.e. setwd("PATH_TO_NEW_DIRECTORY").
It is necessary to import the data into R and to pre-process the inputs prior to
running the Bayesian model. This can be done by running the following code:
> # load the R2jags package and the the data file
> library ( R2jags )
> smoking = read . csv (" smoking_data . csv " , header = TRUE )

> # specify the name of the model file


> model . file =" smoking_model_RE . R"

> # copy smoking data . frame columns to local variables


> attach ( smoking )
> nobs = nobs ; s = s ; t = i ; r = r_i ; n = n_i ; b = b_i +1
> detach ( smoking )

> # number of trials


> ns = length ( unique (s ) )
> # number of comparators
> nt = length ( unique (t ) )
> # number of observations
> nobs = dim ( smoking ) [1]
> # how many studies include baseline
> incb = sum ( table (s ,b ) [ ,1] >0)

The package R2jags is necessary to connect R and JAGS, and is loaded with the
command library(R2jags). The command read.csv is used to read into R the
data inputs contained in the csv file smoking_data.csv, which will be saved as a
data.frame object. Since the quantities need to be available in the R workspace, they
are saved as new R variables. The baseline treatment is incremented by one when
saving it with the command b=b_i+1, so that the comparator t = 0 (no intervention)
is associated with the index 1, the intervention t = 1 (self-help) with the index 2, and
so on. This is because both R and JAGS index arrays with the first element starting from
1 (as opposed to 0). The total number of studies, the arm index for each observation
in the respective trial, the number of comparators and observations and the number of
trials including the baseline reference treatment, in this case t = 0 (no intervention),
are also calculated from the data.
The jags function used to run the Bayesian evidence synthesis model requires
several inputs:
• data: a named list including all the inputs needed by the model;
• inits: a list of initial values or a function generating the initial values for (a
subset of) the stochastic parameters in the model. In this example, we set inits
to NULL, which means that JAGS will choose at random the initial values for all
the parameters in the model. The initial values of the parameters will be randomly
drawn from the space of values they can assume, determined by their stochastic
definition;
• parameters.to.save: a vector of variables to monitor, i.e. the parameters of
interest. JAGS will save the output of the simulations from the associated posterior
distributions only of the monitored parameters;
50 2 Case Studies

• model.file: the name or address of the file containing the model. Since we pre-
viously saved the model in the R as the file smoking_model_RE.R, the name of
this file will be the value passed to this argument;
• n.chains: the number of parallel Markov chains to run. It is highly recommended
that these are at least 2, to allow for checking the convergence and the mixing of
the chains;
• n.iter: the number of iterations to perform for each chain from initialisation;
• n.thin: the thinning rate, i.e. after how many iterations a single value form the
posterior distribution is saved, discarding the others;
• n.burnin: the length of the burn-in, i.e. the number of simulations to discard after
the initialisation of the chains before saving any value. If not specified as in this
case, by default it is set to n.iter/2.
More details on how to run a JAGS model and then post-process its results for the
purposes of health economic analysis are given in [7].
At this point, all the necessary data inputs have been pre-processed and it is
possible to run the MTC analysis model:
> # define data and parameters to monitor
> inputs = list (" s " ," n " ," r " ," t " ," ns " ," nt " ," b " ," nobs " ," incb " ," na ")
> pars = c (" rr " ," pi " ," p " ," d " ," sd " ," T ")

> smoking_output <- jags ( data = inputs , inits = NULL , parameters . to . save = pars ,
model . file = model . file , n . chains =2 , n . iter =10000 , n . thin =10)

The jags function will save the output of the model in the rjags object which
we called smoking_output. A summary of the model results can be printed out by
executing the following line of code:
> print ( smoking_output )
Inference for Bugs model at " smoking_model_RE . txt ", fit using jags ,
2 chains , each with 20000 iterations ( first 10000 discarded ) , n . thin = 10
n . sims = 2000 iterations saved
mu . vect sd . vect 2.5% 25% 50% 75% 97.5% Rhat n. eff
d [1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
d [2] 0.499 0.395 -0.278 0.245 0.501 0.757 1.306 1.001 2000
d [3] 0.843 0.239 0.383 0.684 0.833 0.995 1.338 1.000 2000
d [4] 1.107 0.446 0.248 0.817 1.094 1.391 2.011 1.001 2000
pi [1] 0.062 0.012 0.041 0.054 0.061 0.069 0.086 1.002 1000
pi [2] 0.100 0.031 0.053 0.078 0.096 0.117 0.172 1.001 2000
pi [3] 0.132 0.021 0.096 0.118 0.131 0.145 0.174 1.003 730
pi [4] 0.169 0.051 0.087 0.134 0.164 0.199 0.287 1.001 2000
rr [1 ,1] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
rr [2 ,1] 1.685 0.635 0.774 1.257 1.585 2.003 3.232 1.001 2000
rr [3 ,1] 2.200 0.497 1.416 1.858 2.129 2.469 3.346 1.000 2000
rr [4 ,1] 2.878 1.181 1.254 2.090 2.657 3.432 5.677 1.001 2000
rr [1 ,2] 0.676 0.252 0.309 0.499 0.631 0.795 1.292 1.001 2000
rr [2 ,2] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
rr [3 ,2] 1.454 0.551 0.672 1.054 1.366 1.726 2.814 1.001 2000
rr [4 ,2] 1.849 0.814 0.727 1.291 1.691 2.253 3.865 1.001 2000
rr [1 ,3] 0.476 0.103 0.299 0.405 0.470 0.538 0.706 1.000 2000
rr [2 ,3] 0.785 0.295 0.355 0.579 0.732 0.949 1.488 1.001 2000
rr [3 ,3] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
rr [4 ,3] 1.324 0.489 0.596 0.989 1.249 1.559 2.464 1.001 1700
rr [1 ,4] 0.403 0.159 0.176 0.291 0.376 0.478 0.798 1.001 2000
rr [2 ,4] 0.646 0.288 0.259 0.444 0.591 0.774 1.375 1.001 2000
rr [3 ,4] 0.856 0.316 0.406 0.641 0.801 1.011 1.677 1.001 1700
rr [4 ,4] 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1
sd 0.601 0.128 0.395 0.510 0.589 0.676 0.896 1.007 2000
deviance 281.227 9.952 263.929 274.289 280.731 287.548 302.935 1.001 2000

For each parameter , n . eff is a crude measure of effective sample size ,


and Rhat is the potential scale reduction factor ( at convergence , Rhat =1) .
2.4 Smoking Cessation 51

DIC info ( using the rule , pD = var ( deviance ) /2)


pD = 49.5 and DIC = 330.8
DIC is an estimate of expected predictive error ( lower deviance is better ) .

The table above reports for each monitored variable the estimated mean, standard
deviation, several percentiles of the posterior distributions (2.5, 25, 50, 75 and 97.5%),
the convergence Gelman–Rubin R  diagnostic and the effective sample size n.eff.
The latter gives an indication of the presence of autocorrelation within the chains by
quantifying the information contained in the vector of simulations used to estimate
every parameter. The percentiles can be used to approximate a credibility interval
(CrI), which is an interval of values containing a posterior probability mass equal to
0.95; assuming the unimodality of the distributions (as it is the case), the posterior
95% CrI can be approximated by taking the 2.5 and 97.5% percentiles as the lower
and upper bound, respectively.1
The diagnostics measures indicate a good convergence of the model, with the
Gelman–Rubin R  statistics being below 1.1 for all the measures. In addition the
effective sample size does not signal the presence of autocorrelation within the chains.
From this output we can observe that the most effective treatment is t = 4 (Group
counselling) with an associated probability of patients quitting smoking equal to
π4 = 0.17 (95% CrI [0.09; 0.29]). It is followed by: t = 3 (Individual counselling)
associated with a probability of quitting of π3 = 0.13 (95% CrI [0.10; 0.17]); t = 2
(Self-help) with a probability of π2 = 0.10 (95% CrI [0.05; 0.17]); and lastly t = 1
(No intervention) with an estimated probability of quitting equal to 0.06 (95% CrI
[0.04; 0.09]). The results are represented graphically in Fig. 2.2. It can be observed
that the uncertainty associated with the effect size of group counselling is high, with
a 95% credible interval wider than the ones for the other interventions.
The plot in Fig. 2.2 can be reproduced using the following code:
> attach . jags ( smoking_output )
> tr . eff = data . frame ( t( apply (pi ,2 , quantile ,c (0.025 ,0.975) )) )
> names ( tr . eff )= c (" low " ," high ")
> treats = c (" No intervention " ," Self - help " ," Individual counselling
" ," Group counselling ")
> tr . eff = cbind ( tr . eff , mean = smoking_output$BUGSoutput$mean$pi ,
interventions = factor ( treats , levels = treats ) )
> detach . bugs ()

1 It should be noted that the estimation of the “effective number of parameters” p D is controversial.
The definition reported in [12] and in [3], which is also the one adopted in BUGS, should be preferred
instead of the one reported by R2jags [13]. This statistic is calculated by R2jags as:

p D = Var[ D̄model ]/2

while both [12] and in [3] report that the preferred definition is:

p D = D̄model − D(
θ)

where D̄model is the posterior deviance of the model and D( θ) is the deviance in correspondence
of the estimated posterior mean of the vector of parameters θ. It should be noted that the definition
of p D has a direct impact on the deviance information criteria (DIC), which is an index commonly
used for model comparisons, defined as DIC = D̄model + p D = D( θ) + 2 p D .
52 2 Case Studies

Meta−analysis results

No intervention

Self−help

Individual counselling

Group counselling

0.05 0.1 0.15 0.2 0.25

Probability of smoking cessation

Fig. 2.2 Mean and 95% credible intervals (CrI) of the re-estimated treatment effects of each treat-
ment. Group counselling resulted in having the best estimated efficacy, followed by individual
counselling, self-help and no intervention. The credible interval associated with the group coun-
selling estimate was substantially wider than the ones for the other comparators

> ggplot ( tr . eff ) + geom_point ( aes ( y= mean , x= interventions )) +


geom_errorbar ( aes ( x= interventions , ymin = low , ymax = high ) , width
=0.15) + coord_flip () + theme_bw () + labs ( x ="" , y =" Probability
of smoking cessation ", title =" Meta - analysis results ")

The simulations from the posterior distributions of the parameters that will be used
in the economic model are stored in the BUGSoutput element of the rjags output
object. The vectors of simulations can be attached to the current workspace by using
the command attach.jags, which makes the values available in the workspace.2
As the economic model will be based on the 2,000 values obtained in JAGS, it is
necessary to extract these values from the output object. In the following code, we
attach the JAGS output to the workspace and copy the values simulated from the
posterior distributions of the estimated probability of cessation for each treatment
π = (π1 , π2 , π3 , π4 ) in a 2000 × 4 matrix pi. The latter will be used as inputs for
the economic model.
> attach . jags ( smoking_output )
> pi <- pi

2 When
using OpenBUGS and R2OpenBUGS, the object can be attached to the R workspace using the
command attach.bugs(object) or attach.jags(object), respectively.
2.4 Smoking Cessation 53

2.4.2 Economic Model

Similarly to the Vaccine example, we need now to include other variables and, gener-
ally, post-process the output of the Bayesian model to obtain the quantities necessary
to perform the Decision and Uncertainty Analyses.
For example, in this case no data on the costs were provided in addition to the
effectiveness reported in [11]. Thus, for the purposes of this example, we extracted
information published in [14], who reported costs for different class of interventions
for smoking cessation. The costs in British pounds were taken from [15]. Although
the interventions reported in [11] were not described in detail, a comparison of the
meta-analysis results with the comparative efficacy measures given in [14] showed
consistent results, indicating substantial similarity between the interventions in the
two studies.
The costs for the comparators included in the analysis are composed as follows:
No intervention:
• No costs: £0;
Self-help:
• Nicotine replacement therapy (NRT) for five weeks (35 patches at £1.30 each);
Individual counselling:
• NRT for five weeks (35 patches at £1.30 each);
• Five clinic visits (£10.00 each);
Group counselling:
• NRT for five weeks (35 patches at £1.30 each);
• Five group visits (£19.46 each).
The total average costs per intervention were: £0 for t = 0; £45.50 for t = 1;
£95.50 for t = 2; and £142.80 for t = 3. Due to the expected variability associ-
ated to the compliance to the interventions in general practice and to the potential
need of additional counselling and pharmacological treatment for some patient, it is
reasonable to describe the uncertainty associated with the costs with a probability
distribution.
For simplicity, a triangular distribution is associated with all treatment costs
(excluding the reference “No intervention” comparator), with limits defined by the
average intervention cost ±30%. The triangular distribution is a triangle-shaped curve
with a null associated density of probability outside the specified lower and upper
bounds. It increases linearly from the lower bound to its mode, and decreases linearly
up to the upper limit. A graphical representation is given in Fig. 2.3. A real-world
analysis could be based on more appropriate assumptions for the cost distributions.
The distributions of the costs need to be simulated to be inputted in the cost-
effectiveness model. The reference comparator t = 0 is assumed not to have an
associated costs, i.e. its cost is always null. In formal terms, a degenerate proba-
bility distribution which assumes the value zero with probability equal to one is
54 2 Case Studies

0.09

Density

0.06

0.03

0.00

35 40 45 50 55
Cost

Fig. 2.3 The distribution represents the uncertainty associated with the costs for the self-help
intervention. The curve is shaped as a triangle, hence its name. In this case the mean is equidistant
to the lower and upper bound, thus corresponding to the mean (and median) of the distribution

assigned to this parameter. The costs for the other interventions are simulated from
the intervention-specific triangular distributions described above. Functions to sam-
ple from a triangular distribution are not included in the default libraries of R, thus
the triangle package needs to be installed to use the following code. The package
is available on CRAN, and can be installed as usually from a GUI or by inputting
the following command:
> install . packages (" triangle ")

The code to obtain the simulated values from the probability distributions of the
costs, stored in the cost matrix, is presented below. Since we populated the matrix
with zeroes when creating the object, the costs for t = 0 are automatically assigned.
The function rtriangle accepts as arguments the number of simulations needed, the
lower bound of the distribution a and the upper bound b. If not specified, the mode
of the distribution c is calculated by default as the average of the two extremes; since
we are using symmetric distributions and thus the mode corresponds to the mean,
there is no need to specify this parameter.

Table 2.3 Life expectancy increments gained by smoking cessations per gender and age at quitting.
Source: [16]
Life years gained relative to continuing smokers
Age at quitting Men Women
35 8.5 7.7
45 7.1 7.2
55 4.8 5.6
65 4.6 5.1
2.4 Smoking Cessation 55

Table 2.4 Proportion of smokers per age group. The data on smoking statistics have been published
by the charity Action on Smoking and Health in October 2013, reporting the prevalence of cigarette
smoking in the UK. Source [17]
Age group Proportion of smokers (%)
16–19 15
20–24 29
25-34 27
35–49 23
50–59 21
60+ 13

> library ( triangle )


> cost . t1 =45.5
> cost . t2 =95.5
> cost . t3 =142.8
> c= matrix ( data =0 , nrow =2000 , ncol =4)
> c [ ,2]= rtriangle (2000 , a= cost . t1 *.8 , b= cost . t1 *1.2)
> c [ ,3]= rtriangle (2000 , a= cost . t2 *.8 , b= cost . t2 *1.2)
> c [ ,4]= rtriangle (2000 , a= cost . t3 *.8 , b= cost . t3 *1.2)
As for the measure of effectiveness, we can use data from [16] on the increments
in life expectancy gained. The authors analysed the increments in life expectancy
observed in a US survey, reported in Table 2.3.
Data on the prevalence of smoking in the British setting were obtained from the
2013 report of the charity Action on Smoking and Health (ASH) [17]. The data have
been summarised in Table 2.4. A split by both gender and age was not included, but
the overall proportions of men and women smoking were reported as 22 and 20% in
2012, respectively. This means that the proportion of men among smokers was 52%.
It has been assumed that the proportion was not different among the age groups due
to lack of data.
The distribution by age of the population in the UK was taken from the 2011
census by the Office of National Statistics (ONS). The data tables are available from
the ONS website.3
The data for the life years gained reported in [16] did not include individuals
younger than 35 or older than 65 years. For simplicity, we assume here that the gain
for quitters younger than 35 years was the same observed for the quitters at 35 years
and that individuals aged 65–80 years quitting had the same gain as 65 years old
quitters (Table 2.5).
The average life expectancy was calculated using a simulation-based approach.
For each of the 2,000 simulations, 1,000 smoking individuals were drawn from the
distribution of smokers per age, calculated on the basis of the age distribution in the
UK and the proportion of smokers per age group. The gender was simulated from

3 At
the address http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%
3A77-270247.
56 2 Case Studies

Table 2.5 Data inputs for the simulation of life years gained by smoking cessation. The dataset is
contained in the file smoking_cessation_simulation.csv
Age Population Propotion Smokers Proportion Male life Female life
of smokers of age group years years
15–19 3,997,000 0.15 599,550 0.08 8.50 7.70
20–24 4,297,000 0.29 1,246,130 0.09 8.50 7.70
25–29 4,307,000 0.27 1,162,890 0.09 8.50 7.70
30–34 4,126,000 0.27 1,114,020 0.08 8.50 7.70
35–39 4,194,000 0.23 964,620 0.09 8.50 7.70
40–44 4,626,000 0.23 1,063,980 0.09 7.10 7.20
45–49 4,643,000 0.23 1,067,890 0.09 7.10 7.20
50–54 4,095,000 0.21 859,950 0.08 4.80 5.60
55–59 3,614,000 0.21 758,940 0.07 4.80 5.60
60–64 3,807,000 0.13 494,910 0.08 4.60 5.10
65–69 3,017,000 0.13 392,210 0.06 4.60 5.10
70–74 2,463,000 0.13 320,190 0.05 4.60 5.10
75–79 2,006,000 0.13 260,780 0.04 4.60 5.10

a Binomial model based on the split reported in the ASH smoking statistics and the
life years reported in [16] were assigned.
To obtain the 2,000 simulations from the posterior distribution of the average
life years gained by quitters, the code below has been used. Notice that, in order to
use the code, the file smoking_cessation_simulation.csv needs to be available
in the same directory from which R is run, or the correct address to the file needs
to be specified. Each of the 1,000 individuals in the cohorts are associated with
a simulated age. This is drawn from a multinomial distribution with a vector of
probabilities equal to the observed frequency for each age group. The gained life
years are calculated for each group based on the gender split. The results are then
averaged over the sample, to obtain a vector composed by 2,000 elements. To repeat
the process 4 times, obtaining 2,000 simulations for each treatment, 8,000 samples
from the multinomial distribution are taken. These are successively arranged in a
matrix with 2,000 rows and 4 columns.
> data = read . csv ( file =" smoking_cessation_simulation . csv ")
> life . years = with ( data , rmultinom (2000*4 ,1000 , pr . age ) *
(.52* Male . ly +.48* Female . ly ) )
> life . years = matrix ( apply ( life . years ,2 , sum ) /1000 ,
nrow =2000 , ncol =4)

At this point it is possible to obtain the life years gained for each intervention. It is
only necessary to multiply the probability of smoking cessation π for each treatment
by the average number of life years gained by quitting. This can be obtained by a
multiplication of the two quantities:
> e= pi * life . years
2.4 Smoking Cessation 57

Again, this process is completed by running BCEA and performing the Decision
and Uncertainty Analysis (as described in details in Chaps. 3 and 4).

References

1. S. Petrou, A. Gray, Brit. Med. J. 342 (2011), http://dx.doi.org/10.1136/bmj.d1766


2. G. Lu, A. Ades, Stat. Med. 23, 3105 (2004)
3. D. Lunn, C. Jackson, N. Best, A. Thomas, D. Spiegelhalter, The BUGS Book—A Practical
Introduction to Bayesian Analysis (Chapman Hall/CRC, New York, NY, 2013)
4. M. Plummer, (2015), https://sourceforge.net/projects/mcmc-jags/files/Manuals/4.x/jags_
installation_manual.pdf/download
5. G. Baio, A.P. Dawid, Stat. Methods Med. Res. (2011). doi:10.1177/0962280211419832
6. Stan: A C++ Library for Probability and Sampling, Version 2.8.0 (2015), http://mc-stan.org/.
Accessed 22 Sept 2015
7. G. Baio, Bayesian Methods in Health Economics (Chapman Hall/CRC Press, Boca Raton, FL,
2012)
8. D. Wilkinson, Perform. Improv. Q. (1999)
9. N.J. Welton, A.J. Sutton, N.J. Cooper, K.R. Abrams, A.E. Ades, Evidence Synthesis for Deci-
sion Making in Healthcare (John Wiley & Sons, Ltd, 2012)
10. S. Dias, N. Welton, A. Sutton, A. Ades, Technical support documents: Evidence synthesis
series. Tech. rep., National Institute for Health and Care Excellence (NICE), Decision Support
Unit (2013), http://www.nicedsu.org.uk/Evidence-Synthesis-TSD-series(2391675).htm
11. V. Hasselblad, 18, 37 (1998)
12. A. Gelman, J. Carlin, H. Stern, D. Dunson, A. Vehtari, D. Rubin, Bayesian Data Analysis, 3rd
edn. (Chapman Hall/CRC, New York, NY, 2013)
13. Y.S. Su, M. Yajima, R2jags—a package for running jags from R. (2012), http://cran.r-project.
org/web/packages/R2jags/index.html
14. W.D. McGhan, M. Smith, Am. J. Health-Syst. Pharm. 53, 45 (1996)
15. S. Flack, M. Taylor, P. Trueman, Cost-effectiveness of interventions for smoking cessation.
Tech. rep. York Health Economics Consortium (2007)
16. D.H. Taylor, Jr, V. Hasselblad, S.J. Henley, M.J. Thun, F.A. Sloan, 92(6) (2002)
17. ASH: Action on Smoking and Health. ASH fact sheet on smoking statistics. (2013), http://ash.
org.uk/files/documents/ASH_106.pdf
Chapter 3
BCEA—A R Package for Bayesian
Cost-Effectiveness Analysis

3.1 Introduction

Cost-effectiveness analysis is usually performed using specialised software such as


TreeAge or spreadsheet calculators (e.g. Microsoft Excel). Part of the narrative that
accompanies this choice as the de facto standard is that these tools are “transparent,
easy to use and to share with clients and stakeholders”.
These statements may hold true for simple models, which can be easily arranged in
a small number of spreadsheets, sometimes even just one. In these cases, it is indeed
useful to give the user the possibility of modifying a small number of parameters by
simply changing the value of a cell or selecting a different option from a drop-down
menu.
Figure 3.1 shows an example of an Excel model. The file is structured over 18
different spreadsheets using Virtual Basic for Applications (VBA) macros. Typically,
it is possible to create shortcuts such as buttons, e.g. as in the left-hand side of the
screen in Fig. 3.1a, that allow users to navigate through the spreadsheets. Clearly,
however, these complex models are not necessarily “easy to use”.
Figure 3.1a shows the spreadsheet in which the user can modify the value of some
of the parameters in the model. This can be done either by typing in the cells or
by selecting one of a set of allowed options using the drop-down menus. These
menus can be again programmed using macros. Typically, this spreadsheet is merely
a graphical interface and is not used for calculations of the actual economic model.
In fact, when a value is changed it is simply overwritten in one of the cells in one of
the other spreadsheets, for example the spreadsheet presented in Fig. 3.1b.
The interesting thing to note is the size of this spreadsheet: the screenshot shows
cells in the range A1731–K1776 (using the standard Excel notation, in which numbers
indicate rows and letters indicate the columns of the spreadsheet). This is just an
excerpt of the whole spreadsheet and indicates the complexity of this fundamental
component of the overall model. Values in the cells of this spreadsheet are linked to

© Springer International Publishing AG 2017 59


G. Baio et al., Bayesian Cost-Effectiveness Analysis with the R package BCEA,
Use R!, DOI 10.1007/978-3-319-55718-2_3
60 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

(a)

(b)

Fig. 3.1 An example of a cost-effectiveness model implemented in Microsoft Excel. Panel a


shows the spreadsheet in which the user can modify the value of (some of the very many!) parameters,
either typing in the cells or using drop-down menus. Panel b shows an excerpt of the spreadsheet
in which the values for the relevant variables populating the underlying model are actually stored
for computation
3.1 Introduction 61

other cells in other spreadsheets to actually perform the necessary computations—for


example, the highlighted cell H1750 includes the formula
= IF ( AND ( Deterministic_switch = TRUE , P1750 =1 , R1750 < >"") , R1750 , I1750 )

instructing Excel to first check whether:


(a) the named variable Deterministic_switch (which, incidentally, is defined in
cell P8 in the current workspace, named Variables) is set to the value TRUE or
FALSE;
(b) the value in cell P1750 in the current spreadsheet is equal to 1;
(c) cell R1750 is not empty.
If these three conditions hold, then Excel will copy to cell H1750 the value originally
recorded in cell R1750, while if they do not, it will copy the value originally included
in cell I1750.
Human error in cross-referencing one of the cells may have dire consequences
and produce results that are incorrect. Of course, this is a problem for any modelling
procedure, irrespective of the software used to perform the calculations. The addi-
tional issue with spreadsheet-based models is that it may be very difficult to debug
and search for potential errors, because cross-linking may be difficult to follow when
there are many spreadsheets and active cells in each.
In line with [1, 2], we argue that many of these shortcomings can be overcome by
doing the whole process using proper statistical software—our choice is of course
R—and that the main advantages are
1. Scripting facility: the whole analysis can (and should!) be performed by writing
scripts instructing the software about the steps necessary for the analysis. This
will improve replicability and will provide transparency;
2. Graphical facility: R has very good graphical engines (including the default base
and the more advanced ggplot2). This guarantees, at virtually no cost, high
quality output that can be included in research papers or reimbursement dossiers
to be submitted to the regulators;
3. Statistical facility: models are increasingly complex and involve subtle issues that
require careful statistical modelling. As an example, consider survival analysis
(incidentally, the vast majority of NICE appraisals is in the cancer area, where
survival analysis plays a fundamental role): fitting even the simplest of survival
models goes beyond Excel’s ability;
4. Computational facility: related to the previous point, some of the most advanced
analyses (for example involving “microsimulations” or the analysis of the value of
information—see Chap. 4) require a computational engine that, again, is beyond
the capability of Excel.
In particular, we consider BCEA, an R package to post-process the results of a Bayesian
health economic model and produce standardised output for the analysis of the results
[3]. Figure 3.2 shows a schematic representation of the package. In the figure, the
purple boxes indicates functions that define specific classes. These are objects in R
that allow generic functions (such as print or plot) to adapt, for example, the main
62 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

Fig. 3.2 A schematic representation of the BCEA package

function bcea returns as output an object in the class bcea. These generic functions
are represented by the orange boxes. Finally, the red boxes identify the functions
that are specific to BCEA. In the rest of the book, we present each of these elements
and explain how they can be used in the process of statistical analysis of a health
economic evaluation problem.
BCEA accepts, as inputs, the outcomes of a health economic evaluation compar-
ing different interventions or strategies, ideally but not necessarily produced using
MCMC (Markov Chain Monte Carlo) methods. It is not a tool to perform the evalua-
tion itself, but rather to produce readable and reproducible outputs of the evaluation. It
also provides many useful, technically advanced measures and graphical summaries
to aid researchers interpret their results.
In general, BCEA requires multiple simulations from an economic model that com-
pares at least two different interventions based their cost and effectiveness measures
to produce this standardised output. The cost measure quantifies the overall costs
associated with the interventions for every simulation. The effectiveness, or efficacy,
measure can be given in any form, be it a “hard” outcome (e.g. number of avoided
cases) or a “soft” one (e.g. QALYs, Quality-Adjusted Life Years).
3.1 Introduction 63

Thus the minimum input which must be given to BCEA is composed of two n sim ×
n int matrices, where n sim is the number of simulations used to perform the analysis
(at least 2) and n int is the number of interventions being compared (again, at least 2
are required). These two matrices contain all the basic information needed by BCEA
to perform a health economic comparison of the alternative interventions.
We assume, in general, that the statistical model underlying the economic analysis
is performed in a fully Bayesian framework. This implies that the simulations for
the economic multivariate outcome (e, c) are in fact from the relevant posterior
distributions. We discuss in Chap. 5 how BCEA can be used alongside a non-Bayesian
model.
To illustrate the capabilities of BCEA, the two examples are introduced in Sects. 2.3
and 2.4 are developed as full health economic evaluations throughout the book.
Both these examples are included in the BCEA package and therefore all the results
throughout the book can be replicated using these datasets. Each of the following
sections detail a different function in BCEA demonstrating its functionality for both
single and multi-comparisons examples.

3.2 Economic Analysis: The bcea Function

If a health economic model has been run in a similar manner to the two examples
discussed in Chap. 2 then, in general, the modeller will have access to two matrices,
which we denote by e and c. These matrices contain the simulated values of the
effectiveness and costs, associated with the interventions t = 0, . . . , T , where T + 1
is the total number of treatments, equal to 2 for the Vaccine example and 4 for the
Smoking cessation example. The generic element of position [s, t] in each matrix is
the measurement of the outcome observed in the s-th simulation, with s = 1, . . . , S,
where S is the number samples, under intervention t, with t = 0, . . . , T . For the
Vaccine example (Sect. 2.3), S = 1 000 and for the Smoking example (Sect. 2.4)
S = 2 000.
To begin any analysis using the package BCEA, the bcea function must be called.
This function processes the matrices for the costs and effectiveness so that the model
output is in the correct form for other functions in the package BCEA. Additionally,
the bcea object can be used to give basic summaries and plots. Therefore, when this
function is called it should be assigned to an object, to create an object of class bcea.
This object contains the following elements which are then used as inputs to the other
functions in the BCEA package. A bcea object contains the following elements:
• n.sim: the number of model simulations, i.e. the number of rows of the e and c
matrices given as arguments to the bcea function;
• n.comparators: the total number of interventions included in the model, i.e. 4
for the smoking cession example;
64 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

• n.comparisons: the total number of possible pairwise comparisons versus the


reference intervention. It is equal to n.comparators − 1;
• delta.e: a matrix with n.sim rows and n.comparisons columns, including as
elements for each row the simulation-specific differences in the clinical benefits
between the reference comparator and the other comparators;
• delta.c: a matrix with n.sim rows and n.comparisons columns, including as
elements for each row the simulation-specific differences in the costs between the
reference comparator and the other comparators;
• ICER: a vector of length n.comparisons including the ICER for the comparison(s)
between the reference intervention and the other comparators (cfr. Sect. 1.3);
• Kmax: the value of the Kmax argument given to the function bcea, equal to the
maximum willingness to pay (cfr. Sect. 3.2.2). This is ignored if the option wtp is
passed as an argument to the bcea function.
• k: a vector specifying the grid approximation of the willingness to pay to be used
in the calculations. The k parameter can be passed to the function bcea specifying
the argument wtp as a numeric vector of willingness to pay thresholds of interest.
bcea will also accept a scalar for the input wtp, in which case the analysis is
performed assuming a single value for the willingness to pay threshold. As a
default behaviour, bcea builds a 501-element from 0 to Kmax;
• ceac: a matrix with a number of rows equal to the length of k and
n.comparisons columns. The elements are the value of the pairwise cost-
effectiveness acceptability curve as a function of the willingness to pay grid k
for each comparison with the reference intervention (cfr. Sect. 4.2.2);
• ib: a three-dimensional array with size for the first dimension equal to the length
of the vector k, second dimension equal to n.sim and third dimension given by
n.comparisons. The elements of the array ib are the values of the incremental
benefit for each willingness to pay for every simulation by comparisons. If only
two comparators are included in the model (like in the vaccine example), it is a
matrix with rows equal to the length of k and n.sim columns;
• eib: a matrix with rows given by the length of k and n.comparisons columns
reporting the values of the expected incremental benefit as a function of the will-
ingness to pay thresholds (cfr. Sect. 3.5). If only one comparison is included in the
analysis, it will be a vector, with length equal to the length of k;
• kstar: a vector including the grid approximation of the break-even point(s), if
any. Since kstar is calculated on the vector k, its precision depends on the density
of the grid approximation of the willingness to pay values;
• best: a vector with the same length as k, indicating the “best” intervention for
every willingness to pay threshold included in the k grid. The “best” intervention
is the one maximising the expected utilities for each threshold value;
• U: an array of dimension n.sim × the length of k × n.comparators including the
value of the expected utility for each simulation from the Bayesian model, for each
value of the grid approximation of the willingness to pay and for each intervention
being considered;
3.2 Economic Analysis: The bcea Function 65

• vi: a matrix with n.sim rows and columns equal to the length of k including the
value of information for each simulation from the Bayesian model and for each
value of the grid approximation of the willingness to pay (cfr. Sect. 4.2.1);
• Ustar: a matrix with n.sim rows and columns equal to the length of k, indi-
cating the maximum simulation-specific utilities for every value included in the
willingness to pay grid;
• ol: the opportunity loss value for each simulation and willingness to pay value,
reported as a matrix with n.sim rows and a number of columns equal to the length
of the vector k (cfr. Sect. 4.2.1);
• evi: a vector with the same number of elements as k, with the expected value of
(perfect) information for every considered willingness to pay threshold as values
(cfr. Sect. 4.3);
• interventions: a vector of length n.comparators with the labels given to each
comparator;
• ref: the numeric index associated with the reference intervention;
• comp: the numeric index(es) associated with the non-reference intervention(s);
• step: the step used to form the grid approximation of the willingness to pay grid,
such that a 501-elements grid of values is produced with 0 and Kmax as the extreme
values. Ignored if wtp is passed as an argument to the bcea function;
• e: the matrix including the simulation-specific clinical benefits for each comparator
used to generate the object;
• c: the matrix including the simulation-specific costs for each comparator used to
generate the object.
These items in the bcea object are sub-settable as in lists, i.e. the command
object$n.sim will extract the first element of the bcea object.

3.2.1 Example: Vaccine

To use the Vaccine dataset included in the package BCEA, it is sufficient to load
BCEA using the function library, and attaching the dataset with the command
data(Vaccine). Doing so will import in the current workspace all the variables
required to run the analysis.1
> library ( BCEA )
> data ( Vaccine )
> ls ()
[1]" N " " N . outcomes " " N . resources " " QALYs . adv "
[5] " QALYs . death " " QALYs . hosp " " QALYs . inf " " QALYs . pne "
[9] " c " " cost . GP " " cost . hosp " " cost . otc "
[13] " cost . time . off " " cost . time . vac " " cost . travel " " cost . trt1 "
[17] " cost . trt2 " " cost . vac " "e" " treats "

1 Processing the data as demonstrated in Sect. 2.3.2 will yield slightly different results than those
presented in this section as the parameters were produced in two different simulations. The following
analyses are based on the Vaccine dataset included in the BCEA package.
66 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

The definition of each object included in the Vaccine dataset is discussed in the
package documentation by executing the command ?Vaccine in the R console. At
this point we can begin to run the health economic analysis by using the bcea function
to format the model output. A suitable vector of labels for the two interventions can
be defined (this step is optional) and the function bcea is then called where the
matrices e and c are the effectiveness, in terms of QALYs, and the costs for the
Vaccine example.
> library ( BCEA )
> treats <- c (" Status quo " ," Vaccination ")
> m <- bcea (e ,c , ref =2 , interventions = treats )

If no labels are passed and the option interventions is not explicitly


considered, BCEA will create labels of the type “Intervention 1”,. . .,
“Intervention T”, with T = 2 in this case, indicating the number of compared
interventions.
The option ref=2 instructs R to consider the second intervention (i.e., the one for
which the values of the cost and effectiveness measures feature in the second column
of the matrices c and e) as the “reference” intervention. Thus, t = 1 is considered
as the intervention being assessed against the standard t = 0. If no explicit option is
specified, BCEA assumes that the first intervention is the reference instead and treats
all the others as comparators.
This command produces a bcea object assigned to m in the code above. If the
argument plot is set to TRUE, the function bcea creates a graph showing the main
results (see Fig. 3.3). By default, the graph is not created, which is equivalent to
setting the option to FALSE. It is possible to produce the summary graph when the
economic evaluation is run or anytime after that, using the command plot(m), where
m is a bcea object:
> ### Run the HE analysis and produce summary graph
> m <- bcea ( e =e , c =c , ref =2 , plot = TRUE )
> ### Produce the summary graph from the bcea object
> plot ( m )

Note that plot is an S3 method for the bcea class of object and therefore help for
this function is accessed using ?plot.bcea.
As mentioned earlier, the input data for the population average measure of cost
and effectiveness are ideally obtained from a full Bayesian model, as in the example
just described. Nevertheless, BCEA can also accommodate data on (e, c) obtained
under a frequentist approach, e.g. using bootstrap. Perhaps, these are available
in a spreadsheet (we return to this point repeatedly in the rest of the book, e.g.
in Sects. 4.3.2, 5.2.4.1 and 5.2.5), say the file Bootstrap.csv in which the first two
columns are the simulations for the measure of effects for the two treatments con-
sidered, while the data in columns three and four are the corresponding values for
the measure of cost. In such a situation, we could import these to R as follows.
3.2 Economic Analysis: The bcea Function 67

Fig. 3.3 The graph shows the main results of the analysis. It can be produced by the call to BCEA
by setting the option plot=TRUE or by calling the function plot with a valid BCEA object as its
argument. From the top-left corner, clockwise, it includes: the cost-effectiveness plane Sect. 3.4,
the expected incremental benefit Sect. 3.5, the cost-effectiveness acceptability curve Sect. 4.2.2 and
the expected value of perfect information Sect. 4.3.1

> # Imports the spreadsheet ( assuming the file is in the working directory - if
not , need to change the path !)
> inputs <- read . csv (" Boostrap . csv ")

> # Take a look at the resulting object


> head ( inputs )
QALYs . for . t .0 QALYs . for . t .1 Costs . for . t .0 Costs . for . t .1
1 -0.0013140230 -0.0014362150 12.560492 18.163893
2 -0.0013798217 -0.0010113578 11.371500 16.024182
3 -0.0007782520 -0.0004565305 7.611361 13.012624
4 -0.0009781274 -0.0005831021 7.422760 13.025927
5 -0.0005302833 -0.0005907977 4.313992 9.971295
6 -0.0011509266 -0.0007041771 11.839145 14.955694
68 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

Crucially, the object inputs is automatically created in the R workspace as a


data.frame—not a matrix. This means that if we try to feed the relevant columns
to BCEA as the values for the arguments e and c, we get an error message.
> # Creates objects e , c with the relevant columns of inputs
> e <- inputs [, c (1:2) ]
> c <- inputs [, c (3:4) ]

> # And then use these to launch BCEA


> library ( BCEA )
> m <- bcea (e ,c )
Error in rep (e , K ) * rep (k , each = n . sim * n . comparators ) :
non - numeric argument to binary operator

To avoid this problem, we need to change the class of the simulated values, so that
e and c are matrices, for example using the following code.
> # Check the class of the objects
> class ( inputs )
[1] " data . frame "
> class ( e )
[1] " data . frame "
> class ( c )
[1] " data . frame "

> # Re - create the objects e and c as matrices and check the class
> e <- as . matrix ( inputs [, c (1:2) ])
> c <- as . matrix ( inputs [, c (3:4) ])
> class ( e )
[1] " matrix "
> class ( c )
[1] " matrix "

> # Now re - run BCEA


> m <- bcea (e ,c )

3.2.2 Example: Smoking

To use the Smoking dataset Sect. 2.4 included in the package BCEA, the data set must
be loaded in a similar fashion to the Vaccine data set using data(Smoking). This
will import the Smoking data set into the current workspace. As the effectiveness
and cost matrices are called e and c, in both cases. If the two examples are run in the
same workspace or R session, e and c will be replaced in the workspace.
To run the economic analysis using BCEA, the “Group counselling” intervention
is chosen as the reference intervention. Since this intervention is associated with
the fourth column of the effectiveness and cost matrices, it is selected by specifying
ref=4. To read the outputs more easily, the intervention labels are passed to the
function as well.
3.2 Economic Analysis: The bcea Function 69

> library ( BCEA )


> data ( Smoking )
> treats = c (" No intervention " ," Self - help ",
" Individual counselling " ," Group counselling ")
> m = bcea (e ,c , ref =4 , interventions = treats , Kmax =500)

By default, bcea produces the analysis on a discrete grid of willingness to pay


thresholds, creating a vector of 501 equally spaced values in the interval between
0 and the value given to the argument Kmax in the bcea function (the default value
is set at 50 000 monetary units). This implies that the vector of thresholds produced
will be {0, 100, . . . , 49900, 50000}. For the Smoking example, this upper extreme
is modified by setting Kmax to 500. This limit has been reduced as the interest is
not focused on QALYs but rather on life years gained, and the ratios of costs to
effectiveness are much lower than in a standard health economic analysis, since the
interventions are relatively inexpensive. Selecting an appropriate maximum value of
the willingness to pay allows for a finer analysis of the variations by threshold, since
the values are computed over a fixed-length grid: in other words, to a narrower interval
of the willingness to pay corresponds an analysis based on smaller increments of the
cost-per-outcome threshold.
In general, the grid approximation of the willingness to pay values can be cus-
tomised still further by the user, by passing a numeric vector as the wtp argument.
For example to produce an evaluation of the measures only at the threshold values
20 000, 25 000 and 30 000 monetary units per unit increase in outcome, the function
can be called as in the code below.
> m <- bcea ( e =e , c =c , ref =4 , wtp = c (20000 ,25000 ,30000) , plot = FALSE )

Additionally, notice that no restrictions are applied to the values passed as the wtp
argument. The vector elements need to be at least one, and if any of them is negative
the values are re-scaled by incrementing all values by the absolute value of the lowest
element (i.e. passing the argument wtp=c(-5,5) will produce an analysis over the
values 0 and 10).
The standard BCEA plot, obtained using plot(m) and demonstrated for the Vaccine
example, can be cluttered when multiple comparators are included in the analysis. For
this reason, all plots in the BCEA package can be produced either using base graphics
(the default) or ggplot2, an advanced plotting system based on the grammar of
graphics [4] which is preferred for multiple comparators as ggplot2 allows for finer
controls over the graphs.
To select the ggplot version of the plot, the option graph="ggplot2" must be
added to the plot or plot.bcea function call. The string is partial-matched to either
ggplot2 or base, hence selecting graph="g" or "b" is sufficient to indicate which
graphical engine should be used. The two versions of the graphs share the same
function calls, and are selected by the use of the graph option only. The graphical
results have been kept as consistent as possible between the two versions.
Adding the option graph="ggplot2" to the plot function will produce a ggplot
object which, if not assigned to a named object, will be printed by default. It is possible
to store the plot in an object, modify it and produce the graph using the functions
70 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

print or plot, via the S3 methods for the class ggplot (i.e. print.ggplot and
plot.ggplot).
In general, the ggplot versions of the plots extend the amount of options available
using graph="base", and the modularity of ggplot objects allows for post-hoc
modifications as well. In Fig. 3.4, the position of the legends is set to the bottom of
the graphs, outside the plot area. The size of the text labels is reduced by setting
the argument size=rel(2). The argument ICER.size is set to 2 and passed to the
ceplane.plot function (cfr. Sect. 3.4) to include the ICERs in the cost-effectiveness
plane. The code used to produce Fig. 3.4 is reported below.

Cost−Effectiveness Plane Expected Incremental Benefit


200
150
Cost differential

100
100
EIB

0
50

−100
0
k = 250 k* = 177 k* = 210

0 1 2 3 0 100 200 300 400 500

Effectiveness differential Willingness to pay


Group counselling vs No intervention Group counselling vs No intervention
Group counselling vs Self−help Group counselling vs Self−help
Group counselling vs Individual counselling Group counselling vs Individual counselling

Cost−Effectiveness Acceptability Curve Expected Value of Information


1.00
50
Probability of cost−effectiveness

0.75 40

0.50 30
EVPI

0.25 20

10
0.00

0 100 200 300 400 500


Willingness to pay 0

Group counselling vs No intervention 0 100 200 300 400 500


Group counselling vs Self−help
Group counselling vs Individual counselling Willingness to pay

Fig. 3.4 The summary of the health economic analysis produced by the ggplot version of
plot.bcea. The different colours and line types indicate the three pairwise comparisons versus
the status quo (No intervention). The two willingness to pay values in correspondence of which the
decision changes are represented in the expected incremental benefit (EIB) and expected value of
perfect information (EVPI) plots. An arbitrary willingness to pay, equal to £250 per life year saved,
has been chosen for the cost-effectiveness plane graph
3.2 Economic Analysis: The bcea Function 71

> library ( ggplot2 )


> plot (m , graph =" ggplot2 ", wtp =250 , pos = TRUE , size = rel (2) , ICER . size
=2)

The interpretation of these 4 graphics and their manipulation in both base graph-
ics and ggplot2 graphics will be dealt with in the following sections; the cost-
effectiveness plane in Sect. 3.4, the expected incremental benefit in Sect. 3.5, the
cost-effectiveness acceptability curve in Sect. 4.2.2 and the expected value of perfect
information in Sect. 4.3.1.

3.3 Basic Health Economic Evaluation: The summary


Command

A summary table reporting the basic results of the health economic analysis can be
obtained from the BCEA object using the summary function. This is an S3 method for
objects of class bcea, similar to the plot function applied to produce the graphical
summary. It produces the following output for the Vaccine example:
> summary ( m )

Cost - effectiveness analysis summary

Reference intervention : Vaccination


Comparator intervention : Status quo

Optimal decision : choose Status quo for k <20100 and Vaccination for k >=20100

Analysis for willingness to pay parameter k = 25000

Expected utility
Status quo -36.054
Vaccination -34.826

EIB CEAC ICER


Vaccination vs Status quo 1.2284 0.529 20098

Optimal intervention ( max expected utility ) for k =25000: Vaccination

EVPI 2.4145

By default, bcea performs the analysis for a willingness to pay of k = 25 000


monetary units,2 say £. The threshold can be easily modified by using the command
summary(m,wtp=value), where value is the willingness to pay specified by the
user. If the willingness to pay specified using wtp=value in the call to the function

2 This
choice is due to the fact that the average threshold of cost-effectiveness commonly used by
NICE (National Institute for Health and Care Excellence) varies between £20 000 and £30 000 per
QALY gained.
72 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

summary is not included in the grid (which can be accessed by typing m$k), an error
message will be produced. For example, using the command summary(m,1234) will
result in the following output:
> summary (m ,1234)
Error in summary . bcea (m , 1234) :
The willingness to pay parameter is defined in the interval [0 -50000] , with
increments of 100
Calls : summary -> summary . bcea Execution halted

The summary table displays the results of the health economic analysis, including
the optimal decision over a range of willingness to pay thresholds, identified by the
maximisation of the expected utilities. In this case the break-even point, the threshold
value k where the decision changes, is 20 100 monetary units. The summary also
reports the values for the EIB (Sect. 3.5), CEAC (Sect. 4.2.2) and EVPI (Sect. 4.3.1)
for the selected willingness to pay threshold, along with the ICER.
In the Vaccine example, the ICER is below the threshold of 25 000 and thus the
vaccination policy is cost-effective in comparison to the status quo. A more in-depth
explanation of the probabilistic sensitivity analysis and the tools provided by BCEA
to interpret and report it (e.g. the CEAC and the EVPI) is deferred to Chap. 4.
Running the analysis for a different willingness to pay, for example k = 10 000,
may result in a different optimal decision, depending on whether the ICER is above
or below the selected willingness to pay. If this new threshold were selected for the
Vaccine case, the ICER would now be above it and thus the decision taken in this
scenario would be associated with less uncertainty. In fact, the ICER is estimated at
20 098 monetary units, twice the value of the willingness to pay threshold selected
in this case (but notice that the summary reports the grid estimate of 20 100).
> summary (m , wtp =10000)

Cost - effectiveness analysis summary

Reference intervention : Vaccination


Comparator intervention : Status quo

Optimal decision : choose Status quo for k <20100 and Vaccination for k >=20100

Analysis for willingness to pay parameter k = 10000

Expected utility
Status quo -20.215
Vaccination -22.745

EIB CEAC ICER


Vaccination vs Status quo -2.5302 0.22 20098

Optimal intervention ( max expected utility ) for k =10000: Status quo

EVPI 0.6944

For the Smoking example, the default summary is given for Kmax. However, as
above, the willingness to pay value can be changed from this default, using the wtp
argument:
3.3 Basic Health Economic Evaluation: The summary Command 73

> summary (m , wtp =250)


Cost - effectiveness analysis summary

Reference intervention : Group counselling


Comparator intervention ( s ) : No intervention
: Self - help
: Individual counselling

Optimal decision : choose No intervention for k < 177


Self - help for 177 <= k < 210
Group counselling for k >= 210

Analysis for willingness to pay parameter k = 250

Expected utility
No intervention 103.86
Self - help 123.00
Individual counselling 126.27
Group counselling 141.73

EIB CEAC ICER


Group counselling vs No intervention 37.867 0.6355 197.65
Group counselling vs Self - help 18.725 0.5580 209.70
Group counselling vs Individual counselling 15.451 0.5330 188.61

Optimal intervention ( max expected utility ) for k =250: Group counselling

EVPI 42.984

The summary table shows that, based on the expected incremental benefit, the
optimal decision changes twice over the chosen grid of the willingness to pay values.
Below a threshold of willingness to pay equal to £177 per life year gained, the opti-
mal decision is No treatment. For values of the willingness to pay between £177 and
£210 per life year gained, the most cost-effective decision would be the Self-help
intervention. For thresholds greater than £210 the optimal strategy is Group coun-
selling. Notice that, Individual counselling is dominated by the other comparisons
at considered the willingness to pay values. The break-even points are relatively low
in value, indicating that the introduction of smoking cessation interventions would
be cost-effective compared to the null option (No treatment).
Due to the multiple treatment options this summary table is more complex than
the summary for the Vaccine example. The ICER is given for the three comparison
treatments, compared with Group Counselling. The EIB and CEAC are also given
for these pairwise comparisons. Finally, note that there is only one value given for
the EVPI (see Sect. 4.3.1). This is because the EVPI relates to uncertainty underlying
the whole model rather than the paired comparisons individually, we return to this
idea in Sect. 4.3.1.2.
74 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

3.4 Cost-Effectiveness Plane

The summary table produced using the summary function already provides relevant
information about the results of the economic analysis; however, a graphical repre-
sentation of the results can indicate behaviours or particular characteristics that may
be missed from the analysis of the summary indexes. The first and probably most
important representation of the data is the cost-effectiveness plane (which inciden-
tally is recommended by most health technology assessment agencies as a necessary
tool for economic evaluation). The cost-effectiveness plane, discussed in Sect. 1.3
and Fig. 1.4, is a visual description of the incremental costs and effects of an option
compared to some standard intervention [1].

3.4.1 The ceplane.plot Function

The ceplane.plot function is used to produce the cost-effectiveness plane from a


BCEA object, and it is plotted by default in the graphical summary accessible via the
S3 function plot.bcea. To produce the cost-effectiveness plane from a BCEA object
m, it is sufficient to execute the following command:
> ceplane . plot ( m )

In Fig. 3.5, the cost-effectiveness plane for the Vaccine example, the dots are a
representation of the differential outcomes (effectiveness and costs) observed in each
simulation. If a dot falls in the sustainability area shaded in grey, it indicates that the
expected incremental benefit for the comparison is positive for the given simulation
and chosen willingness to pay threshold. The red dot is a representation of the ICER
on the cost-effectiveness plane and is obtained as the averages of the two marginal
distributions (for Δe and Δc ). The numerical value of the ICER is printed in the
top-right corner by default, while the willingness to pay threshold (k) is displayed in
the bottom-left corner. This gives the gradient of the line partitioning the plane and
defines the cost-effectiveness acceptability (sustainability) region.
Several options are available for this function. The willingness to pay can be
adjusted by setting the option wtp to a different value, i.e. wtp=10000, which will
change the slope of the line, varying the value assigned to k in the equation Δc = kΔe
defining the acceptability region. The willingness to pay is defined in the interval
[0, ∞), and any value in this range can be assigned to this argument, 0 included.
Assigning a negative value to wtp will generate an error. In this case, note that, the
selected value for wtp does not have to be in the grid defined by the element m$k.
The position of the ICER label can be adjusted by using the pos argument, placing
the legend in any chosen corner of the graph. This can be done by setting the para-
meter pos to the different values topright (the default), topleft, bottomright or
bottomleft. It is also possible to assign a two-dimensional numerical vector to this
argument. A value equal to 0 in the first element positions the label on the bottom,
3.4 Cost-Effectiveness Plane 75

Cost effectiveness plane


Vaccination vs Status Quo

• ICER=20097.59

10
5
Cost differential
0
−5

k = 25000

−0.0005 0.0000 0.0005 0.0010 0.0015

Effectiveness differential

Fig. 3.5 The cost-effectiveness plane for the Vaccine example. The red dot indicates the average
of the distribution of the outcomes, i.e. the ICER. The grey-shaded surface is a representation of the
sustainability area, in correspondence of the fixed willingness to pay threshold, in this case fixed at
25,000 monetary units (the default)

while a 0 as the second element of the vector indicates the left side of the graph. If
the first and/or second elements are not equal to zero, the label is positioned on the
top and/or on the right, respectively.
In cases with more than two interventions, like for the Smoking example, the
argument comparison can be used to select which pairwise comparisons to visualise
on the plot. For example, the following code will plot the cost-effectiveness plane
for Group Counselling (reference treatment) against Self-Help (treatment 2):
> ceplane . plot (m , comparison =2 , wtp =250)

It is also possible to add more than one treatment comparison by setting comparison
as a vector, e.g.
> ceplane . plot (m , comparison =c (1 ,3) , wtp =250)

The values in this vector must be valid indexes, i.e. they need to be integer positive
numbers between 1 and the number of non-reference comparators, 3 for the Smoking
example. If this number is not known, the number of non-reference comparators
76 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

included in a BCEA object m is stored in the element n.comparisons of the object,


accessible using the command:
> m$n . comparisons
[1] 3

Note that if more than one pairwise comparison is plotted then the ICER and sus-
tainability area cannot be plotted using the default graphics package.

3.4.2 ggplot Version of the Cost-Effectiveness Plane

To add these elements to the graphic, the ggplot graphics package must be used.
This package also allows the user to have more control over the plot, which is useful
in situations where the cost-effectiveness plane must conform to certain publications
standards.
In multi-decision problems, the acceptability area is included by default in the
ggplot cost-effectiveness plane. The ICER, on the other hand, is not included by
default but this can be included using the argument ICER.size=2. This displays
the ICERs with a size equal to 2 millimetres. The following code produces Fig. 3.6,
which includes the 3 pairwise ICERs of size 2mm and an non-default acceptability
area with a willingness to pay for 250 monetary units.
> ceplane . plot (m , wtp =250 , graph =" ggplot2 ", ICER . size =2)

The results from Fig. 3.6 indicate that the three pairwise comparisons—represented
by the three different clouds of points—are similar in terms of variability. The dis-
tance from one to the next seems similar in terms of increments of the differential
costs and effectiveness. Clearly, the most effective and costly intervention on average
is Group counselling, indicated by all three ICERs residing in the top left quadrant.
This is followed by Individual counselling and Self-help. The No intervention option
is obviously the least expensive strategy, with no costs to be borne but also has the
smallest probability of success. Note also that the costs are therefore directly propor-
tional to the efficacy for all comparators, making Self-help the least expensive direct
intervention but also the least effective among the three active interventions. Indi-
vidual counselling is between Self-help and Group counselling for both outcomes.
All options presented in Sect. 3.4.1 are compatible with the ggplot cost-effective-
ness plane, but the opposite is not always true. In addition to the base plot manip-
ulations, it is possible to use the size option to set the value (in millimetres) of the
size of the willingness to pay label. A null size (i.e. size=0) can be set to avoid
the label from being displayed. Depending on the distribution of the cloud of points
and the chosen willingness to pay threshold, the default positioning algorithm for
the willingness to pay label can result in a non-optimal result, in particular when the
acceptability region limit crosses the left margin of the plot. An alternative position-
ing can be selected by setting the argument label.pos=FALSE in the ceplane.plot
function call. This option will place the label at the bottom of the plot area.
3.4 Cost-Effectiveness Plane 77

Cost−Effectiveness Plane

Cost differential 150

100

50

0
k = 250

−1 0 1 2
Effectiveness differential
Group counselling vs No treatment Group counselling vs Self−help
Group counselling vs Individual counselling

Fig. 3.6 The cost-effectiveness plane for the Smoking example produced by the ceplane.plot
function by setting the argument graph to "ggplot2". The theme applied in the graph is a modified
version of theme_bw to keep consistency between this version and the one using base graphics. The
output of the function is a ggplot object

For the Vaccine example, the ICER value is printed on the cost-effectiveness
plane. For a model with only two decisions the ICER, is also given of the ggplot
version on the cost-effectiveness plane. The ICER legend positioning works slightly
differently for the ggplot version and is in general less restrained than in the base
graphics plot. It is possible to place it outside the plot limits with assigning the
values "bottom", "top", "right" or "left" (with quotes) to the pos argument.
Alternatively it can be drawn inside the plot limits using a two-dimensional vector
indicating the relative positions ranging from 0 to 1 on the x- and y-axis respectively,
so for example the option pos=c(0,1) will put the label in the top-left corner of the
plot area, and pos=c(0.5,0.5) will place it at the centre of the plot. The default
value is set to FALSE, indicating that the label will appear in the top-right corner of
the plot area, in a slightly more optimised position than setting pos=c(1,1). Setting
the option value to TRUE will place the label on the bottom of the plot.
In the case of multiple comparisons, the legend detailing which intervention is rep-
resented by which cloud of points—seen on the same plot for the Smoking example—
can be manipulated in a similar fashion to the ICER label for the single compari-
son model. However, for multiple decisions the two commands, pos="bottom" and
pos=TRUE differ, the first uses a horizontal alignment for the elements in the legend,
the latter will stack them vertically.
78 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

In addition to the options presented above, it is possible to modify the graph


further thanks to the flexibility given by the ggplot class. As an example, the default
ggplot plot grid can be displayed by adding the associated theme option to the
graph object.3 The modified graph can be produced by the following line:
> ce . plot = ceplane . plot (m , graph =" ggplot2 ", label . pos = TRUE , pos =" bottom ") + theme
( panel . grid = element_line () )

The layers of the plot can be removed, added or modified by accessing the layers
element of the ggplot object. For a pairwise comparison, the object will be com-
posed of 7 layers: the line and the area defining the acceptability region, two layers
containing the two axes, the points representing the simulations, the legend and finally
the willingness to pay label. These elements can be accessed with the command:
> ce . plot$layers
[[1]]
mapping : x = x , y = y
geom_line : colour = black
stat_identity :
position_identity : ( width = NULL , height = NULL )

[[2]]
mapping : x = x , y = y
geom_polygon : fill = light grey , alpha = 0.3
stat_identity :
position_identity : ( width = NULL , height = NULL )

[[3]]
mapping : yintercept = 0
geom_hline : colour = grey
stat_hline : yintercept = NULL
position_identity : ( width = NULL , height = NULL )

[[4]]
mapping : xintercept = 0
geom_vline : colour = grey
stat_vline : xintercept = NULL
position_identity : ( width = NULL , height = NULL )

[[5]]
geom_point : na . rm = FALSE , size = 1
stat_identity :
position_identity : ( width = NULL , height = NULL )

[[6]]
mapping : x = lambda .e , y = lambda . c
geom_point : na . rm = FALSE , colour = red , size = 2
stat_identity :
position_identity : ( width = NULL , height = NULL )

[[7]]
mapping : x = x , y = y
geom_text : label = k = 250 , hjust = 0.15 , size = 3.5
stat_identity :
position_identity : ( width = NULL , height = NULL )

3 The default theme in the BCEA plots is an adaptation of theme_bw().


3.4 Cost-Effectiveness Plane 79

Cost−Effectiveness Plane

150
Cost differential

100

50

−1 0 1 2
Effectiveness differential
Group counselling vs No treatment Group counselling vs Self−help
Group counselling vs Individual counselling

Fig. 3.7 A version of the cost-effectiveness plane modified by changing the ggplot object proper-
ties. The cost-effectiveness acceptability region and the willingness to pay label have been removed,
and a panel grid has been included. The modularity of this class of objects allows for a high degree
of personalisation of the final appearance

Thus it is possible to modify the plot post hoc, by removing, adding or modifying
layer elements. For example, to produce a plot of the plane excluding the cost-
effectiveness acceptability area it is sufficient to execute the following code, which
will produce the plot in Fig. 3.7.
> # remove layers 1, 2 and 7 from the ggplot object
> ce . plot$layers <- ce . plot$layers [- c (1 ,2 ,7) ]
> # print the plot
> plot ( ce . plot )

3.4.3 Advanced Options for ceplane.plot

The function ceplane.plot also contains some advanced options which allow the
user to customise the appearence of the resulting graph even further. In particular,
it is possible to pass the optional inputs xlab="string" where "string" is a text
string containing the label that the user wants to put on the x- axis, instead of the
default "Effectiveness differential". Similarly, it is possible to pass an argu-
ment ylab="string", which replaces the default string "Cost differential" for
80 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

the y-axis. Finally, the option title="string" instructs BCEA to print a customised
title for the graph. An example of advanced use for this function is the following:
> ceplane . plot (m , xlab =" Difference in QALYs ",
ylab =" Difference in costs ( Pounds ) ",
title =" C / E plane ")

In addition, it is possible to modify the x- and y-axes limits using the option
xl=c(lower,upper) and yl=c(lower,upper), where lower and upper are suitable
values (these can of course be different for two axes).

3.5 Expected Incremental Benefit

The expected incremental benefit (EIB) is a summary measure useful to assess the
potential changes in the decision under different scenarios (see Sect. 1.3). When
considering a pairwise comparison (e.g. in the simple case of a reference intervention
t = 1 and a comparator, such as the status quo, t = 0), it is defined as the difference
between the expected utilities of the two alternatives:

EIB = E[u(e, c; 1)] − E[u(e, c; 0)] = U 1 − U 0 . (3.1)

In (3.1), U 1 and U 0 are synthetic measures of the benefits which the intervention t is
expected to produce. Since the aim of the cost-effectiveness analysis is to maximise
the benefits, the treatment with the highest expected utilities will be selected as the
“best” treatment option. Thus, if EIB> 0, then t = 1 is more cost-effective than
t = 0.
Of course, the expected utility is defined depending on the utility function selected
by the decision-maker; when the common monetary net benefit is used, the EIB can
be expressed as a function of the effectiveness and cost differentials (Δe , Δc ), as
in (1.6).
In practical terms, BCEA estimates the EIB using the S = n.sim simulated values
passed as inputs for the relevant quantities (e, c) as

1
S

EIB = [u(es , cs ; 1) − u(es , cs ; 0)],
S s=1

where (es , cs ) is the s-th simulated values for the population average measure of
effectiveness and costs.
Assuming that the monetary net benefit is used as utility function, this effectively
means that BCEA computes a full distribution of incremental benefits

IB(θ ) = kΔe − Δc
3.5 Expected Incremental Benefit 81

Fig. 3.8 A Gaussian kernel Incremental Benefit distribution


density estimate of the Vaccination vs Status Quo
incremental benefit
distribution observed over p(IB(θ) > 0, k = 25000)
the simulations produced
using the ib.plot function.
The shaded area indicates
the observed frequency of
the incremental benefit being
positive for the chosen

Density
willingness to pay threshold

−40 −20 0 20 40 60
IB(θ)

—recall that (Δe , Δc ) are random variables, whose variations are determined by the
posterior distribution of θ . For each simulation s = 1, . . . , S, BCEA computes the
resulting value for IB(θ ) and then the EIB can be estimated as the average of this
distribution

1
S

EIB = IB(θ s ),
S s=1

where θ s is the realised configuration of the parameters θ for the s-th simulation.
This procedure clarifies the existence of the two layers of uncertainty in the analy-
sis, which is also evident in the cost-effectiveness plane: uncertainty in the parameters
is characterised by considering the full (posterior) distribution of the relevant quanti-
ties (Δe , Δc ). This already averages out the individual variability, but can be further
summarised by taking the expectation over the distribution of the parameters, to
provide summaries such as the ICER and the EIB.
The value of the IB is accessible from the BCEA object by the function sim.table,
detailed in Sect. 4.2.1. A graphical summary of the distribution of the incremen-
tal benefits for pairwise comparisons can be produced using the BCEA command
ib.plot, which produces the graph in Fig. 3.8.
> ib . plot ( m )
82 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

The represented distribution is a Gaussian kernel density approximation of the


incremental benefit calculated for each simulation. The kernel estimate can be
adjusted using the parameters bw, identifying the kernel smoothing bandwidth, and
n, the number of equally spaced points at which the density is to be estimated. These
two parameters are passed to the density function available in the stats package.
The willingness to pay can be modified by giving a different value the wtp parameter,
with k = 25 000 as the default. The limits of the x-axis can be adjusted using the
xlim argument, by supplying the lower and upper bound of the axis to be represented
as the elements of a two-dimensional vector.
As mentioned in Sect. 1.3, the EIB can be directly linked with the decision rule
applied to the ICER. If a willingness to pay value k ∗ exists in correspondence of which
EIB = 0 this value of k is called the break-even point. It corresponds to the maximum
uncertainty associated with the decision between the two comparators, with equal
expected utilities for the two interventions. In other terms, for two willingness to pay
values, one greater and one less than k ∗ , there will be two different optimal decisions.
The expected utilities and the EIB values for the chosen willingness to pay are
included in the summary table produced by the summary function (see Sect. 3.3). It
is possible to explore how the decision changes over different willingness to pay
scenarios, analysing the behaviour of the EIB as a function of k. A graphical sum-
mary of the variation of the expected incremental benefit for the Vaccine example is
depicted in Fig. 3.9, which can be produced by the following command:

Expected Incremental Benefit and 95% credible intervals


40
30
20
EIB
10
0
−10
−20

k* = 20100

0 10000 20000 30000 40000 50000


Willingness to pay

Fig. 3.9 Expected incremental benefit as a function of the willingness to pay for the Vaccine
example. The break-even value corresponds to k ∗ = 20100, indicating that above that threshold
the alternative treatment is more cost-effective than the status quo, since for k > k ∗ follows that
U 1 − U 0 > 0. The use of the net benefit as a utility function makes the EIB function linear with
respect to the willingness to pay k
3.5 Expected Incremental Benefit 83

Expected Incremental Benefit


Group counselling vs No treatment

200
Group counselling vs Self−help
Group counselling vs Individual counselling

100
EIB
0
−100

k* = 159 k* = 225

0 100 200 300 400 500

Willingness to pay

Fig. 3.10 Expected incremental benefit as a function of the willingness to pay for the Smoking
example. There are two break-even points in this example corresponding to k ∗ = 159 and k ∗ = 225.
Note that the EIB are with respect to Group Counselling, this means that while the second break-even
point coincides with the Group counselling versus Self-Help line crossing 0, the first break-even
point is given at the point where the No treatment and Self-Help lines intersect as these are the most
cost-effective treatments for low willingness to pay values

> eib . plot ( m )

The function eib.plot plots all the available comparisons by default. Optionally
a specific subset of comparisons to be represented can be selected. This can be done
by assigning a vector with numeric values indexing the comparisons to be included
as elements to the argument comparison. For example, if the BCEA object contains
multiple interventions, the option comparison=2 will produce the EIB plot for the
second non-reference comparator versus the reference one, sorted by the order of
appearance in the matrices e and c given to the BCEA object. The break-even points
(if any) can be excluded from the plot by setting the argument size=NA. However,
controlling the label size via the size argument is possible only in the ggplot version
of the plot (see below).
The pos option is used only when multiple comparisons are available. In this
case a legend allowing the user to identify the different comparisons is added to the
plot, and can be positioned as in the ceplane.plot function. The values "top",
"bottom", "right" or "left" or a combination of two of them (e.g. "topright")
will position the label in the respective position inside the plot area. For example the
code,
> eib . plot (m , pos =" topleft ")

where m is the BCEA object for the Smoking example, produces Fig. 3.10. The para-
meter pos can be specified also in the form of a two-element numeric vector. The
value 0 in the first position indicates the left of the plot, while 0 in the second position
84 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

will place the label on the bottom of the plot. A numeric value different than 0 (e.g.
equal to 1) will refer to the right or top, respectively if in the first or second position
of the vector.
Notice that in Fig. 3.9 two additional lines give the credible intervals for the
EIB, whereas for the multiple comparisons the credible intervals are not given. The
argument plot.cri controls these credible intervals and is set to NULL by default.
This means that if a single comparison is available or selected, the eib.plot function
also draws the 95% credible interval of the distribution of the incremental benefit. The
intervals are not drawn by default if multiple comparisons are selected. However, they
can be included in the graph by setting the parameter plot.cri=TRUE. In addition,
the interval level can be set using the alpha argument, default value 0.05, implying
that the 95% credible interval is drawn.
If plot.cri=TRUE, the function will calculate the credible intervals at level 1 −
alpha. The credible intervals are estimated by default by calculating the 2.5-th and
97.5-th percentiles of the IB distribution. Alternatively they can be estimated using
a normal approximation by setting cri.quantile=FALSE. This alternative method
assumes normality in the distribution of the incremental benefit and thus, for each
value k of the grid approximation of the willingness to pay, the credible intervals are
estimated as 
EIB ± z α/2 Var (IB(θ )),

where z α/2 indicates the quantile of the standard normal distribution.


The available options for ggplot versions of the IB and EIB plot do not differ
from the base functions. The only difference is that in the eib.plot function the
size of the break-even point labels can be changed using the argument size, giving
the size of the text in millimetres. The ggplot versions can be produced by setting
the parameter graph="ggplot2" (or graph="g", using partial matching) in both
ib.plot and eib.plot function calls.

3.6 Contour Plots

Contour plots are used to extend the amount of graphical information contained in
the representation of the simulated outcomes on the cost-effectiveness plane. The
BCEA package implements two different tools to compare the joint distribution of
the outcomes, the functions contour4 and contour2. The contour function is an
alternative representation of the cost-effectiveness plane given by ceplane.plot.
While the latter focuses on the distributional average (i.e. the ICER), contour gives
a graphical overview of the dispersion of the cloud of points, displaying information
about the uncertainty associated with the outcomes.

4 The contour.bcea function is an S3 method for BCEA objects, thus it can be invoked by calling
the function contour and giving a valid bcea object as input.
3.6 Contour Plots 85

Cost effectiveness plane contour plot


Vaccination vs Status Quo
Pr(Δe ≤ 0, Δc > 0) = 0.169 Pr(Δe > 0, Δc > 0) = 0.811

10
Cost differential

5
0
−5

Pr(Δe ≤ 0, Δc ≤ 0) = 0.001 Pr(Δe > 0, Δc ≤ 0) = 0.019

−0.0005 0.0000 0.0005 0.0010 0.0015

Effectiveness differential

Fig. 3.11 The contour plot for the Vaccine example of the bivariate distribution of the differential
effectiveness and costs produced by the function contour.bcea. The contour lines give a represen-
tation of the variability of the distribution and of the relationship between the two outcomes. The
four labels at the corners of the plot indicate the proportion of simulations falling in each quadrant
of the Cartesian plane

The contour function can be invoked with the following code, which will produce
the plot in Fig. 3.11:
> contour ( m )

The function plots the outcomes of the simulation on the cost-effectiveness plane,
including a contour indicating the different density levels of the joint distribution
of the differentials of costs and effectiveness. The contour lines divide the observed
bivariate distribution of the outcome (Δe , Δc ) in a prespecified number of areas. Each
contour line is a curve along which the estimated probability distribution function
has a constant value. For example, if the chosen number of contour lines is four,
the distribution will be divided in five areas each containing 20% of all simulated
outcomes. A larger number of simulations will determine a more precise estimation of
the variance and therefore of the contours of the distribution. By default, the function
partitions the Cartesian plane in 5 regions, each associated with an equal estimated
density of probability with respect to the bivariate distribution of the differential
outcomes.
86 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

If a single comparison is available or selected, the comparison argument is a


single-element vector, the graph indicates the probability of a point (i.e., the outcome
of a future simulation) falling in each of the four quadrants of the cost-effectiveness
plane. The probability that the status quo dominates the alternative treatment is
equal to the likelihood of a future outcome landing in the north-western quadrant.
Alternatively, the probability of the alternative treatment dominating is the probability
of residing in the south eastern quadrant. For example, Fig. 3.11 shows the basic
results for the Vaccine example; the estimated probability of the vaccination strategy
dominating is 0.019, while the probability of the status quo dominating is 0.169.
Several options are available for the contour function. Again, the parameter
comparison determines which comparison should be plotted in a multi-decision
setting. The base graphic version produces only a single comparison per plot, there-
fore for multi-comparison models this argument must be used. If the comparison
argument is not passed then the first comparison is plotted by default. Multiple com-
parisons can be plotted at the same time by choosing the ggplot version of the
plot.
The scale argument can be used the change the density of the contours. Internally
BCEA calculations the contours using a bivariate kernel density kde2d which is cased
on
 an argument  h, giving the density between each contour. This is passed as h=
σ̂e σ̂c
;
scale scale
, where σ̂e and σ̂c are the standard deviations for the effectiveness
and costs respectively. Therefore, increasing the scale gives tighter contours.
The option nlevels indicates the number of areas to be included on the graphic i.e.
Fig. 3.11 has 5 levels. However, as this parameter is highly dependent on the value
of h sometime the true number of levels can differ from nlevels. The argument
levels on the other hand allows the manual specification of a vector of values at
which to draw the contour lines. The number of levels will then be equal to the length
of the vector.
Due to the S3 nature of the function contour, the help page for the usage within
BCEA can be accessed by the command help(contour.bcea) or alternatively by
?contour.bcea.
The ggplot version of the contour.bcea function is capable of plotting multiple
comparisons at the same time. It is possible to represent different pairwise compar-
isons in the same plot by setting the parameter comparison to a vector of the chosen
comparisons.
> contour (m , graph =" ggplot2 ", comparison = c (1 ,3) )

For multiple comparisons, a legend to indentify the different comparisons will


be included and its position can be adjusted using the pos argument. Again it can
be set outside the plot with the four options "bottom", "top", "left" or "right".
Alternatively, it is possible to put the legend inside the plot using a two-element
vector as in the ceplane.plot function (see Sect. 3.4). Note that, the argument pos
is only used in the ggplot version of the graph, as it is only used for multiple
comparisons.
3.6 Contour Plots 87

Cost effectiveness plane


Vaccination vs Status Quo
• ICER=20097.59

10
Cost differential
5
0
−5

k = 25000

−0.0005 0.0000 0.0005 0.0010 0.0015

Effectiveness differential

Fig. 3.12 The contour plot of the bivariate distribution of the differential effectiveness and
costs for the Vaccine case study produced by the function contour2. This function differs from
contour.bcea since it includes decisional elements such as the cost-effectiveness acceptability
region. In this example it can be seen that the mean of the distribution is not exactly centred, since
the mean is driven by the simulations resulting in a high effectiveness differential between the two
strategies. This results in a difference between the mean and median of the distribution

The contour2 function includes the contour of the bivariate distribution as well
as decision-making elements (i.e. the ICER and the cost-effectiveness acceptability
region). The plot in Fig. 3.12 is produced by the following code:
> contour2 ( m )

The parameters which can be set in the contour2 function are:


• wtp, the value of the willingness to pay threshold involved in the decision;
• comparison, indicating which comparison should be included in the plot (or com-
parisons, if ggplot is used);
• xl and yl, which can be used to set the limits on the x- and y-axis respectively by
assigning them a two-dimensional vector.
A clear representation of the cost-effectiveness plane for multiple comparisons can
be produced by using the ggplot2 version of the contour2 function. This represents
the outcomes of the simulations on the plane, together with the contours of the
88 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

Cost−Effectiveness Plane

Cost differential 150

100

50

0
k = 250

0 1 2 3
Effectiveness differential
Group counselling vs No intervention Group counselling vs Self−help
Group counselling vs Individual counselling

Fig. 3.13 The representation of the cost-effectiveness plane for the smoking cessation example.
The contours highlight the similarity in the uncertainty between the three bivariate distributions of
the differential outcomes and costs. The differential distributions are all contained in the positive
part of the y axis on the plane, meaning that the “Group counselling” intervention is more expensive
than all three others with low uncertainty. The group counselling intervention is cost-effective on
average with respect to all three other comparators for a willingness to pay of £250 per life year
saved

distributions and the cost-effectiveness acceptability region. The plot in Fig. 3.13
can be produced by the following command:
> contour2 (m , graph =" ggplot2 ", comparison = NULL , wtp =250 , pos = TRUE , ICER . size =2)

Clearly, this Figure is similar to Fig. 3.6, with the additional of the bivariate contours
allowing the user to identify the variation in the different comparisons more clearly.
Evidently, this version of the contour2 function is able to pass additional values
to the ceplane.plot function (e.g. the arguments pos, ICER.size, label.pos).
In addition, both contour and contour2 can be further customised using the same
optional arguments that have been described in Sect. 3.4.3 (that is irrespective of
which graphical engine is used to produce the graph).
3.7 Health Economic Evaluation for Multiple Comparators and the Efficiency Frontier 89

3.7 Health Economic Evaluation for Multiple Comparators


and the Efficiency Frontier

There are several ways of looking at the respective cost-effectiveness between com-
parators in an analysis of multiple comparisons. The most common graphical tool
for this evaluation is the cost-effectiveness plane. However, the comparative eval-
uation can be executed using another graphical instrument, the cost-effectiveness
efficiency frontier. The efficiency frontier is an extension of the standard approach
of incremental cost-effectiveness ratios and provides information for the health eco-
nomic evaluation when a universal willingness to pay threshold is not employed (e.g.
Germany) and it is particularly informative for assessing maximum reimbursement
prices [5].
The efficiency frontier compares the net costs and benefits of different interven-
tions in a therapeutic area. It is different from the common differential approach
(e.g. the cost-effectiveness plane) as the net measures are used. The predicted costs
and effectiveness for the interventions under consideration are compared directly to
the costs and effectiveness measure for treatments that are currently available. The
frontier itself defines the set of interventions for which cost is at an acceptable level
for the benefits given by the treatment. A new treatment would be deemed efficient—
i.e. it would then lie on the efficiency frontier—if, either, the average effectiveness
for the new treatment is greater than any of the currently available treatments or,
the cost of the treatment is lower than currently available treatments with the same
effectiveness. This area for efficiency lies the right of the curve in Fig. 3.14.
Practically, efficiency is determined sequentially. This means that we start from
an arbitrary inefficient point (i.e. the origins of the axes) and then determine the
intervention with the smallest average effectiveness. In general, this intervention
will also have a higher cost than the starting point—this intervention will also have
the lowest ICER Incremental Cost-Effectiveness Ratio (ICER)value amongst the
comparators. If two ICERs are equal then the treatment with the lowest cost is deemed
to be efficient, i.e. lie on the efficiency frontier. The next intervention included on the
frontier then has the next lowest effectiveness and cost measures—i.e. has the lowest
ICER value compared to the current efficient intervention. This method proceeds
until all efficient technologies have been identified.
The BCEA function ceef.plot produces a graphical and optionally a tabular output
of the efficiency frontier, both single and multiple comparisons. Given a bcea object
m, the frontier can be produced simply by the ceef.plot(m) command. In the plot,
the circles indicate the mean for the cost and effectiveness distributions for each
treatment option. The number in each circle corresponds to the order of the treatments
in the legend. If the number is black then the intervention is on the efficiency frontier.
Grey numbers indicate dominated treatments. By default, the function presents the
efficiency frontier plot in Fig. 3.14 and a summary, as displayed below:
90 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

Cost−effectiveness efficiency frontier

150
4
100

3
Cost

50

1 : No intervention
2 : Self−help
3 : Individual counselling
4 : Group counselling
1
0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Effectiveness

Fig. 3.14 The cost-effectiveness efficiency frontier for the smoking cessation example produced
by the ceef.plot function. The colours of the numbers in the circles indicate if a comparator is
included on the efficiency frontier or not. In this case, the interventions No treatment, Self-help and
Group counselling are on the frontier. Individual counselling is extendedly dominated by Self-help
and Group counselling

> ceef . plot (m , pos =" right " , start . from . origins = FALSE )

Cost - effectiveness efficiency frontier summary

Interventions on the efficiency frontier :


Effectiveness Costs Increase slope Increase angle
No intervention 0.41543 0.000 NA NA
Self - help 0.67414 45.533 176.01 1.5651
Group counselling 1.13883 142.981 209.70 1.5660

Interventions not on the efficiency frontier :


Effectiveness Costs Dominance type
Individual counselling 0.88713 95.507 Extended dominance

The text summary is produced by setting the argument print.summary to TRUE


(the default) and can be suppressed by giving it the value FALSE. The summary is
composed of two tables, reporting information for the comparators included on the
frontier. It also details the average health effects and costs for the comparators not
on the frontier, if any. For the interventions included on the frontier, the slope of the
frontier segment connecting the intervention to the previous efficient one and the
angle inclination of the segment (with respect to the x-axis), measured in radians,
3.7 Health Economic Evaluation for Multiple Comparators and the Efficiency Frontier 91

are also reported. In particular, the slope can be interpreted as the increase in costs
for an additional unit in effectiveness, i.e. the ICER for the comparison against the
previous treatment. For example, the ICER for the comparison between Self-help
and No treatment is £176.01 per life year gained.
The dominance type for comparators not on the efficiency frontier is reported
in the output table. This can be of two types: absolute or extended dominance. An
intervention is absolutely dominated if another comparator has both lower costs and
greater health benefits, i.e. the ICER for at least one pairwise comparison is negative.
Comparators in a situation of extended dominance are not wholly inefficient, but
are dominated because a combination of two other interventions will provide more
benefits for lower costs. For example, in the Smoking example, a combination of
Group Counselling and Self-Help would give more benefits for the same cost as
Individual Counselling.
The plot produced by the ceef.plot function, displayed in Fig. 3.14, is composed
of different elements:
• the outcomes of the simulations, i.e. the matrices e and c provided to the bcea
function, represented by the scatter points, with different colours for the compara-
tors;
• the average cost and effectiveness point for each comparator considered in the
analysis. These are represented by the circles including numbers indexing the
interventions by their order of appearance in the bcea object. The legend provides
information on the labels of the comparators;
• the efficiency frontier line, connecting the interventions on the frontier;
• the dominance regions, shaded in grey. A lighter shade indicates that interventions
in that area would be (absolutely) dominated by a single intervention, while mul-
tiple interventions would dominate comparators in the area with the darker shade.
Comparators in the non-shaded areas between the dominance regions and the
efficiency frontier are extendedly dominated. The graphical representation of the
dominance areas can be suppressed by setting dominance=FALSE in the function
call.
The start.from.origins option is used to choose the starting point of the fron-
tier. By default its value is set to TRUE, meaning that the efficiency frontier will
have the origins of the axes, i.e. the point (0, 0) as starting point. If this is set to
FALSE, the starting point will be the average outcomes of the least effective and
costly option among the compared interventions. If any of the comparators result in
negative costs or benefits, the argument start.from.origins will be set to FALSE
with a warning message. The starting point will not be included in the summary if
not in correspondence of the average outcomes of one of the included interventions.
As German guidelines recommend representing the costs on the x-axis and ben-
efits on the y-axis [5], an option to invert the axes has been included. This can be
done by specifying flip=TRUE in the function call. It is worth noting that, in the
efficiency frontier summary, the angle of increase of the segments in the frontier will
reflect this axes inversion. However, the segment slopes will not change, to retain
consistency with the definition of ICER (additional cost per gain in benefit).
92 3 BCEA—A R Package for Bayesian Cost-Effectiveness Analysis

The function allows any subset of the comparators to be included in the estimation
of the efficiency frontier. The interventions to be included in the analysis can be
selected by assigning a numeric vector of at least two elements to the argument
comparators, with the indexes of the comparators as elements. For example, to
include only the first and third comparator in the efficiency frontier for the smoking
cessation analysis, it is sufficient to add comparators=c(1,3) to the efficiency
frontier function call. Additionally, the positioning of the legend can be modified
from the default (i.e. in the top-right corner of the graph) by modifying the value
assigned to the argument pos. The values that can be assigned to this argument are
consistent to the other plotting functions in BCEA , e.g. ceplane.plot and eib.plot.
The ggplot2 version of the graph shares the design with the base graphics version.
In addition to the higher flexibility in the legend positioning provided by the argument
pos, a named theme element can be included in the function call, which will be
added to the ggplot2 object. The dominance regions are also rendered in a slightly
different way, with levels of transparency stacking up when multiple comparators
define a common dominance area. As such, the darkness of the grey-shaded areas
depend on the number of comparators sharing absolute dominance areas.

References

1. G. Baio, Bayesian Methods in Health Economics (Chapman Hall/CRC Press, Boca Raton, FL,
2012)
2. C. Williams, J. Lewsey, A. Briggs, D. Mackay (2016). doi:10.1177/0272989X16651869
3. G. Baio, A. Berardi, BCEA: A Package for Bayesian Cost-Effectiveness Analysis. http://cran.r-
project.org/web/packages/BCEA/index.html
4. H. Wickham, ggplot2: Elegant Graphics for Data Analysis (Use R!) (Springer, Berlin, 2009)
5. General methods for the assessment of the relation of benefits to costs. Technical report, Institute
for Quality and Efficiency in Health Care (IQWiG) (2009). https://www.iqwig.de/download/
General_Methods_for_the_Assessment_of_the_Relation_of_Benefits_to_Costs.pdf
Chapter 4
Probabilistic Sensitivity Analysis Using
BCEA

4.1 Introduction

Theoretically, as mentioned in Sect. 1.3, the maximisation of the expected utility is


all that is required to determine the best course of action in the face of uncertainty and
given current evidence [1–3]. This means that if we completely trust all the assump-
tions made in the current modelling and the data used to inform the unobserved and
unobservable quantities, then the computation of ICERs and EIBs would be suffi-
cient to determine which treatment is the most cost-effective. The decision-making
process would therefore be completely automated under these circumstances. This
implies that, as shown throughout Chap. 3, the vaccination strategy and the Group
Counselling interventions would be implemented for willingness to pay values of
25 000 and 250 monetary units respectively in the two examples of Chap. 2.
Of course, in reality this is hardly ever a realistic situation: the evidence upon
which the statistical model is based is often limited (either in size, or time follow-up,
or in terms of generalisability to a reference population). In addition, any statistical
model represents an idealisation of a complex reality and inevitably fails to account
for all the relevant aspects that may determine future outcomes. Therefore, while
theoretically the ICER contains all the relevant information for decision-making, it
is often important to consider the uncertainty associated with the underlying process.
This is particularly true in economic evaluations, where the objective of the analy-
sis is not simply inference about unknown quantities, but to drive policy decisions
which involve investing significant resources on the “optimal” intervention. More
importantly, investment in one intervention involves removing investment from other
alternatives. Additionally, some interventions may require significant upfront invest-
ment that would be wasted if, in the light of new evidence, the intervention was non-
optimal. Therefore, in general, decision-makers must understand the uncertainty
surrounding their decision. If this uncertainty is high they will, typically, recom-
mend future research is carried out before resources are invested in new treatments.
For this reason, health technology assessment agencies such as NICE in the UK

© Springer International Publishing AG 2017 93


G. Baio et al., Bayesian Cost-Effectiveness Analysis with the R package BCEA,
Use R!, DOI 10.1007/978-3-319-55718-2_4
94 4 Probabilistic Sensitivity Analysis Using BCEA

recommend the application of Probabilistic Sensitivity Analysis (PSA) to the results


of an economic evaluation.
This chapter will introduce PSA, which is used to test the impact of the model
uncertainty and model assumptions in the optimal decision, as well as several tech-
niques that are used to summarise, quantify and analyse how uncertainty influences
the decision-making process. This analysis can be divided into two parts, depending
on the source of the uncertainty. The first two sections of this chapter will focus on
parameter uncertainty. This explores the impact of the uncertainty associated with
the model parameters on the decision-making process. In a Bayesian setting this
parametric uncertainty is captured by the posterior distributions for the parameters
and can be thought of as the uncertainty present conditional on the assumptions of
the health economic model. Both the examples introduced in Chap. 3 are used to
illustrate the different tools available in BCEA, built to understand and report the
impact of the parametric uncertainty.
The final section of this chapter (Sect. 4.4) will then introduce the concept of PSA
applied to structural uncertainty. This is concerned with analysing the influence of the
model assumptions themselves. Clearly, this type of analysis has a very broad scope
but within BCEA we focus on two key elements of structural uncertainty. These are
the impact of the implicit assumptions coming from the health economic modelling
framework, such as risk aversion, and the impact of alternative model specifications,
such as changing the prior distribution for certain parameters. The capabilities of
BCEA for structural PSA is illustrated, again, with practical examples.

4.2 Probabilistic Sensitivity Analysis for Parameter


Uncertainty

In a nutshell, PSA for parameter uncertainty is a procedure in which the input para-
meters are considered as random quantities. This randomness is associated with a
probability distribution that describes the state of the science (i.e. the background
knowledge of the decision-maker) [2]. As such, PSA is fundamentally a Bayesian
exercise where the individual variability in the population is marginalised out but
the impact of parameter uncertainty on the decision is considered explicitly. Cal-
culating the ICER and EIB averages over both these sources of uncertainty as the
expected value for the utility function is found with respect to the joint distribution
of parameters and data. This parametric uncertainty is propagated through the eco-
nomic model to produce a distribution of decisions where randomness is induced by
parameter uncertainty.
From the frequentist point of view, PSA is unintuitive as parameters are not con-
sidered as random quantities and therefore are not subject to epistemic uncertainty.
Consequently, PSA is performed using a two-stage approach. First, the statistical
model is used to estimate the parameters, e.g. using the Maximum Likelihood Esti-
mates (MLEs) θ̂ as a function of the observed data, say y. These estimates are then
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 95

used to define a probability distribution describing the uncertainty in the parameters,


based on a function g(θ̂). For example, the method of moments could be used to deter-
mine a suitable form for a given parametric family (e.g. Normal, Gamma, Binomial,
Beta, etc.) to match the observed mean and, perhaps, standard deviation or quantiles.
Random samples from these “distributions” are obtained (e.g. using random number
generators, which for simple distributions are available even in spreadsheet calcu-
lators such as MS Excel) and finally fed through the economic model (cfr. Fig. 1.5)
to find a distribution for the decisions, much as in the same way as in the Bayesian
setting.
Figure 4.1 illustrates the fundamental difference between the two approaches. The
top panel shows the frequentist, two-stage process, while the bottom panel depicts
the one-step Bayesian analysis. In the former, the parameter θ is a fixed quantity
(represented in a square, in the top-left panel). Sampling variability is modelled
as a function of the parameter p(y | θ); this is used to obtain the estimate of the
“true” value of the parameter, θ̂. In a separate step (the top-right panel), the analysis
“pretends” that the parameters are actually random (and hence depicted in a circle, in
the top-right panel) and associated with a probability distribution defined in terms of
θ̂ (or a function thereof). In the Bayesian analysis in the bottom panel, the parameter
is considered as a random variable in the first place (and hence it is depicted in a
circle in the bottom panel); as soon as data y become available, the uncertainty on θ
is updated and automatically propagated to the economic model.
The most relevant implication of the distinction between the two approaches
is that a frequentist analysis typically discards the potential correlation among the
many parameters that characterise the health economic model. This is because the
uncertainty in θ is typically defined by a set of univariate distributions p(θq ), for each
of the q = 1, . . . , Q model parameters. We do note, however, that while it is possible
to use a multivariate distribution to account for correlation between the parameters,
most models developed under a frequentist approach (for which computations are
often performed in Excel) are not based on these more complex distributions.
Conversely, a full Bayesian model can account for this correlation automatically
as the full joint posterior distribution p(θ) = p(θ1 , . . . , θ Q ) is typically generated,
usually using MCMC, and propagated to the economic model. Interestingly, even if
some form of independence is assumed in the prior distributions, the model struc-
ture and the observed data may induce some correlation in the posterior with no
extra modelling complications as this correlation is automatically picked up in the
computation of the posterior distributions.
This may help demystify the saying that “Bayesian models are more complex”:
when comparing a simple, possibly univariate, frequentist model with its basic
Bayesian counterpart (e.g. based on “minimally informative” priors), it is perhaps
fair to say that the latter involves more complicated computation. However, when
the model is complex to start with (e.g. based on a large number of variables, or
characterised by highly correlated structures), then the perceived simplicity of the
frequentist approach is rather just a myth and, in fact, a Bayesian approach usually
turns out as more effective, even on a computational level.
96 4 Probabilistic Sensitivity Analysis Using BCEA

Fig. 4.1 Frequentist, two-stage versus Bayesian, one-stage health economic evaluation. The top
panel presents the frequentist process, where the parameters of the model are first estimated, e.g.
using MLEs. These are used to define some probability distributions, which are in turn fed through
the economic model. In a full Bayesian approach (bottom panel), this process happens at once:
the uncertainty in the parameters is updated from the prior to the posterior distribution, which is
directly passed to the economic model

Figure 4.2 gives a visual indication of the PSA process where the parameter
distributions seen on the left-hand side of the figure can be determined in a Bayesian
or frequentist setting. PSA begins by simulating a set of parameter values, represented
by red crosses in Fig. 4.2. These parameter values are then fed through the economic
model to give a value for the population summaries (Δe , Δc ), shown in the middle
column in the Figure and recorded in the final column “Decision Analysis”. These
measures are then combined to perform the cost-effectiveness analysis and calculate
a suitable summary e.g. IB(θ), for the specific simulation. This process is replicated
for another set of the simulated values to create a table of (Δe , Δc ) for different
simulations, along with a table of summary measures.
In this way, PSA does not differ from the analysis framework used in BCEA
which is based on a simulation approach from the distribution of (Δe , Δc ). Doing
the full economic analysis involves summarising this table using a suitable summary
such as the ICER. However, PSA involves considering the distribution of the final
summary measure such as the row-wise IB(θ). This gives a decision for each row
of the PSA table, conditional on the parameter values in that specific simulation.
This actually implies that one main tool to evaluate the parameter uncertainty is
the cost-effectiveness plane, because it provides helpful information about how the
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 97

Fig. 4.2 A schematic representation of the process of “Probabilistic sensitivity analysis”. Uncer-
tainty about the parameters is formalised in terms of suitable probability distributions and propagated
through the economic model (which can be seen as a “black box”) to produce a distribution of deci-
sion processes. These can be summarised (to determine the best option, given current evidence) or
analysed separately, to assess the impact of uncertainty in the parameters

model output varies due to uncertainty in the parameters. However, to further extend
the analysis of this decision uncertainty, the impact of parameter uncertainty on the
decision-making process can be assessed for different willingness to pay values.

4.2.1 Summary Tables

BCEA is able to provide several types of output that can be used to assess the health
economic evaluation. As seen in Chap. 3, a summary can be produced as follows:
> summary(m)
where m is a BCEA object. This function provides the output reported in Sect. 3.3.
In addition to the basic health economic measures, e.g. the EIB and the ICER,
BCEA provides the some summary measures for the PSA, allowing a more in-depth
analysis of the variation observed in the results, specifically the CEAC (Sect. 4.2.2)
and the EVPI (Sect. 4.3.1). The full output of the PSA is stored in the BCEA object,
and can be easily accessed using the function sim.table, e.g. with the following code:
> table=sim. table (m,wtp=25000)
98 4 Probabilistic Sensitivity Analysis Using BCEA

The willingness to pay value (indicated by the argument wtp) must be selected
from the values of the grid generated when the BCEA object was created, as in the
summary function (see Sect. 3.3) and is set to 25 000 monetary units or Kmax when
that argument has been used by default.
The output of the sim.table function is a list, composed of the following elements:
• Table: the table in which the output is stored as a matrix;
• names.cols: the column names of the Table matrix;
• wtp: the chosen willingness to pay value threshold. All measures depend on it
since it is a parameter in the utility function;
• ind.table: the index associated with the selected wtp value in the grid used to run
the analysis. It is the position the wtp occupies in the m$k vector, where m is the
original bcea object.
The matrix Table contains the health economics outputs in correspondence of each
simulation and can be accessed by subsetting the object created with the sim.table
function. The first lines of the table can be printed in the R console as follows:
> head( table$Table)
U1 U2 U* IB2_1 OL VI
1 -36.57582 -38.71760 -36.57582 -2.1417866 2.141787 -1.750121
2 -27.92514 -27.67448 -27.67448 0.2506573 0.000000 7.151217
3 -28.03024 -33.37394 -28.03024 -5.3436963 5.343696 6.795451
4 -53.28408 -47.13734 -47.13734 6.1467384 0.000000 -12.311646
5 -43.58389 -40.40469 -40.40469 3.1791976 0.000000 -5.578996
6 -42.37456 -33.08547 -33.08547 9.2890987 0.000000 1.740230
(incidentally, this particular excerpt refers to the Vaccine example).
The table is easily readable and reports for every simulation, indexed by the
leftmost column, the following quantities:
• U1 and U2: the utility values for the first and second interventions. When multiple
comparators are included, additional columns will be produced, one for every
considered comparator;
• U*: the maximum utility value among the comparators, indicating which inter-
vention produced the most benefits at each simulation;
• IB2_1: the incremental benefit IB for the comparison between intervention 2 and
intervention 1. Additional columns are included when multiple comparators are
considered (e.g. IB3_1);
• OL: the opportunity loss, obtained as the difference between the maximum utility
computed for the current parameter configuration (e.g. at the current simulation)
U* and the current utility of the intervention associated with the maximum utility
overall. In the current example and for the selected threshold of willingness to pay,
the mean of the vector U1,1 where the vaccine is not available, is lower than the
mean of the vector U2, vaccine available, as vaccination is the most cost-effective
intervention, given current evidence. Thus, for each row of the simulations table,

1 Noticethat this is in fact U 0 , in our notation. The slight confusion is due to the fact that it is not
advisable (or indeed even possible in many instances) to use a 0 index, in R. Similarly, the value
U2 indicates the utility for treatment t = 1, U 1 .
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 99

the OL is computed as the difference between the current value of U* and the value
of U2. For this reason, in all simulations where vaccination is indeed more cost-
effective (i.e. when IB2_1 is positive), OL(θ) = 0 as there would be no opportunity
loss, if the parameter configuration were the one obtained in the current simulation;
• VI: the value of information, which is computed as the difference between the
maximum utility computed for the current parameter configuration U* and the
utility of the intervention which is associated with the maximum utility overall. In
the Vaccine example and for the selected threshold of willingness to pay, vaccina-
tion (U2) is the most cost-effective intervention, given current evidence. Thus, for
each row of the simulations table, the VI is computed as the difference between
the current value of U* and the mean of the entire vector U2. Negative values of
the VI imply that for those simulation-specific parameter values both treatment
options are less valuable than the current optimal decision, in this case vaccination.
BCEA includes a set of functions that can depict in graphical form the results
of PSA, in terms of the most commonly used indicators, which we describe in the
following.

4.2.2 Cost-Effectiveness Acceptability Curve

The Cost-Effectiveness Acceptability Curve (CEAC), originally proposed in [4], esti-


mates the probability of cost-effectiveness for different willingness to pay thresholds.
The CEAC is used to evaluate the uncertainty associated with the decision-making
process, since it quantifies the degree to which a treatment is preferred. This is mea-
sured in terms of the difference in utilities, normally the incremental benefit IB(θ).
Formally, the CEAC is defined as

CEAC = Pr(IB(θ) > 0).

This effectively represents the proportion of simulations in which t = 1 is associated


with a higher utility than t = 0. If the net benefit function is used, the definition can
be rewritten as
CEAC = Pr(kΔe − Δc > 0),

which depends on the willingness to pay value k. This means that the CEAC can
be used to determine the probability that treatment 1 is optimal changes as the will-
ingness to pay threshold increases. In addition, this shows the clear links between
the analysis of the cost-effectiveness plane and the CEAC. Figure 4.3 shows in pan-
els (a)–(c) the cost-effectiveness plane for three different choices of the willingness
to pay parameter, k. In each, the CEAC is exactly the proportion of points in the
sustainability area. Panel (d) shows the CEAC for a range of values of k in the
interval [0 − 50 000].
100 4 Probabilistic Sensitivity Analysis Using BCEA

Fig. 4.3 A graphical representation of the links between the cost-effectiveness plane and the cost-
effectiveness acceptability curve

In general, the CEAC can also be directly compared to the EIB. The intervention
with the highest associated probability of cost-effectiveness (CEAC) will present
higher expected utilities with respect to the other comparators. For example, if two
alternative interventions are considered and one of them has an associated CEAC
value equal to 0.51, it will be considered cost-effective on average, producing a
positive differential in the utilities. The CEAC gives additional information as it gives
us additional information about the uncertainty. A probability of cost-effectiveness
of 0.51 and 1.00 will result in the same choice if analysing the ICER or the EIB,
while describing two very different situations. In the first case the difference between
the interventions is only slightly in favour of one intervention. However, in the
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 101

second situation (CEAC equal to one) the decision to implement the cost-effective
intervention has very little associated uncertainty. Notice that this property will not
hold when the underlying joint distribution of cost and effectiveness differentials is
extremely skewed, in which case an intervention that is deemed to be cost-effective,
given current evidence, may be associated with a relatively low CEAC.
The CEAC value for the chosen willingness to pay threshold is included by default
in the summary.bcea function output, described in Sects. 3.3 and 4.2.1. Functions are
included in BCEA to produce a graphical output for pairwise and multiple compar-
isons.

4.2.2.1 Example: Vaccine (Continued)

For the Vaccine example, the summary table in Sect. 3.3 reports that the CEAC is
equal to 0.529, for a willingness to pay threshold of 25 000 monetary units. This
indicates a relatively low probability of cost-effectiveness for the vaccination policy
over the status quo, as the CEAC is close 0.5. When only two comparators are under
investigation, a CEAC value equal to 0.5 means that the two interventions have the
same probability of cost-effectiveness. This is the maximum possible uncertainty
associated with the decision-making process. Therefore, a CEAC value of 0.529
means that, even though the vaccination is cost-effective on average, the difference
with the reference comparator is modest.
An alternative way of thinking of the CEAC value is to consider all “potential
futures” determined by the current uncertainty in the parameters: under this inter-
pretation, in nearly 53% of these cases t = 1 will turn out to be cost-effective (and
thus the “correct” decision, given the available knowledge). This also states that,
for a willingness to pay of 25 000 monetary units, nearly 53% of the points in the
cost-effectiveness plane lie in the sustainability area.
From the decision-maker’s perspective it is very informative to consider the CEAC
for different willingness to pay values, as it allows them to understand the level of
confidence they can have in the decision. It also demonstrates how the willingness
to pay influences this level of confidence. In fact, regulatory agencies such as NICE
in the UK do not use a single threshold value but rather evaluate the comparative
cost-effectiveness on a set interval. As the CEAC depends strongly on the chosen
threshold, it can be sensitive to small increments or decrements in the value of the
willingness to pay and can vary substantially.
To plot the cost-effectiveness acceptability curve for a bcea object m the function
ceac.plot is used, producing the output depicted in Fig. 4.4.
> ceac . plot (m)

The CEAC curve in Fig. 4.4 increases together with the willingness to pay. As
vaccination is more expensive and more effective than the status quo, if the willing-
ness to pay increases, a higher number of simulations yield a positive incremental
102 4 Probabilistic Sensitivity Analysis Using BCEA

Fig. 4.4 The plot of the Cost Effectiveness Acceptability Curve


CEAC allows the decision

1.0
make to assess the impact of
the variation of the
willingness to pay on the

0.8
Probability of cost effectiveness
probability of
cost-effectiveness. This
enables the analysis of the

0.6
uncertainty in different
scenarios, for different
values of the maximum cost

0.4
per unit increase in
effectiveness the
decision-maker is willing to 0.2

pay
0.0

0 10000 20000 30000 40000 50000


Willingness to pay

benefit. In other terms, the slope of the line defining the cost-effectiveness accept-
ability region on the cost-effectiveness plane increases, so more points are included
in the sustainability region.
The values of the CEAC for a given threshold value can be extracted directly from
the bcea object by extracting the ceac element. For example, the CEAC value for
willingness to pay values of 20 000 and 30 000 monetary units will be displayed by
running the following code:
> with(m, ceac[which(k==20000)])
[1] 0.457
> with(m, ceac[which(k==30000)])
[1] 0.586
or equivalently
> m$ceac[which(m$k==20000)]
[1] 0.457
> m$ceac[which(m$k==30000)]
[1] 0.586

The lines above will return an error if the specified value for k is not present in the
grid of willingness to pay values included in the m$k element of the bcea object m.
The CEAC plot can become more informative by adding horizontal lines at given
probability values to read off the probabilities more easily. To add these lines in the
base version of the plot simply call the function lines after the ceac.plot function. For
example to include horizontal lines at the level 0.25, 0.5 and 0.75, run the following
code:
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 103

Cost−Effectiveness Acceptability Curve

1.00

Probability of cost−effectiveness

0.75

0.50

0.25

0.00

0 10000 20000 30000 40000 50000


Willingness to pay

Fig. 4.5 The inclusion of the panel background grid allows for an easier graphical assessment of
the values of the cost-effectiveness acceptability curve. This can be easily done by re-enabling the
panel.grid in the theme options in ggplot

> ceac . plot (m)


> lines (x=range(m$k) ,y=rep(0.5 ,2) , col="grey50" , lty=2)
> lines (x=range(m$k) ,y=rep(0.25 ,2) , col="grey75" , lty=3)
> lines (x=range(m$k) ,y=rep(0.75 ,2) , col="grey75" , lty=3)
For a single pairwise comparison, the ggplot version of the ceac.plot is the same
as the base version, due to the simplicity of the graph. The default output is built to
be consistent with the base version, meaning that it does not include a background
grid. This can be easily overcome using the flexibility of the ggplot2 package. For
example, it is possible to simply restore the default gridline of the theme_bw theme
by executing the following lines:
> ceac . plot (m, graph="ggplot ")+theme(panel . grid=element_line () )
which creates the graph displayed in Fig. 4.5.
Analysing the CEAC in the interval of willingness to pay values between 20 000
and 30 000 monetary units per QALY demonstrates that this decision is relatively
uncertain (notice that depending on the monetary unit selected, the relevant range may
vary). The probability of cost-effectiveness for the vaccination strategy compared to
the status quo is between 0.46 and 0.59, close to 0.50.
104 4 Probabilistic Sensitivity Analysis Using BCEA

For lower thresholds the decision is clearly in favour of not implementing the
vaccination, with a CEAC below 0.25 for a willingness to pay threshold less than
10 000 monetary units per QALY. For higher values of willingness to pay, however,
the decision uncertainty is still high as the probability of cost-effectiveness does not
reach 0.75 in the considered range of willingness to pay values.

4.2.2.2 Example: Smoking Cessation (Continued)

When more than two comparators are considered in an economic analysis, the pair-
wise CEACs with respect to a single reference intervention may not give sufficient
information. For example, Fig. 4.6 shows the probability of cost-effectiveness for
Group counselling compared to the other interventions in a pairwise fashion. This
analysis gives no information about the other comparisons that do not include Group
counselling. This can potentially lead to misinterpretations since these probabilities
do not take into account the whole set of comparators. This issue is also present in
the EIB analysis. This is seen in the interpretation of Fig. 3.10 which is not straight-
forward in the multiple treatment comparison setting.
BCEA provides a necessary tool to overcome this problem using the multi.ce
function, which computes the probability of cost-effectiveness for each treatment
based on the utilities of all the comparators. This allows the user to analyse the overall
probability of cost-effectiveness for each comparator, taking into account all possible
interventions. To produce a CEAC plot that includes the intervention-specific cost-
effectiveness curve, the following code is used. First, the multiple treatment analysis
is performed using the bcea object m. The results must then be stored in an mce
object. Finally, the plot in Fig. 4.7 is produced by calling the mce.plot function. The
argument pos can be used to change the legend position, and in this case it is set to
top-right to avoid the legend and the curves overlapping.
> mce=multi . ce(m)
> mce. plot (mce, pos="topright ")
The function multi.ce requires BCEA a object as argument. It will output a list
composed of the following elements:
• m.ce, a matrix containing the values of the cost-effectiveness acceptability curves
for each intervention over the willingness to pay grid. The matrix is composed of
one row for every willingness to pay value included in the bcea object m$k (i.e.
501 if the argument wtp is not specified in the bcea function call), and columns
equal to the number of included comparators;
• ceaf, the cost-effectiveness acceptability frontier. This vector is determined by the
maximum value of the CEACs for each value in the analysed willingness to pay
grid;
• n.comparators, the number of included treatment strategies;
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 105

Cost Effectiveness
Acceptability Curve

1.0
0.8
Probability of cost effectiveness

0.6
0.4
0.2

Group counselling vs No intervention


Group counselling vs Self−help
0.0

Group counselling vs Individual counselling

0 100 200 300 400 500


Willingness to pay

Fig. 4.6 The figure depicts the three pairwise CEACs, representing the comparisons between the
“Group counselling” intervention versus all other comparators taken one by one. This plot does
not give information on the probability of cost-effectiveness of each strategy when considering all
other treatment options at the same time. This issue can be overcome by analysing all comparators
at the same time

• k, the willingness to pay grid. This is equal to the grid in the original bcea object,
in this example m$k;
• interventions, a vector including the names of the compared interventions, and
equal to the interventions vector of the original bcea object.
To produce a graph of the CEACs for all comparators considered together, the
command mce.plot is used. This command accepts as inputs an object produced by
the multi.ce function, the pos argument indicating the legend position. As usual, an
option to select whether the plot should be produced using base graphics or ggplot
can be included. The legend position can be changed in the same way as for the
base ceac.plot function Sect. 3.4.1. Again, if selecting ggplot, finer control of the
legend position is possible. The legend can be placed outside the plot limits using
a string providing the position (e.g. "bottom") or alternatively a two-element vector
specifying the relative position on the two axis; the two elements of the vector can
assume any value, so that for example c(0.5,0.5) indicates the centre of the plot and
c(1,0) the bottom-right corner inside the plot.
The results displayed in Fig. 4.7 confirm and extend the conclusions previously
drawn for the Smoking example, No intervention is the optimal choice for low
106 4 Probabilistic Sensitivity Analysis Using BCEA

Cost−effectiveness acceptability curve


for multiple comparisons

1.0
No intervention
Self−help
Individual counselling
Group counselling
Probability of most cost effectiveness

0.8
0.6
0.4
0.2
0.0

0 100 200 300 400 500


Willingness to pay

Fig. 4.7 This figure is a graphical representation of the probability of cost-effectiveness of each
treatment when all other comparators are considered. The information given is substantially different
from the pairwise CEACs, since it allows for the evaluation of the “best” treatment option over the
considered grid of willingness to pay values. The uncertainty associated with the decision can be
inferred by the distance between the treatment-specific curves

willingness to pay thresholds. However, it can also be seen that the probability of
cost-effectiveness decreases steeply as the threshold increases. Between the values
£177 and 210 the curve with the highest value is self-help but, again, the associated
probability of cost-effectiveness is modest, lower than 0.40. In addition, it is not sub-
stantially higher than the probability of the other interventions being cost-effective
as the CEAC values are similar.
The values of the CEAC curves can be extracted from the mce object for a given
willingness to pay threshold, for example 194, using the following code:
> mce$m. ce[which(mce$k==194) ,]
[1] 0.1830 0.3085 0.1985 0.3100
The code above will return an empty vector if the threshold value 194 is not present
in the grid vector mce$k. A slightly more sophisticated way of preventing this error
is extracting the value for the threshold minimising the distance from the chosen
threshold, which in this case yields the same output since the chosen willingness to
pay is included in the vector.
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 107

Probability of most cost effectiveness Cost−effectiveness acceptability frontier Cost−effectiveness acceptability frontier

Probability of most cost effectiveness


1.0

1.0
0.8

0.8
0.6

0.6
0.4

0.4
0.2

0.2
0.0

0.0
0 100 200 300 400 500 0 500 1000 1500 2000 2500

Willingness to pay Willingness to pay

500 2, 500

Fig. 4.8 The cost-effectiveness acceptability frontier (CEAF) for the smoking cessation example.
The CEAF indicates the overall value of uncertainty in the comparative cost-effectiveness consider-
ing all interventions at the same time. The low value of the curve highlights high uncertainty in the
interval £150–200. This is because the average utilities for all four interventions are much closer in
this interval than for smaller and bigger values of the willingness to pay

> mce$m. ce[which.min(abs(mce$k-194)) ,]


[1] 0.1830 0.3085 0.1985 0.3100
The representation of the cost-effectiveness acceptability frontier can also be
produced. The frontier is defined as the maximum value of the probability of cost-
effectiveness among all comparators. It is an indication of the uncertainty associated
with choosing the cost-effective intervention. In other terms, higher frontier values
correspond to lower decision uncertainty. Once the multi.ce function has been called,
the frontier can be easily created by the following command, which produces the
output in Fig. 4.8a.
> ceaf . plot (mce)
The cost-effectiveness acceptability frontier (CEAF) indicates a high degree of
uncertainty associated with the optimal decision for willingness to pay thresholds
between £150 and 250 per life year gained. The frontier increases in value for increas-
ing thresholds, reaching about 0.60 for a value of £500 per life year gained. This result
highlights that, even if Group counselling is the optimal choice for a willingness to pay
higher than £200 per life year gained, the associated probability of cost-effectiveness
is not very high. Combining Figs. 4.7 and 4.8a makes it clear that decision uncertainty
does not derive from a comparison with No intervention, which is clearly inferior
in terms of expected utilities to the other comparisons. Its associated probability of
cost-effectiveness is zero or close to zero for thresholds higher than £350 per life
year gained. Therefore, all intervention strategies for smoking cessation are all cost-
effective when compared with No intervention. The frontier remains stable around
108 4 Probabilistic Sensitivity Analysis Using BCEA

a value equal to 0.60 for higher thresholds, as shown in Fig. 4.8b which plots the
frontier for a higher limit of the willingness to pay threshold. To produce this graph,
the economic analysis must be re-run as follows:
> m2=bcea(e , c , ref=4,intervention=treats ,Kmax=2500)
> mce2=multi . ce(m2)
> ceaf . plot (mce2)
The cost-effectiveness acceptability probability remains stable around a value equal
to about 0.60, with Group counselling being the optimal choice for willingness to
pay values greater than £225 per life year gained.

Including the Cost-Effectiveness Frontier in the Plot


A small addition to the ggplot version of mce.plot allows for the inclusion of the
cost-effectiveness acceptability frontier in the plot. Adding the frontier gives an easier
interpretation of the graphical output. There is no built-in option to include it in the
graph, but the frontier can be added using the code below.
> mce. plot (mce, graph="g" ,pos=c(1 ,1) ) + stat_summary
(fun .y=max,geom="line " , colour="grey25" ,alpha=.3 ,lwd=2.5)

Cost−effectiveness acceptability curve


for multiple comparisons
No intervention
1.00 Self−help
Individual counselling
Group counselling
Probability of most cost effectiveness

0.75

0.50

0.25

0.00

0 100 200 300 400 500


Willingness to pay

Fig. 4.9 The multiple-comparison cost-effectiveness acceptability curves with the overlaid accept-
ability frontier curve for the smoking cessation example. The transparency argument alpha available
in the geom_line layer makes it easy to understand graphically which comparator has the highest
probability of cost-effectiveness
4.2 Probabilistic Sensitivity Analysis for Parameter Uncertainty 109

The code includes some control over the appearance of the frontier in the graph. An
additional geom_line layer is added to overlay the cost-effectiveness acceptability
frontier, using the aesthetics provided by the mce.plot function. The values of the
curve over the willingness to pay grid are calculated by means of the stat_summary
function using maximum value of the curves on each point. Figure 4.9 demonstrates
the produced graphic.

4.3 Value of Information Analysis

As already introduced in Sect. 4.1, PSA can be undertaken to ascertain whether it


is worth gathering additional information before deciding whether to reimburse a
new treatment. In a fully Bayesian setting, new evidence is easily integrated using
Bayes’ theorem. This new information simply updates the posterior distributions for
the model parameters, which in turn feeds into the economic model and informs the
cost and effectiveness measures (cfr. Fig. 1.5). In most settings, this new information
about the statistical model will reduce our decision uncertainty. This reduces the
risk of decision-makers funding an inefficient treatment. In this way, the additional
information will have value to decision-makers as it reduces the likelihood wasting
money on a non-optimal treatment. Additionally, in certain settings this new infor-
mation will indicate that the optimal decision under current information is actually
non-optimal, saving resources.
However, collecting this additional information comes at a cost associated with
the data collection, e.g. the cost of a further clinical trial. Therefore, we want to assess
whether the potential value of additional information exceeds the cost of collecting
this information. This will come down to two components, the probability of funding
the incorrect treatment if the decision is based on current evidence, represented by
the CEAC, and the potential cost of funding a non-optimal treatment.

4.3.1 Expected Value of Perfect Information

One measure to quantify the value of additional information is known as the Expected
Value of Perfect Information (EVPI). This measure translates the uncertainty associ-
ated with the cost-effectiveness evaluation in the model into an economic quantity.
This quantification is based on the Opportunity Loss (OL), which is a measure of
the potential losses caused by choosing the most cost-effective intervention on aver-
age when it does not result in the intervention with the highest utility in a “possible
future”. A future can be thought of as obtaining enough data to know the exact value
of the utilities for the different interventions. This would allow the decision-makers
to known the optimal treatment with certainty. In a Bayesian setting, the “possible
futures” are simply represented by the samples obtained from the posterior distrib-
110 4 Probabilistic Sensitivity Analysis Using BCEA

ution for the utilities. Thus, the opportunity loss occurs when the optimal treatment
on average is non-optimal for a specific point in the distribution for the utilities.
To calculate the EVPI practically, possible futures for the different utilities are
represented by the simulations. The utility values in each simulation are assumed to
be known, corresponding to a possible future, which could happen with a probability
based on the current available knowledge included in and represented by the model.
The opportunity loss is the difference between the maximum value of the simulation-
specific (known-distribution) utility U ∗ (θ) and the utility for the intervention result-
ing in the overall maximum expected utility U (θ τ ), where τ = arg maxt U t (cfr. the
discussion in Sect. 4.2.1):

OL(θ) = U ∗ (θ) − U (θ τ ). (4.1)

Usually, for a large number simulations the OL will be 0 as the optimal treat-
ment on average will also be the optimal treatment for the majority of simulations.
This means that the opportunity loss is always positive as either we choose the cur-
rent optimal treatment or the treatment with a higher utility value for that specific
simulation.
The EVPI is then defined as the average of the opportunity loss. This measures the
average potential losses in utility caused by the simulation-specific optimal decision
being non-optimal in reality.

EVPI = E[OL(θ)] (4.2)

If the probability of cost-effectiveness is low, then more simulations will give a non-
zero opportunity loss and consequently the EVPI will be higher. This means that if
the probability of cost-effectiveness is very high, it is unlikely that more information
would be worthwhile, as the most cost-effective treatment is already evident. How-
ever, the EVPI gives additional information over the CEAC as it takes into account
the opportunity lost as well as simply the probability of cost-effectiveness.
For example, there may be a setting where the probability of cost-effectiveness
is low, so the decision-maker believes that decision uncertainty is important. How-
ever, this is simply because the two treatments are very similar in both costs and
effectiveness. In this case the OL will be low as the utilities will be similar for both
treatments for all simulations. Therefore, the cost of making the incorrect decision is
very low. This will be reflected in the EVPI but not in the CEAC and implies that the
optimal treatment can be chosen with little financial risk, even with a low probability
of cost-effectiveness.

4.3.1.1 Example: Vaccine (Continued)

To give an indication of the cost of uncertainty for a chosen value of k, the EVPI is pre-
sented in the summary table. For the Vaccine example, this measure is estimated to be
equal to 2.41 monetary units for a threshold of 25 000 monetary units per QALY. The
4.3 Value of Information Analysis 111

interpretation of this value is more challenging than the CEAC as it is an unbounded


measure. It is also challenging to think in terms of an “acceptable” level of opportu-
nity loss.
However, the EVPI can also be thought of as the expected benefit of learning
the true values of all the model parameters for the statistical model. In general,
information about the model parameters is gleaned from an investigation, e.g. a
clinical trial or literature review. So, while it is not possible to learn the true values
of the model parameters in a specific trial, the EVPI does give an upper bound for
all future research costs. If the costs associated with additional research exceed the
EVPI, then the decision should be made under current information. Therefore, the
EVPI should be compared with the cost of performing analysis to learn information
about all the model parameters.
For this example, the EVPI is 2.41 monetary units which is clearly lower than
the cost of any additional research strategy.2 Therefore, this analysis implies that
the Vaccination strategy can be implemented and that parameter uncertainty has
little cost. This is perhaps surprising, given the conclusion in Sect. 4.2.2.1 that this
example is associated with a large amount of uncertainty as the probability of cost-
effectiveness is around 0.59 for 25 000 monetary units per QALY. This shows the
importance of considering the opportunity loss, alongside the probability of cost-
effectiveness.
The EVPI can be easily analysed over different willingness to pay values using
BCEA. To represent the curve graphically, the function evi.plot is used. This function
uses the bcea object m, and produces the plot of the EVPI over the willingness to
pay grid of values m$k. To produce the output in Fig. 4.10, it is sufficient to execute
the following code:
> evi . plot (m)
The analysis of the curve over different willingness to pay values demonstrates
that the EVPI increases until the break-even point, at 20 100 monetary units. The
break-even point is where the uncertainty associated with the decision reaches its
maximum value. The EVPI also increases as the willingness to pay increases because
the opportunity loss is related to the utility which is larger when the willingness to
pay is higher. However, Fig. 4.10 shows the EVPI is low when compared with the cost
of any possible research strategy, even for a willingness to pay of 50 000 monetary
units.
Clearly, the reduction of the parameter uncertainty by gathering additional evi-
dence would not be cost-effective since the value of the new information would be
less than the expected losses, should the non-optimal intervention be chosen and
the vaccination strategy can therefore be implemented without collecting additional
information.

2 Ingeneral, the EVPI value given in BCEA is the per person EVPI, to compare with the cost of
future research this EVPI value should be multiplied by the number of people who will receive the
treatment. This is because the cost of an incorrect decision is higher if a greater number of patients
use the treatment.
112 4 Probabilistic Sensitivity Analysis Using BCEA

Expected Value of Information

2.5
2.0
1.5
EVPI

1.0
0.5
0.0

0 10000 20000 30000 40000 50000


Willingness to pay

Fig. 4.10 The expected value of perfect information for the vaccination example for willingness
to pay thresholds between 0 and 50 000

4.3.1.2 Example: Smoking Cessation (Continued)

The analysis of the expected value of information can be carried out for multiple
comparators in the same way. Once the bcea object m is obtained, the EVPI can be
plotted using
> evi . plot (m)
As already presented in Sect. 4.2.2.2, the two break-even points can be seen in
Fig. 4.11: the first one when the decision changes from implementing “No treatment”
to “Self-help” (k1∗ = £177) and the second when the optimal decision changes to
“Group counselling” (k2∗ = £210). Notice that the EVPI is a single value even in the
multi-comparison setting. This is in contrast to all the other measures that must be
extended in the multi-comparison setting. This is because the EVPI is based on the
average opportunity loss across all the possible treatment options. For example, while
Group Counselling dominates on average for larger willingness to pay values, the
dominating treatment for a specific simulations can be either No treatment, Individual
Counselling or Self-Help. This induces one value for the opportunity loss for each
simulation, rather than three different values.
The cost-effectiveness frontier, in Fig. 4.9, showed that the probability of cost
effectiveness rapidly decreases as the willingness to pay increases from 0 before
4.3 Value of Information Analysis 113

Expected Value of Information

50
40
30
EVPI

20
10
0

0 100 200 300 400 500


Willingness to pay

Fig. 4.11 The expected value of perfect information for the smoking cessation example. The two
break-even points can be seen in the plot

stabilising below 0.60 for willingness to pay values greater than £150. Therefore,
there is a moderate amount of uncertainty about which treatment is cost-effective.
However, Fig. 4.11 shows that the EVPI is relatively low. It is possible, therefore,
to proceed with Group Counselling without performing additional research. Again,
this highlights the importance of the EVPI as a tool for PSA as it takes into account
both the opportunity loss and the probability of making an incorrect decision.
Note that the EVPI must be compared with the costs of future research as in many
examples the value of resolving the uncertainty in the health economic model will
seem high. However, as the cost of future research is also typically high it can still
exceed the EVPI. The interpretation of the EVPI must be made in the context of the
modelling scenario, not against a general-purpose threshold or comparator.

4.3.2 Expected Value of Perfect Partial Information

The EVPI is a useful tool for performing PSA, especially when used in conjunction
with the CEAC and can be easily calculated as part of a BCEA standard procedure.
This allows both the CEAC and the EVPI to be provided as part of the general plot
and summary functions.
114 4 Probabilistic Sensitivity Analysis Using BCEA

However, in the case where the EVPI is high compared to the cost of additional
research it is useful to know where to target that research to reduce the decision uncer-
tainty sufficiently. That is to say when the opportunity loss under current information
is high compared to the cost of obtaining additional information, it is important to
know how to reduce this opportunity loss as efficiently and as cheaply as possi-
ble. Additionally, in some settings, decision-makers are interested in understanding
which parameters are driving the decision uncertainty.
This is very important in health economic modelling as some of the underlying
parameters are known with relative certainty. For example, there may be large amount
of research on the prevalence of a given disease; similarly, the cost of the treatment
may be known with reasonable precision. Evidently, investigating these parameters
further to reduce the decision uncertainty would waste valuable resources and delay
getting a potentially cost-effective intervention to market. Ideally, therefore, it would
be advisable to calculate the value of resolving uncertainty for certain parameters or
subsets of parameters in order to target research efforts.
This subset analysis would also be important in deciding whether a specific pro-
posed trial is cost-effective. In this setting, the proposed study would target some
model parameters and the expected value of learning these specific parameters would
need to exceed to cost of the proposed trial. Again, note that it is important to compare
this value with the value of the proposed trial to ascertain whether the uncertainty is
high.
In general, the value of a subset of parameters is known as the Expected Value
of Perfect Partial Information (EVPPI); this indicator can be used to quantify the
value of resolving uncertainty in a specific parameter (or subset of parameters), while
leaving uncertainty in the remaining model parameters unchanged.
While intuitively this appears a simple extension to the general framework in
which the EVPI is computed, the quantification of the EVPPI does pose serious
computational challenges. Traditionally, the EVPPI was calculated using computa-
tionally intensive nested simulations. However, recent results [5–8] have allowed
users to approximate the EVPPI efficiently.
Crucially, these approximations are based solely on the PSA values for the para-
meters (e.g. in a Bayesian setting the posterior distribution obtained using MCMC
methods), allowing these methods to be included in general-purpose software like
BCEA. To begin, the PSA simulations for the parameters themselves need to be
stored and loaded in R. These PSA samples must be the parameter simulations used
to calculate the measures of costs and effectiveness. In a Bayesian setting these will
typically be available in a BUGS/JAGS object in R as in Sect. 4.3.3.4. However, if the
health economic model has been built-in Excel then the parameter simulations can
be loaded into R from a .csv file to calculate the EVPPI (see Fig. 5.5 and the discus-
sion in Sects. 5.2.4.1 and 5.2.5). For example, if a spreadsheet containing simulated
values for a set of relevant parameters was saved as the file PSAsimulations.csv, then
these could be imported in R using the commands
> psa_samples <- read . csv("PSAsimulations . csv" , header=T)
4.3 Value of Information Analysis 115

The resulting object psa_samples could then be passed to BCEA to execute the
analysis of the EVPPI.

4.3.3 Approximation Methods for the Computation


of the EVPPI

As discussed, the EVPPI is used to quantify the impact on the decision-making


process of uncertainty in a specific subset of parameters of interest, which we refer
to as the subset φ. In other words, we split the set of full parameters as θ = (φ, ψ),
where ψ is the set of “nuisance” parameters. For example, with reference to the Vac-
cineexample (cfr. Sect. 2.3.1), we could define φ = (β j , γ2 , ξ, λ) and ψ to contain
all the remaining “unimportant” parameters.
From the theoretical point of view, the procedure used to compute the EVPPI
closely mimics the one used for the EVPI. However, the opportunity loss is computed
only after the uncertainty due to limited knowledge of ψ is averaged out. In this case,
the opportunity loss is again the cost of selecting the treatment that is considered to
be optimal on average given current evidence when it is less cost-effective than its
comparator(s) in some “potential futures”. The crucial difference from the case of the
EVPI, however, is that the potential futures are only concerned with the parameters
φ, meaning that we are only concerned with the optimal decision conditional on
possible future values of φ. Therefore, we first must calculate the optimal treatment
after the uncertainty due to ψ has been marginalised out:
 
U ∗ (φ) = max Eψ|φ E(e,c)|θ [u(e, c; t)] . (4.3)
t
= max Eψ|φ [Ut (θ)] .
t

In line with (4.1), the opportunity loss for a specific value φ is the difference between
the value U ∗ (φ) and the utility for the intervention resulting in the overall maximum
utility
OL(φ) = U ∗ (φ) − U (φτ ),

where U (φτ ) is again the utility with ψ marginalised out. Finally, the EVPPI is the
expectation of this (conditional) opportunity loss

EVPPI = Eφ [OL(φ)]
= Eφ [U ∗ (φ)] − U ∗ .

As mentioned above, this formulation adds very little theoretical complexity, with
respect to the EVPI. However, the calculation of U ∗ (φ) involves a maximisation over
116 4 Probabilistic Sensitivity Analysis Using BCEA

a conditional expectation, which greatly complicates the computation using simu-


lations, implying that the PSA simulations cannot be used directly to calculate the
EVPPI, without investing in large computational time.
Consequently, the approximation methods focus on estimating this term as effi-
ciently as possible, based solely on the PSA simulations. The main idea is to use
regression to approximate the function Ut (φ) where

Ut (φ) = Eψ|φ [Ut (θ)] ,

for each treatment option t. This means regression is being used to marginalise out
the uncertainty due to the nuisance parameters ψ. Once the function Ut (φ) has been
estimated for all treatment options then

U ∗ (φ) = max Ut (φ).


t

As demonstrated in [7], it is possible to use regression to marginalise out uncer-


tainty, by considering the observed utility values Ut (θ) from the PSA analysis as a
set of “noisy” observations of the function Ut (φ):

Ut (θ) = Eψ|φ [Ut (θ)] + ε (4.4)


= gt (φ) + ε,

assuming that ε ∼ Normal(0, σε2 ). In (4.4), the computed values of Ut (θ), which are
available from the PSA process described in Sect. 4.2.1, are used as input data, in
conjunction with the simulated values of the parameters of interest. Thus, in terms
of regression, the relevant “response” is given by the values of Ut (θ), while the
“covariates”, or independent variables, are φ. Both these quantities are obtained
from the PSA process and are therefore available at no extra computational cost,
once the PSA procedure is in place.
Notice that in (4.4), the target to be estimated is the unobserved maximum con-
ditional expectation Eψ|φ [Ut (θ)], which is effectively a function of φ only, which
we indicate as gt (φ). This is because uncertainty due to other parameters has been
marginalised out leaving a function conditional on the values of φ only.
In general, this function gt (φ) can have a complicated form implying that a method
such as linear regression would be unlikely to capture the relationship between φ and
Eψ|φ [Ut (θ)]. Therefore, a flexible regression method is needed, to estimate function
gt (φ). A possible way of doing this is to use “non-parametric” regression methods, in
which the predictor does not take a predetermined form but is constructed according
to information derived from the data. Once the function gt (φ) has been estimated,
it is possible to use the resulting estimates to find the average opportunity loss as all
the other terms can be simply calculated directly from the health economic model
output.
4.3 Value of Information Analysis 117

By default, the BCEA function evppi uses Generalised Additive Models (GAM)
[9] when φ contains only one parameter; we briefly describe this in Sect. 4.3.3.1. For
the general case in which φ is multi-dimensional, BCEA resorts to a fast Gaussian
Process (GP) [10] approximation method developed for EVPPI calculation in [8],
which we present in Sect. 4.3.3.3. BCEA also implements a GP regression method
developed by [7] (discussed in Sect. 4.3.3.2 below). While this is in some cases
slightly more flexible (because of its underlying formulation), it is in general less
efficient from the computational point of view.
In addition, the two methods from [5, 6] can be used to approximate the EVPPI
for a single parameter. These are also implemented in the function evppi and are
described in Sect. 4.3.6. However, they can be considered as “deprecated”, since for
a small number of parameters in the subset φ, the GAM method is superior both in
terms of accuracy and computational speed.
The following sections give a short explanation of all these methods in order that
the user may have a fuller understanding of the technical, regression method specific
aspects that can be manipulated by the user in BCEA. We note that the complexity
of the mathematical formulation is beyond the scope of this book and thus we only
sketch the basic features of the advanced methods. The reader is referred to the
relevant literature for more in-depth information.

4.3.3.1 Generalised Additive Models

GAMs model the observed utility values as a sum of smooth functions of the impor-
tant parameters φ. In BCEA, these smooth functions are splines—technically these
are piecewise polynomial functions where the degree of the polynomial defines the
smoothness of the function. The polynomial degree is selected automatically by the
evppi function. This effectively amounts to modelling



gt (φs ) = h t (φsq ),
q=1

where Q φ is the number of important parameters (i.e. the size of the subset φ) and
h t (·) are the smooth functions.
In standard GAM regression methods, each smooth function or spline is a function
of one parameter in the set φ. This can occasionally be too restrictive in health
economic models, especially when the values in φ are correlated, and thus the EVPPI
estimate found using this method is unreliable. It is for this reason that for multi-
dimensional problems a different regression method is used by default in BCEA.
If GAM methods are used for multi-parameter sets φ, BCEA uses a GAM with
interactions between all the parameters by default. This means that the polynomials
include cross terms and therefore the number of potential terms is greatly increased,
especially for higher numbers of parameters. Just to give an example, assuming that
118 4 Probabilistic Sensitivity Analysis Using BCEA

cubic splines are fitted (the default choice in BCEA), considering Q φ = 3 implies
that the full GAM model has 125 parameters.
While the estimation procedure is still extremely fast, a large dataset (i.e.
S = 50 000) is required to calculate the EVPPI when φ contains more than four
parameters. Additionally, even a large dataset cannot prevent over-parametrisation
in some settings meaning that the GAM cannot be fitted accurately. It is possible to
remove these interaction terms in BCEA, as shown in Sect. 4.3.4 but, again, this can
negatively affect the accuracy of the estimate.

4.3.3.2 “Standard” Gaussian Process Regression

As suggested by [7], one way to overcome the limitations of GAMs is to use GP


regression to estimate the conditional expectation of the utility value (4.3). This strat-
egy effectively amounts to modelling the observed utility values, i.e. the regression
“response” Ut (θ) as a multivariate normal
⎛ ⎞
Ut (θ 1 )
⎜ Ut (θ 2 ) ⎟
⎜ ⎟
⎜ .. ⎟ ∼ Normal(Hβ, Σ + σε2 I), (4.5)
⎝ . ⎠
Ut (θ S )

where θ s and φs are the s-th simulated values for θ and φ, respectively; σε2 is the
residual variance from the regression construction; H is a design matrix
⎛ ⎞
1 φ11 · · · φ1Q φ
⎜ 1 φ21 · · · φ2Q φ ⎟
⎜ ⎟
H =⎜. .. ⎟;
⎝ .. . ⎠
1 φ S1 · · · φ S Q φ

β is the vector of regression coefficients describing the linear relationship between the
important parameters φ and the conditional expectation of the utilities (net benefits).
This means that the mean utility value is based on a linear regression of φ. However,
a GP is more flexible than linear regression due to the covariance matrix Σ, which
is determined by a covariance function C, a matrix operator whose elements C(r, s)
describe the covariance between any two points Ut (θr ) and Ut (θ s ).
Strong and Oakley’s original formulation uses a squared exponential covariance
function C Exp , defined by
⎡ ⎤
Qφ 2
 φ − φ
CExp (r, s) = σ 2 exp  ⎦
rq sq

q=1
δq
4.3 Value of Information Analysis 119

where φrq and φsq are the r -th and the s-th simulated value of the q-th parameter in
φ, respectively. For this covariance function, σ 2 is the GP marginal variance and δq
defines the “smoothness” of the relationship between two utility values with “similar”
values for φq . For high values of δq the correlation between the two conditional
expectations with similar values for φq is small. Therefore, the function gt (φ) is a
very rough (variable) function of φq . The δq values are treated as hyperparameters
to be estimated from the data.
This model includes 2Q φ + 3 hyperparameters: the Q φ + 1 regression coeffi-
cients β, the Q φ “smoothness” parameters δ = (δ1 , . . . , δ Q φ ), the marginal standard
deviation of the GP σ and the residual error σε of (4.4). The multivariate normal
structure allows the use of numerical optimisation to find the maximum a posteriori
estimates for these hyperparameters. This, however, implies a large computational
cost [11]—technically, the reason is the necessity to invert a large, dense matrix
(that is a matrix full of non-zero entries). This means that while extremely flexible,
because there is a specific parameter δq for each of the elements in φ, the resulting
estimation can be very computational intensive.

4.3.3.3 Gaussian Process Regression Based on INLA/SPDE

To overcome the computational complexity of the standard GP regression method, it


is possible to set up the problem in a more complex manner, which nonetheless has
the advantage of employing computational tricks to speed up the process of hyperpa-
rameter estimation by a considerable margin. Specifically, it avoids the computation
becoming more expensive as the number of parameters in φ increases. The basic
model can be reformulated [8] to make use of a fast approximate Bayesian com-
putation method known as the Integrated Nested Laplace Approximation (INLA)
[12] to obtain the posterior distributions for the hyperparameters and estimate the
function gt (φs ).
In a nutshell (and leaving aside all the technical details, which are beyond the scope
of this book), the idea is to modify model (4.5) so that the unknown functions gt (φ)
are estimated as a function of a linear predictor Hβ, which includes the impact of
all the parameters in φ, and a “spatially structured” effect ω, which effectively takes
care of the non-linearity and the correlation among the parameters—performing the
same role as the covariance matrix.
Crucially, it is possible to apply the Stochastic Partial Differential Equation
(SPDE) method developed in [13] in combination with the INLA procedure to esti-
mate all the relevant quantities using a “surface” that approximates the impact of the
covariance matrix at each point in the parameter space, without explicitly including
the δq parameters. Although this surface theoretically exists for all points in the para-
meter space φ, it is only evaluated at a finite set of points on an irregular grid. The
value of the surface at points not on the grid is estimated by interpolation. This is a
very fast and efficient procedure.
If the grid points are too far apart, then the estimation quality is limited as the
surface will not capture all the information about the covariance matrix. However, as
120 4 Probabilistic Sensitivity Analysis Using BCEA

the computational time is linked to the number of grid points, fewer grid points will
decrease the computational time. The evppi function creates the grid automatically
but to increase computational time or accuracy this grid can be manipulated, as we
show in Sect. 4.3.5.
This grid is only computationally efficient in two dimensions, so dimension reduc-
tion is used to obtain a suitable two-dimensional transformation of the parameters
to estimate covariance matrix. This dimension reduction, known as Principal Fitted
Components, tests whether these two dimensions contain all the relevant information
about the utility values. Therefore, if this test indicates that more than two dimen-
sions are needed to contain all the information then this method can struggle and full
residual checking is required to check the estimation procedure, see Figs. 4.14 and
4.15.
Before showing how BCEA can be used to compute the EVPPI using INLA and
Stochastic Partial Differential Equations (SPDEs), we mention that in order to do
so, it is necessary to install the R packages INLA and splancs. As usual, this can be
done by typing to the R terminal the following commands.
> install.packages("splancs")
> install.packages("INLA", repos="https://www.math.ntnu.no/inla/R/stable")
Notice that since INLA is not stored on CRAN, it is necessary to instruct R about the
repository at which is available (i.e. the argument repos in the call to install.packages).
If these two packages are not installed in the user’s machine, BCEA will produce a
message to request their installation.

4.3.3.4 Example: Vaccine (Continued)

To explore the use of the evppi function, we revisit the Vaccine example. In order
to use the evppi function, the user must have access to the PSA samples/posterior
draw the parameters vector θ as well as a BCEA object m. For this example, the PSA
samples of the original parametrisation of the model have been used and extracted
directly from the BUGS object, vaccine, in the Vaccine workspace provided with
BCEA. If the parameter simulations are available in Microsoft Excel or a similar
spreadsheet calculator, then these simulations can simply be imported into R, e.g.
using a .csv file.
If the user is working directly from the BUGS/JAGS output, a BCEA function
called CreateInputs is available to convert this output into input for the evppi function.
This function takes as argument the BUGS/JAGS object containing the MCMC
simulations and returns a matrix, with rows equal to the number of simulations and
columns equal to the number of parameters in θ, and a vector of parameter names.
The call to CreateInputs is presented below.
> inp <- CreateInputs(vaccine)
> names(inp)
[1] "mat" "parameters"
4.3 Value of Information Analysis 121

Fig. 4.12 Plot of the EVPI Expected Value of Perfect Partial Information
and the EVPPI for β1 and β2 EVPI
EVPPI for beta.1., and beta.2.

2.5
for different willingness to
pay values

2.0
1.5
1.0
0.5
0.0

0 10000 20000 30000 40000 50000


Willingness to pay

The matrix of PSA simulation is saved as the object inp$mat, which is then used
directly as an input to the evppi function. If the PSA simulations are saved in an
.csv file, then this can used directly in the function. The other object in the inp list,
inp$parameters, is a vector of the names of the parameters in this matrix. This vector,
or more usually sub-vectors, can also be used as an input to the evppi function. If
a .csv file is used then the column headings from that file can be given as inputs
instead.
The basic syntax for calculating the EVPPI is as follows:
> evppi(parameter,input,he)
where parameter is a vector of values or column names for which the EVPPI is being
calculated, input is the matrix or data frame containing the parameter simulations
and he is a bcea object. For example, to calculate the EVPPI for the parameters β1
and β2 , using the default settings, in the Vaccine example the following command
would be used:
> EVPPI <- evppi(c("beta.1.","beta.2."),inp$mat,m)
As the evppi function can take some time to calculate the EVPPI, a read out of the
progress is printed. The most time-consuming part of this process is described as
Calculating fitted values for the GP regression. Depending on the complexity of the
problem and the number of simulations used, this can take minutes.
It is possible to plot the evppi object using an S3 method for objects of the evppi
class. This gives a graphic showing the EVPI and the EVPPI for all willingness to
pay values included in the m$k vector (Fig. 4.12), obtained by invoking the command
plot(EVPPI) in the R terminal.
122 4 Probabilistic Sensitivity Analysis Using BCEA

In addition to this functionality, an evppi object contains objects that can be used
for further analysis; these can be explored by typing
> names(EVPPI)
which returns the following output:
[1] "evppi" "index" "k"
[4] "evi" "parameters" "time"
[7] "method" "fitted.costs" "fitted.effects"
These elements can be extracted for individual analysis and relate to the following
quantities.
• evppi: a vector giving the EVPPI value for each value of the willingness to pay
value in m$k.
• index: a vector detailing the column numbers for the parameters of interest.
• k: a vector of the willingness to pay values for which the EVPPI and the EVPI
have been calculated.
• evi: a vector of the values for the EVPI for different values of the willingness to
pay
• parameters: a single character string detailing the parameters for which the EVPPI
has been calculated. This character string is used to create the legend for the
graphical output of the plot function.
• time: a list giving the computational time taken to calculate the different stages of
the EVPPI estimation procedure.
• method: the calculation method used for the EVPPI. This will be a list object (cfr.
Sect. 4.3.5.1).
• fitted.costs: this gives the estimated gt (φ) function for costs, for the non-parametric
regression methods presented in Sect. 4.3.3
• fitted.effects: as for costs, this gives the estimated gt (φ) function for effects.
The most important element of this object is the evppi vector giving the EVPPI
values for the different willingness to pay values. In a standard analysis (using BCEA
default settings) this whole vector can be quite unwieldy. However, the following code
can be used to extract the EVPPI value for different willingness to pay thresholds,
specifically in this case 20 000 monetary units.
> EVPPI$evppi[which(EVPPI$k==20000)]
[1] 1.02189
Note that the costs and effects are fitted separately in the evppi function. Pre-
viously, the utility function Ut (θ) was used as the response for the non-parametric
regression. However, as the utility functions depend directly on the willingness to pay,
this would imply calculating the EVPPI 501 times in a standard BCEA procedure.
Therefore, to speed up computation the costs and effects are fitted separately and
then combined for each willingness to pay threshold. Another computational saving
is made by using incremental costs and effects as the response. This means that in
a health economic model with T decision options, BCEA fits 2(T − 1) regression
4.3 Value of Information Analysis 123

curves. In the process, BCEA provides a read out of the progress and presents the
computational time for each decision curve.

4.3.3.5 Example: Smoking (Continued)

To demonstrate the evppi function for multi-decisions we use the smoking cessation
example. Again, the simulations for each of the model parameters must be loaded into
the R workspace. These simulations are contained in the rjags object smoking_output
in the Smoking dataset—this was created by running the MTC model of Sect. 2.4.1
in JAGS. The following code is used to extract the parameter simulations from the
rjags object:
> data(Smoking)
> inp<-CreateInputs(smoking_output)
In this example, there are 25 parameters in the smoking_output dataset. However,
some of these are simply constants used to simplify the calculation of the costs and
effectiveness measures. If these constant variables are used in the evppi function, then
the following errors will occur, for single and multi-parameter EVPPI respectively:
> # Single Parameter EVPPI
> EVPPI<-evppi(c(1),inp$mat,m)

Calculating fitted values for the GAM regression

Error in smooth.construct.cr.smooth.spec
(object$margin[[i]], dat, knt) :
d.1. has insufficient unique values to support 5 knots: reduce k.

> # Multi Parameter EVPPI


> EVPPI<-evppi(c(1,2),inp$mat,m)

Finding projections

Error in eigen(Sigmahat) : infinite or missing values in ’x’


Dropping these constants from the analysis (i.e. by reconstructing the matrix
inp$mat by eliminating the columns associated with constant values), the EVPPI
can be calculated using the same code as before. Notice that, rather than using the
column names for the parameters in this example, the column number is used in the
evppi call. The Smoking example causes some difficulties for the INLA algorithm
using the default options in BCEA: this is seen as the INLA algorithm crashes and
exits before calculating the EVPPI. To overcome this the argument h.value is added
to the evppi call. Section 4.3.4 gives a full explanation of this argument, which in
general must be reduced when the INLA algorithm crashes.
124 4 Probabilistic Sensitivity Analysis Using BCEA

> EVPPI<-evppi(c(2:3),inp$mat,m,h.value=0.0000005)

Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE

Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE

Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE

Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE

Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE

Finding projections
Determining Mesh
Calculating fitted values for the GP regression using INLA/SPDE
Calculating EVPPI

> plot(EVPPI)
For this example, we demonstrate the readout that should be expected from the
EVPPI function, which allows the user to track the progress of the function. Notice
there are six separate readouts indicating that the fitted values for GP regression
using INLA/SPDE are found six separate times. This corresponds to the costs and
effects for the three different comparisons in the Smoking example, as discussed in
Sect. 4.3.3.4.
Figure 4.13 shows the EVPPI plot generated using the above code. Again, notice
that the two break-even points can be clearly seen for the EVPPI curve. Note that, the
EVPPI is always dominated by the EVPI but the difference between the two curves
does not remain constant for the different willingness to pay values.

4.3.4 Advanced Options

The evppi function can take extra arguments to allow users to manipulate the underly-
ing estimation procedures. In general, these extra arguments fine tune the procedure
4.3 Value of Information Analysis 125

Fig. 4.13 Plot of the EVPI Expected Value of Perfect Partial Information
and the EVPPI for parameter EVPI

50
EVPPI for 2, and 3
columns 2 and 3 in the
smoking_output dataset for

40
different willingness to pay
values

30
20
10
0

0 100 200 300 400 500


Willingness to pay

to increase/decrease computational speed depending on the required accuracy of the


estimate. As a general rule, the slower the estimation procedure, the more accurate
the EVPPI estimation. Therefore, the user can prioritise either speed or accuracy. The
default settings in BCEA are chosen as a trade-off between these two considerations.

Selecting the Number of PSA Runs


The input N simply allows the user to control how many PSA runs are to be used
to calculate the EVPPI. This is useful when a large number of PSA runs, say over
5 000, are available for analysis. The estimation methods for the EVPPI are slow for
large PSA samples and can sometimes crash due to the memory capabilities required
for the estimation procedure.
It is advisable, therefore, to subset from this larger PSA sample and calculate
the EVPPI using a smaller number of PSA simulations. In some cases, particularly
for Bayesian health economic models sampled using MCMC, it is preferable to
thin the full PSA sample rather than taking the first n observations. Therefore, the N
argument can then also be passed as a vector of row numbers. The EVPPI would then
be calculated using those rows. For example, these row numbers can be randomly
picked as follows:
> select <- sample(1:dim(e)[1],size=10,replace=F)
> select
[1] 843 328 674 989 583 635 251 496 37 1000
> EVPPI <- evppi(c("beta.1.","beta.2."),inp$mat,m,N=select)
These commands instruct R to sample 10 values out of the vector consisting of the
numbers between 1 and the size of the PSA simulations (determined by dim(e)[1],
in R terminology). To avoid sampling with replacement (which could theoretically
give rise to the same index more than once), we also set the option replace=F. Then
in the call to evppi, we can simply specify the option N=select and BCEA will only
perform the estimation of the EVPPI using the simulations 843, 328, 674, 989, 583,
635, 251, 496, 37 and 1 000 from the original PSA sample and discard any other
126 4 Probabilistic Sensitivity Analysis Using BCEA

index (Of course, it is extremely unsatisfactory to only use 10 simulations in real


applications!).
The user could also simply specify N as a vector, for example in the form
> select <- c(843,328,674,989,583,635,251,496,37,1000)
> EVPPI <- evppi(c("beta.1.","beta.2."),inp$mat,m,N=select)
which would produce the same results as above in terms of selection of PSA runs
and computation of the EVPPI.
Outputting the Residuals
As the estimation of the EVPPI is effectively the result of a complex statistical model,
it is important to check and evaluate the quality of this procedure. Thus, in line with
standard statistical methodology, the function evppi allows the user to assess model
fit. To do this, the option residuals must be set to TRUE or T (the default value). This
argument should be changed to FALSE or F if the evppi objects become too large.
However, if the residuals argument is TRUE, then evppi outputs the estimated
values for the function gt (φ), which can be used to check the modelling assump-
tions and thus give some confirmation that the EVPPI has been estimated correctly.
Specifically, for the non-parametric regression methods, the residuals should look
as if they were drawn from a Normal distribution—this is again in line with any
linear or non-linear regression method. In addition to this, and more importantly, the
residuals should have no underlying structure or graphical pattern.
BCEA contains a function diag.evppi that allows users to visually test these model
assumptions. A typical call for this function is the following.
> EVPPI <- evppi(c("beta.1.","beta.2."),inp$mat,m,residuals=T)
> diag.evppi(EVPPI,m,diag="residuals")
> diag.evppi(EVPPI,m,diag="qqplot")
The function diag.evppi takes an object in the class evppi, as it is the first argument.
Thus, we first start by estimating the EVPPI. Notice that in order to use diag.evppi,
the option residuals=T is set and the BCEA object m relates to the example under
investigation.
If we set diag="residuals", then diag.evppi plots the residuals against the fitted
values for both the costs and effects (see Fig. 4.14). This plot should have no structure
and be centred around 0. If the plot has clear structure, then the EVPPI estimate will
be inaccurate.
Setting diag="qqplot" shows a QQ plot indicating whether the residuals are
approximately distributed as a Normal random variable. This is a less important
consideration but nevertheless can indicate if there are problems with the estimation
procedure (see Fig. 4.15).
In a multi-decision problem, such as the Smoking example, gt (φ) must must
be estimated separately for each comparison, for both costs and effects, as seen
in Sect. 4.3.3.5. There are, therefore, a total of six functions to be estimated by
regression methods. This is turn means there are six different regression fits that
need to be checked.
4.3 Value of Information Analysis 127

Residual plot for costs Residual plot for effects

1e−03
5

5e−04
0
Residuals

Residuals
0e+00
−5

−5e−04
−10

−1e−03

4.7 4.9 5.1 5.3 0.00022 0.00026 0.00030


Fitted values Fitted values

Fig. 4.14 The plot of residuals against fitted values for the Vaccine example for both costs and
effects

This can be done using the diag.evppi function using the additional argument
int=k, where k is the column in the m$delta.e matrix for which we wish to access
the fit. For example, the call
> diag.evppi(EVPPI,m,diag="residuals",int=2)
would instruct BCEA to plot the regression fit for the second incremental costs and
effects.

4.3.5 Technical Options for Controlling the EVPPI


Estimation Procedure

This section relates to the more technical aspects for controlling the non-parametric
estimation methods. This section is more advanced and assumes that the reader
has read Sect. 4.3.3 and specifically Sect. 4.3.3.3. It is not necessary to use these
additional fine-tuning procedures to produce EVPPI estimates but in some more
complicated health economic models it may be necessary to improve the EVPPI
estimation accuracy using these more advanced manipulations.
128 4 Probabilistic Sensitivity Analysis Using BCEA

Normal Q−Q plot Normal Q−Q plot


(costs) (effects)

0.00030
5.3
5.2

0.00028
5.1
Sample Quantiles

Sample Quantiles
0.00026
5.0
4.9

0.00024
4.8

0.00022
4.7

−3 −1 1 2 3 −3 −1 1 2 3
Theoretical Quantiles Theoretical Quantiles

Fig. 4.15 A QQ plot for the residuals in the Vaccine example for both costs and effects

Controlling the INLA Step Size


The INLA algorithm uses numerical integration to approximate the values for the
hyperparameters for GP regression. This involves choosing a number of points at
which the hyperparameters are estimated precisely. The number of points, or more
precisely the distance between these points must be chosen. By default, in the evppi
function, this value is set to 0.00005 which is generally small enough to get accurate
estimates for the EVPPI.
However, in some settings this grid is not refined enough and this leads to a
break-down of the numerical results, meaning that the INLA algorithm crashes. If
this does happen then the distance between these points should be reduced by adding
the h.value argument to the evppi call. Reducing the h.value will allow the INLA
algorithm to estimate the EVPPI accurately.
Controlling the Grid for the Default INLA-SPDE Method
As previously discussed, the SPDE-INLA method relies on estimating a “surface”
that represents the impact of the GP covariance matrix at certain points on a gird.
This implies that the accuracy of the EVPPI and computational time depend directly
on this grid. Therefore, there are several options in the evppi function that allow
users to control the creation of the grid and through that control the estimation of the
EVPPI.
4.3 Value of Information Analysis 129

Fig. 4.16 An example of a Constrained refined Delaunay triangulation


grid used to approximate the
covariance matrix impact

To investigate the grid properties, the option plot=T should be set in the call to
evppi. This plots the grid for each function gt (φ) and also allows the user to save
the grid during the estimation procedure. Figure 4.16 gives a good example of a grid
used to approximate the EVPPI. The blue dots represent the PSA data whereas the
vertices of the triangles are the points where the surface is estimated.
In principal, the grids should be broadly circular with no visual outliers—blue
dots isolated from the other points by a blue boundary line. The inner section should
have dense triangles with a boundary relatively tight to the blue data points. The
outer section should be spaced away from the blue data points and can have larger
triangles. Both these sections are encased by blue boundary lines.
The closer these boundaries sit to the data, the faster the computation as there
are fewer grid points. However, the INLA method fixes the value of the surface at
the outer boundary meaning that boundaries close to the data can skew the results.
The inner (outer) boundary can be controlled using the argument convex.inner=a
(convex.outer=a’), where a (a’) is a negative value, typically between −0.2 and
−0.6 for the inner and −0.5 and −1 for the outer boundary. Notice that the value a’
should be more negative than a, as more negative values indicate that the boundary
will be further from the points of interest.
Technically, these negative values define the minimum radius of the curvature
of the boundary, defined as a fraction of the radius of the points. This means that,
if convex.inner=-0.1 then the minimum curvature of the boundary has radius equal
to one-tenth of the radius of the points, giving a boundary that hugs the points of
interest quite tightly. As this is decreased to more negative values, the boundary
is constrained to be further from the points of interest. Incidentally, Fig. 4.16 uses
values of −0.3 and −0.7 respectively for the two boundaries.
The density of the points can also be controlled with the argument cutoff=b and
max.edge=c, where b can typically be between 0.1 and 0.5 and c between 0.5 and
1.2. These values simply define the minimum and maximum (absolute) difference
between the points in this grid. Small values increase the density and larger values
decrease it, with the computation time varying inversely to the density of the grid
points.
130 4 Probabilistic Sensitivity Analysis Using BCEA

Modifying the Linear Predictor


As mentioned in Sect. 4.3.3.3, the basic idea of the INLA/SPDE procedure is to
construct a model for gt (φ) which depends on a linear predictor H β and a “spatially
structure” component ω which basically takes care of the correlation among the Q φ
parameters in φ and is based on two-dimensional reduction of φ.
Often, this two-dimensional reduction is sufficient to capture all the information
about φ and successfully estimate the target function gt (φ) to produce an accurate
estimation of the EVPPI. However, in some cases this reduction is not sufficient to
capture all the information and BCEA will give the user a warning:
Warning message:
In make.proj(parameter = parameter, inputs = inputs, x = x, k = k, :
The dimension of the sufficient reduction for the incremental effects , column 1 ,
is 3.
Dimensions greater than 2 imply that the EVPPI approximation
using INLA may be inaccurate.
Full residual checking using diag.evppi is required.
As is possible to see, the warning suggests residual checking as a means to identify
any issues with the estimation. Additionally, it details which regression curve may
be inaccurate, in this case the first incremental decision (int=1 for the diag.evppi
command) and specifically for the incremental effects. Figure 4.17 shows the residual
plot for the above example which is structured and therefore indicates some issues
with the estimation procedure. As suggested in [7], it is important to note that while
this error structure does show some issues with the estimation procedure these are not
necessarily severe as the residuals are symmetric. This means that the mean gt (φ)
is likely to be correctly estimated. As the EVPPI estimate is based on the estimate,
it may still be accurate.
In general, to improve the estimation it can be necessary to use the computational
intensive standard GP regression procedure. However, as the issues may not be severe,
the estimation can be improved with a smaller computational cost by modelling some
level of interactions between the parameters. These interactions can be added using
the int.ord argument in the evppi function which can be given as a vector, e.g.
int.ord=c(2,1). This formulation models second-order interactions for the effects
only, as the dimension reduction was unsuitable for the effects. This means, for
example when φ contains three elements, that the linear part of the model for the
effects is changed to a non-linear structure

β0 + β1 φ1s + β2 φ2s + β3 φ3s + β4 φ1s φ2s + β5 φ1s φ3s + β6 φ2s φ3s .

Clearly,
this  increases the number of regression parameters from (Q φ + 1) to Q φ +

1+ , giving an increase in computational time.
2
In a multi-decision case, it may be advisable to only use interactions for the curve
where the issues occurred, in this case the first incremental effects, to avoid a large
increase in computational time. This is achieved using the list environment in R. The
4.3 Value of Information Analysis 131

Residual plot for costs Residual plot for effects

3000

2
2000
1000
Residuals

Residuals
0
0

−2
−4
−2000

−1000 1000 −4 0 2 4 6
Fitted values Fitted values

Fig. 4.17 The fitted against residuals for an EVPPI estimation procedure with a high-dimensional
reduction needed to capture all the information in φ

first element in the list is the interaction levels for the effects and the second is the
interaction levels for the costs. So in this example, the following code would be used
> interactions <- list(c(2,1,1),c(1,1,1))
> EVPPI <- evppi(1:3,inp$mat,m,int.ord=interactions)
to only use second-order interactions for the first incremental effects.
Once the EVPPI has been fitted with interactions, we must reassess whether
the EVPPI has been correctly estimated by inspecting the residuals. If this has not
improved the fit significantly then either the interaction levels can be increased further
or the non-default GP method should be used. However, both these strategies have a
greater computational cost.
EVPPI Using Non-default GP Regression
To use non-default GP regression for all regression fits, e.g. the costs and effects for
all incremental decisions, the argument method="GP" must be included in the call
to the evppi function.
As discussed in Sect. 4.3.3.2, the hyperparameters for the non-default regression
are estimated using numerical methods. This involves inverting a square matrix with
the same number of rows and columns as the PSA simulations which is compu-
tationally expensive. This means the estimation of the hyperparameters can take a
substantial time.
Therefore, the default for this method is to use 500 PSA simulations to calculate
the hyperparameters. This can be increased, in improve accuracy, using the argument
n.sims=S, where S is larger than 500. The computational speed will also be increased
by reducing this number, although clearly this affects accuracy. It is important to note
that matrix inversion has a S3 cost, implying that doubling the PSA simulation size
132 4 Probabilistic Sensitivity Analysis Using BCEA

will increase the computational cost by 8. Additionally, the computational time is


related to the number of parameters of interest in φ. Therefore, the computational
time can increase to hours in complicated models with a large number of parameters
of interest.
EVPPI Using Non-default GAM Regression
GAM regression can be used for multi-parameter EVPPI for all regression fits using
the argument method="GAM" in the call to the evppi function. By default this will
calculate the EVPPI using full interactions for the GAM, as discussed in Sect. 4.3.3.1.
This gives maximum flexibility and therefore, in general more accuracy, inevitably at
the cost of computational time. Additionally, when the number of hyperparameters
needed to estimate the GAM approaches the number PSA simulations then the GAM
overfits and the accuracy is compromised. For this reason it is not recommended to
use for the GAM for more than three parameters of interest.
To avoid overfitting, it possible to specify a different form of interaction structure
among the parameters of interest. The interaction structure is specified by adding
the argument formula in the evppi call. The formula will be passed directly to the
gam function in the mgcv package. Therefore, a full explanation of the formula
structure is given by executing ?gam or help(gam). In general, there are two important
formula types, s(), a smooth function of 1 parameter, and te(), modelling interactions
between several parameters. For example, lets assume φ = (β1 , β2 , λ), we could
specify formula="te(beta.1.,beta.2.)+s(lambda)"; this implies that gt (φ) should be
estimated by modelling interactions between β1 and β2 , which could be correlated,
and then modelling λ separately.

4.3.5.1 Mixed Strategy for EVPPI Calculations

Throughout this section it is clear that a trade-off between computational speed and
accuracy must be made. For simpler examples, the default fast GP method will be
suitable but as the examples become more complex, as seen by the residuals, it may
be necessary to increase computation time to get more accurate results.
As the evppi procedure fits several regression curves to model all incremental
costs and effects, it may be that some of the curves can be fitted with the faster
procedures, while the more complex curves may require more computational time.
To allow for this, the method argument can be given as a list object in R, similar to
the int.ord argument for including interactions.
Again, the first list element contains the methods that should be used for effects
and the second list element contains the methods for the costs. The fast GP regression
method is called "INLA". Therefore, using the same example as before, where the
first incremental effects were poorly fitted, the following code would be used:
> methods <- list(c("GP ","INLA ","INLA "), c("INLA ","INLA ","
INLA "))
> EVPPI <- evppi(1:3,inp$mat,m,method=methods)
4.3 Value of Information Analysis 133

This strategy for EVPPI estimation demonstrates that while the default options for
the evppi function have been chosen as a trade-off between computational time and
accuracy, it is important to perform residual checking as part of an EVPPI procedure.
It is not recommended to use the evppi function to perform entirely “black-box”
estimation for the EVPPI as this can lead to misleading EVPPI estimates.

4.3.6 Deprecated (Single-Parameter) Methods

In this section we review two approximation methods that can be used to estimate
a single-parameter EVPPI, i.e. the case in which the set of important parameters
only contains one element and is thus indicated as a scalar, φ. As mentioned earlier,
these methods can be considered as deprecated since for such a simple setting, it is
possible to apply the GAM estimation which is accurate and efficient. Nevertheless,
we present these two older methods for completeness. For the same reason, they are
still included in the evppi function in BCEA.
The basic idea underlying the methods proposed almost at the same time and
independently by Sadatsafavi et al. [6] and Strong and Oakley [5] is that gt (φ) can
be estimated by noting that the optimal decision only changes a small number of
times, normally less than 3, when traversing all the possible values of φ. This means
that for almost all values of φ the optimal decision is the same for φ + and φ − ,
for an arbitrary small value . As an example, the optimal decision may only change
once, i.e. it is t = 1 for some value m > φ and t = 0 for m ≤ φ. Since the interest
is on a single-parameter φ, these change points are simply located on a line, rather
than more complicated shapes in higher dimensions.
While the optimal decision remains constant, gt (φ) can be approximated by the
average observed utility value in that set. In other words, if the decision remains
constant in an interval Iφ = [φl ; φu ], then gt (φ) is well approximated by the average
observed utility values calculated using the all φ ∈ Iφ . If this process is performed
for all changes of decision that occur in the parameter (unidimensional) space, then
gt (φ) can be estimated by a small number of values, which can then be used to
compute the opportunity loss.
The problem with this theory is that we typically have no substantial information
about the values of φ which change the optimal decision. The two single-parameter
EVPPI estimation methods use different algorithms to determine these cut-off points:
Sadatsafavi et al. search for points where the optimal decision is most likely to
change. As the algorithm searches for these points, the number of expected decision
changes must be specified. Conversely, Strong and Oakley split the full PSA sample
into “small” sub-samples and thus assume that the decision remains constant within
these sub-samples. Therefore, gt (φ) is approximated simply by finding the maximum
observed utility within each of these samples. For this method, the user must specify
the number of the sub-samples to be used.
134 4 Probabilistic Sensitivity Analysis Using BCEA

It is possible to use these methods by simply adding the option method, set to
either "so" or "sad", to the evppi. This instructs BCEA to use the method of Strong
and Oakley or that of Sadatsafavi et al., respectively. In addition, the user must also
specify either the number of subsets, e.g. n.blocks=50 for Strong and Oakley, or the
number of decision changes, e.g. n.seps=1 for Sadatsafavi et al.
Notice that it is possible to specify more than one parameter for these methods;
in this case, BCEA will calculate the single-parameter EVPPI separately for all the
parameters considered. For example, if we specify that the relevant parameters are
β1 and β2 , the two single-parameter EVPPIs can be obtained using the code:
> EVPPI.so <- evppi(c("beta.1.","beta.2."),inp$mat,m,method="so",n.blocks=50)
> EVPPI.sad <- evppi(c("beta.1.","beta.2."),inp$mat,m,method="sad",n.seps=1)
In this particular case, we are specifying that the full range of PSA samples should
be split into 50 “blocks”, when using the option so. Similarly, we are also instructing
evppi that we are only expecting one change in the optimal decision (indicated by
the option n.seps=1, for method="sad").
Again, it is possible to extract individual EVPPI values and plot the EVPPI when
these methods have been used. The most important distinction from the default
methods is that these methods calculate the EVPPI for different parameters and thus
more than one vector of EVPPI values is stored in the object m$evppi. These can be
extracted using the $ notation in R, which allows the user to access the elements of
an object. For example, we could extract the value of the EVPPI for the parameter
β1 and willingness to pay of 20 000 monetary units obtained using the Strong and
Oakley algorithm by typing
> EVPPI.so$evppi$‘beta.1.‘[which(m$k==20000)]
Similarly, the value of the EVPPI for β2 obtained using the method of Sadatsafavi
et al. is accessed using the code
> EVPPI.sad$evppi$‘beta.2.‘[which(m$k==20000)]
Despite their slightly different nature, because the resulting objects are still in the
class EVPPI, the plot method is still available and thus we can simply visualise the
results by typing
> plot(EVPPI.so)
> plot(EVPPI.sad)
Figures 4.18 and 4.19 showed a visual indication of which parameter has a higher
impact on uncertainty. This ordering should stay constant for all willingness to
pay values.
Looking at these plots, observe some problems with the estimation with the
Sadatsafavi et al. method, rather than a smooth function, reaching a sharp peak at the
change points, the EVPPI jumps around a little more frequently. It is also possible
4.3 Value of Information Analysis 135

Fig. 4.18 The Expected Value of Perfect Partial Information


single-parameter EVPPI EVPI

2.5
calculated using the Strong (1) EVPPI for beta.1.
(2) EVPPI for beta.2.
and Oakley method with 50

2.0
blocks for two parameters of
the Vaccine example

1.5
1.0
0.5
0.0 (1)
(2)

0 10000 20000 30000 40000 50000


Willingness to pay

Fig. 4.19 The Expected Value of Perfect Partial Information


single-parameter EVPPI EVPI
2.5

calculated using the (1) EVPPI for beta.1.


(2) EVPPI for beta.2.
Sadatsafavi et al. method
2.0

with an assumption for one


decision change for two
1.5

parameters of the Vaccine


example
1.0
0.5

(1)
0.0

(2)

0 10000 20000 30000 40000 50000


Willingness to pay

to note that the two parameter single-parameter methods differ, showing that the
chosen model specific inputs are incorrect. Information about how to choose these
inputs can be found in the original papers or in the review [11]. These are relatively
complex and are not implemented in BCEA, which makes these single-parameter
methods challenging to use accurately.

4.3.7 The Info-Rank Plot

Another way in which the EVPPI can be used is to provide an “overall” assessment of
the impact of each single parameter on the decision uncertainty. To this aim, BCEA
has a specialised function info.rank which produces a plot of the univariate EVPPI
for all the parameters of interest (as specified by the user). While this is not ideal,
since correlation among parameters and model structure does have an impact on
136 4 Probabilistic Sensitivity Analysis Using BCEA

Info−rank plot for willingness to pay=20100


Death.1.1.
Infected.1.1.
pi.2.1.
pi.1.1.
beta.1.
Trt.2.1.1.
Repeat.GP.1.1.
Mild.Compl.1.1.
GP.1.1.
Trt.1.1.1.
pi.2.2.
pi.1.2.
Infected.2.1.
Pneumonia.1.1.
Trt.1.2.1.
Trt.2.2.1.
GP.2.1.
Mild.Compl.2.1.
Repeat.GP.2.1.
Mild.Compl.2.2.
GP.2.2.
Repeat.GP.2.2.
Trt.1.2.2.
Trt.2.2.2.
Infected.2.2.
q.1.
Death.2.1.
Pneumonia.2.1.
Pneumonia.2.2.
Hospital.1.1.
lambda
psi.5.
psi.4.
Death.2.2.
psi.3.
beta.2.
q.7.
rho.2.
gamma.1.
beta.7.
beta.5.
phi
n.1.2.
n.2.2.
delta
psi.7.
Hospital.2.1.
psi.2.
psi.1.
psi.8.
q.5.
beta.6.
Adverse.events
xi
q.6.
psi.6.
beta.3.
gamma.2.
eta
beta.4.
q.4.

0.0 0.1 0.2 0.3 0.4 0.5 0.6

Proportion of total EVPI

Fig. 4.20 The Info-rank plot for the Vaccine example. Each bar quantifies the proportion of the
total EVPI associated with each of the parameters used as input

the joint value of the EVPPI (which is not a linear combination of the individual
EVPPIs!), the Info-rank plot with all the model parameters ranked can be used as a
sort of Tornado diagram, a tool often used in deterministic sensitivity analysis [14].
For example, in the case of the Vaccine case study, we can obtain the Info-rank
plot for all the underlying model parameters using the following commands.
> # Creates the object with the relevant inputs
> inp <- CreateInputs(vaccine)

> # Makes the Info-Rank plot


info.rank(inp$parameters,inp$mat,m)
where m is BCEA the object for the Vaccine example. The first argument of the
info.rank function gives the parameters for which the relative importance should be
calculated. In this setting inp$parameters implies that the relative importance should
be calculated for all parameters in the Vaccine model.
This creates the graph shown in Fig. 4.20. The graph shows a set of bars quantifying
the proportion of the total EVPI associated with each parameter. The larger the bar,
the higher the impact of a given parameter on decision uncertainty. As mentioned
above, care is needed in giving this graph an “absolute” interpretation—just because
a parameter shows a relatively low position in the Info-rank plot, does not mean that
there will be no value in investigating it in conjunction with other parameters.
4.3 Value of Information Analysis 137

However, it can be shown that the EVPPI of a set of parameters must be at least
as big as the individual EVPPI values. Therefore, parameters with high individual
EVPPI will always result in joint parameter subset with high value. But, nothing can
be said about parameters with small individual EVPPI values especially in decision
tree models which are typically multiplicative in the parameters. This means that
learning the value of one of these parameters has little value as the other elements
remain uncertain. However, learning all the parameters can greatly decrease decision
uncertainty and therefore has large value to the decision-maker. Nonetheless, the
Info-rank plot gives an overview, which is perhaps useful (in conjunction with expert
knowledge about the model parameters) to drive the selection of the subset φ to be
included in the full analysis of the EVPPI.
In addition to the graph, the function info.rank automatically prints the complete
ranking for the parameters selected. This is also stored in an element $rank. The
user can control the vector of parameters to be ranked, given as either a vector of
strings giving the parameter names or a numeric vector, corresponding to the column
numbers of important parameters. For example, the code
> info.rank(parameters=c("beta.6.","gamma.2.",eta"),input=inp$mat,m)
> info.rank(parameters=c(43,48,46),input=inp$mat,m)
produce the same output, since the values 43, 48 and 46 are the indexes associ-
ated with the positions of the parameters beta.6., gamma.2. and eta in the vector
inp$parameters. The simulations for all the parameters in the model need to be stored
in a matrix, in this example inp$mat. As mentioned earlier, this may be created using
the utility function CreateInputs, or imported from a spreadsheet.
The user can also select a specific value for the willingness to pay for which the
Info-rank is computed and plotted by setting the argument wtp=value. The element
value must be one of the elements of the willingness to pay grid, i.e. the values
stored in m$k. The default value is determined as the break-even point from the
BCEA object containing the economic evaluation. It is important to note that the
ranking will often change substantially as a function of the willingness to pay.
Additional options include graphical parameters that the user can specify. For
example, the argument xlim specifies the limits of the x−axis; the argument ca
determines the font size for the axis label (with default value set at 0.7 of the full
size); cn indicates the font size for the vector of parameter names (again with default at
0.7 of the full size); mai specifies the margins of the graph (default = c(1.36,1.5,1,1));
and finally rel is a logical argument that specifies whether the ratio of EVPPI to EVPI
(which can be obtained by default as rel=TRUE) or the absolute value of the EVPPI
(rel=FALSE) should be used for the analysis.
138 4 Probabilistic Sensitivity Analysis Using BCEA

4.4 PSA Applied to Model Assumptions and Structural


Uncertainty

This section is concerned with testing the sensitivity of the cost-effectiveness analysis
to the underlying model and health economic modelling assumptions.

4.4.1 Mixed Strategy

In the previous chapters, we have explored the analysis of cost-effectiveness assuming


that technologies and interventions would be used in total isolation, i.e. they would
completely displace each other when chosen. In reality this happens rarely, as new
interventions are not completely implemented for all patients in a certain indication.
In general, the previous strategies usually maintain a share of the market over time.
This is due to a number of factors, for example resistance to the novel intervention
or preference of use of different technologies in different patients.
When the market shares of the other available technologies cannot be set to zero,
the expected utility in the overall population can be computed as a mixture:


T
Ū = qt U t = q0 U 0 + q1 U 1 + · · · + q T U T
t=0


with qt ≥ 0 ∀t ∈ {0, · · · , T } and t qt = 1. For each intervention t, the quantity qt
represents its market share and U t its expected utilities. The resulting quantity Ū can
be easily compared with the “optimal” expected utility U ∗ to evaluate the potential
losses induced by the different market composition. In other terms, the expected
utility for the chosen market scenario is the weighted average of the expected utility
of all treatment options t with the respective market share qt as weights.

4.4.1.1 Example: Smoking Cessation (Continued)

To evaluate the uncertainty surrounding the decision under a mixed-strategy assump-


tion using the EVPI, the mixedAn function is provided in BCEA, which requires the
bcea object, m, and the shares for each intervention. In general, it is useful to create
an object containing the weights associated to each treatment strategy which we call
mkt.shares. Its elements should be values between zero and one, and sum to one. The
length of the vector needs to be equal to the number of interventions defined in the
bcea object, i.e. equal to m$n.comparators, equal to four in the smoking cessation
example. It is important to note that this function allows any values for the market
shares (including negative shares and sum not equal to one), and therefore the user
needs to be careful when specifying them.
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 139

Let us assume that the market shares are driven by the willingness of the indi-
viduals to start therapies. In a possible scenario, the majority of the patients prefer
quitting smoking without any help, and the other patients are less willing to undergo
therapies as they become more costly. We can therefore imagine in this scenario that
40% of the individuals are in the no treatment group, 30% in the self-help group,
and only 20 and 10% seek individual and group counselling respectively. The market
shares vector are then defined as
> mkt.shares = c(0.4, 0.3, 0.2, 0.1)
To produce the results for the mixed analysis, it is necessary to execute the
mixedAn function, to create an object ma or the class mixedAn.
> ma = mixedAn(m, mkt.shares)
The resulting ma object can be used in two ways:
• To extract the point estimate of the loss in EVPI for a given willingness to pay value,
by comparing the “optimal” situation in which the most cost-effective comparator
t is adopted (such that the associated market shares qt are equal to one);
• To produce a graphical analysis of the behaviour of the EVPI over a range of
willingness to pay values for the mixed strategy and “optimal” choice.
The first item can be obtained using the summary function. Since this is an S3
function for the objects of class mixedAn, the relative help page can be found by
executing the ?summary.mixedAn function from the R console. For example, the
point estimate for the loss in EVPI for a willingness to pay of 250 per QALY gained
can be obtained with:
> summary(ma, wtp=250)

Analysis of mixed strategy for willingness to pay parameter k = 250

Reference intervention: Group counselling (10.00% market share)


Comparator intervention(s): No treatment (40.00% market share)
: Self-help (30.00% market share)
: Individual counselling (20.00% market
share)

Loss in the expected value of information = 23.85

The two EVPI curves, for the optimal and mixed-treatment scenarios together,
can be plotted using the plot function. Both the base graphics and ggplot2 versions of
the plot are available, and can be selected by specifying the argument graph="base"
(or "b" for short, the default) or graph="ggplot2" (or "g" for short). The function
accept other arguments analogously to the previously presented functions to tweak
the appearance of the graph in addition to the choice of the plotting library to be
used (the help page can be opened by running ?plot.mixedAn in the R console). The
y-axis limits can be selected by passing a two-element vector to the argument y.limits
(defaults to NULL), while the legend position can be chosen using pos, consistent
140 4 Probabilistic Sensitivity Analysis Using BCEA

Expected Value of Information


200

150
EVPI

100

50

0 100 200 300 400 500


Willingness to pay
Mixed strategy:
No intervention=40.00%
Optimal strategy Self−help=30.00%
Individual counselling=20.00%
Group counselling=10.00%

Fig. 4.21 The figure represents the values of the expected value of perfect information under the
“optimal strategy” and “mixed strategy” scenarios for varying willingness to pay thresholds. It is
clearly shown that in this case the EVPI for the mixed strategy is always greater than for the optimal
strategy, due to the sub-optimality of the market shares leading to higher values of the opportunity
loss

with other functions such as ceplane.plot. The output of the function for the code
shown below is presented in Fig. 4.21.
> plot(ma, graph="g", pos="b")

Figure 4.21 shows the behaviour of the EVPI over willingness to pay values
ranging from £0 to 500. The EVPI for the mixed strategy is greater than the opti-
mal strategy scenario, in which the whole of the market share is allocated to the
intervention with the highest utility at any given willingness to pay value. The gap
between the two curves is wider for the two extreme points of the chosen interval of
WTP values, while they are relatively close in the interval between £150 and 250 per
life years gained. This is because in this interval the decision uncertainty is highest
decreases the opportunity loss differential between the two scenarios. Clearly, when
the optimal strategy is uncertain then all strategies are similar and therefore using a
mixed strategy is close to the optimal strategy.
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 141

4.4.2 Including Risk Aversion in the Utility Function

In the all analyses presented so far, the utility for the decision-maker has always been
assumed to be described by the monetary net benefit (see Sect. 1.3). This assumption
imposes a form of risk neutrality on the decision-maker, which might not be always
reasonable [2]. A scenario considering risk aversion explicitly, with different risk
aversion profiles, can be implemented by extending the form of the utility function
[3]. One of the possible ways to include the risk aversion in the decision problem is
to re-define the utility function (1.5) as:

1 
u(b, r ) = 1 − exp(−r b)
r
where the parameter r > 0 represents the risk aversion attributed to the decision-
maker. The higher the value of r , the more risk-averse the decision-maker is consid-
ered to be, where b := ke − c is the monetary net benefit.3
It is not usually possible to make the degree of risk aversion explicit, as it is
unlikely to be known. Therefore, an analysis using different risk aversion scenarios
can be carried out to analyse the decision-making process as a function of r .
The CEriskav function provided in BCEA can be used to perform health economic
analysis in the presence of risk aversion. In a similar manner to the mixed strategy
analysis presented in Sect. 4.4.1, a new object is created after the user specifies the
risk aversion parameter r . This can be a single positive number or a vector of possible
values denoting the risk aversion profile of the decision-maker. A vector r can be
defined and fed into the function together with the bcea object.

4.4.2.1 Example: Vaccine (Continued)

To perform the analysis with risk aversion we will use the Vaccine example. We will
assume that the object m of class bcea, produced by the initial call to the function
bcea and containing the results of the base case analysis, is available in the current
workspace. A vector r containing the risk aversion parameters to be used in the
scenarios. To assess the robustness of the results to variations in risk aversion, we
input different values in a vector, r:
> r <- c(0.0000000001,0.005,0.020,0.035)

The first element is taken to be sufficiently close to zero to be interpreted by BCEA


as a scenario with no risk aversion, marked as r → 0 on the graphic, Fig. 4.22. Once

3 It
should be noted that as a result the form of the known-distribution utility function assumes a
complex form. For a more complete discussion please see [2].
142 4 Probabilistic Sensitivity Analysis Using BCEA

the r vector is defined, the risk aversion analysis can be run using the function
CEriskav. This will create an object of class CEriskav assigned to the object cr:
> cr <- CEriskav(m,r=r,comparison=1)

The objects of class CEriskav can be used to produce the expected incremental
benefit and expected value of perfect information plots in BCEA. To do so it is
sufficient to call the plot function with an object of class CEriskav as argument. This
will produce the output displayed in Fig. 4.22:
> plot(cr)

Calling the plot function with a CEriskav object (i.e. the plot.CEriskav function)
will produce the two graphs in two separate graphical windows by default. This
creates a new graphical device for the second plot (the EVPI). This behaviour can be
modified by passing additional arguments to the plotting function. If the argument
plot="ask" is added, the "ask" option in the graphical parameters list "par" will be
temporarily be set to TRUE. In this way the second plot will be displayed in the
active device, after the application receives, either the Return key, in an interactive
R session, or readline() in non-interactive sessions.
Objects of class CEriskav such as cr contain several subsettable named elements:
• Ur: a four-dimensional numeric matrix containing the utility value for each simu-
lation, willingness to pay value (WTP) contained in the approximation grid vector
k, intervention and risk aversion parameter included in the vector r;
• Urstar: a three-dimensional numeric matrix containing the highest utility value
among the comparators for each simulation, willingness to pay value in the approx-
imation grid and risk aversion parameter;
• IBr: the incremental benefit matrices for all risk aversion scenarios, built as three-
dimensional matrices over the simulation, WTP values and values defined in r;
• eibr: the EIB for each pair of WTP and risk aversion values, i.e. the average of IBr
over the simulations;
• vir: the value of information, in a multi-dimensional matrix with the same structure
as IBr;
• evir: the expected value of information obtained averaging vir over the simulations;
• R: the number of risk aversion scenarios assessed;
• r: the input vector r , containing the risk aversion parameters passed to the CEriskav
function;
• k: the grid approximation of the interval of willingness to pay values taken from
the bcea object given as argument to CEriskav, in this case m.

4.4.3 Probabilistic Sensitivity Analysis to Structural


Uncertainty

All the sensitivity analyses presented so far are based on the premise that the only way
in which uncertainty affects the results of the economic evaluation is either through
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 143

EIB as a function of the risk aversion parameter

r→0
r = 0.005

200
r = 0.02
r = 0.035
150
100
50
0

0 10000 20000 30000 40000 50000

Willingness to pay

EVPI as a function of the risk aversion parameter

r→0
r = 0.005
20

r = 0.02
r = 0.035
15
10
5
0

0 10000 20000 30000 40000 50000

Willingness to pay

Fig. 4.22 The figures show the output of the plot function for the risk aversion analysis. The figures
show the effect of different risk aversion scenarios on the expected incremental benefit (EIB) and
the expected value of perfect information (EVPI), respectively, at the top and bottom of the figure.
It can be easily noticed that the EIB departs from linearity and the decision uncertainty represented
in the EVPI grows with increasing aversion to risk
144 4 Probabilistic Sensitivity Analysis Using BCEA

sampling variability for the observed data or epistemic uncertainty in the parameters
that populate the model. In other words, the economic assessment is performed
conditional on the model selected to describe the underlying phenomenon.
However, almost invariably the model structure, i.e. the set of assumptions and
alleged relationships among the quantities involved, is an approximation to a complex
reality. As Box and Draper state; “all models are wrong, but some are useful” [15].
From a hard-core Bayesian perspective, the ideal solution would be to extend the
modelling to include a prior distribution over a set of H finite, mutually exclusive and
exhaustive potential model structures M = (M1 , M2 , . . . , M H ). In principle, these
may be characterised by some common features (i.e. some Markov model to represent
the natural history of a disease) and each could have slightly different assumptions;
for example, M1 may assume a logNormal distribution for some relevant cost, while
M2 may assume a Gamma.
In this case, the data would update all the prior distributions (over the parameters
within each model and over the distribution of models) so as to determine a poste-
rior probability that each of the Mh is the “best” model (i.e. the one that is most
supported by the data). It would be possible then to either discard the options with
too small posterior probabilities or even build some form of model average, where
the weights associated with each structure are these posterior probabilities. Leaving
aside the technical difficulties with this strategy, the main problem is the underlying
assumption that we are able to fully specify all the possible alternative models. This
is rarely possible in practical scenarios.
One possible way to overcome this issue is to aim for a less ambitious objective—
the basic idea is to formalise a relatively small set of possible models and compare
them in terms of their out-of-sample predictions. This quantifies how well the pre-
dictive distribution for a given model would fit a replicated dataset based on the
observed data. Notice that, especially in health economic evaluations, the possible
models considered are merely a (rough) approximation to the complex phenomenon
under study, so there is unlikely to be a guarantee that any of these models should
be the “true” one.
Under this strategy, it is therefore necessary to determine a measure of good-
ness of fit that can be associated with each of the models being compared. Within
the Bayesian framework, one convenient choice is to use the Deviance Information
Criterion (DIC) [16]. A technical discussion of this quantity and its limitations are
beyond the objectives of this book—for a brief discussion see [2]. Nevertheless, the
intuition behind it is that it represents a measure of model fit based on (a function
of) the likelihood D(θ) = −2 log p(y | θ) and a term p D , which is used to penalise
model complexity. The reason for the inclusion of the penalty is that models contain-
ing too many parameters will tend to overfit the observed data, i.e. do particularly
well in terms of explaining the realised dataset, but are likely to perform poorly on
other occurrences of similar data.
From the technical point of view, it is fairly easy to compute the DIC as a by-
product of the MCMC procedure. Structural PSA can be then performed using the
following steps.
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 145

1. Define a (relatively small) set of possible models M = (M1 , M2 , . . . , M H ).


These may differ in terms of distributional assumptions, but may also contain
slightly different features.
2. Fit all the models and for each h = 1, . . . , H , compute DICh . This usually has
no extra computational cost;
3. Derive a set of weights to be associated with each of the models. One possible
solution discussed in [2] is to use

exp(−0.5ΔDICh )
wh = , (4.6)

H
exp(−0.5ΔDICh )
h=1

where ΔDICh =| minh (DICh ) − DICh |;


4. Use the weights wh to compute an average model M∗ .

4.4.3.1 Example: Chemotherapy

To consider this strategy, we consider a model discussed in [2] used to evaluate a


new chemotherapy drug (t = 1) against the standard of care (t = 0). A graphical
representation of the modelling structure is shown in Fig. 4.23: in each treatment
option, patients may experience blood-related side effects (with a probability πt )
or can have an successful treatment, where no side effects occur, with probability
(1 − πt ). In the latter case, the measure of effectiveness et is set to 1, while the only
dr ug
cost accrued is the acquisition of the drug ct , which varies in the two arms. In the
presence of side effects, patients may require ambulatory care (with probability γ,
which we assume is independent on the treatment assigned); this is coded as a non-
effective treatment and thus et = 0; the patients in this branch of the decision tree
dr ug
will accrue costs ct + ctamb . Similarly, the patients who experience the worst side
effects will go into hospital, which we simplistically assume happens with probability
dr ug hosp
(1 − γ) and will accrue costs ct + ct .
The new drug is supposed to produce a reduction in the chance of side effects, so
that we can model π1 = ρπ0 , where π0 is a baseline value, which is associated with
the standard of care, and ρ is the reduction factor.
We can express the assumptions in terms of a fairly simple model, where the
number of patients experiencing side effects is S E t ∼ Binomial(πt , n t ); condition-
ally on the realisation of this event, the number of patients requiring ambulatory
care is At ∼ Binomial(γ, S E t ). We also assume a vague prior on the probability
of side effects under the standard of care and on the probability of ambulatory care,
iid
e.g. π0 , γ ∼ Beta(1, 1). A clinical trial may be conducted to investigate the new drug
and we assume to have observed that n 0 = 111, S E 0 = 27, A0 = 17, n 1 = 103,
S E 1 = 18 and A1 = 14. Notice that the trial data are about a sample of patients
(n 0 , n 1 ) in the two arms, while we are interested in extrapolating the model to the
target population, made by N individuals (as shown in Fig. 4.23).
146 4 Probabilistic Sensitivity Analysis Using BCEA

Fig. 4.23 Graphical representation of the Chemotherapy model, in terms of a decision tree

We also assume that some information, e.g. from registries, is present to inform
the prior distribution for the cost of ambulatory and hospital care and that this
can be encoded in the following form: camb ∼ logNormal(4.77, 0.17) and chosp ∼
logNormal(8.60, 0.17). These imply that we are expecting them to realistically vary
in the intervals £(85; £165) and £(3813; £7738), respectively. Moreover, we know
dr ug dr ug
that the drugs have fixed costs c0 = 110 and c1 = 520.
Arguably, the crucial parameter in this model is the reduction in the probability
of side effects ρ. We assume that only limited evidence is available on the actual
effectiveness for t = 1 and thus consider two somewhat contrasting scenarios. In
the first one, we model ρ ∼ Normal(0.8, 0.2), to express the belief that the new
chemotherapy is on average 20% more effective than the standard of care, with some
relatively limited variability around this estimate. The second scenario assumes a
“sceptical” view and models ρ ∼ Normal(1, 0.2). This implies that on average the
new chemotherapy is no better than the status quo, while allowing for uncertainty
in this assessment. Figure 4.24 shows a graphical representation of these two prior
distributions—panel (a) shows the “enthusiastic” prior, while panel (b) depicts the
“sceptical” prior.
The model assumptions can be encoded in the following BUGS/JAGS code (notice
that again we define the two treatment arms as t = 1, 2 for the standard of care and
the new drug, respectively).
model {
# Observed data
for (t in 1:2) {
SE[t] ~ dbin(pi[t],n[t])
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 147

20000
(a) (b)
15000

15000
10000

10000
5000

5000
0

0
0.0 0.5 1.0 1.5 0.5 1.0 1.5
Reduction in the chance of side effects Reduction in the chance of side effects

Fig. 4.24 Prior assumptions on the reduction factor for the chance of side effects, ρ. Panel a assumes
an “enthusiastic” prior, where the new chemotherapy is assumed to be on average better than the
standard of care, while allowing for a possibility that this is actually not the correct case; panel b
shows a “sceptical” prior, under which the new chemotherapy is assumed to be on average just as
effective as the standard of care

A[t] ~ dbin(gamma,SE[t])
}

# Priors for clinical parameters


pi[1] ~ dbeta(1,1)
rho ~ dnorm(m.rho, tau.rho)
tau.rho <- pow(sigma.rho,2)
pi[2] <- rho * pi[1]
gamma ~ dbeta(1, 1)

# Priors for the costs


c.amb ~ dlnorm(4.77,0.17)
c.hosp ~ dlnorm(8.60,0.18)

# Prediction for the whole target population


for (t in 1:2) {
# Estimated patients with side effects & ambulatory care
SE.pred[t] ~ dbin(pi[t],N)
A.pred[t] ~ dbin(gamma,SE.pred[t])

# Defines economic output


mu.e[t] <- 1-pi[t]
mu.c[t] <- c.drug[t]*(1-pi[t]+pi[t]*(1-gamma)+pi[t]*gamma)+
pi[t]*(1-gamma)*c.hosp+pi[t]*gamma*c.amb
148 4 Probabilistic Sensitivity Analysis Using BCEA

}
}
We allow for the two different formulations of the model by passing two sets of
values for the parameter m.rho (set to 0.8 and 1 in the two cases). In this particular
case, we are assuming that the standard deviation sigma.rho does not vary in the two
scenarios, but of course this assumption could (and perhaps should) be relaxed.
The two models can be run using the R2jags package and the results may be stored
in R objects, say chemo_enth and chemo_scep, both in the class rjags and containing
the results of the MCMC simulations4
> # Loads the model results from the BCEA website
> library(BCEA)
> load("http://www.statistica.it/gianluca/BCEA/chemo_PSA.Rdata")

> # Stores the two model files in the object’models’


> models <- list(chemo_enth,chemo_scep)

> # Defines effects & costs as lists


> e <- list(chemo_enth$sims.list$mu.e,chemo_scep$sims.list$mu.e)
> c <- list(chemo_enth$sims.list$mu.c,chemo_scep$sims.list$mu.c)

> # Performs structural PSA using BCEA


> ints <- c("Standard of care","New chemotherapy")
> m_enth <- bcea(e[[1]],c[[1]],ref=2,interventions=ints)
> m_scep <- bcea(e[[2]],c[[2]],ref=2,interventions=ints)
> m_avg <- struct.psa(models,e,c,ref=2,interventions=ints)
In the code above, first we upload the results from the two models which are stored
in the remote server. Then we create an R list in which we include the two objects
chemo_enth and chemo_scep. The simulated values for the economic output defined
in the model code in the nodes mu.e and mu.c can be accessed using the list format,
using the sims.list element inside the two model objects. For simplicity, we save these
in two new objects e and c—notice that these will be lists containing two elements
each, one from the “enthusiastic” and one from the “sceptical” model.
With these variables we can now perform basic analysis for each of the two models;
we do this by calling the function bcea and applying it to the relevant elements of
e and c—the notation e[[1]] instructs R to consider the first element of the object e
(i.e. the simulated values for the effectiveness variable for the “enthusiastic” model,
which is the first to be included in the object e). Similarly, e[[2]] indicates the second
element (associated with the “sceptical” model).
Finally, we can use the BCEA function struct.psa and perform the structural PSA.
This function takes the list of models and the two lists including the measures of
effectiveness and costs as arguments. The other (optional) arguments have the same

4 We provide these in an R file which is downloadable from the website


http://www.statistica.it/gianluca/BCEA/chemo_PSA.Rdata.
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 149

format as the standard call to the function bcea and allow the user to specify the
reference intervention and a vector of labels.
The resulting object m_avg is a list, whose element can be explored by typing the
following commands
> # Lists the elements of the object m_avg
> names(m_avg)
[1]"he""w" "DIC"

> # Shows the weights given to each model


> m_avg$w
[1] 0.4347864 0.5652136

> # Shows the DIC associated with each model


> m_avg$DIC
[1] 23.34652 22.82182
The element m_avg$w is the vector of the weights associated with each of the models,
as defined in (4.6). In this case, the two models show relatively similar level of fit
in terms of the DIC and thus the two values w1 and w2 are relatively close to 0.5,
although, in light of the observed data, the sceptical model seems to be preferred and
thus associated with a slightly higher weight. In other words, the evidence is perhaps
not strong enough to support the bold claim that the new drug is substantially more
effective than the standard of care.
This is consistent with the analysis of the two DICs, which can be explored
from the element m_avg$DIC; the values are fairly close, although the one for the
“sceptical” model is lower, indicating a (slightly) better fit to the observed data.
The third element stored in the object m_avg is an object in the class bcea. Thus,
we can apply any BCEA command to the object m_avg$he, for example we could
type plot(m_avg$he), or contour2(m_avg$he). Figure 4.25 shows a comparison of
the results from the three models. Panels (a) and (b) present the contour plot for the
two original models, while panel (c) depicts the results of the economic evaluation
for the model average. In the latter, the value of the ICER is some sort of compromise
between the values of the two original models, while the variability of the posterior
value of the ICER is some sort of compromise between the distribution p(Δe , Δc )
appears to be reduced.
Figure 4.26 shows a comparison of the three models in terms of the CEAC in
panel (a) and the of the EVPI in panel (b). Uncertainty in the model average has a
lower impact on the decision-making process, as the CEAC is higher and the EVPI is
lower for the model average. This is true for all the selected values of the willingness
to pay, k.
150 4 Probabilistic Sensitivity Analysis Using BCEA

Fig. 4.25 Comparison of the three models. Panel a shows the cost-effectiveness plane for the
“enthusiastic” model, while panel b considers the “sceptical” model and panel c depicts the model
average result. In this case, the contour in panel c shows lower variability as it is more tightly centred
around the mean (i.e. the estimated ICER)

The figures can be obtained by typing the following commands to the R terminal.
> # Plots the CEACs
> plot(m$he$k,m$he$ceac,t="l",lwd=2,xlab="Willingness to pay",ylab="
Probability of cost effectiveness")
> points(m_enth$k,m_enth$ceac,t="l",lty=2)
> points(m_scep$k,m_scep$ceac,t="l",lty=3)
> legend("bottomright",c("Model average","Enthusiastic model","Sceptical
model"),bty="n",lty=c(1,2,3),cex=.8)
4.4 PSA Applied to Model Assumptions and Structural Uncertainty 151

300
Expected value of information
0.8
Probability of cost effectiveness

250
200
0.6

150
0.4

100
0.2

50
Model average Model average
Enthusiastic model Enthusiastic model
0.0

Sceptical model Sceptical model

0
0 10000 20000 30000 40000 50000 0 10000 20000 30000 40000 50000

Willingness to pay Willingness to pay

Fig. 4.26 Comparison of the three models. Panel a shows the CEACs computed for the “enthu-
siastic” (dashed line), “sceptical” (dotted line) and the model average (solid line), while panel b
shows the analysis of the expected value of information in the three cases. In this case, the model
average is also associated with lower impact of uncertainty in the final decision, as indicated by the
higher value of the CEAC (for all k considered) as well as the lower value of the EVPI

> # Plots the EVPPIs


> rg <- range(m$he$evi,m_enth$evi,m_scep$evi)
> plot(m$he$k,m$he$evi,t="l",lwd=2,xlab="Willingness to pay",ylab="Expected
value of information",ylim=rg)
> points(m_enth$k,m_enth$evi,t="l",lty=2)
> points(m_scep$k,m_scep$evi,t="l",lty=3)
> legend("bottomright",c("Model average","Enthusiastic model","Sceptical
model"),bty="n",lty=c(1,2,3),cex=.8)

References

1. K. Claxton, J. Health Econ. 18, 342 (1999)


2. G. Baio, Bayesian Methods in Health Economics (Chapman Hall/CRC Press, Boca Raton, FL,
2012)
3. G. Baio, A.P. Dawid, Stat. Methods Med. Res. (2011). doi:10.1177/0962280211419832
4. B. van Hout, M. Gordon, F. Rutten, Health Econ. 3, 309 (1994)
5. M. Strong, J. Oakley, Med. Decis. Making 33(6), 755 (2013)
6. M. Sadatsafavi, N. Bansback, Z. Zafari, M. Najafzadeh, C. Marra, Value Health 16(2), 438
(2013)
7. M. Strong, J. Oakley, A. Brennan, Med. Decis. Making 34(3), 311 (2014)
8. A. Heath, I. Manolopoulou, G. Baio, Statistics in Medicine (2016). doi:10.1002/sim.6983,
http://onlinelibrary.wiley.com/doi/10.1002/sim.6983/full
9. T. Hastie, R. Tibshirani, Generalized Additive Models, vol. 43 (CRC Press, 1990)
10. C. Rasmussen, C. Williams, Gaussian Processes for Machine Learning (2006). ISBN:
026218253X
152 4 Probabilistic Sensitivity Analysis Using BCEA

11. A. Heath, I. Manolopoulou, G. Baio, A review of methods for the analysis of the expected
value of information (2015). ArXiv e-prints arXiv:1507.02513
12. H. Rue, S. Martino, N. Chopin, J. R. Stat. Soc. B 71, 319 (2009)
13. F. Lindgren, H. Rue, J. Lindström, J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73(4), 423 (2011)
14. A. Briggs, M. Weinstein, E. Fenwick, J. Karnon, M. Schulpher, A. Paltiel, Value Health 15,
835 (2012)
15. G. Box, N. Draper, Empirical Model Building and Response Surfaces (Wiley, New York, NY,
1987)
16. D. Spiegelhalter, K. Abrams, J. Myles, Bayesian Approaches to Clinical Trials and Health-Care
Evaluation (Wiley, Chichester, UK, 2004)
Chapter 5
BCEAweb: A User-Friendly Web-App
to Use BCEA

5.1 Introduction

In this chapter, we introduce BCEAweb, a web interface for BCEA. BCEAweb is a web
application aimed at everyone who does not use R to develop economic models
and wants a user-friendly way to analyse both the assumptions and the results of
a health economic evaluation. The results of any probabilistic model can be very
easily imported into the web-app, and the outcomes are analysed using a wide array
of standardised functions. The chapter will introduce the use of the main functions
of BCEAweb and how to use its capabilities to produce results summaries, tables and
graphs.
The interface allows the user to produce a huge array of analysis outputs, both
in tabular and graphical form, in a familiar environment such as a web page. The
only inputs needed are the outputs of a probabilistic health economic model, be
it frequentist (based, for example, on bootstrapping techniques) or fully Bayesian.
It also includes functionalities to produce a full report and analyse the inputs of a
probabilistic analysis to test the distributional assumptions.
Throughout this chapter, we will make use of the two examples used so far to show
the functionalities of BCEA. The Vaccine and Smoking Cessation models introduced
in the previous chapters will be used as practical examples of how to use BCEAweb
in real-world examples.

5.2 BCEAweb: A User-Friendly Web-App to Use BCEA

The vast majority of health economics models are built in MSExcel. This is because
the users of these models are familiar with the software, and it is accepted by virtually
all health authorities and decision makers across the world, including the National

This chapter was written by Gianluca Baio, Andrea Berardi, Anna Heath and Polina Hadji-
panayiotou.

© Springer International Publishing AG 2017 153


G. Baio et al., Bayesian Cost-Effectiveness Analysis with the R package BCEA,
Use R!, DOI 10.1007/978-3-319-55718-2_5
154 5 BCEAweb: A User-Friendly Web-App to Use BCEA

Institute for Health and Care Excellence (NICE), the Pharmaceutical Benefits Advi-
sory Committee (PBAC) and the Canadian Agency for Drugs and Technologies in
Health (CADTH). While models programmed in Excel are presented in a familiar,
user-friendly fashion, the software itself can prove to be a limit when building com-
plex models. The intricate wiring and referencing style of these models is often the
cause of programming errors. Very often models rely on Visual Basic for Applica-
tions (VBA) for Excel for complex procedures.
We acknowledge that Excel models are usually sufficient to demonstrate the
value for money of new (or existing) technologies. However, often the presentation
of the results is lacking, and the calculation of more complex quantities is inefficient
(e.g. CEAC for multiple comparators) or are not feasible at all (e.g. EVPPI), mostly
because VBA lacks the mathematical or statistical capabilities and flexibility of a
language such as R. BCEAweb is aimed at researchers and health economists who would
like to expand the scope of their analyses without re-building models from scratch,
programming additional analyses of the outputs and, perhaps most importantly, doing
so without using R.
The main objective of BCEAweb is to make all the functionalities included in BCEA
available and easy to use for everyone, without writing a single line of code. The
programme was inspired by the Sheffield Accelerated Value of Information (SAVI)
web-app [1], which can be accessed at the webpage http://savi.shef.ac.uk/SAVI/.
The focus of SAVI is the research on the methods of calculation of the expected
value of (partial) information developed by Strong and colleagues (and presented
in Sect. 4.3.3.2), and it also offers facilities to calculate cost-effectiveness summary
measures. On the other hand, the purpose of BCEAweb is to offer an easy and stan-
dardised way to produce outputs from a health economic evaluation, with the EVPPI
among them. It should also be noted that BCEAweb also includes an EVPPI calcula-
tion method which is faster than the one implemented in SAVI for multidimensional
problems, based on the work on the EVPPI presented in [2].
The strength of BCEAweb is that it allows many different input formats, includ-
ing csv files obtained from spreadsheet calculators (e.g. MS Excel), OpenBUGS/
JAGS and R itself. The outputs can be saved either individually from the web-app, or
by exporting a modular summary in Word or pdf format, which also includes brief
and flexible interpretations of the results. The report is modular, allowing users to
choose the sections to be included.

5.2.1 A Brief Technical Overview of BCEAweb

The web application BCEAweb is developed entirely in R-Shiny, a web application


framework for R [3] and is available at the web page https://egon.stats.ucl.ac.uk/
projects/BCEAweb/. BCEAweb connects the user to an R server so that the main
functionalities of BCEA are quickly and easily available using a familiar web interface.
This makes it possible to build an interactive application directly from R. A typical
shiny application server is mainly based on two R scripts, the files server.R and
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 155

ui.R. The first includes all the functions and R commands to be run, while the latter
contains the code building the user-interface controlling the layout and appearance
of the web application. These are managed server-side, producing a web page relying
on HTML, CSS and Javascript without the programmer having to write in languages
other than R. Additionally the web-app relies on additional files used in functions
such as report generation, which are localised and accessed only on the server.
On the client side, a modern web browser supporting Javascript is capable of
displaying the web-app. When accessing it through the internet, all the calculations
are performed by the server, so that even the more demanding operations are not rely-
ing on the user’s device. The application can also be run locally, when a connection
is not available or when potentially sensitive data cannot be shared on the internet,
but obviously in this case the execution is performed on the users’ machine.
Any modern browser is able to display BCEAweb: both remotely, accessing it on
the internet, and locally, by running it through R. The R package is needed only
if running the web-app locally, while an internet connection is required to access
BCEAweb remotely.

5.2.2 Note on Data Import

As already mentioned, the output of an economic model can be imported easily, and
in different formats, into BCEAweb. The functionalities of the web-app require two
different sets of inputs: the simulated (or sampled) values of the parameters, used to
test the distributional assumptions of the PSA, and the PSA results, i.e. the costs and
health effects resulting from each set of simulated (or sampled) parameter values,
used to summarise and produce the results of the probabilistic analysis.
The latter set of values (i.e. the PSA results) is generally saved by all health
economic models, as it represents the PSA results and is used to produce tools such
as the cost-effectiveness plane. However, the sampled or simulated parameter values
are not always saved in Excel models as they are usually discarded. Therefore,
to make use of the tools to check the parametric distributional assumptions and to
calculate the EVPPI and info-rank values, the analysts will need to make sure that
the economic model saves the values of both the simulated parameter values and
the PSA results. Details on how to format the inputs for the web-app are reported in
Sects. 5.2.4.1 and 5.2.5.

5.2.3 Introduction to the Interface

At a first look, BCEAweb seems to be a regular web page, and it actually is. The
welcome page, first displayed when accessing the web-app, is shown in Fig. 5.1.
This page provides details about what BCEAweb is, how to use it and about how it fits
in the general process of a health economic evaluation. Many hyperlinks, coloured
156 5 BCEAweb: A User-Friendly Web-App to Use BCEA

Fig. 5.1 The landing page of BCEAweb provides information on the web-app and how to use it. The
buttons at the top of the page are used to navigate through the pages. The web-app is run locally
in the examples pictured throughout this chapter, and thus the address bar refers to the IP address
localhost resolves to

in orange, are included throughout the text. The tabs at the top of the page can be
clicked to navigate through the different sections of BCEAweb.
As shown in Fig. 5.1, BCEAweb is fundamentally divided into 6 main pages. Each
of them will be described in a section of this chapter. These can be easily accessed
from any page by clicking on the respective label in the navigation bar:
• Welcome: the landing page shown in Fig. 5.1. The welcome page includes expla-
nations about what BCEAweb is and does, and provides a basic usage guide;
• Check assumptions: this page allows the users to check the distributional assump-
tions underpinning the probabilistic sensitivity analysis;
• Economic analysis: where the model outcomes are uploaded and the main set-
tings regulated. It provides several tools for the analysis of the economic results.
It shows the results on a cost-effectiveness plane and includes other analyses such
as the EIB and the efficiency frontier;
• Probabilistic Sensitivity Analysis: calculates and shows results for tools
commonly used in the probabilistic sensitivity analysis, i.e. the pairwise and
multiple-comparison CEAC and CEAF;
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 157

• Value of information: allows the calculation of the EVPI, EVPPI and the info-
rank summary, based on the value of partial perfect information;
• Report: creates a standardised report, with modular sections, can be downloaded
from this page.

5.2.4 Check Assumptions

The first tab of the web-app is called “Check Assumptions”. It is sufficient to click
on its name in the navigation bar to access it. On this page, the user can upload the
simulations of the parameters used in the iterations of the PSA from any probabilistic
form of an economic model. The functionalities are particularly useful to easily test
for violations of distributional assumptions, which might be caused, for example,
by miscalculations of the distribution parameters in models caused by a very large
number of inputs. Rather than checking the values one by one, the sampled values
can be analysed based on the analysis of the empirical (i.e. observed) distributions
to ensure they are correct. This functionality extends BCEA and is not included in the
R package. The web-app presents the page shown in Fig. 5.2.
The drop-down menu at the top of the grey box on the left of the page allows the
selection of the preferred input data format. The data and can be fed into the web-app
in three different formats:
• Spreadsheet, by saving the simulations in a csv (comma-separated values) file.
This is particularly useful if the simulations are produced in a spreadsheet pro-
gramme, such as MS Excel, LibreOffice or Google Sheets;

Fig. 5.2 The “Check assumptions” tab before importing any data. The left section is reserved for
inputs and parameters, while outputs are displayed on the right side of the page
158 5 BCEAweb: A User-Friendly Web-App to Use BCEA

• BUGS, if the values of the parameters were obtained in a software such as OpenBUGS
or JAGS. The user will be required to upload the index file and a number of CODA
files equal to the number of simulated Markov chains;
• R, by providing an RDS file obtained by saving an object obtained from a BUGS
programme in R containing the values of the simulations. The file needs to be
saved using the saveRDS function.
The data need to be saved on a file (or files, in the case of a CODA input from a BUGS
programme) which will be imported into BCEAweb. The content of the files need to
be formatted in a standardised way so that the web-app can successfully import it.
The data formats for the files are described below.

5.2.4.1 Importing Parameter Simulation Data Files in BCEAweb

The spreadsheet input form is the easiest way to provide inputs if the economic model
is programmed in a software such as MS Excel, as the parameters can be easily
exported in a csv file. The data need to be arranged in a matrix with parameters by
column and iterations by row, as shown in Fig. 5.3. The first row is reserved for the
parameter names.
To import the values of simulations performed in R using a BUGS programme such
as OpenBUGS or JAGS, it is necessary to save the output object into an RDS file. The
output object (e.g. obtained from the jags function) needs to be of list mode, with

Fig. 5.3 The data format for csv files to be imported in the “Check assumptions” tab of BCEAweb.
The first row is dedicated to the names of the parameters, which will be used by the web-app to
populate the parameter menus
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 159

the list of simulations values located in the subset $BUGSoutput$sims.matrix. This


is the default behaviour of the BUGS to R connection packages. The object needs to
be saved (with any name) in an external file by using the saveRDS function.
If the simulations are performed directly in a BUGS programme, the easiest way
to import the results into BCEAweb is to save the sampled values as CODA files. Any
number of chains can be imported in the web-app. A file for each of the chains in
addition to the CODA index file will have to be produced and uploaded into BCEAweb.
Importing a file using either the R or BUGS format will make additional function-
alities available. These are diagnostic tools useful for checking for convergence and
presence of autocorrelation in the series, as described in Sect. 5.2.4.2 below.

5.2.4.2 Using the Check Assumptions Tab

The page will display a loading bar showing the transfer status of the data. As soon
as the input files are imported, the web-app will immediately show an histogram of
the distribution of the first parameter in the dataset, together with a table reporting
summary statistics of the parameter distribution at the bottom of the chart, as depicted
in Fig. 5.4. The variables to be displayed can be picked from the menu on the left
side of the page, either by clicking on the menu and selecting one in the list or by
typing the parameter label in the box, which will activate the search function. The
width of the histogram bars can also be set by varying the number of bins using the
slider at the bottom of the left side of the page. A greater number of bins will increase
the number of bars (i.e. decreasing the number of observations per rectangle), while
decreasing the number of bins will increase their size, reducing the number of bars.
Trace plots can be shown for each parameter distribution by clicking on the respec-
tive button in the navigation bar. This functionality is not very useful when import-
ing data from a spreadsheet as it can only show whether there are any unexpected
sampling behaviours dependent on the iteration number. On the other hand if the
simulations are passed to BCEAweb in a BUGS or R format and the values are sampled
from multiple, parallel chains, these can be easily compared variable by variable
using the trace plots.
Additional tools are available if the input data are from multiple parallel chains and
imported in the web-app in the BUGS or R format. By choosing one of these two options
in the input format menu bar, additional navigation buttons will appear compared to
the spreadsheet input option. By clicking on them, additional diagnostic tools will
appear, useful for checking the convergence of Markov chains and the presence of
correlation: the Gelman-Rubin plots and an analysis of the effective sample size and
of the autocorrelation. These are not included when importing data from a spreadsheet
as generally electronic models programmed in MS Excel or similar programmes do
not make use of parallel sampling from multiple chains. These tools are not discussed
in detail here; interested readers can find a description of these and other diagnostics
in [4].
160 5 BCEAweb: A User-Friendly Web-App to Use BCEA

Fig. 5.4 The “Check assumptions” tab after data have been imported in a spreadsheet form.
Additional analysis tools are available when importing simulations in the R or BUGS format (not
shown here)

5.2.5 Economic Analysis

The basic analysis of the cost-effectiveness results is carried out in the “Economic
analysis”. On this page, the cost-effectiveness results are uploaded and the eco-
nomic analysis begins. Analogously to the “Check assumptions” tab, the user can
upload the results of a probabilistic model in three different formats: spreadsheet,
BUGS and R. The data need to be arranged as already explained in the previous section
for each of the available formats, with parameters by columns and iterations by row.
The model outputs need to be ordered so that the health outcomes and costs for each
intervention are alternated, as shown in Fig. 5.5.
The same data arrangement is required if supplying the values in the R format:
health outcomes and costs need to be provided, alternated for each of the included
comparisons, in a matrix or data.frame object. This object needs to be saved in
RDS format using the saveRDS function for the web-app to be able to read it. Health
outcomes and costs must be provided in the same alternated arrangement also if
using a BUGS programme; one file containing the values for each of the simulated
Markov chains and the CODA index file need to be uploaded onto the web-app.
The “Economic analysis” page, displayed in Fig. 5.6, allows the user to choose
the parameters to be used by the underlying BCEA engine to analyse the results. On
the left side of the page the user can define the grid of willingness to pay (WTP)
thresholds used in the analysis by specifying the threshold range and the value of the
“step”, i.e. the increment from one grid value to the next. The three parameters are
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 161

Fig. 5.5 The data format for spreadsheet files to be uploaded in the “Economic analysis” tab.
The first row is reserved for variable names which are not used but included for consistency with
the input format of the “Check assumptions” tab

required to have values such that the difference between the maximum and minimum
threshold is divisible by the step. The default values of BCEA produce a 501-element
grid, which is generally fine enough to capture differences in the economic results
conditional on the WTP. Increasing the grid density by decreasing the step value has
detrimental effects on computational speed, therefore the users are advised to change
the grid density carefully. The cost-effectiveness threshold can also be set on this
page. This value is used as the cut-off value for the decision analysis, and is used to
determine if interventions are cost-effective by comparing the chosen threshold and
the ICER estimated in the economic analysis. This parameter is used in the functions
included in BCEAweb, such as the cost-effectiveness plane and the cost-effectiveness
summary.
Once the PSA data are uploaded to BCEAweb, additional options become available
on the left of the page. These are the names of the compared interventions, which
can be changed by the user, and the intervention chosen as reference. These options
match the ones used in the bcea function presented in Sect. 3.2. When the data are
uploaded and these additional options are set, the analysis is performed by clicking
on the “Run the analysis” grey button at the bottom of the page. The page will
display the cost-effectiveness summary, as shown in Fig. 5.6.
As soon as the analysis is run, BCEAweb produces the cost-effectiveness analysis
summary, the cost-effectiveness plane, the expected incremental benefit plot and the
cost-effectiveness efficiency frontier. These can be accessed by clicking in the sub-
navigation bar; each of them will display the respective standardise output produced
by BCEA:
162 5 BCEAweb: A User-Friendly Web-App to Use BCEA

Fig. 5.6 The “Economic Analysis” tab once the data have been uploaded and the model run. The
vaccine example was used to produce the results in the Figure

• “2.1 Cost-effectiveness analysis” shows the analysis summary


(Sect. 3.3), also included in Fig. 5.6;
• “2.2 Cost-effectiveness plane” includes the cost-effectiveness plane
(Sect. 3.4). The appearance of the graph can be changed by selecting a differ-
ent reference intervention and by varying the value of the WTP threshold, which
will interactively (i.e. without the need of re-running the analysis) change the
cost-effectiveness acceptability region;
• “2.3 Expected incremental benefit” shows the EIB plot (presented in
Sect. 3.5), including the 95% credible intervals and the break-even point(s), if
any;
• “2.4 Cost-effectiveness efficiency frontier”, which includes the effi-
ciency frontier plot and summary table. The efficiency frontier plot is described
in Sect. 3.7.

5.2.6 Probabilistic Sensitivity Analysis

The probabilistic sensitivity analysis (PSA) tab of BCEAweb includes tools to


determine the uncertainty associated to the decision-making process. This section
of the web-app is focused on calculating and summarising the probability of
cost-effectiveness of the compared interventions. The calculations in this tab are
based on the data uploaded to the web-app in the “Economic analysis” tab. If
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 163

the importing procedure has already been carried out, the results are available to
all tabs of the web-app, and there is no need to re-upload them. In this case the
“Probabilistic sensitivity analysis” page does not require any additional
input, and the graphs will be automatically displayed.
The “Probabilistic sensitivity analysis” is structured in three sub-
sections. Each of them contains a different tool to analyse the probability of cost-
effectiveness of the compared interventions: the pairwise cost-effectiveness accept-
ability curve (CEAC), the multiple-comparison CEAC and the cost-effectiveness
acceptability frontier (CEAF). These tools are described in more detail in Sect. 4.2.2.
The plots are shown automatically if the data have already been imported into the
web-app (if not, these will have to be re-uploaded from the “Economic analysis”
tab). The page will appear as shown in Fig. 5.7. The three curves can be accessed
by clicking on the respective buttons in the navigation sub-menu on the top-left
corner of the page. The pairwise CEAC and the CEAF values can be downloaded
by clicking on the “Download CEAC table” (or “Download CEAF table” for the
frontier). This will let the user download a file in format csv including the values
of the CEAC (or CEAF). Additionally the CEAC or CEAF estimates can be queried
for any given threshold value included in the grid approximation (specified in the
“Economic analysis” tab) directly in the web-app. Changing the wtp value in the
drop-down menu will return the respective probability of cost-effectiveness. However
if multiple interventions are compared, as in Fig. 5.7, the CEAC value shown will
refer to the first curve plotted. In the case shown the CEAC values shown (i.e. 0.6600)
indicates the probability of “Group counselling” being cost-effective when compared
to “No contact”.

Fig. 5.7 The “Probabilistic sensitivity analysis” tab, showing the cost-effectiveness acceptability
curves for the smoking cessation example
164 5 BCEAweb: A User-Friendly Web-App to Use BCEA

5.2.7 Value of Information

The “Value of information” tab is focused on three tools for the value of infor-
mation analysis:
• Expected value of perfect information, or EVPI, is the monetary value (based on
the monetary net benefit utility function) attributable to removing all uncertainty
from the economic model (and thus in the decisional process);
• Expected value of partial perfect information, or EVPPI, is the value attributable
to a reduction in the uncertainty associated to a single parameter or to a specific
set of parameters in the economic model;
• Info-rank plot, an extension of the tornado plot. This is useful to assess how para-
meters contribute to the uncertainty in the model by ordering them in decreasing
order of the ratio of single-parameter EVPPI on EVPI for a given WTP threshold
(shown in Fig. 5.8).
The EVPI tab works in a similar fashion to the CEAC, as it does not require
any other data than the PSA data imported into the web-app on the “Economic
analysis” tab. Analogously to the CEAC, the values can be explored by using the
drop-down menu in the page, as well as downloaded in a csv file by clicking on the
“Download EVPI table” button.
The “Info-rank” and “EVPPI” require that the parameter simulations, uploaded
in the “Check assumptions” tab, and the probabilistic economic results, uploaded
in the “Economic analysis” tab, are available. If these are not available in BCEAweb,
it will not be possible to calculate the EVPPI and the info-rank table. Once both
the parameter values and economic results are correctly imported, the parameter
selection menus in the “Info-rank” and “EVPPI” tabs (labelled “Select relevant
parameters” and “Select parameters to compute the EVPPI”, respectively)
will display a list of the model parameters. These can be either selected from the
drop-down menu or searched by typing the parameter labels into the menu field. Any
number of parameters can be selected; choosing “All parameters” selects all of
them at the same time. The analyses are run by selecting the parameters and clicking
on the grey “Run” button at the bottom of the page. In addition the EVPI, EVPPI
and Info-rank tables of values can be downloaded in csv format by clicking on the
grey “Download” button.
The EVPPI tab in BCEAweb includes alternative methods for the estimation of
the EVPPI, and allows for fine-tuning the parameters of the procedure based on the
chosen methodology. In addition to a plot comparing the full-model EVPI and the
EVPPI for the selected parameter (or set of parameters), BCEAweb also includes a
“Diagnostic” tab, which can be accessed from the EVPPI page. It includes residual
plots for both the costs and effectiveness so the user can check for unexpected behav-
iours in the model fit which might make the EVPPI estimates unreliable. BCEAweb
allows users to check the reliability of the estimation and also gives the instruments
to intervene on simple issues; for example non-linearities in the distribution of the
residuals can be addressed by increasing the interaction order using the INLA-SPDE
estimation method.
5.2 BCEAweb: A User-Friendly Web-App to Use BCEA 165

Fig. 5.8 The Info-rank plot for the vaccine example, including all parameters, in correspondence
of a threshold of 25,000 monetary units. While the psi.5 parameter stands out compared to the other
variables, the proportion between the single-parameter EVPPI and total EVPI is below 1%

5.2.8 Report

The “Report” tab allows users to download the outputs produced by BCEAweb in a
standardised report in either a pdf or Microsoft Word format. The list of sections
is displayed on the page, and checking the respective box will include the output
section in the report produced by the web-app. It is worth noting that the correct data
need to be available for each section to be produced correctly. The outputs produced
in the report will depend on the parameters specified in the BCEAweb tabs for each of
the selected sections, e.g. the comparator names, the variables included in the EVPPI
analysis, etc.
To download the report from BCEAweb it is sufficient to select the section required
by the user, choose the preferred output format and click on the grey “Download”
button at the bottom of the page. The report will be generated and downloaded; please
note that in some internet browsers, depending on the user settings, the report might
be displayed as a new web page. An example of a report generated by the web-app
for the vaccine example is shown in Fig. 5.9. The cost-effectiveness plane and cost-
effectiveness acceptability curve sections were selected to produce the report for the
vaccine example.
166 5 BCEAweb: A User-Friendly Web-App to Use BCEA

Fig. 5.9 The standardised report produced by BCEAweb for the vaccine example. This report was
obtained by selecting the cost-effectiveness plane and CEAC sections only

References

1. M. Strong, J. Oakley, A. Brennan, Medical Decision Making 34(3), 311 (2014)


2. A. Heath, I. Manolopoulou, G. Baio, Statistics in Medicine (2016). doi:10.1002/sim.6983, http://
onlinelibrary.wiley.com/doi/10.1002/sim.6983/full
3. W. Chang, J. Cheng, J. Allaire, Y. Xie, J. McPherson, Shiny: Web Application Framework for R
(2015), http://CRAN.R-project.org/package=shiny
4. G. Baio, Bayesian Methods in Health Economics (Chapman Hall/CRC Press, Boca Raton, 2012)
Index

B ceac.plot, 87, 101, 102, 104


Bayesian inference, 2, 10 efficiency frontier (CEEF), 90
Bayesian modelling Cost-effectiveness plane
parameters, 3, 4, 19, 94–96, 109, 111, ceplane.plot, 15, 16, 70, 74, 77, 84, 86
114, 137 contour, 84, 85
posterior, 7, 9, 37, 96, 149 contour2, 87
prior, 3, 4, 7, 49
BCEA
function, 62, 63, 66, 69, 89, 92, 117, 120, D
126, 135, 141, 149 Decision analysis, 17, 20, 43, 96, 161
installation, 120 Deviance
object, 62, 65, 66, 74, 76, 81, 83, 84, 97, deviance information criterion (DIC),
101, 102, 104, 111, 112, 120, 121, 136 144
plot, 66, 69, 74, 84 Distribution
summary, 72, 97, 101 beta (dbeta, betaPar), 4, 5, 9, 29, 36
BCEAweb binomial (dbin), 4
data input, 154 gamma (dgamma), 6–8
produce report, 155, 159, 161, 165 lognormal (lognPar, dlnorm), 29, 144
normal (dnorm), 5, 6, 84
triangular, 54
C uniform, 39
Case studies uniform (dunif), 7
smoking, 43, 63, 68, 69, 72, 73, 75, 77,
83, 91, 105, 107, 108, 112, 113, 124,
126, 138, 163 E
vaccine, 27, 63, 65, 66, 71–73, 75, 82, Efficiency frontier
85, 87, 115, 127, 135, 136 ceef.plot, 89
CEAC mce.plot, 104, 108, 109
ceac.plot, 87 Expected incremental benefit (EIB)
multi-decision, 99, 101, 109, 149 eib.plot, 83, 84
Computation Expected value of perfect information
Bayesian, 2, 9, 10, 95 (EVPI)
R2jags, 154, 158 evi.plot, 112
R2OpenBUGS, 24, 154, 158 opportunity loss (OL), 115
Cost-effectiveness Expected value of perfect partial information
acceptability curve (CEAC), 67, 71, 74, (EVPPI)
79, 87, 88, 99, 100, 103, 163 createinputs, 120, 125, 137
© Springer International Publishing AG 2017 167
G. Baio et al., Bayesian Cost-Effectiveness Analysis with the R package BCEA,
Use R!, DOI 10.1007/978-3-319-55718-2
168 Index

diag.evppi, 126, 130 P


evppi, 120 Probabilistic sensitivity analysis
Gaussian process, 117 PSA, 94, 96, 109, 113, 114, 116, 120,
generalised additive models (GAM), 121, 125, 132, 134, 155, 161
117, 131 table, 96
info.rank, 135 Probability
integrated nested laplace approximation posterior, 3, 51, 144
(INLA), 120 prior, 7, 145
mixed strategy, 132
residuals, 126
single-parameter, 132 Q
stochastic partial differential equations Quality adjusted life years (QALYs), 103
(SPDEs), 120

R
G R
Ggplot, 69, 70, 76–78, 84, 86, 105, 108 ggplot2, 61
INLA, 120
R2jags, 154
H R2OpenBUGS, 154
Health economic evaluation, 1, 2, 7, 10, 12, shiny, 154
13, 16, 20, 21, 23, 62, 63, 71, 89, 97, stats, 154
144, 153–155 triangle, 54, 128
workspace, 18, 34, 39, 53
Risk aversion, 141–143
I
Incremental benefit (IB)
ib.plot, 81 S
Incremental cost-effectiveness ratio (ICER), Structural uncertainty
14–16, 70, 72–77, 81, 82, 84, 87, 89, CEriskav, 141
91, 93, 94, 96, 97, 100, 149, 161 createinputs, 125
mixedAn, 138
PSA, 94
M struct.psa, 149
Markov Chain Monte Carlo
convergence, 40, 159
Gibbs sampling, 10, 125 W
Monetary net benefit, 14, 80, 141 Willingness to pay (wtp), 72, 98, 160

Potrebbero piacerti anche