Sei sulla pagina 1di 9

Journal of Evaluation in Clinical Practice ISSN 1356-1294

Causality, mathematical models and statistical association:


dismantling evidence-based medicine jep_1383 267..275

R. Paul Thompson BA MA PhD


Professor, Institute for the History and Philosophy of Science, and Department of Ecology and Evolutionary Biology, University of Toronto,
Toronto, Ontario, Canada

Keywords Abstract
causality, evidence-based medicine,
mathematical model, randomized controlled From humble beginnings, largely at the medical school at McMaster University, Canada,
trial, scientific theory the evidence-based medicine (EBM) movement has enjoyed a spectacular rise in interna-
tional acceptance over the last 25 years. Randomized controlled trials (RCTs) and system-
Correspondence atic reviews based on them have pride of place (the gold standard) in EBM’s hierarchy of
Professor R. Paul Thompson evidence; models and theories are relegated to the bottom of the hierarchy. In the last
Institute for the History and Philosophy of decade, RCTs have been extensively criticized. I briefly rehearse those criticisms because
Science and Technology they are an important backdrop to the criticism of EBM developed in this paper. In essence,
Victoria College the argument developed here is that RCTs use mathematics solely as a tool of analysis
University of Toronto rather than as the language of the science and that this fundamentally affects the validity of
91 Charles Street West
causal claims. As EBM gives pride of place to RCTs and devalues theoretical models – a
Toronto, ON
devaluation that would be incomprehensible to a physicist or biologist – the validity of
Canada M5S 1K7
EBM’s causal claims and knowledge claims are weak and far from a ‘gold standard’.
E-mail: p.thompson@utoronto.ca

Accepted for publication: 8 January 2010

doi:10.1111/j.1365-2753.2010.01383.x

upon it, statistics). Section V examines aspects of probability and its


I. Introduction interpretations that are relevant to my thesis. Specifically, it distin-
Sections II through VI of this paper build a foundation for the guishes mathematical calculi (systems) from their interpretations.
central thesis, expounded in section VII, that the construction of This is crucial in the case of probability because there are at least
scientific theories is essential to scientific knowledge (i.e. under- four much debated interpretations. RCTs rely on a contentious
standing the properties of things within a domain of science and interpretation, the frequency interpretation. The reliance of RCTs
their dynamical behaviour). The construction of scientific theories on the frequency interpretation lays bare that the mathematics is
requires that mathematics be employed as the language of science being used merely as a tool of analysis. In section VI, I illustrate the
and not merely as a tool of analysis. Evidence-based medicine difference between using mathematics as the language of a science
(EBM), resting as it does on randomized controlled trials (RCTs) and using it as a mere tool of analysis. I do this by looking at
as the gold standard of evidence, places scant, if any, emphasis on Mendel’s work on hybridization – work that formed the foundation
constructing theories and uses mathematics as a mere tool of data for modern population genetics. I show that Mendel began his work
analysis and experimental design. by using mathematics as a tool of analysis but made a crucial
In section II, I highlight that EBM is a slogan and, like all slogans, transition to using mathematic as the language in which to describe
it needs to be unpacked and analysed. In section III, I do some of that entities and their behaviours. It was this transition that resulted in a
unpacking and analysis. I do this in reverse order beginning with the powerful and robust account of the dynamics of hybridization; a
concept of ‘medicine’, then examining the concept ‘based’, and dynamics that provided a rich and deep understanding of the
finally moving to ‘evidence’, the concept on which this paper phenomena being observed. In section VII, I draw the thread of this
focuses. As RCTs are foundational for EBM and an important focus journey together by focusing on the nature and role of theories in
of my criticism of EBM, section IV provides a brief overview of science and the ways in which RCTs, and, by association EBM,
RCTs to ensure that a reader and I have the same essential under- provide a non-theoretical and impoverished account of the phenom-
standing. At the heart of RCTs is the probability calculus (and, built ena it studies. I conclude, with some suggestion for the kinds of

© 2010 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 16 (2010) 267–275 267
Causality, models and statistical assoc R. P. Thompson

research directions that might shed light on why, despite compelling therapeutic interventions involving pharmaceuticals, lifestyle
and unanswered criticisms, RCTs are still regarded by regulators modifications and the like. With these clarifications, the scope
and advocates of EBM as the gold standard of evidence. of EBM within the larger domain of medicine has shrunk
considerably.
The second term in EBM, ‘based’, is also not without concep-
II. The power and the poverty tual problems. Many areas of clinical medicine – such as ophthal-
of slogans mology, gynaecology and internal medicine – are an inextricable
A good slogan is a rhetorically powerful tool. In some of the most mixture of clinical practice and research in human biology. In
successful cases, the slogan derives its rhetorical power from the these domains, talk of one being ‘based’ on the other is conceptu-
apparent unreasonableness of accepting its contradiction. Two ally puzzling. If evidence and clinical practice are inextricably
recent examples are ‘smart growth’ related to the increasing size of interconnected, one cannot, in any non-artificial sense, be based on
cities, or ‘sustainable development’ related to a range of economic the other. Hence, either EBM employs a muddled and puzzling
issues – poverty reduction, increases in agricultural yield and the concept of ‘based’, or the scope of EBM has to be narrowed even
like. Who is likely to object to housing and feeding people and to further.
relieving poverty in developing countries? And, who would be The most significant issues with EBM, however, are not with the
cavalier enough to suggest that growth should be stupid rather than B and M but the E; it is that concept that occupies the rest of this
smart or development should be unsustainable rather than sustain- paper. The superficial meaning of ‘evidence’ in EBM is clear.
able? EBM is yet another example of a rhetorically powerful Below the surface, however, lurk problems of clarity, and problems
slogan. Who in their right mind would suggest that medicine with its metaphysical, epistemological, logical and mathematical
should not be based on evidence? As with all such slogans, assumptions. EBM has an explicit ‘Hierarchy of Evidence’ Pride
however, the substantial issues lie in the meanings of the terms of place in this scheme is given to RCTs. In a recent compendium
used. What exactly, for example, does ‘smart’ or ‘sustainable’ on EBM [1], evidence is organized into four primary levels (A–D);
mean in the specific context. within the primary levels A and B are sublevels (e.g. A: level 1b).
RCTs are ranked as A: level 1b, and systematic reviews (SR) –
with homogeneity – are ranked as A: level 1a – the very top. At the
III. Evidence-based medicine: bottom of the levels of evidence (D) is, ‘expert opinion without
the concepts explicit critical appraisal, or based on physiology, bench research
Each of the three terms in ‘EBM’ is complex and problematic. or “first principles” ’ [emphasis added] [1]. As systematic reviews
Consider first the term ‘medicine’; the domain of medicine is vast, supervene on individual RCTs, individual RCTs are the ‘gold
encompassing, for example, research domains such as molecular standard’ of evidence. This stands in stark contrast to sciences such
genetics, immunology, physiology and endocrinology and such as physics, chemistry and biology.
clinical domains as family practice, dermatology and paediatrics. The contention of this paper is that the foundation of the EBM
The methodology, experimental techniques, uses of models, etc. in edifice has significant engineering flaws; the entire epistemologi-
medical molecular genetics are similar – some might argue iden- cal, logical and mathematical foundations have cracks and fissures
tical – to those found in general molecular biology; in these fields and it has been poured, not on the bedrock claimed, but on sand.
RCTs play a very small and limited role – but more of that later. Although what I identify as a crack and fissure is different from,
Clinical areas of medicine such as family practice are strikingly but additive to, other critics, we all share the view that the engi-
different; prophylaxis, diagnosis, therapy and prognosis rely neering flaws are profound and troubling. What is remarkable
heavily on clinical experience, skill in differential diagnosis, and about EBM is its continuing lure in the face of relentless and
RCTs. Given these differences, what ‘medicine’ is being referred compelling criticism.
to in ‘evidence-based medicine’? A large and growing body of literature has identified flaws and
Clinical medicine is clearly identified as the target of EBM by weaknesses with EBM and particularly with its enshrining of
the movements founders, which leaves a significant and funda- RCTs as the gold standard of evidence in medicine (medicine, of
mental portion of medicine out of the EBM domain; one, of course, as narrowly circumscribed above). For example, Salsburg
course, would expect those other areas to involve evidence but, has argued that the current application of probability is mathemati-
for the doctrine of EBM, they fall outside its purview – as will cally and philosophically flawed [2]. Kravitz et al. have argued
be clear when we turn to what EBM means by ‘evidence’. that the heterogeneity of treatment effects is profound and not
Things are still pretty murky, however, because clinical medicine incorporated into the interpretation of data or in patient application
itself has many faces and fuzzy boundaries. Family practice [3]. Ashcroft has argued that the concept and nature of ‘clinical
seems a reasonable icon of clinical medicine but what about oph- effectiveness’ is fundamentally unclear [4]. Schaffner has argued
thalmology that has a clear clinical component but also involves that the concept of ‘causality’ in RCTs is unclear [5]. Howson and
surgical techniques and contributes to, and draws on, research in Urbach have exposed numerous problems with the frequentist
optical theory and cell biology? Areas of clinical medicine such interpretation of probability, the lack of clarity about randomness
as ophthalmology and internal medicine are a mixture of and the lack of justification for the requirement of randomization
clinical practice, surgical (mechanical) intervention and research in RCTs – as also has Bennett [6,7]. All these critics offer rem-
biology. edies; surprisingly neither the criticisms nor the remedies seem to
Within the ideology of EBM, ‘medicine’ appears restricted to have had any impact on the doctrines of EBM. In the first decade
the patient–doctor interaction, specifically the domains of diagno- of the 21st century, the criticisms have continued to mount (again
sis and therapeutic intervention, and even more specifically without discernable impact on EBM); Upshur, Worrall, Bluhm,

268 © 2010 Blackwell Publishing Ltd


R. P. Thompson Causality, models and statistical assoc

Borgerson and Cartwright, to mention a few key people, have


provided critical appraisals [8–13].
x
x + y < 180°
IV. Randomized controlled trials y
The mantra of RCTs owes a great deal to Ronald A. Fisher who
held that randomization, control and repeatability was a jointly
sufficient condition for the assertion of a causal connection. If a
target population was randomized into two groups, one of which Figure 1 Euclid’s formulation of the parallel postulate.
received an intervention while the other did not (the control group)
and the experimental circumstances were carefully managed and
if, when the experimental circumstances were repeated, the same be provided and theorems deduced from them. When interpreted in
outcome occurred, then one could assert absolutely that the inter- an economic context, some of the symbolic elements will be inter-
vention caused the specific outcome. According to this view, ran- preted as interacting economic agents in a market structure. In an
domization ensures that any potentially confounding factors ecological context, some of the symbolic elements will be inter-
(factors, other than the intervention itself, that might affect the preted as predators or prey interacting in an ecosystem. In a social
outcome) are equally distributed in the two groups. Hence, the psychology context, some of the symbolic elements will be inter-
assumption is that the only difference between the two groups is preted as interacting members of a social group exhibiting
the intervention. The control group serves as the required com- co-operation or deception and so on. The abstract axiomatic
parator – the outcome in the absence of the intervention (the system is the same; the interpretation depends on the specific
putative cause). Repetition of the trial increases confidence that domain of application.
there are no distorting artefacts of the design, conduct, data col- An instructive example of the nature and importance of the
lection and analysis of the trial. It also increases confidence in the distinction between a mathematical theory and its interpretation
assumption of a structural identity of the intervention and control can be drawn from geometry. Euclid wrote his Elements around
groups. Contemporary RCTs add ‘blinding’ to the mix. Most com- 300 bce. In it (13 books in all), he set out the postulates of
monly RCTs are ‘double blind’, which means neither those con- geometry [15]. Although a landmark work during his time and for
ducting the experiment nor those in the two groups (commonly centuries after, by modern standards, his explicit postulates are
referred to as ‘arms’ of the study) know which group is receiving deemed inadequate. As Boyer points out, Euclid often introduced
the intervention. In some cases, again without the knowledge of other postulates into his proofs [16]. Later formulations of geom-
those conducting the experiment or those in the two groups, the etry provided more rigorous postulates (axioms). One of Euclid’s
groups are switched partway through the trial: control becomes postulates is: if a straight line falling on two straight lines makes
intervention, intervention becomes control. There are other vari- the interior angles on the same side less than two right angles, the
ants as well but this will suffice as capturing the essential and most two straight lines, if produced indefinitely, meet on that side on
common features of RCTs. The assumed goal of all this is to which the angles are less than two right angles (see Fig. 1). This
uncover causes; however, it does this in name only, as I expose has become known as the parallel postulate.
later. But, an equally important feature of the structure of RCTs is A more familiar formulation (a variant on John Playfair’s 1795
to allow the application of mathematical analysis (especially prob- formulation) is: given a line and point not lying on that line, one
ability and statistics). As a result, this has all the appearance of and only one new line can be drawn through the given point such
scientific rigour. A closer examination, however, will uncover that the two lines never cross when extended infinitely in both
some reasons for scepticism. directions (i.e. they are parallel) [17]. Proof of this postulate fas-
cinated and eluded generations of mathematicians from Euclid to
the 19th century. During the 1820s, Nikolai Lobachevski and
V. Probability: axiomatizations Janos Bolyai independently developed a new ‘non-Euclidean’
and interpretations geometry by assuming one of the contradictions of the parallel
The now most widely accepted mathematical axiomatization of postulate [18,19]. They assumed that more than one line passing
probability theory is Kolmogorov’s 1933 set-theoretic axiomati- through the given point can be parallel to the given line. Subse-
zation [14]. To the extent that mathematicians almost universally quently, Eugenio Beltrami proved that this non-Euclidean geom-
accept his axiomatization, the mathematical theory of probability etry was consistent if Euclid’s was consistent [20]. Bernhard
is a settled matter. Things are considerably less settled when one Riemann, in a lecture in 1854 (later published), indicated that
moves to the interpretation of probability. There are two broad various non-Euclidean geometries were possible [21]. The one that
interpretations: an objective interpretation and a subjective inter- today bears his name assumes another contradiction of the postu-
pretation. Each broad interpretation admits of refined interpreta- late, namely that no lines can be drawn through the given point that
tions. Before exploring these, however, it is worth pausing to is parallel to the given line.
consider what it means for these to be ‘interpretations’ and why Hence, by the latter part of the 19th century, there were a number
this matters to RCTs and therefore to EBM. of geometries all shown to be internally consistent if Euclid’s is
The essential distinction between an axiomatization and its consistent. The importance of this for the focus of this paper lies in
interpretation rests on specific understandings of the meaning, and its laying bare the need to separate a mathematical calculus from its
hence application, of the system. For example, game theory can be physical interpretation or application. The standard interpretations
set out in purely mathematical symbolic form. Abstract axioms can of these geometries are given using geodesic models. For example,

© 2010 Blackwell Publishing Ltd 269


Causality, models and statistical assoc R. P. Thompson

one model for Euclidean geometry is the geometry of a plane (a flat is to claim that the repeatable conditions have a propensity
sheet of paper for example); one model for Riemannian geometry such that, if they were to be repeated a large number of times,
(no lines can be drawn parallel to a given line through a point outside they would produce a frequency of the outcome close to p.
that line) is the geometry of the surface of a sphere; hyperbolic The RCTs assume, for the most part, the frequency interpreta-
geometry (there are many lines that can be drawn parallel to a given tion. The significance of this for my thesis, as will become clear in
line through a point outside that line) can be modelled as the section VII, lies in the fact that the frequency interpretation
geometry of the surface of a hyperbole. assumes that probability is a relationship of events to each other
This array of geometries and their interpretations has yet and not a relationship between theory and evidence. The logical
another dimension. Even after it was shown that consistent non- theory or subjective theory (usually in a Bayesian guise) is more
Euclidean geometries were mathematically valid, and geodesic suited to exploring and justifying the relationship between theory
models for each were articulated, the actual nature of physical and evidence. A quick look back to the ways Gilles describes the
space was still deemed to conform to Euclidean geometry. But in different interpretations will reveal that for (1) and (2), the empha-
1912–1914, all that changed; space in Einsteinian general relativ- sis is on ‘evidence’ and the rational ‘degree of belief’ that evidence
ity is deemed to be Riemannian (non-Euclidean). This is captured warrants. In the case of (3) and (4), the emphasis is on ‘events’ and
well by Hilbert whose development of geometry: ‘outcomes’ and their connections. RCTs relate events to each other
. . . emphasized that the undefined terms in geometry should so in that respect the adoption of frequency interpretation in the
not be assumed to have any properties beyond those indicated context of RCTs is defensible. Because RCTs explore the connec-
in the axioms. The intuitive-empirical level of the older geo- tions among events (such as treatments and outcomes, lifestyle
metric views must be disregarded, and points, lines, and choices and health status, and such), considerable reconceptual-
planes are to be understood merely as elements of certain ization of the nature, purposes and methods of RCTs would be
given sets. . . . Similarly, the undefined relations are to be required to make a subjectivist theory such a Bayesian approaches
treated as abstractions indicating nothing more than a corre- even remotely appropriate. And, moving RCTs into a framework
spondence or mapping. ([16], p. 609) where the essential connection was evidence to belief and not
This brief discourse on geometry is intended to explicate the events to events would refocus the purpose and methods to exactly
distinction between (A) a mathematical calculus and its interpre- where I will argue they should be: the relationship between theory
tation; and (B) its application, or not, to empirical phenomena. In and evidence. That would not be a pleasing outcome for the EBM
order to apply a domain of mathematics to the understanding of hierarchy of evidence because a robust, and therefore explanatory,
empirical phenomena – such as the nature of space and the move- theory would become the gold standard, not RCTs.
ment of things in it – one must provide an interpretation. Some As indicated, the frequency interpretation is suited to RCTs
interpretations are more suited to the task than others. Also, each because RCTs explore the relationship between events. As a result,
interpretation brings with it commitments and assumptions (as RCTs have been subjected to an extensive body of criticism, a brief
seen, for example, in (1)–(4) in the next paragraph). This is no indication of which I note here in passing and without amplifica-
small issue in the case of RCTs. The mathematical probability tion of the details of the arguments – those are found in the source
calculus employed in the analysis of RCT data assumes, mostly documents cited. Howson and Urbach have argued that the fre-
without justification or even user awareness, an interpretation of quency interpretation is deeply flawed and that the Neyman–
the calculus. And, allusion to the fact that probability arose in the Pearson theory of significance tests commonly employed within it
context of games of chance (i.e. in an empirical context) no more fails to place statistical inductive reasoning on an entirely objective
justifies the collapsing of the distinction, than the observation that and rational basis [6]. The statistician David Salsberg is more
geometry arose in the context of understanding or resolving forceful:
empirical matters. There is much hard work to be done to justify And so, the Neyman-Pearson formulation lays in rubble at our
any empirical/scientific use of the probability calculus. feet. It is an arbitrary construction with no apparent relation-
Generally, four interpretations of the probability calculus are ship to the needs of clinical research. It rests on the rotten
recognized; these are set out crisply by Donald Gillies [22]: beam of frequentist probability. ([2], pp. 23–24)
1 The Logical theory identifies probability with degree of Salsburg is no lightweight or ‘ivory tower’ theoretician. He was
rational belief. It is assumed that given the same evidence, all honoured with the Distinguished Statistician Lifetime Achieve-
rational human beings will entertain the same degree of belief ment Award from the Pharmaceutical Research and Manufacturers
in a hypothesis or prediction (John Maynard Keynes held this Association for his theoretical and applied contributions to
view). [23] statistics.
2 The subjective theory identifies probability with the degree Howson and Urbach also argue that the emphasis on the neces-
of belief of a particular individual. Here it is no longer sity and value of randomization is unwarranted:
assumed that all rational human beings with the same evi- We have argued that randomization does not solve the
dence will have the same degree of belief in a hypothesis or problem for which it was designed, and, moreover, there are
prediction. Differences of opinion are allowed. good reasons for not regarding it as an absolute precondition
3 The frequency theory defines the probability of an outcome on trials. ([6], p. 154)
as the limiting frequency with which that outcome appears in Many mathematicians and philosophers have provided argu-
a long series of similar events. ments along the same lines; RCTs rest on an irreparably flawed
4 The propensity theory, or at least one of its versions, takes frequency interpretation of probability and that the highly touted
probability to be a propensity inherent in a set of repeatable ‘randomization’ fails to deliver the promised goods. These are
conditions. To say that probability of a particular outcome is p indeed damaging criticisms but I think there is a more fundamental

270 © 2010 Blackwell Publishing Ltd


R. P. Thompson Causality, models and statistical assoc

epistemological problem with EBM and RCTs; a problem to which ment of results. Third, a methodical process must be carefully
I now turn. followed. Mendel obtained seeds of 34 ‘more or less’ distinct
varieties which he checked to ensure that they bred true. He then
selected 22 for the experiment and focused on seven characters
VI. Mathematics in science: lessons which differed in form in plants that bred true:
from Mendel Round vs. wrinkled peas
Mathematics plays at least two different roles in science. One role Yellow vs. orange peas (seen through the transparent seed
is as the language of science. In the same way that English for me coats)
is the language in which I express things, for example, ‘I am Seed coats white vs. grey, grey-brown, leather brown
hungry’, so mathematics is the language in which scientific claims Smooth or wrinkled ripe seed pods
are expressed, for example, V(dX/dt) = I - F1(X) + F2(Y) expresses Green vs. yellow unripe seed pods
the rate of insulin change in the Bolie description of insulin- Axial or terminal flowers
glucose regulation [24]. The other is as a tool of analysis. Failure Long vs. short stem (he chose 6–7 feet and 3/4–11/2 feet).
to appreciate the fundamental epistemological and logical differ- Fourth, numerous plants with each character difference need to
ences between these two roles underlies the deep flaws in EBM be crossed (his term was ‘trial’) to obtain hybrids (F1 generation).
and its untenable claims regarding RCTs and causality and their Having done this, he observed that only one characteristic of each
evidential role in science. A straightforward and instructive pair of characteristics, a, b (e.g. wrinkled seeds or orange peas)
example of the distinction in the roles of mathematics can be found was present in all the F1 progeny. He then intra-fertilized the
in the work of Gregor Mendel [25]. Hence, I spend some time in hybrids to obtain the F2 generation. He observed, that for each pair
this section, elucidating Mendel’s reasoning and methodology; the of characteristics in a, b, they emerged in F2 in the ratio 3:1
goal is to expose clearly the distinction I am drawing and which I (see Table 1).
will be employing in the next section. Fifth, intra-fertilize b, and intra-fertilize a. He observed that b,
Mendel was interested in hybridization in plants (inter- when intra-fertilized, bred true to form in all subsequent genera-
fertilizing two varieties of a plant) and set out to discover what tions. Whereas, a, when intra-fertilized, yielded a ratio of 3a : 1b.
happens in subsequent generations of intra-bred hybrids. His Sixth, continue intra-fertilization of the offspring plants. This he
explicit goal was to discover generally applicable laws (see, pp. observed led to same results in all subsequent generations (see
8–11). Part of Mendel’s success lay in his clarity on the experi- Fig. 2).
mental requirements. One requirement was accurate bookkeeping. The crucial point to note is that to this point Mendel has been
He identified three important bookkeeping aspects: determination using mathematics to tabulate and calculate ratios. Hence, he has
of the number of distinct forms, careful organization of progeny by
generations, determination of statistical (numerical) relations. A F0 a b cross purebreds
second requirement was to construct an experimental design that
would reveal answers (see, pp. 9–10) and then to carry it out with F1 a intra-fertilize a and intra-fertilize b
precision and care.
F2 a b
His experimental design had six steps. First, plants must be
selected that will yield answers. Hence: they must have differen-
F3 a b
tiating characteristics, the offspring in each generation must be
able to be protected from foreign pollen, they should suffer no
marked reduction of fertility through the generations, they must
remain ‘constant without any exception’ (this means checking that
seeds obtained from ‘seedsmen’ breed true through many genera-
tions). Mendel chose peas of the genus Pisum because they had
these required characteristics. Continues to breed true
Second, observation of successive generations must be carefully
undertaken to ensure generational delineation and correct assign- Figure 2 Mendel’s pattern of crosses and results.

Table 1 Mendel’s experiments (trials) in


Outcome Ratio
tabular form ([25], pp. 16–17)
Trial Description 1 2 1 divided by 2

1 Form of seed Smooth = 5474 Wrinkled = 1850 2.958918919


2 Colour of seed Yellow = 6022 Green = 2001 3.009495252
3 Colour of seed coats Grey brown = 705 White = 224 3.147321429
4 Form of pods Smooth = 882 Wrinkled = 299 2.949832776
5 Colour unripe pods Green = 428 Yellow = 152 2.815789474
6 Flower position Axial = 651 Terminal = 207 3.144927536
7 Stem length Long = 785 Short = 277 3.45814978
14 947 5010 2.983433134

© 2010 Blackwell Publishing Ltd 271


Causality, models and statistical assoc R. P. Thompson

Table 2 Mendel’s model of generational sequence based on the ratio predicted mathematically by the model [although differences
1:2:1 ratio can be drawn (see Thompson [26]), for the purposes of this
Seed outcome Ratios
paper ‘model’, ‘theory’ and ‘mathematical model’ are used inter-
changeably]. Mendel continued his experimentation using other
Generation A Aa A A Aa a plants.
1 1 2 1 1 2 1 Mendel summed up his findings and his theoretical model
2 6 4 6 3 2 3 (dynamical system) as follows:
3 28 8 28 7 2 7 So far as experience goes, we find it in every case confirmed
4 120 16 120 15 2 15 that constant progeny can only be formed when the egg cells
5 496 32 496 31 2 31 and the fertilising pollen are of like character, so that both are
n 2n - 1 2n - 1 provided with the material for creating quite similar individu-
als, as is the case with normal fertilisation of pure species. We
Each plant is assumed to produce four seeds per generation ([25], p. 22).
must therefore regard it as certain that exactly similar factors
Example: generation 2: Seed outcome A.
4 from A.
must be at work also in the production of the constant forms
2 from Aa : 2 plants, 4 seeds from each = 8 seeds in 1:2:1 ratio = 2A :
in the hybrid. Since the various constant forms are produced
4 Aa : 2aa. in one plant, or even in one flower of a plant, the conclusion
appears logical that in the ovaries of the hybrids there are
been using mathematics as an instrument (a tool of analysis); formed as many sorts of eggs cells, and in the anthers as
mathematics – especially statistics – used in this way is, without many sorts of pollen cells, as there are possible combination
question, a powerful instrument. However, Mendel’s genius was in forms, and that the egg and pollen cells agree in their internal
moving beyond this use of mathematics to using it as the language composition with those separate forms. (p. 29)
in which to describe the dynamics of a system, a step which EBM Mendel’s theory was amplified and deepened in the 1920s
– as I argue in the next section – does not take. beginning with the contribution of the mathematician G. H. Hardy
Describing the dynamics of a system requires three key ele- who mathematically demonstrated that in every generation
ments. First, the entities of the system must be specified (e.g. for after the first the same proportion of alleles is obtained:
Mendel, ‘factors’; today ‘alleles’) and their characteristics/ p2AA : 2pqAa : q2aa [27]. This is the famous Hardy-Weinberg
properties (e.g. dominance and recessiveness). For Mendel, these equilibrium (Wilhelm Weinberg: 1862–1937: doctor, Stuttgart,
were postulated and not observed entities. This is the case with Germany also published it 1908). It is one of the cornerstones of
many entities in robust contemporary scientific theories. Second, a modern population genetics; it serves as an equilibrium principle
mathematical account of the relationships among the entities must within the dynamical system similar to that played by Newton’s
be provided. This specification of the entities and their properties first law in his dynamical system. In both cases, they state that if
and relationships is the specification of an ontology for the system. nothing happens nothing will happen. Other principles specify
Third, a mathematical specification of the behaviour of the system what kinds of things will cause something to happen.
over time must be provided (e.g. segregation and recombination).
These are regularities or, to be more mathematically precise, trans-
formation functions in the system, and they specify the dynamics VII. Theories in science: ontology,
of the system. That is, they specify the ways in which the system dynamics and evidence
will change (transform) over time. They are often differential
The lesson embedded in the Mendel example is that the strength
equations (ordinary or partial) but could be any of a number of
and credibility of knowledge claims in a domain of science are
other equations such as recursion equations.
directly proportional to:
Mendel provided all the required elements of a dynamical
1 the robustness of the formulation of a mathematical system (an
system. He postulated, the a-plants in F2 and following are really
ontology and dynamics formulated using the language of an
composed of two factors (A) and (a) in the ratio : 2Aa : 1a – factors
appropriate mathematical calculus), and
are the entities in his system. As b-plants from the F1 generation
2 the strength of the demonstration that empirical phenomena
onward breed true, the ratio of the factors for both intra- breedings
within that domain are found to be in accordance with the systems
is (using Mendel’s own notation): A + 2Aa + a. Today, this would
ontology and dynamics (the system and the phenomena are iso-
be expressed: AA + 2Aa + aa.
morphic).
Mendel’s groundbreaking contribution was to set out math-
Let me draw out this lesson by juxtaposing two quotations from
ematically, using the factors as the entities of the system, the
Mendel with one from Sackett.
dynamics of his system of heredity (see Table 2).
Mendel wrote:
This, for Mendel, is a purely mathematical model of what one
In point of fact it is possible to demonstrate theoretically that
would expect based on the existence of the entities (factors) with
this hypothesis would fully suffice to account for the develop-
their specified properties and the dynamics of their behaviour.
ment of the hybrids in the separate generations, if we might at
Mendel continued his experimentation by examining multiple
the same time assume that the various kinds of egg and pollen
characters through the generations. He begins with two characters
cells were formed in the hybrid on the average in equal
(round (A) vs. wrinkled (a) seed and yellow (B) vs. green (b) seed
numbers. (p. 29, emphasis added)
colour) and then moves to three and then four. The outcomes of
these multiple – character experiments, with some mathematical Experimentally therefore the theory is confirmed that the pea
manipulations, revealed that each character conforms to the 1:2:1 hybrids form egg and pollen cells which, in their constitution,

272 © 2010 Blackwell Publishing Ltd


R. P. Thompson Causality, models and statistical assoc

represent in equal numbers all constant forms which result be formulated, all agree on the utility and centrality of theories to
from the combination of the characters united in fertilisation. a robust science. Furthermore, on all such conceptions, the dis-
(p. 34, emphasis added) tinction is made between mathematics as the language of a domain
Points (1) and (2) were clearly fundamental to Mendel’s method- of science and mathematics as a tool of analysis.
ology and reasoning; and, it is worth adding, to the subsequent The account of scientific theories that most sharply delineates
edifice of contemporary population genetics. By contrast, Sackett this distinction has been called the semantic view of theories; the
and company claim that the lowest level of evidence (D) (10th name, however, matters little in this context [26,29]. On this
place out of 10 levels of evidence) – is: account of theories, theories are mathematical models (such as
Expert opinion without explicit critical appraisal, or based on Mendel’s): some are very sophisticated and complex, others quite
physiology, bench research or ‘first principles’. (p. 175, simple. The model is formulated using some field of mathematics;
emphasis added) the formulation at a minimum specifies the ontology (the entities it
In an attempt to make sense of the starkness of this contrast, one claims to exist and their properties: variables and parameters) and
might speculate that perhaps Sackett and others are not referring to dynamics (the ways a system can change over time: mathematical
theoretical knowledge – knowledge derived from well-confirmed functions such as recursion equations and differential equations)
and accepted theories. Perhaps the authors mean ‘physiology’ in of a system. The explicit claim is that empirical phenomena within
some special sense that does not include, for example, our knowl- the domain for which the model was developed have exactly the
edge of the feedback (self-regulating) mechanisms of various hor- same ontology and dynamics as the model. Justifying this claim is
monal systems; physiology would then be more akin to descriptive an extra-theoretical endeavour. Mature sciences such as physics,
anatomical knowledge. Perhaps they have in mind as ‘bench chemistry and biology encompass both endeavours: the endeavour
research’ something along the lines of discovering, refining and to formulate a mathematical model and the endeavour to demon-
using PCR. Perhaps ‘first principles’ is an oblique reference to strate the sameness of the model’s ontology and dynamics and that
metaphysical systems or to the appeal to ‘common sense’ alla of the phenomena under study. Formulating a model requires using
Thomas Reid rather than a reference to well-confirmed axioms or mathematics as the language of the system. Demonstrating same-
postulates of a theory, such as Newton’s laws of motion and the ness of model and phenomena involves many things including
law of gravitational attraction [28]. Given statements made else- other theories (models). For example, using a light microscope to
where in the book, this does not strike me as a fruitful line of observe chromosomal segregation as a partial demonstration of the
interpretation; furthermore, if something like this is intended, then isomorphism of Mendel’s model with phenomena requires the
an even more worrying feature of EBM emerges: theoretical theory of optics.
knowledge fails to even make the list of levels of evidence. Evidence-based medicine seems both to eschew theories and
Perhaps Sackett and other EBM advocates think that clinical models and, by giving pride of place to RCTs, employs mathemat-
medicine is unique. That is, when it comes to decisions about ics principally as a tool of analysis. The branch of mathematics
medical treatments, theories (models) play no role or, at best, a employed in RCTs is probability and statistics – largely in the
very minor role; in brief, the best evidential bases for clinical statistical analysis of data from a trial but also in designing trials
medicine are RCTs. But, the case for clinical medicine being and in meta-analyses (systematic reviews) of trials. This use of
unique has yet to be made. In what ways does evidence and probability and statistics as a tool of analysis contrasts sharply
decision making in clinical medicine differ from evidence and with their use in statistical mechanics and in population genetics.
decision making in computer and electrical engineering? In both In both of these cases, the domain of mathematics employed is
cases the systems are extremely complex. In both cases, there are probability and statistics but in both cases the mathematical cal-
self-regulating sub-systems. In both cases, interventions (modifi- culus is used as the language of the system. And, in both cases,
cations) have a cascade of effects. In both cases, different physical mathematics is used to formulate a system (a state space, an
(physiological) architectures produce dramatically different ontology and a dynamics). In these cases, probability and statistics
responses to the same intervention. Yet, an engineer would never play an essential role in the formulation of a causal dynamical
relegate theories and models to the back burner of evidence with system – in contrast to their role in RCTs. This difference is not
respect to the outcome of an intervention; and she would not rely innocuous.
principally, instead, on a RCT trial where different architectures First, it should be noted that many areas of medical research and
are randomized. Theories and models are the stuff and substance medical knowledge involve models in which mathematics is used
of the causal framework for an engineer and a causal framework is as the language in which to describe dynamical systems. These
essential in determining ‘why’ a particular outcome occurred as a include physiology (which Sackett et al. explicitly relegate to
result of an intervention. As I set out more fully in what follows, lowest level of evidence), immunology, medical genetics, neuro-
RCTs, at best, determine ‘that’ a particular outcome occurred. sciences and similar fields. Epidemiology and biostatistics
Theories are a fundamental element of the epistemological (broadly understood to include RCTs to determine the efficacy of
underpinnings of modern science. Among many other things, theo- pharmaceutical interventions as well as lifestyle modifications,
ries integrate a large body of otherwise disparate knowledge; they etc.) is the field where RCTs are prominent and models of dynami-
make possible explanations (answering ‘why?’ questions) and pre- cal systems are rare. RCTs dominate epidemiology and biostatis-
dictions; they ground counterfactual claims; they allow new tics but play a vastly smaller, if any, role in other fields of scientific
knowledge to be generated by exploring the implications of a enquiry and knowledge. Modern physics, astronomy, chemistry
theory’s ontology (the entities it claims to exist and their proper- and biology make exceedingly little use of RCTs. But why should
ties) and dynamics (the ways a system can changes over time). this matter? Perhaps RCTs are well suited to epidemiological
Although there are different conceptions of how theories can best research. Nancy Cartwright cuts to the core on this question.

© 2010 Blackwell Publishing Ltd 273


Causality, models and statistical assoc R. P. Thompson

In an RCT, if we are lucky, we find the average difference in an area with a high incidence of goitres. In one arm, the ran-
effect produced by the treatment in the population sampled. domly selected and assigned subjects are given an iodine pill in
That does not tell us what the overall outcome on this effect a dose approximating that found by chemical analysis to be part
in question would be from introducing the treatment in some of the diet in the Mediterranean region. The other arm receives a
particular way in some uncontrolled situation, even if we con- pill with no active ingredients. The trial is robust and has 10 000
sider introducing it only in the very population sampled. For people in each arm and runs for 10 years; it is double blind. The
that we need a causal model. Even less does it tell us about results indicate a statistically significant decrease in goitres in the
‘side-effects’ of introducing the treatment, either from the iodine arm (at the 0.01 significance level). What do we now
treatment itself or from our way of implementing it. These too know? With some confidence we can declare that there is a con-
are crucial in calculating the costs and benefits of a proposed nection between iodine deficiency and goitres and that iodine
policy. Or, as Heckman argues, suppose one wants to predict supplements are efficacious in preventing goitres. This is obvi-
what portion of the population will experience a given degree ously a therapeutically useful piece of knowledge but it does not
of improvement. RCTs do not deliver that kind of result. provide any answer to the question, ‘why does an iodine defi-
Again we need a causal model. ([13], p. 238) ciency produce goitres?’
Causal models are those that use mathematics as a language The answer to that question requires the construction of a
with which to formulate a dynamical system. As Cartwright cor- dynamical model of a homeostatic endocrine mechanism.
rectly observes, RCTs fail to give answers to crucially important Thyroid-stimulating hormone (TSH) is secreted by the thyro-
questions and EBM’s almost total reliance on RCTs and eschew- trope cells of the anterior pituitary. Hypothalmic thyrotropin-
ing of dynamical models means that it is never in a position to releasing hormone (TRH) stimulates the pituitary production of
provide answers to these questions. Probability and statistics have TSH. This stimulates thyroid hormone synthesis and its secre-
important uses as tools of analysis in many areas of science but it tion. Thyroid-secreted hormones inhibit the production of TRH.
is always in the context of a dynamical model; the mathematics This is the feedback (homeostatic) endocrine dynamics. Ingested
can be used, for example, to normalize data for comparison with iodine is bound to serum proteins, especially albumin. The
the model, to determine goodness of fit between the model’s thyroid gland extracts the albumin-bound iodine (unbound iodine
prediction and the empirical data. And, in fields like statistical is excreted in the urine) from the blood stream. This is part of a
mechanics and population genetics probability and statistics are dynamical system involving NIS (Na+/I- symporter). When
used as the language in which their dynamical systems are speci- access to dietary iodine is restricted the extraction process is
fied. By contrast, probability and statistics in RCTs provide an depressed. As iodine is an essential component in thyroxin (T4),
analysis of data but without reference to any dynamical causal the production of thyroxin (T4) by the thyroid is disrupted. As a
model. As such ‘why?’ questions cannot be answered. To answer result, TRH inhibition is decreased leading to higher levels of
the question, ‘why does this pharmaceutical produce this effect in TRH and, as a consequence, TSH production is increased. TSH
this individual?’ one needs to employ dynamical models from a stimulates the thyroid to increase thyroxin production. In the
particular combination of fields such as physiology, biochemistry, absence of iodine, however, it cannot do so; its increase in size
endocrinology and immunology. is an ineffective attempt to respond to the continual stimulation.
The distinction I have drawn regarding the uses of mathematics Although the complete dynamics are much more complex, this
is important in at least two respects. First, it distinguishes clearly suffices to show that one now has an answer to why restricted
explanations (scientific answers to ‘why’ questions) that employ access to dietary iodine results in goitres and why providing an
dynamical models and explanations that employ analytical tools. iodine supplement reduces or eliminates the problem. And, the
The former provide an account of the entities involved and the entire system can be described by a set of differential equations.
dynamics of their behaviour. The latter provide, at best in ideal This dynamical model explains scientifically the causal structure
circumstances, a link between an input and an output. Some might of the relationship. That is, it explains the relationship as
be willing to assert, again in ideal circumstances, that the input is opposed to simply demonstrating that the relationship exists.
a cause and the output is an effect; suffice here to note that there Hence, contrary to EBM, this is the gold standard of evidence; a
are a host of dangers in doing so. More importantly, using those powerful understanding of the dynamical mechanisms, the
labels should not delude one into thinking that a scientific expla- availability of which make an RCT otiose. By contrast with this
nation has been given; just how and why the input and output are deep understanding, the RCT – EBM’s gold standard – provides,
connected is not answered by using mathematics to analyse data. as Bluhm and Borgerson note, a crude empiricist description
RCTs do not provide scientific explanations (the how and the [11].
why); dynamical models do. The second importance of the distinction I have drawn between
To illustrate this point, consider iodine and thyroxin produc- the different uses of mathematics is that the distinction makes clear
tion. We, of course, have considerable knowledge of the connec- that although mathematics is central to scientific research and
tion which is why this will serve well as an example; but to make knowledge, not all uses of mathematics are equal. Using math-
the point, one has to assume a time when the connection is being ematics as a language with which to formulate dynamical systems
explored. In this early period of investigation, a researcher might allows deep explanations of phenomena. Using mathematics as a
have noticed that an enlargement of the thyroid gland (goitre) tool of analysis does not. Failure to make this distinction may
occurs more frequently in areas of the world where access to make an area of research appear far more rigorous and substantial
dietary iodine is restricted (some regions of central Africa and in its knowledge claims than is warranted, just because it employs
central South America, for example). The researcher constructs mathematics in a dazzling and impressive way; a way that none-
an RCT using a population of central South American children in theless remains a tool of data analysis.

274 © 2010 Blackwell Publishing Ltd


R. P. Thompson Causality, models and statistical assoc

7. Bennett, D. J. (1998) Randomness. Cambridge, MA: Harvard Univer-


VIII. The lure of EBM and RCTs sity Press.
Just as using mathematics as a tool was an important first step in 8. Upshur, R. E. G. (2005) Looking for rules in a world of exceptions:
Mendel’s development of his model, so RCTs can play, and do reflections on evidence-based practice. Perspectives, 48, 477–489.
9. Worrall, J. (2002) What evidence in evidence-based medicine? Phi-
play, an important investigative role in medicine. The central claim
losophy of Science, 69, S316–S330.
in this paper is that to answer ‘why’ questions, to provide a causal 10. Bluhm, R. (2005) From hierarchy to network: a richer view of evi-
account of phenomena and to integrate disparate bits of knowledge dence for evidence-based medicine. Perspectives, 48, 535–547.
into a cohesive whole requires the formulation of a dynamical 11. Bluhm, R. & Borgerson, K. (forthcoming) Evidence-based medicine.
system. A formulation of such a dynamical system is, and since at In Philosophy of Medicine (ed. F. Grifford). Amsterdam: Elsevier.
least Galileo has always been, the real ‘gold standard’ in science. 12. Borgerson, K. (2005) Evidence-based alternative medicine. Perspec-
Given the large body of exceptionally robust and sophisticated tives, 48, 502–515.
theorizing and model building in medicine (e.g. physiology, 13. Cartwright, N. (2007) Hunting Causes and Using Them. Cambridge:
immunology, medical genetics) and the barrage of criticisms of Cambridge University Press.
RCTs – most unanswered, perhaps even unnoticed by advocates 14. Kolmogorov, A. N. (1933) Foundations of the Theory of Probability,
1st English edition, 1950: 2nd English edition, 1956. New York:
and practitioners – the obvious question is: why are RCTs con-
Chelsea Pub Co.
sidered by regulators and researchers in certain fields to be the 15. Euclid of Alexandria (circa 300BCE) Elements (first printed edition
‘gold standard’ of evidence? There is much important work to be (1482) – a Latin translation by Johannes Campanus).
done in answering this question. Some obvious arenas of inves- 16. Boyer, C. B. (1991) A History of Mathematics, 2nd edn. New York:
tigation can be identified: things such as the political dynamics of John Wiley & Sons.
medicine (funding, authority, prestige and the like), the sociology 17. Playfair, J. (1795) Elements of Geometry; Containing the First Six
of decision making and defending of decisions, and the psychol- Books of Euclid, with Two Books on the Geometry of Solids. To
ogy and politics of litigation aversion. Also, in an environment of Which are Added, Elements of Plane and Spherical Trigonometry.
open intellectual investigation, an investigation of the role of Edinburgh: printed for Bell & Bradfute, and G. G. & J. Robinson,
dogma and ideology ought not to be summarily disallowed. London.
18. Lobachevski, N. I. (1840) Geometrical Researches on the Theory of
Vladimir Lenin is purported to have said, ‘A lie told often enough
Parallels. English translation appears in Roberto Bonola Non-
becomes the truth’; a chilling but seemingly correct observation. Euclidean Geometry, Dover, 1955. Non-Euclidean Geometry: A Criti-
Slogans become mantras and, in turn, become doctrines and ide- cal and Historical Study of Its Development. New York: Dover, 1955.
ologies, which ultimately become entrenched in research and 19. Bolyai, J. (1832) The science of absolute space. English translation in
regulatory communities. By the time the flaws and foundation- Roberto Bonola Non-Euclidean Geometry: A Critical and Historical
of-sand are detected and exposed, the doctrines and ideologies Study of Its Development. New York: Dover, 1955.
have been propagated often enough to become unassailable truth; 20. Beltrami, E. (1902) Opere Matematiche, Vol. 1. Milano: Ulrico
truth on which rests the vested interests of many. Even if one Hoepli.
believes that such an investigation will not yield fruit, there are 21. Riemann, B. (1868) Über die Hypothesen Welche der Geometrie zu
far too many examples in history – even the history of science – Grunde liegen (On the hypotheses which underlie geometry) [given
first as a lecture June 10, 1854 from his Habilitationsschrift on the
to make such an investigation off limits or to assume in advance
foundations of geometry].
that the thesis is dubious. 22. Gilles, D. (2000) Philosophical Theories of Probability. London: Rou-
tledge.
23. Keynes, J. M. (1921) A Treatise on Probability. London: Macmillan
and Co (reprinted 2004 by Dover Publications, Inc., Mineola,
References
NewYork).
1. Sackett, D. L., Straus, S. E., Richardson, W. S., Rosenberg, W. & 24. Bolie, V. W. (1960) Coefficients of normal blood glucose regulation.
Haynes, R. B. (2000) Evidence-Based Medicine: How to Practice and Journal of Applied Physiology, 16, 783–788.
Teach EBM, 2nd edn. Edinburgh: Harcourt Publishers Limited (see 25. Mendel, G. (1865) Versuche über Pfanzenhybriden. Verhandlungen
especially pp. 173–177). des Naturforschenden Vereins in Brünn, 4, 3–47 (all references are to
2. Salsburg, D. (1993) The use of statistical methods in the analysis of Fisher’s reprinting and modification of Bateson’s English translation,
clinical studies. Journal of Clinical Epidemiology, 46, 17–27. Bateson, W. (1909) Mendel’s Principles of Heredity. Cambridge:
3. Kravitz, R. L., Duan, N. & Braslow, J. (2004) Evidence-based medi- Cambridge University Press).
cine, heterogeneity of treatment effects, and the trouble with averages. 26. Thompson, P. (2007) Formalisations of evolutionary biology. In Hand-
The Milbank Quarterly, 82, 661–687. book of the Philosophy of Science: Philosophy of Biology (eds M.
4. Ashcroft, R. (2002) What is clinical effectiveness? Studies in History Matthen & C. Stephens), pp. 485–523. Amsterdam: Elsevier.
and Philosophy of Biological and Biomedical Sciences, 33, 219– 27. Hardy G. H. (1908) Mendelian proportions in a mixed population.
233. Science, 28, 49–50.
5. Schaffner, K. F. (1993) Clinical trials and causation: Bayesian per- 28. Reid, T. (1764) An Inquiry in the Human Mind on the Principles of
spectives. Statistics in Medicine, 12, 1477–1494. Common Sense. Dublin: Alexander Ewing.
6. Howson, C. & Urbach, P. (1989) Scientific Reasoning: The Bayesian 29. Thompson, P. (1989) The Structure of Biological Theories. Albany,
Approach. La Salle, IL: Open Court. NY: State University of New York Press.

© 2010 Blackwell Publishing Ltd 275

Potrebbero piacerti anche