Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Abstract Randomness
Reichenbach (1934/1949) is credited with having first
We argue that the apparent inconsistency between peo- suggested that mathematical novices will be unable to
ples intuitions about chance and the normative predic- produce random sequences, instead showing a tendency
tions of probability theory, as expressed in judgments
about randomness and coincidences, can be resolved by to overestimate the frequency with which outcomes alter-
focussing on the evidence observations provide about the nate. Subsequent research has provided support for this
processes that generated them rather than their likelihood. claim (reviewed in Bar-Hillel & Wagenaar, 1991; Tune,
This argument is supported by probabilistic modeling of 1964; Wagenaar, 1972), with both sequences of numbers
sequence and number production, together with two ex-
periments that examine judgments about coincidences. (eg. Budescu, 1987; Rabinowitz, Dunlap, Grant, & Cam-
pione, 1989) and two-dimensional black and white grids
(Falk, 1981). In producing binary sequences, people al-
People are notoriously inaccurate in their judgments ternate with a probability of approximately 0.6, rather
about randomness, such as whether a sequence of heads than the 0.5 that is seen in sequences produced by a ran-
and tails like is more random than the se- dom generating process. This preference for alternation
quence . Intuitively, the former sequence results in subjectively random sequences containing less
seems more random, but both sequences are equally runs such as an interrupted series of heads in a set of
likely to be produced by a random generating process coin flips than might be expected by chance (Lopes,
that chooses or with equal probability, such as a fair 1982).
coin. This kind of question is often used to illustrate how
our intuitions about chance deviate from the normative Theories of subjective randomness
standards set by probability theory. Our intuitions about A number of theories have been proposed to account for
coincidental events, which seem to be defined by their the accuracy of Reichenbachs conjecture. These theo-
improbability, have faced similar criticism from statisti- ries have included postulating that people develop a con-
cians (eg. Diaconis & Mosteller, 1989). cept of randomness that differs from the true definition
The apparent inconsistency between our intuitions of the term (eg. Budescu, 1987; Falk, 1981; Skinner,
about chance and the formal structure of probability the- 1942), and that limited short-term memory might con-
ory has provoked attention from philosophers and mathe- tribute to peoples responses (Baddeley, 1966; Kareev,
maticians, as well as psychologists. As a result, a number 1992; 1995; Wiegersma, 1982). Most recently, Falk and
of definitions of randomness exist in both the mathemat- Konold (1997) suggested that the concept of randomness
ical (eg. Chaitin, 2001; Kac, 1983; Li & Vitanyi, 1997) can be connected to the subjective complexity of a se-
and the psychological (eg. Falk, 1981; Lopes, 1982) lit- quence, characterized by the difficulty of specifying a
erature. These definitions vary in how well they satisfy rule by which a sequence can be generated. This idea
our intuitions, and can be hard to reconcile with proba- is related to a notion of complexity based on descrip-
bility theory. In this paper, we will argue that there is tion length (Li & Vitanyi, 1997), and has been considered
a natural relationship between peoples intuitions about elsewhere in psychology (Chater, 1996).
chance and the normative standards of probability theory. The account of randomness that has had the strongest
Traditional criticism of peoples intuitions about chance influence upon the wider literature of cognitive psychol-
has focused on the fact that people are poor estimators ogy is Kahneman and Tverskys (1972) suggestion that
of the likelihood of events being produced by a particu- people may be attempting to produce sequences that are
lar generating process. The models we present turn this representative of the output of a random generating
question around, asking how much more likely a set of process. For sequences, this means that the number of
events makes a particular generating process. This ques- elements of each type appearing in the sequence should
tion may be far more useful in natural inference situa- correspond to the overall probability with which these el-
tions, where it is often more important to reason diagnos- ements occur. Random sequences should also maintain
tically than predictively, attempting to infer the structure local representativeness, such that subsequences demon-
of our world from the data we observe. strate the appropriate probabilities.
Formalizing representativeness any sequence of the same length. However, comput-
A major challenge for a theory of randomness based ing P x regular requires specifying the probability of
upon representativeness is to express exactly what it the observed outcome resulting from a generating pro-
means for an outcome to be representative of a random cess that involves regularities. While this probability is
generating process. One interpretation of this statement hard to define, it is in general easy to compute P x hi ,
is that the outcome provides evidence for having been where hi might be some hypothesised regularity. In the
produced by a random generating process. This interpre- case of sequences of heads and tails, for instance, hi
tation has the advantage of submitting easily to formal- might correspond to a particular probability of observ-
ization in the language of probability theory. ing heads, P p. In this case P hi is
If we are considering two candidate processes by p4 1 p 4 . Using the calculus of probability, we can ob-
which an outcome could be generated one random, tain P x regular by summing over a set of hypothesized
and one containing systematic regularities the total ev- regularities, ,
P x hi P hi regular
idence in favor of the random generating process can be
P x regular (4)
assessed by the logarithm of the ratio of the probabilities hi H
of these processes
where P hi regular is a prior probability on hi . In all
P random x applications discussed in this paper, we make the simpli-
log (1)
P regular x
fying assumption that P hi regular is uniform over all
hi . However, we stress that this assumption is not
where P random x and P regular x are the probabili- necessary for the models we create, and the prior may
ties of a random and a regular generating process respec- in fact differ from uniformity in some realistic judgment
tively, given the outcome x. contexts.
This quantity can be computed using the odds form of
Bayes rule Random sequences
For the case of binary sequences, such as those that might
P random x P x random P random be produced by flipping a coin, possible regularities can
(2)
P regular x P x regular P regular
be divided into two classes. One class assumes that flips
are independent, and the regularities it contains are asser-
in which the term on the left-hand side of the equation is tions about the value of P . The second class includes
called the posterior odds, and the first and second terms regularities that make reference to properties of subse-
on the right-hand side are called the likelihood ratio and quences containing more than a single element, such
prior odds, respectively. Of the latter two terms, the spe- as alternation, runs, and symmetries. Since this second
cific outcome x influences only the likelihood ratio. Thus class is less well defined, it is instructive to examine the
the contribution of x to the evidence in favour of a ran- account that can be obtained just by using the first class
dom generating process can be measured by the loga- of regularities.
rithm of the likelihood ratio, Taking to be all values of p P 0 1 , we
have P H T random 12 H T and P H T regular
P x random T
0 p 1 p d p, where H T are the sufficient statistics
random(x) log (3) 1 H
P x regular
of a particular sequence containing H heads and T tails.
This method of assessing the weight of evidence for a Completing the integral, it follows from (3) that
particular hypothesis provided by an observation is often random H T log HH T f H T (5)
used in Bayesian statistics, and the log likelihood-ratio
given above is called a Bayes factor (Kass & Raftery, where f H T is log 2H T log H T 1 , a fixed
1995). The Bayes factor for a set of independent obser- function of the total number of flips in the sequence. This
vations will be the sum of their individual Bayes factors, result has a number of appealing properties. Firstly, it is
and the expression has a clear information theoretic in- maximized when H T , which is consistent with Kah-
terpretation (Good, 1979). The above expression is also neman and Tverskys (1972) original description of the
closely connected to the notion of minimum description representativeness of random sequences. Secondly, the
length, connecting this approach to randomness with the ratio involved essentially measures the size of the set of
ideas of Falk and Konold (1997) and Chater (1996). sequences sharing the same number of heads and tails.
A sequence like is unique in its composition,
Defining regularity whereas has a composition much more com-
Evaluating the evidence that a particular outcome pro- monly obtained by flipping a coin eight times.
vides for a random generating process requires com-
puting two probabilities: P x random and P x regular . The Zenith radio data
The first of these probabilities follows from the defi- Having defined a framework for analyzing the subjective
nition of the random generating process. For exam- randomness of sequences, we have the opportunity to de-
ple, P random is 12 8 , as it would be for velop a specific model. One classic data set concerning
the production of random sequences is the Zenith radio extent to which results in a more random outcome than
data. These data were obtained as a result of an attempt , assessed over the subsequences starting one step back,
by the Zenith corporation to test the hypothesis that peo- two steps back, and so forth,
ple are sensitive to psychic transmissions. On several
k 1
occasions in 1937, a radio program took place during
which a group of psychics would transmit a randomly Lk
random Hi 1 Ti random Hi Ti 1
(6)
i 1
generated binary sequence to the receptive minds of their
listeners. The listeners were asked to write down the se- where the Hi Ti are the tallies of heads and tails counting
quence that they received, one element at a time. The back i steps in the sequence. We can then convert this
binary choices included heads and tails, light and dark, quantity into a probability using a logistic function, to
black and white, and several symbols commonly used in give a probability distribution for the kth response, Rk :
tests of psychic abilities, and all sequences contained a
total of five symbols. Listeners then mailed in their re- 1
P Rk
sponses, which were analyzed. These responses demon- 1 e Lk
strated strong preferences for particular sequences, but 1
there was no systematic effect of the actual sequence that (7)
Ti 1
was transmitted (Goodfellow, 1938). The data are thus a 1 ki 11 Hi 1
rich source of information about response preferences for
random sequences. The relative frequencies of the differ- ki 11 Ti 1
(8)
ent sequences, collapsed over choice of first symbol, are ki 11 Ti 1 ki 11 Hi 1
shown in the upper panel of Figure 1.
The parameter scales the effect that Lk has on the re-
Zenith Radio Data
sulting probability. The probability of the sequence as a
0.2 whole is then the product of the probabilities of the Rk ,
0.15
and the result defines a probability distribution over the
set of binary sequences of length k. This distribution is
Probability
00001
00010
00011
00100
00101
00110
00111
01000
01001
01010
01011
01100
01101
01110
01111
Probability
0.2
planation is suggestive of the kinds of regular generat-
ing processes that could be involved in producing num- 0.1
ties like those described by Kubovy and Psotka (1976), Randomness model
such as being even numbers, powers of 2, or occupying
special positions such as endpoints.
Taking the arithmetic properties of numbers to consti-
tute hypothetical regularities, we can specify the quan-
tities necessary to compute random x . Our hi are sets
of numbers that share some property, such as the set of
even numbers between 0 and 9. For any hi , we define 0 1 2 3 4 5 6 7 8 9
hi
size of the set. This means that observations generated
from a regularity are uniformly sampled from that regu-
larity. Setting P hi regular to give equal weight to all hi , Figure 2: The upper panel shows number production data
we can compute P x regular . from Kubovy and Psotka (1976), taken from 1,770 participants
This model can be applied to the data of Kubovy choosing numbers between 0 and 9. The lower panel shows the
and Psotka (1976). Since there are ten possible re- transformed predictions of the randomness model.
sponses, we have P x random 1
. Taking hypothet-
10
ical regularities of multiples of 2 ( 0 2 4 6 8 ), multi-
ples of 3 ( 3 6 9 ), multiples of 5 ( 0 5 ), powers
likelihood of coincidences. Psychological research ad-
of 2
( 2 4 8 ), and endpoints ( 0 1 9 ), we obtain the dressing coincidences seems consistent with this view,
of random x shown in the lower
values panel of Figure
suggesting that selective memory (Hintzman, Asher, &
Stern, 1978) and preferential weighting of first-hand ex-
2. Randomness also needs to be included in so that
periences (Falk & MacGregor, 1983) might facilitate the
random x is defined when x is not in any other regu-
under-estimation of the probability of events.
larity. Its inclusion is analogous to the incorporation of
a noise process, and is in fact formally identical in this Not just likelihood )*)+)
case. The order of the model predictions is a parameter
The above analyses reflect the same bias that made it
free result, and gives the ordinal correlation rs 0 99.
Applying a single parameter power transformation to
the difficult to construct a probabilistic account of random-
predictions, y% y min y & 0 ' 98 , gives r 0 95.
ness: the notion that peoples judgments reflect the likeli-
hood of particular outcomes. Subjectively, coincidences
are events that seem unlikely, and are hence surprising
Coincidences when they occur. However, just as with random se-
The surprising frequency with which unlikely events quences, sets of events that are equally likely to be pro-
tend to occur has drawn attention from a number of duced by a random generating process differ in the de-
psychologists and statisticians. Diaconis and Mosteller gree to which they seem to be coincidences. Follow-
(1989), in their analysis of such phenomena, define a co- ing Diaconis and Mostellers suggestion that the Birth-
incidence as a surprising concurrence of events, per- day Problem provides a domain for the investigation of
(
&
ceived as meaningfully related, with no apparent causal coincidences, consider the kinds of coincidences formed
connection (p. 853). They go on to suggest that the by sets of birthdays. If we meet four people and find out
surprising frequency of these events is due to the flex- that their birthdays are October 4, October 4, October 4,
ibility that we allow in identifying meaningful relation- and October 4, this is a much bigger coincidence than
ships. Together with the fact that everyday life provides if the same people have birthdays May 14, July 8, Au-
a vast number of opportunities for coincidences to oc- gust 21, and December 25, despite the fact that these sets
cur, our willingness to tolerate near misses and to con- of birthdays are equally likely to be observed by chance.
sider each of a number of possible concurrences mean- The way that these sets of birthdays differ is that one of
ingful contributes to explaining the frequency with which them contains an obvious regularity: all four birthdays
coincidences occur. Diaconis and Mosteller suggested occur on the same day.
that the surprise that people show at the solution to the
Birthday Problem the fact that only 23 people are re- Modeling coincidences
quired to give a 50% chance of two people sharing the Just as sequences differ in the amount of evidence they
same birthday suggests that similar neglect of combi- provide for having been produced by a random gener-
natorial growth contributes to the underestimation of the ating process, sets of birthdays differ in how much evi-
dence they provide for having been produced by a pro- one week across a month boundary, 4 birthdays in the
cess that contains regularities. We argue that the amount same calendar month, 4 birthdays with the same calendar
of evidence that an event provides for a regular generat- dates, and 2 same day, 4 same day, and 4 same date with
ing process will correspond to how big a coincidence it an additional 4 unrelated birthdays, as well as 4 same
seems, and that this can be computed in the same way as week with an additional 2 unrelated birthdays. These
for randomness, dates were delivered in a questionnaire. Each participant
was instructed to rate how big a coincidence each set of
P x regular dates was, using a scale in which 1 denoted no coinci-
coincidence(x) log (9)
P x random ,
dence and 10 denoted a very big coincidence.
The results of the experiment and the model predic-
To apply this model we have to define the regulari- tions are shown in the top and middle panels of Figure 3
ties . For birthdays, these regularities should corre- respectively . Again, the ordinal predictions of the model
spond to relationships that can exist among dates. Our are parameter free, with rs 0 94. Applying the transfor-
model of coincidences used a set of regularities that re- mation y% y min y ( 0' 48 ,
gives r 0 95. The main
flected proximity in date (from 1 to 30 days), belonging
discrepancies between the model and the
data are the
to the same calendar month, and having the same cal- four birthdays that occur in the same calendar month, and
endar date (eg. January 17, March 17, September 17, the ordering of the random dates. The former could be
December 17). We also assumed that each year con- addressed by increasing the prior probability given to the
sists of 12 months of 30 days each. Thus, for a set of regularity of being in the same calendar month clearly
n birthdays, X x1 xn , we have P X random
(
&
(
( this was given greater weight by the participants than by
360
1 n
. In defining P X regular , we want to respect the model. Explaining the increase in the judged coin-
the fact that regularities among birthdays are still strik- cidence with larger sets of unrelated dates is more diffi-
ing even when they are embedded in noise for in- cult, but may be a result of opportunistic coincidences:
stance, February 2, March 26, April 3, June 12, June as more dates are provided, participants have more op-
12, June 12, June 12, November 22 still provides strong portunities to identify complex regularities or find dates
evidence for a regularity in the generating process. To of personal relevance. This process can be incorporated
allow the model to tolerate noisy regularities, we can into the model, at the cost of greater complexity.
introduce a noise term into P X hi . The probabil- How big a coincidence?
ity calculus lets us integrate out unwanted parameters, 10
5
in adding a numerical free parameter to the model. In
particular, P X hi 0 P X hi P hi d. Assum-
1 0
- 1 $ h1i $ x j hi 5
P x j hi 360
(10)
360 x j . hi
0
4 same month
4 same week
4 same week
4 same date
4 same date
2 same day
4 same day
2 same day
4 same day
2 in 2 days
2 random
4 random
6 random
8 random