Sei sulla pagina 1di 1

Cohen, J., & Cohen, P. (1983).

Applied mul- in Parker's (1995) examples arise from no rameter being estimated is trivially small,
tiple regression/correlation analysis for the web of relations in an empirical-theoretical rather than nilliterally and exactly zero.
behavioral sciences (2nd ed.). Hillsdale, NJ: structure, but are ad hoc measures, adequate Furthermore, they concede that reversed con-
Erlbaum. only to the task of performing a significance ditional probabilities are not equal. They
Kempthorne, O., & Folks, L. (1971). Prob- test. Parker is quite right: We gain little (and take me to task, however, for using "inap-
ability, statistics, and data analysis. Ames,
IA: Iowa State University Press.
risk a bellyache) from two to four chili pep- propriate" and "irrelevant" (p. 1099) ex-
Welkowitz, J., Ewen, R. B., & Cohen, J. pers. And that, most emphatically, is the amples. They complain that in the schizo-
(1982). Introductory statistics for the be- problem. Only by developing measures that phrenia example, Ho (e.g., being normal),
havioral sciences (3rd ed.). New York: Aca- psychologists in a given area can agree upon instead of being zero or trivially small, has a
high probability, and that it is rarely the case
demic Press. and use in their research can we have mean-
that Ho is certain, as in the Congress
ingful measurement units with which to build
example.
a cumulative scientific structure.
McGraw (1995, this issue) asserts that My examples were not intended to
in "purely exploratory research that is un- model NHST as used "in the real world"
The Earth Is Round guided by any rigorous theoretical (Baril & Cannon, 1995, p. 1099), but rather
(p < .05): Rejoinder conceptualization and that has no literature to demonstrate how wrong one can be when
the logic of NHST is violated. I must point
to draw upon . . . the prior probability [that
Jacob Cohen out that Ho is generally a hypothetical state-
Ho is true] is large" (p. 1100). Because I hold
Department of Psychology, ment of fact that is to be assessed using the
that the nil hypothesis is never true, its prior
New York University rules of logic and is neither small (as in the nil
probability is zero. Even if Ho is taken to
hypothesis, which they seem to be assum-
mean trivially small, there is much persua-
I am greatly pleased and thankful to the sive evidence that the crud factor described ing) nor large (as they take my Congress
many readers who responded to my article by Meehl (1990) and Lykken (whom he example to be). R. A. Fisher (1951) dubbed
on null hypothesis significance testing cited) is likely to ensure that its prior prob- it the null hypothesis because it was the
(NHST; Cohen, December 1994). The pur- ability is not large. hypothesis to be nu//ified.
pose of the article was to begin a crusade to I must say that I am surprised by how
I used a high prior probability in my
replace meaningless NHST by placing confi- schizophrenia example to demonstrate how much resistance I encounter to using confi-
dence limits on effect sizes. The responses dence limits. Confidence limits not only tell
greatly mistaken one can be when one takes you the status of the null (or nil) hypoth-
that came to me were generally positive, and
thep value as bearing on the truth of the null esis, but also give you an idea of just how big
those that raised questions were stimulating.
hypothesis. Although one cannot, of course, the effect is. Without them, we relegate our
To those who rushed to the defense of
know the size of the crud factor in any given conclusions to the form, in Tukey's (1969,
NHST, I concede that there are circumstances
in which the direction, not the size of an domain, I don't find McGraw's (1995) graph p. 86) immortal phrase, "if you pull on it, it
effect, is central to the purpose of research. at all reassuring. gets longer!"
An example is a strictly controlled experi- Frick' s (1995, this issue) comment does
ment, such as a clinical trial (although even in not examine the meaningin the context of REFERENCES
a clinical trial, nothing is lost and much may confidence intervalsof "95% probable." It
Baril, G. L., & Cannon, J. T. (1995). What is
be gained with confidence limits). But the means that if I were to repeatedly draw the probability that null hypothesis testing
ritual of nil hypothesis testing has so domi- random samples from this population and
is meaningless? American Psychologist, 50,
nated our research practice that it has inhib- set up for each sample a 95% confidence
1098-1099.
ited our interest in the magnitude of the interval, my intervals would include the esti- Cliff, N. (1993). What is and what isn't mea-
phenomena we study and the units in which mated population parameter 95% of the time. surement. In G. Keren & C. Lewis (Eds.), A
they are measured, the basic stuff of which In fact, it means that over a lifetime of re- handbook for data analysis in the behav-
quantitative sciences are made. Parker (1995, searchduring which I computed many such ioral sciences. Methodological Issues (pp.
this issue) worries about the equality of the intervals for different populations and dif- 59-93). Hillsdale, NJ: Erlbaum.
units of our measures. The problem with his ferent parametersI would similarly suc- Cohen, J. (1994). The earth is round (p < .05).
examples is twofold. ceed in including the parameters I was esti- American Psychologist, 49, 997-1003.
mating. This procedure in no way posits Fisher, R. A. (1951). Statistical methods for
One problem is his presumption that a research workers. Edinburgh, Scottland:
demonstration of equality of units in some any null hypothesis. Oliver & Boyd. (Original work published
abstract sense is a necessary condition for Incidentally, I do not question the va-
1925)
effect size measurement. Such a demonstra- lidity of NHST, but rather its widespread Frick, R. W. (1995). A problem with confi-
tion cannot be necessary, as it is not pos- misinterpretation. If I reject the null hy- dence intervals. American Psychologist, 50.
sible. Instead, measurement proceeds "in pothesis at the 5% level, then I can correctly 1102-1103.
intimate relation with the empirical- assert that if it were true, I would have McGraw, K. O. (1995) Determining false alarm
theoretical structure of a scientific field" obtained results like those in hand less than rates in null hypothesis testing research.
(Cliff, 1993, p. 61). Such a structure has 5% of the time. I cannot correctly assert that American Psychologist, 50, 1099-1100.
existed for IQ for many years with no great the probability that the null hypothesis is Meehl, P. E. (1990). Why summaries of re-
concern about the equality of IQ units. I true is less than 5%. And apart from this search on psychological theories are often
wouldn't claim that every IQ unit is equal to misinterpretation, there is little point in re- uninterpretable. Psychological Reports, 66
jecting the nil hypothesis, which, I repeat, is (Monograph supplement 1-V66), 195-244.
every other IQ unit, but I think thataver-
Parker, S. (1995). The "difference of means"
aged over subjects and IQ unitsthey are always false.
may not be the "effect size." American
equal enough. Baril and Cannon (1995, this issue) Psychologist, 50, 1101-1102.
The other problem is that the use of concede that I am correct in asserting that Tukey, J. W. (1969). Analyzing data: Sancti-
timidity ratings, number of chili peppers the probability that Ho is true is zero, but fication or detective work? American Psy-
eaten, and number of correct identifications they then redefine Ho to mean that the pa- chologist, 24, 83-91.

December 1995 American Psychologist 1103

Potrebbero piacerti anche