10 1016@j Jbusres 2012 02 023

Journal of Business Research 66 (2013) 1261–1266
Contents lists available at SciVerse ScienceDirect
Journal of Business Research
Is two-tailed testing for directional research hypotheses tests legitimate?☆

Hyun-Chul Cho a, 1, Shuzo Abe b,⁎
a
Hanyang University, Republic of Korea
b
Waseda University, Japan
a r t i c l e i n f o a b s t r a c t
Article history: This paper demonstrates that there is currently a widespread misuse of two-tailed testing for directional re-
Received 1 April 2011 search hypotheses tests. One probable reason for this overuse of two-tailed testing is the seemingly valid be-
Received in revised form 1 September 2011 liefs that two-tailed testing is more conservative and safer than one-tailed testing. However, the authors
Accepted 1 November 2011
examine the legitimacy of this notion and find it to be flawed. A second and more fundamental cause of
Available online 29 February 2012
the current problem is the pervasive oversight in making a clear distinction between the research hypothesis
Keywords:
and the statistical hypothesis. Based upon the explicated, sound relationship between the research and sta-
Hypothesis testing tistical hypotheses, the authors propose a new scheme of hypothesis classification to facilitate and clarify
One-tailed testing the proper use of statistical hypothesis testing in empirical research.
Two-tailed testing © 2012 Elsevier Inc. All rights reserved.
Statistical hypothesis
Research hypothesis in existential form
Research hypothesis in non-existential form
1. Introduction (2) to explicate why it slows down the process of rigorous knowledge
building in the fields of marketing and business, and (3) to propose a
Standard textbooks on statistics clearly state that non-directional new scheme of hypothesis classification to eliminate such unnoticed
research hypotheses should be tested using two-tailed testing while confusion.
one-tailed testing is appropriate for testing directional research hy- There are three probable reasons for this widespread overuse of
potheses (e.g., Churchill & Iacobucci, 2002, p. 660; Pfaffenberger & two-tailed testing for directional research hypotheses tests. One rea-
Patterson, 1987, pp. 403–569). However, during the actual conduct son is technical in nature; quite often, the currently available software
of statistical testing, this advice is not often heeded. According to programs of standard statistical tools, such as EQS and AMOS in SEM
our observation of 492 recent empirical articles that have used struc- and regression analysis in SPSS, only provide two-tailed p values with
tural equation modeling (SEM), regression analysis, and analysis of regard to the t tests on the free parameters to be tested. When ANOVA
variance (ANOVA) in five selected marketing research-related jour- is used in experimental research, the significance level on the upper
nals, the Journal of Marketing, Journal of Marketing Research, Marketing tail in the F-table looks like one-tailed testing, but what it actually
Science, Journal of Consumer Research, and Advances in Consumer Re- provides is “right-sided” two-tailed testing (Kaiser, 1960). It is then
search (2001–2005), there were 2703 (N = 2703) research hypothe- understandable that busy researchers who may be unaware of these
ses in total. Overall, 90.9% (n = 2458) of them are expressed in facts only report the p values provided by their statistical software
directional form, but only 9.1% (n = 245) of them are described in without considering the difference of p values between one-tailed
non-directional form. and two-tailed testing.
However, to test directional research hypotheses, 74.8% (n= 368) of The second reason is that some marketing researchers may think
the articles used two-tailed testing while only 10.6% (n= 52) of the ar- that when using two-tailed testing, they can be more rigorous or con-
ticles used one-tailed testing, and 14.6% (n= 72) of them were non- servative in their empirical research. This conception is thought to be
classifiable. The purposes of this article are (1) to investigate why this influenced by arguments for substituting two-tailed testing for one-
practice, which runs counter to standard textbook advice, is happening, tailed testing in other fields of social sciences, mainly in psychology
(e.g., Braver, 1975; Howell, 2007, pp. 98–100; Kimmel, 1957).
The third reason is that a number of researchers have not paid
☆ The authors thank Adam Turner at Hanyang University and Richard P. Bagozzi at enough attention to making a clear distinction and a logical linkage be-
the University of Michigan for their helpful comments. tween the research hypothesis (RH) (i.e., RH expressed by a verbal
⁎ Corresponding author at: Faculty of Commerce, Waseda University, 1-6-1 Nishiwaseda, statement about some testable relationship between concepts) and its
Shinjyuku-ku, Tokyo 169-8050, Japan.
E-mail addresses: chohyunc@hanyang.ac.kr (H.-C. Cho), shuzo@waseda.jp (S. Abe).
statistical hypothesis (i.e., H0 and H1 expressed by a pair of complemen-
1
Department of Business Administration, Hanyang University (ERICA Campus), tary parameters). They may not be aware of the difference between the
1271 Sa-3-dong, Sangrok-gu, Ansan-si, Gyeonggi-do, 425-791, Republic of Korea. two expressive modes or aware of the logical flow in which the research
0148-2963/$ – see front matter © 2012 Elsevier Inc. All rights reserved.
doi:10.1016/j.jbusres.2012.02.023
1262 H.-C. Cho, S. Abe / Journal of Business Research 66 (2013) 1261–1266
hypothesis should be translated into the statistical hypothesis. A then it is sufficient to use a more rigid level of significance within
convenience-oriented mixing of the research and statistical hypotheses one-tailed testing rather than resorting to inaccurate two-tailed test-
may be a cause of the overuse of two-tailed testing even in situations ing. In this case, one can achieve both objectives simultaneously. In
where the use of two-tailed testing is inappropriate. this sense, because one-tailed testing is more liberal than two-tailed
The first reason described above is directly accountable for the testing at a given level of significance, if one wants to maintain the
current overuse of two-tailed testing. However, it rests on assumptions same rigorousness between them in t- or F-testing, then it is enough
that favor two-tailed testing, which we find problematic. Therefore, it is to halve both the two-tailed p value and the relevant two-tailed α in
necessary to examine if the arguments in support of two-tailed testing the “p b α criterion” (e.g., “p = 0.09: p b 0.10” for inexact two-tailed
are valid. The authors believe, however, that the fundamental cause of testing, however, “p = 0.045: p b 0.05” for exact one-tailed testing).
the current problems lies in the third category: oversight of the proper In particular, we need to recall the technical factor that all the calcu-
sequence of hypothesis formation. As long as researchers lack clarity lated p values in F tests are always two-tailed p values (Kaiser, 1960)
and rigor in defining the logical relationships between the research hy- even though they are located on the upper tales in the F distributions.
pothesis and the statistical hypothesis, the current practice will prevail. Therefore, for directional decisions in F tests using planned contrasts,
Based upon careful examination of these points, this paper proposes the two-tailed p values should be halved. This technical factor may
new definitions of hypothesis to prevent these misconceptions that appear to be a trivial matter; however, when one scrutinizes the
weaken and mislead scientific methods in empirical research. true impact it may bring about in the current and future research in
marketing and business, its meaning is more than significant. If re-
2. Examinations of reasons in favor of two-tailed testing searchers do not pay enough attention to the exactness in making
their judgments with regard to their empirical testing results, it
According to the proponents of two-tailed testing, two major would lead to inaccurate or mistaken empirical conclusions in each
reasons support the preference for two-tailed to one-tailed testing individual study, and in the long run, it would have cumulative, im-
(for a more detailed discussion, see Braver, 1975; Harris, 1995, measurable, negative effects on knowledge building.
pp. 248–251; Howell, 2007, pp. 98–100; Kimmel, 1957; Pillemer, Here, it is important to note that the authors of the present study
1991). First, by substituting two-tailed testing for one-tailed testing are not advocating looser criteria with one-tailed testing; what is
at a given level of significance α, researchers subsequently come to use needed is to grasp the exact statistics in interpreting statistical testing
a larger magnitude of the critical value and can conduct a more conser- results. Being inexact under the name of being conservative is not rec-
vative, rigorous test. Second, researchers occasionally find cases where ommendable. In that sense, the authors think that merely reporting
the sign of the parameter to be tested turn out to be significant in the testing results such as “It was found to be statistically significant at
opposite direction from the one anticipated. Using two-tailed testing the 0.05 level of significance (i.e., p b 0.05).” is not good enough. Re-
provides researchers with a “safeguard” against this unexpected result. searchers need to report the exact p values, confidence intervals (if
necessary), and effect sizes, such as correlation coefficients and the
2.1. Is two-tailed testing a more rigorous test? mean differences to facilitate knowledge accumulation in their field
(Board of Scientific Affairs, 1996). In the field of psychology, a con-
With regard to the first cause mentioned above, two-tailed testing tinuing controversy (Harlow, Mulaik, & Steiger, 1997; Nickerson,
does provide a more conservative, rigid test at a given α level. However, 2000) has been revolving around the limitations of null hypothesis
researchers need to recognize that substituting two-tailed testing for significance testing.
one-tailed testing leads to a lack of logical consistency and an inaccurate
or mistaken empirical conclusion at a given level of significance α. It is 2.2. Can two-tailed testing be a safeguard against an unexpected result?
worth noting that statistical testing is just one part of empirical testing
for acquiring scientific knowledge. Research hypotheses are usually Pertaining to the second concern, at first glance there appears to
derived from a “theoretical framework or model” (e.g., Chandran & be some value in finding out the significance in the opposite direction
Morwitz, 2005; Selnes & Sallis, 2003) or from a “conceptual background with two-tailed testing. For some proponents of two-tailed testing
or model” (e.g., Slotegraaf & Inman, 2004; Smeesters & Mandel, (e.g., Burke, 1953), this seems to be the single most important reason
2006). The directionality or non-directionality of the research hy- to support two-tailed testing, even if it creates a logically decoupled
pothesis is usually determined by such theoretical grounds. In turn, state between the directional research hypothesis and its statistical
the directionality/non-directionality of each research hypothesis hypothesis. In this case, however, the role of statistical testing is sim-
determines the directionality/non-directionality of its alternative ple and clear: it is to provide supportive or non-supportive evidence
hypothesis. to the directional research hypothesis (e.g., RH1: … has a positive in-
For example, if a research hypothesis presumes a positive relation- fluence on…) and its underlying theory. The research hypothesis
ship between two constructs, then right-tailed testing is appropriate; RH1 is supported when the calculated value of the parameter is posi-
however, if the relationship is a negative one, then left-tailed testing tive and statistically significant by right-tailed testing. Otherwise,
is undertaken. Because two-tailed testing, by its very nature, does even in the case where the test result is significant in the opposite di-
not reflect the directionality of the research hypothesis, it is apparent rection, RH1 is not supported (e.g., Piamphongsant & Mandhachitara,
that it loses the exact logical connection between the directional re- 2008; Shukla, 2011; Srite, Galvin, Ahuja, & Karahanna, 2007). This re-
search hypothesis and its statistical hypothesis. Here, it must be em- sult is because the researchers failed to acquire supporting evidence
phasized that the whole process of empirical theory testing must be for the RH1 and its underlying theory. The latter corresponds to
done logically and consistently as follows: (1) the deduction of the re- cases where the calculated value of the parameter is positive but
search hypothesis from the theory, (2) the exact translation of the re- not significant, or the calculated value is negative regardless of its sig-
search hypothesis into its statistical hypothesis, (3) conducting the nificance. Therefore, to empirically negate the directional research
test of the statistical hypothesis by sample data, (4) supporting (or hypothesis RH1, it is sufficient to find out whether the calculated
not supporting) the research hypothesis, and (5) finally, supporting value is insignificant in the hypothesized direction. This process can
(or not supporting) the theory. Thus, preserving statistical conserva- be exemplified in the following action-related example. A firm is con-
tiveness at the expense of logical exactness is not a well-conceived templating the test of a new product X, which is expected to outsell
approach. In other words, researchers need to use an approach that the existing product Y. If the market test data indicate that the perfor-
is both rigorous and logically consistent. If one wants to achieve a mance of X is not significantly better than that of Y, then the new
more rigorous test result when testing a directional hypothesis, product X should not be introduced. Checking if the performance of
H.-C. Cho, S. Abe / Journal of Business Research 66 (2013) 1261–1266 1263
X is significantly worse than that of Y is unnecessary and does not is not in accord with this basic rule. The authors believe that the cur-
bring about any subsequent change in the course of action (Kimmel, rent overuse of two-tailed testing stems from the favorable views on
1957, criterion 2). two-tailed testing and from researchers’ blindness because they are
Two other reasons presented by the proponents of two-tailed unaware that they are breaking this basic rule. Let us now see how
testing are that finding such unexpected results may benefit the this rule, which dictates the flow from the research hypothesis to
development of theories in the field (Burke, 1953) and that in social the statistical hypothesis, is often violated in the fields of marketing,
sciences, it is rare to find a well-formulated theory to deduce research business, and other social sciences.
hypotheses that accompany stable predictions (Burke, 1953; Pillemer, The following type of hypothesis expression appears frequently in
1991). The authors cannot completely deny the points in this line a large amount of the methodology literature.
of argument, but it also contains weaknesses. Although it is very
important to thoroughly investigate the true cause of an unanticipated H1. Males self-disclose more than females.
opposite result, such as possible research-design or theory deficiency,
it is worth noting that it should be regarded as a matter to be considered H0. There is no difference between males and females with respect to
within the discussion section of an empirical research paper. Finding an self-disclosure (italics added) (Frey, Botan, & Kreps, 2000, p. 326).
unexpected result and using it for the development of a newly revised
It is clear that this example contains definitional confusion be-
theory belongs to the context of discovery. It should be handled
tween the expressive mode of the research hypothesis (RH expressed
separately from the context of justification, which is theory testing. As
by a verbal statement about some testable relationship between con-
far as the statistical test of the directional research hypothesis is con-
cepts) and that of the statistical hypothesis (H0 and H1 expressed by a
cerned, it should be conducted in the most logical and systematic way
pair of complementary parameters). There is neither a clear distinc-
possible. Bringing the exploratory research purpose into the hypothesis
tion nor an exact linkage between the research hypothesis, RH, and
testing process and making it a necessary routine procedure should be
its statistical hypothesis, H0 and H1. “RH per se” is identified with
avoided. In this case, researchers should apply a fundamental principle
the alternative hypothesis H1, and the “opposite of RH” is identified
of contemporary scientific philosophy; that is, researchers should not
with the null hypothesis H0. In addition, the verbally expressed “H0:
mix up the context of justification with the context of discovery
…no difference…” and the verbally expressed “H1: …more than…”
(Hunt, 1991, pp. 22–26).
are not complementary to each other. Under such confusion, (1) sta-
For the present stage of theory development in the fields of market-
tistical testing would be misunderstood as a direct test between “H0:
ing and business, it is necessary to admit that we have not yet reached
the opposite of RH” and “H1: RH per se,” (2) because the verbally
the stage of having a core of well-established theories. However, this
expressed H0 involves only a “no difference (i.e., =) sign,” it is likely
awareness does not mean that marketing and business researchers
to mislead us into believing that two-tailed testing is applicable, and
should become less theory-oriented. Although the theories we are
(3) the verbally expressed null hypothesis has been frequently mis-
testing may not be well-established ones, we should deduce research
understood as the “object of falsification” (e.g., Ledford, 2001; Miller
hypotheses as rigidly as possible and follow the most logical steps in
& Fox, 2001) or as the “object of strong inference” (e.g., Brinberg,
testing them. Empirical tests conducted under the light of “theory”
Lynch, & Sawyer, 1992; Ruscio, 1999), although the “straw-person”
(e.g., Anderson, 1983; Hunt, 1991, p. 168), rather than through reasons
null hypothesis has no basis in theory. One should remember that
of convenience, will contribute better to the sound development of
“falsification (Popper, 1959)” is Popper's alleged basic nature of scien-
scientific knowledge. One cannot overemphasize the vital role of
tific knowledge and that “strong inference (Platt, 1964)” is a process
the research hypothesis as well as its underlying theory, however
of pitting competing research hypotheses against each other where
naïve the stage of the theory may be, in conducting statistical testing.
each is based on a different theory.
It is the underlying “naïve” theory of the research hypothesis that
However, if one clearly distinguishes the expressive mode of the
provides the logical reason why construct A positively or negatively
research hypothesis from that of the statistical hypothesis, then the
influences construct B. The “naïve” theory provides us with the starting
exact logical linkage between the research hypothesis (i.e., RH) and
point for the future development of better theories. At this point, some
its statistical hypothesis (i.e., H0 and H1) can be made as follows:
researchers (e.g., Burke, 1953) may argue that one-tailed testing should
be applied only when the situation accompanies absolute reasons to
RH. “Males self-disclose more than females.”
believe a particular result. However, this notion is also untenable
because a research hypothesis is inferential in its very nature. The true H0. μMale ≤ μFemale [the negation of the assertion]
purpose of statistical testing is to test the statistical validity of that
inference. H1. μMale > μFemale [the assertion]
The impact of properly using one-tailed testing would be tremen-
dous. Researchers could avoid making inaccurate or mistaken empirical Evidently, because the null hypothesis, “H0: μMale ≤ μFemale,”
conclusions at a given level of significance α (e.g., “p = 0.08: p > 0.05” includes the “≤” sign, there is no way to apply two-tailed testing
for inexact two-tailed testing, however, “p = 0.04: p b 0.05” for exact to it. It is worth noting that statistical testing is the test between
one-tailed testing). With a given sample size, the result of one-tailed “H0: μMale ≤ μFemale” and “H1: μMale > μFemale”, and this test result
testing would provide more accurate information whether a directional should be used as indirect evidence for supporting or not-supporting
research hypothesis is supported or not. RH; that is, “rejecting/not-rejecting the ‘straw-person’ H0” leads to
“accepting/not-accepting the ‘substantive’ H1” and this result, in turn,
3. The confusion between the research and statistical hypotheses leads to “supporting/not-supporting RH,” respectively (e.g., Kerlinger
& Lee, 2000, p. 279).
3.1. Prevailing confusion Clarifying the difference between the research and statistical hy-
potheses is important; it prevents us from classificatory confusion
The previous section has shown that the reasons for supporting caused by the oversight of the coexistence of two different types of
two-tailed testing for directional research hypotheses tests are not research hypotheses. Almost all the research hypotheses derived
as legitimate as they first appeared. Additionally, the previous argu- from theories can be classified into two types: the research hypothe-
ments have made it clear that, because the directional statistical hy- sis in existential form (RHEF) and the research hypothesis in non-
pothesis should be derived from the directional research hypothesis, existential form (RHNF). The former RHEF, which is most common,
using two-tailed testing to test this directional research hypothesis involves verbal assertions about the existence of some testable
relationships between concepts, such as “has a negative influence on” and then moving back to specifying the research hypothesis and, as a
(e.g., Bagozzi, 1980), “is positively related to” (e.g., Loken & Ward, consequence, overlooking a certain category of the research hypothesis
1990), or “is greater than” (e.g., Biehal & Sheinin, 2007). However, (i.e., RHNF), which should be handled with additional carefulness. Un-
the latter RHNF is occasionally set up as a research hypothesis. It usu- less researchers make a clear distinction between the research and sta-
ally includes verbal assertions about the nonexistence of some testable tistical hypotheses and observe the correct relationships between them,
relationship between concepts, such as “has no influence on” (e.g., less rigorous use of statistical testing is hard to eliminate. Thus, while it
Bagozzi, 1980), “is unrelated to” (e.g., Loken & Ward, 1990), or is less convenient and looks somewhat redundant, it is important to fol-
“there is no difference” (e.g., Biehal & Sheinin, 2007). Unfortunately, low the orderly sequence of the empirical testing procedure even when
RHEF has been frequently misunderstood as the alternative hypothe- researchers use statistical null hypothesis testing methods.
sis and RHNF as the null hypothesis (e.g., Balachander & Ghose, 2003). It may be thought that making a clear distinction between the re-
In translating RHEF or RHNF into its own statistical hypothesis, it is search hypothesis and the statistical hypothesis is simple enough. Al-
important to note the fundamental difference, as in the following though it may look simple at a glance, it is not that easy, because the
example: problem lies between two different levels of methodology. Generally
speaking, there are three levels of methodology. The uppermost level
RH1. “Regret has no influence on complaint intentions, whereas is that of scientific philosophy, in which the basic questions addressed
satisfaction has a negative influence on complaint intentions” (Tsiros are “How can we draw a demarcation line between scientific knowl-
& Mittal, 2000, H6; this is a good example involving both RHEF and edge and all the other types of knowledge?” or “How can we assure
RHNF). that scientific knowledge is making an advance?” The second level
of methodology is concerned with the development and test of a spe-
RH1a. “Regret has no influence on complaint intentions” (italics added).
cific theory. It is related to logical and empirical validity of a theory.
[RHNF]
Finally, the third level of methodology is more technical in nature.
H0. γ11 = 0 (the assertion) Generally, it is related to how to conduct statistical testing. All three
levels are important, but no less important are the interrelationships
H1. γ11 ≠ 0 (the negation of the assertion) between and among the three levels. This article mainly discusses the
relationships between the second and third levels of methodology
RH1b. “Satisfaction has a negative influence on complaint intentions” and asserts that the second level of methodological decision dictates
(italics added). [RHEF] the lower level consideration. Some researchers who pay attention
H0. γ21 ≥ 0 (the negation of the assertion) to the third level may miss the hierarchical relationships between
and among the three levels of the methodology.
H1. γ21 b 0 (the assertion)
3.2. Two legitimate situations for two-tailed testing
When translating each of the two types of the research hypothesis
into its statistical hypothesis, as statisticians commonly note, the null This paper does not propose the elimination of two-tailed testing
hypothesis H0 and the alternative hypothesis H1 should be mutually in the context of theory testing because there are two different
exclusive and collectively exhaustive (e.g., Casella & Berger, 1990, p. types of situations where two-tailed testing is appropriate. First,
345); an equality sign (i.e., =, ≤, or ≥) should always appear in the when researchers do not have sufficient level of knowledge that pro-
null hypothesis, H0 (e.g., Harnett & Soni, 1991, p. 331). The above ex- vides for the directionality of the research hypotheses, the research
ample clearly shows a distinction and a logical linkage between the hypotheses may take the form of non-directional ones and the subse-
expressive mode of the research hypothesis (i.e., RHEF and RHNF) quent use of two-tailed testing is appropriate (e.g., Gravetter &
and that of the statistical hypothesis (i.e., each pair of H0 and H1). Wallnau, 2007, p. 253). Second, two-tailed testing should be used
Note that in the case of translating RHEF, such as RH1b, into its statis- when researchers set up non-directional research hypotheses (i.e.,
tical hypothesis, researchers should place the assertion in the alterna- RHNF) pertaining to the nonexistence of some testable relationship
tive hypothesis H1; however, when translating RHNF, such as RH1a, between concepts, such as “has no influence on” (e.g., Bagozzi,
into its statistical hypothesis, the assertion should be put in the null 1980), “has no relation to” (e.g., Loken & Ward, 1990), or “is no differ-
hypothesis H0, rather than in the alternative hypothesis H1. Note ence” (e.g., Biehal & Sheinin, 2007).
that this logic of proof to accept H0 in RHNF is weak. As Wonnacott
and Wonnacott (1984, p. 277) note, “Statistical testing does not pro- 3.3. Four new definitions regarding research and statistical hypotheses
vide a formal criterion to accept the null hypothesis.” If the test result
that the null hypothesis H0 of RHNF is accepted, it does not guarantee Now, based upon a clear distinction and upon an exact connection
that H0 is true. All one can do is adopt a smaller confidence interval so between the expressive mode of the research hypothesis (i.e., RHEF
that it includes the value assumed by H0 (frequently 0, but not always and RHNF) and that of the statistical hypothesis (i.e., H0 and H1),
0), then one can have stronger evidence for the null hypothesis (Frick, the authors propose the following four new definitions with regard
1995; Nickerson, 2000). Researchers need to prepare a larger sample to the research hypothesis and the statistical hypothesis. These four
size to achieve a small confidence interval. definitions are meant to remove the prevailing confusion in defini-
The above examinations of the current definitional and classificatory tional and classificatory hypothesis use associated with the process
confusion with regard to the research and statistical hypotheses have of empirical hypothesis testing in marketing and business research.
shown that without a clear distinction between the research hypothesis They provide marketing and business researchers not only with a
and the statistical hypothesis, the basic rule of empirical testing where new scheme of hypothesis classification but also with clear guidelines
the two expressive modes of the research hypothesis (i.e., RHEF and for the orderly setting up of the statistical hypothesis, H0 and H1.
RHNF) determine the form of the statistical hypothesis is easily violated.
As a result, a less rigorous form of statistical testing, such as adopting Definition 1. The research hypothesis derived from theory and/or
two-tailed testing for a directional research hypothesis test, has become extended observation is a verbal statement about the assertion of
a common practice. The current confusion involves other forms of mis- some testable relationship between two or more concepts.
understanding as well, such as starting from the use of logic of disproof
(i.e., proof by contradiction) as the basis for setting up the null and alter- Definition 2. The statistical hypothesis is the assertion and its negation
native hypotheses (e.g., Anderson, Sweeney, & Williams, 1999, p. 330) that are expressed by a pair of mutually exclusive and collectively
H.-C. Cho, S. Abe / Journal of Business Research 66 (2013) 1261–1266 1265
exhaustive parameters or statistical terms about some characteristic research hypothesis and the statistical hypothesis. To prevent such
of a population distribution. Between the assertion and its negation confusion, the authors have proposed to make a clearer distinction
described by a pair of complementary parameters or statistical and an exact connection between the expressive mode of the re-
terms, the part involving an equality sign (i.e., =, ≤, or ≥) always search hypothesis (i.e., RHEF or RHNF expressed by a verbal state-
appears in the null hypothesis H0, and the other part not involving ment about some testable relationship between concepts) and that
an equality sign always appears in the alternative hypothesis H1. of the statistical hypothesis (i.e., H0 and H1 expressed by a pair of
complementary parameters). Additionally, this article has shown
Definition 3. The expressive mode of the research hypothesis con- the two legitimate situations where two-tailed testing is to be used:
sists of the research hypothesis in existential form (i.e., RHEF) and for every non-directional RHEF, such as “RH2: …has an effect on…”,
the research hypothesis in non-existential form (i.e., RHNF). RHEF is and for all non-directional RHNF, such as “RH3: … has no relation
a research hypothesis that involves the assertion about the existence to….”
of some testable relationship between concepts, such as “has a posi- Finally, based upon the foregoing arguments, the authors have
tive effect on,” “has a negative relation to,” or “there is a difference.” proposed a new scheme of hypothesis classification called “four
On the other hand, RHNF is a research hypothesis that includes the new definitions of the research and statistical hypotheses” to more ef-
assertion about the nonexistence of some testable relationship be- fectively contribute to scientific knowledge development in the fields
tween concepts, such as “has no effect on,” “has no relation to,” or of marketing and business.
“there is no difference.” In rare cases, if the assertion has two forms
together, such as “has no influence or negative influence on,” then it References
is classified as RHNF.
Anderson, P. F. (1983, fall). Marketing, scientific progress, and scientific method. Journal
Definition 4. The research hypothesis should be translated into the of Marketing, 47, 18–31.
Anderson, D. R., Sweeney, D. J., & Williams, T. A. (1999). Statistics for business and
statistical hypothesis. In the case of translating RHEF into its statistical
economics (7th ed.). Cincinnati, OH: South-Western.
hypothesis H0 and H1, researchers should put the negation of the as- Bagozzi, R. P. (1980, spring). Performance and satisfaction in an industrial sales force:
sertion expressed by the parameter(s) or statistical terms in the null An examination of their antecedents and simultaneity. Journal of Marketing, 44,
65–77.
hypothesis H0, and the assertion described by the parameter(s) or
Balachander, S., & Ghose, S. (2003, Jan.). Reciprocal spillover effects: A strategic benefit
statistical terms in the alternative hypothesis H1. However, when of brand extensions. Journal of Marketing, 67, 4–13.
translating RHNF into its statistical hypothesis H0 and H1, researchers Biehal, G. J., & Sheinin, D. A. (2007, April). The influence of corporate messages on the
should put the assertion expressed by the parameter(s) or statistical product portfolio. Journal of Marketing, 71, 12–25.
Board of Scientific Affairs (1996). Task force on statistical inference initial report. American
terms in the null hypothesis H0 and the negation of the assertion de- Psychological Association http://www.apa.org/science/leadership/bsa/statistical/
scribed by the parameter(s) or statistical terms in the alternative hy- tfsi-initial-report.pdf
pothesis H1. Braver, S. L. (1975). On splitting the tails unequally: A new perspective on one- versus
two-tailed tests. Educational and Psychological Measurement, 35(2), 283–301.
Brinberg, D., Lynch, J. G., Jr., & Sawyer, A. G. (1992, Sep.). Hypothesized and confounded
explanations in theory tests: A Bayesian analysis. Journal of Consumer Research, 19,
4. Conclusions 139–154.
Burke, C. J. (1953). A brief note on one-tailed tests. Psychological Bulletin, 50(5),
According to the authors’ observation of 2703 research hypotheses 384–387.
Casella, G., & Berger, R. L. (1990). Statistical inference. Pacific Grove, CA: Wadsworth &
in JM, JMR, MS, JCR, and ACR (2001–2005), 90.9% (n = 2458) are judged Brooks/Cole.
to be directional research hypotheses. Of the 492 articles that contain Chandran, S., & Morwitz, V. G. (2005, Sep.). Effects of participative pricing on consumers’
the directional research hypotheses, 368 articles (74.8%) use two- cognitions and actions: A goal theoretic perspective. Journal of Consumer Research,
32, 249–259.
tailed testing, 52 articles (10.6%) use one-tailed testing, and the Churchill, G. A., Jr., & Iacobucci, D. (2002). Marketing research: Methodological foundations
remaining 72 articles (14.6%) were non-classifiable. The authors (8th ed.). Fort Worth, TX: Harcourt College Publishers.
have identified three categories of reasons that caused this over- Frey, L. R., Botan, C. H., & Kreps, G. L. (2000). Investigating communication: An introduction
to research methods (2nd ed.). Needham, Heights, MA: Allyn & Bacon.
whelming use of two-tailed testing for directional research hypothe- Frick, R. W. (1995). Accepting the null hypothesis. Memory & Cognition, 23(1), 132–138.
ses. One category of reasons is researchers’ oversight of the currently Gravetter, F. J., & Wallnau, L. B. (2007). Statistics for the behavioral sciences (7th ed.).
available p values in t- or F tests usually being those for two-tailed Belmont, CA: Thomson Wadsworth.
Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (1997). What if there were no significance tests?
testing. For correct directional decisions, we have noted that the
Mahwah, NJ: Lawrence Erlbaum Associates, Publishing.
two-tailed p values should be halved. Otherwise, researchers are likely Harnett, D. L., & Soni, A. K. (1991). Statistical methods for business and economics. Reading,
to draw inaccurate or mistaken empirical conclusions at a given level MA: Addison-Wesley.
of significance α (e.g., “p = 0.08: p > 0.05” for inexact two-tailed test- Harris, M. B. (1995). Basic statistics for behavioral science research. Needham Heights,
MA: Allyn & Bacon.
ing, however, “p = 0.04: p b 0.05” for exact one-tailed testing). Howell, D. C. (2007). Statistical methods for psychology (6th ed.). Belmont, CA: Thomson
The second category of reasons is the notion that two-tailed test- Wadsworth.
ing is scientifically more conservative and safer. This paper's conclu- Hunt, S. D. (1991). Modern marketing theory: Critical issues in the philosophy of marketing
science. Cincinnati, OH: South-Western Publishing Co.
sion is that the arguments made by the proponents for two-tailed Kaiser, H. F. (1960, May). Directional statistical decisions. Psychological Review, 67,
testing are not legitimate. Although two-tailed testing is more conser- 160–167.
vative in theory, it decouples the link between the directional re- Kerlinger, F. N., & Lee, H. B. (2000). Foundations of behavioral research (4th ed.). Fort
Worth, TX: Harcourt College Publishers.
search hypothesis and its statistical hypothesis, possibly leading to Kimmel, H. D. (1957). Three criteria for the use of one-tailed tests. Psychological Bulletin,
doubly inflated p values. The authors have also shown that the argu- 54(4), 351–353.
ment for finding the significant result in the opposite direction has Ledford, S. K. (2001). The implications of Kumho tire: Applying Daubert analysis to
warning-label testimony in products liability cases. Indiana Law, 76, 465–483.
meaning only in the context of discovery rather than in the context Loken, B., & Ward, J. (1990, Sep.). Alternative approaches to understanding the
of justification. In the case of testing the research hypothesis and its determinants of typicality. Journal of Consumer Research, 17, 111–126.
underlying theory, researchers should not simultaneously address Miller, H. T., & Fox, C. J. (2001). The epistemic community. Administration & Society,
32(6), 668–685.
the context of discovery and that of justification. The authors have
Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and
also shown that no matter how naïve the present stage theories continuing controversy. Psychological Methods, 5(2), 241–301.
might be, theory-oriented appraisal of the collected data remains Pfaffenberger, R. C., & Patterson, J. P. (1987). Statistical methods: For business and
important. economics (3rd ed.). Homewood, IL: Irwin.
Piamphongsant, T., & Mandhachitara, R. (2008). Psychological antecedents of career
The third and most fundamental category of reasons is that there women's fashion clothing conformity. Journal of Fashion Marketing and Management,
is frequently no clear distinction and no exact linkage between the 12(4), 438–455.
Pillemer, D. B. (1991). One- versus two-tailed hypothesis tests in contemporary Slotegraaf, R. J., & Inman, J. J. (2004, Aug.). Longitudinal shifts in the drivers of satisfaction
educational research. Educational Researcher, 20(9), 13–17. with product quality: The role of attribute resolvability. Journal of Marketing Research,
Platt, J. R. (1964). Strong inference. Science, 146(3642), 347–353. 41, 269–280.
Popper, K. R. (1959). The logic of scientific discovery. New York: Harper and Row Smeesters, D., & Mandel, N. (2006, March). Positive and negative media image effects
(Original work published 1934 in German). on the self. Journal of Consumer Research, 32, 576–582.
Ruscio, J. (1999). Statistical models and strong inference in social judgment research. Srite, M., Galvin, J. E., Ahuja, M. K., & Karahanna, E. (2007). Effects of individuals’
Psycoloquy, 10(27) social bias 17. psychological states on their satisfaction with the GSS process. Information
Selnes, F., & Sallis, J. (2003, July). Promoting relationship learning. Journal of Marketing, Management, 44, 535–546.
67, 80–95. Tsiros, M., & Mittal, V. (2000, March). Regret: A model of its antecedents and
Shukla, P. (2011). Impact of interpersonal influences, brand origin and brand image consequences in consumer decision. Journal of Consumer Research, 26, 401–417.
on luxury purchase intentions: Measuring interfunctional interactions and a Wonnacott, T. H., & Wonnacott, R. J. (1984). Introductory statistics for business and
cross-national comparison. Journal of World Business, 46(2), 242–252. economics (3rd ed.). New York: John Wiley & Sons.

10 1016@j Jbusres 2012 02 023

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

10 1016@j Jbusres 2012 02 023

Caricato da

Copyright:

Formati disponibili

Journal of Business Research 66 (2013) 1261–1266

Contents lists available at SciVerse ScienceDirect

Journal of Business Research

Is two-tailed testing for directional research hypotheses tests legitimate?☆

Potrebbero piacerti anche