Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
significance
History
Statistical significance dates to the 1700s,
in the work of John Arbuthnot and Pierre-
Simon Laplace, who computed the p-value
for the human sex ratio at birth, assuming
a null hypothesis of equal probability of
male and female births; see p-value
§ History for details.[18][19][20][21][22][23][24]
Related concepts
The significance level α is the threshold for
p below which the experimenter assumes
the null hypothesis is false, and something
else is going on. This means α is also the
probability of mistakenly rejecting the null
hypothesis, if the null hypothesis is true.[4]
Stringent significance
thresholds in specific fields
Limitations
Researchers focusing solely on whether
their results are statistically significant
might report findings that are not
substantive[43] and not replicable.[44][45]
There is also a difference between
statistical significance and practical
significance. A study that is found to be
statistically significant may not
necessarily be practically significant.[46]
Effect size
Reproducibility
Challenges
Overuse in some journals
See also
A/B testing, ABX test
Fisher's method for combining
independent tests of significance
Look-elsewhere effect
Multiple comparisons problem
Sample size
Texas sharpshooter fallacy (gives
examples of tests where the
significance level was set too high)
References
1. Sirkin, R. Mark (2005). "Two-sample t
tests". Statistics for the Social Sciences
(3rd ed.). Thousand Oaks, CA: SAGE
Publications, Inc. pp. 271–316. ISBN 1-412-
90546-X.
2. Borror, Connie M. (2009). "Statistical
decision making". The Certified Quality
Engineer Handbook (3rd ed.). Milwaukee,
WI: ASQ Quality Press. pp. 418–472.
ISBN 0-873-89745-5.
3. Myers, Jerome L.; Well, Arnold D.; Lorch
Jr., Robert F. (2010). "Developing
fundamentals of hypothesis testing using
the binomial distribution". Research design
and statistical analysis (3rd ed.). New York,
NY: Routledge. pp. 65–90. ISBN 0-805-
86431-8.
4. Dalgaard, Peter (2008). Introductory
Statistics with R. New York: Springer.
pp. 155–56. doi:10.1007/978-0-387-79054-
1_9 . ISBN 978-0-387-79053-4.
5. Johnson, Valen E. (October 9, 2013).
"Revised standards for statistical
evidence" . Proceedings of the National
Academy of Sciences. National Academies
of Science. 110: 19313–19317.
doi:10.1073/pnas.1313476110 .
PMC 3845140 . Retrieved 3 July 2014.
6. Redmond, Carol; Colton, Theodore
(2001). "Clinical significance versus
statistical significance". Biostatistics in
Clinical Trials. Wiley Reference Series in
Biostatistics (3rd ed.). West Sussex, United
Kingdom: John Wiley & Sons Ltd. pp. 35–
36. ISBN 0-471-82211-6.
7. Cumming, Geoff (2012). Understanding
The New Statistics: Effect Sizes, Confidence
Intervals, and Meta-Analysis. New York,
USA: Routledge. pp. 27–28.
8. Krzywinski, Martin; Altman, Naomi (30
October 2013). "Points of significance:
Significance, P values and t-tests" . Nature
Methods. Nature Publishing Group. 10 (11):
1041–1042. doi:10.1038/nmeth.2698 .
Retrieved 3 July 2014.
9. Sham, Pak C.; Purcell, Shaun M (17 April
2014). "Statistical power and significance
testing in large-scale genetic studies" .
Nature Reviews Genetics. Nature Publishing
Group. 15 (5): 335–346.
doi:10.1038/nrg3706 . Retrieved 3 July
2014.
10. Altman, Douglas G. (1999). Practical
Statistics for Medical Research. New York,
USA: Chapman & Hall/CRC. p. 167.
ISBN 978-0412276309.
11. Devore, Jay L. (2011). Probability and
Statistics for Engineering and the Sciences
(8th ed.). Boston, MA: Cengage Learning.
pp. 300–344. ISBN 0-538-73352-7.
12. Craparo, Robert M. (2007). "Significance
level". In Salkind, Neil J. Encyclopedia of
Measurement and Statistics. 3. Thousand
Oaks, CA: SAGE Publications. pp. 889–891.
ISBN 1-412-91611-9.
13. Sproull, Natalie L. (2002). "Hypothesis
testing". Handbook of Research Methods: A
Guide for Practitioners and Students in the
Social Science (2nd ed.). Lanham, MD:
Scarecrow Press, Inc. pp. 49–64. ISBN 0-
810-84486-9.
14. Babbie, Earl R. (2013). "The logic of
sampling". The Practice of Social Research
(13th ed.). Belmont, CA: Cengage Learning.
pp. 185–226. ISBN 1-133-04979-6.
15. Faherty, Vincent (2008). "Probability and
statistical significance". Compassionate
Statistics: Applied Quantitative Analysis for
Social Services (With exercises and
instructions in SPSS) (1st ed.). Thousand
Oaks, CA: SAGE Publications, Inc. pp. 127–
138. ISBN 1-412-93982-8.
16. McKillup, Steve (2006). "Probability
helps you make a decision about your
results". Statistics Explained: An
Introductory Guide for Life Scientists (1st
ed.). Cambridge, United Kingdom:
Cambridge University Press. pp. 44–56.
ISBN 0-521-54316-9.
17. Myers, Jerome L.; Well, Arnold D.; Lorch
Jr, Robert F. (2010). "The t distribution and
its applications". Research Design and
Statistical Analysis (3rd ed.). New York, NY:
Routledge. pp. 124–153. ISBN 0-805-86431-
8.
18. Brian, Éric; Jaisson, Marie (2007).
"Physico-Theology and Mathematics
(1710–1794)". The Descent of Human Sex
Ratio at Birth. Springer Science & Business
Media. pp. 1–25. ISBN 978-1-4020-6036-6.
19. John Arbuthnot (1710). "An argument
for Divine Providence, taken from the
constant regularity observed in the births of
both sexes" (PDF). Philosophical
Transactions of the Royal Society of
London. 27 (325–336): 186–190.
doi:10.1098/rstl.1710.0011 .
20. Conover, W.J. (1999), "Chapter 3.4: The
Sign Test", Practical Nonparametric
Statistics (Third ed.), Wiley, pp. 157–176,
ISBN 0-471-16068-7
21. Sprent, P. (1989), Applied
Nonparametric Statistical Methods (Second
ed.), Chapman & Hall, ISBN 0-412-44980-3
22. Stigler, Stephen M. (1986). The History
of Statistics: The Measurement of
Uncertainty Before 1900. Harvard University
Press. pp. 225–226 . ISBN 0-67440341-X.
23. Bellhouse, P. (2001), "John Arbuthnot",
in Statisticians of the Centuries by C.C.
Heyde and E. Seneta, Springer, pp. 39–42,
ISBN 0-387-95329-9
24. Hald, Anders (1998), "Chapter 4. Chance
or Design: Tests of Significance", A History
of Mathematical Statistics from 1750 to
1930, Wiley, p. 65
25. Cumming, Geoff (2011). "From null
hypothesis significance to testing effect
sizes". Understanding The New Statistics:
Effect Sizes, Confidence Intervals, and
Meta-Analysis. Multivariate Applications
Series. East Sussex, United Kingdom:
Routledge. pp. 21–52. ISBN 0-415-87968-X.
26. Fisher, Ronald A. (1925). Statistical
Methods for Research Workers. Edinburgh,
UK: Oliver and Boyd. p. 43. ISBN 0-050-
02170-2.
27. Poletiek, Fenna H. (2001). "Formal
theories of testing". Hypothesis-testing
Behaviour. Essays in Cognitive Psychology
(1st ed.). East Sussex, United Kingdom:
Psychology Press. pp. 29–48. ISBN 1-841-
69159-3.
28. Quinn, Geoffrey R.; Keough, Michael J.
(2002). Experimental Design and Data
Analysis for Biologists (1st ed.). Cambridge,
UK: Cambridge University Press. pp. 46–69.
ISBN 0-521-00976-6.
29. Neyman, J.; Pearson, E.S. (1933). "The
testing of statistical hypotheses in relation
to probabilities a priori". Mathematical
Proceedings of the Cambridge
Philosophical Society. 29: 492–510.
doi:10.1017/S030500410001152X .
30. "Conclusions about statistical
significance are possible with the help of
the confidence interval. If the confidence
interval does not include the value of zero
effect, it can be assumed that there is a
statistically significant result." "Confidence
Interval or P-Value?".
doi:10.3238/arztebl.2009.0335 .
31. StatNews #73: Overlapping Confidence
Intervals and Statistical Significance
32. Neyman, J. (1937). "Outline of a Theory
of Statistical Estimation Based on the
Classical Theory of Probability".
Philosophical Transactions of the Royal
Society A. 236: 333–380.
doi:10.1098/rsta.1937.0005 .
JSTOR 91337 .
33. Meier, Kenneth J.; Brudney, Jeffrey L.;
Bohte, John (2011). Applied Statistics for
Public and Nonprofit Administration (3rd
ed.). Boston, MA: Cengage Learning.
pp. 189–209. ISBN 1-111-34280-6.
34. Healy, Joseph F. (2009). The Essentials
of Statistics: A Tool for Social Research
(2nd ed.). Belmont, CA: Cengage Learning.
pp. 177–205. ISBN 0-495-60143-8.
35. McKillup, Steve (2006). Statistics
Explained: An Introductory Guide for Life
Scientists (1st ed.). Cambridge, UK:
Cambridge University Press. pp. 32–38.
ISBN 0-521-54316-9.
36. Health, David (1995). An Introduction To
Experimental Design And Statistics For
Biology (1st ed.). Boston, MA: CRC press.
pp. 123–154. ISBN 1-857-28132-2.
37. Hinton, Perry R. (2010). "Significance,
error, and power". Statistics explained (3rd
ed.). New York, NY: Routledge. pp. 79–90.
ISBN 1-848-72312-1.
38. Vaughan, Simon (2013). Scientific
Inference: Learning from Data (1st ed.).
Cambridge, UK: Cambridge University
Press. pp. 146–152. ISBN 1-107-02482-X.
39. Bracken, Michael B. (2013). Risk,
Chance, and Causation: Investigating the
Origins and Treatment of Disease (1st ed.).
New Haven, CT: Yale University Press.
pp. 260–276. ISBN 0-300-18884-6.
40. Franklin, Allan (2013). "Prologue: The
rise of the sigmas". Shifting Standards:
Experiments in Particle Physics in the
Twentieth Century (1st ed.). Pittsburgh, PA:
University of Pittsburgh Press. pp. Ii–Iii.
ISBN 0-822-94430-8.
41. Clarke, GM; Anderson, CA; Pettersson,
FH; Cardon, LR; Morris, AP; Zondervan, KT
(February 6, 2011). "Basic statistical
analysis in genetic case-control studies" .
Nature Protocols. 6 (2): 121–33.
doi:10.1038/nprot.2010.182 .
PMC 3154648 . PMID 21293453 .
42. Barsh, GS; Copenhaver, GP; Gibson, G;
Williams, SM (July 5, 2012). "Guidelines for
Genome-Wide Association Studies" . PLoS
Genetics. 8 (7): e1002812.
doi:10.1371/journal.pgen.1002812 .
PMC 3390399 . PMID 22792080 .
43. Carver, Ronald P. (1978). "The Case
Against Statistical Significance Testing".
Harvard Educational Review. 48: 378–399.
44. Ioannidis, John P. A. (2005). "Why most
published research findings are false" .
PLoS Medicine. 2: e124.
doi:10.1371/journal.pmed.0020124 .
PMC 1182327 . PMID 16060722 .
45. Amrhein, Valentin; Korner-Nievergelt,
Fränzi; Roth, Tobias (2017). "The earth is
flat (p > 0.05): significance thresholds and
the crisis of unreplicable research" . PeerJ.
5: e3544. doi:10.7717/peerj.3544 .
46. Hojat, Mohammadreza; Xu, Gang
(2004). "A Visitor's Guide to Effect Sizes".
Advances in Health Sciences Education.
47. Pedhazur, Elazar J.; Schmelkin, Liora P.
(1991). Measurement, Design, and Analysis:
An Integrated Approach (Student ed.). New
York, NY: Psychology Press. pp. 180–210.
ISBN 0-805-81063-3.
48. Stahel, Werner (2016). "Statistical Issue
in Reproducibility". Principles, Problems,
Practices, and Prospects Reproducibility:
Principles, Problems, Practices, and
Prospects: 87–114.
49. "CSSME Seminar Series: The argument
over p-values and the Null Hypothesis
Significance Testing (NHST) paradigm" .
www.education.leeds.ac.uk. School of
Education, University of Leeds. Retrieved
2016-12-01.
50. Novella, Steven (February 25, 2015).
"Psychology Journal Bans Significance
Testing" . Science-Based Medicine.
51. Woolston, Chris (2015-03-05).
"Psychology journal bans P values" .
Nature. 519 (7541): 9–9.
doi:10.1038/519009f .
52. Siegfried, Tom (2015-03-17). "P value
ban: small step for a journal, giant leap for
science" . Science News. Retrieved
2016-12-01.
53. Antonakis, John (February 2017). "On
doing better science: From thrill of
discovery to policy implications" . The
Leadership Quarterly. 28 (1): 5–21.
doi:10.1016/j.leaqua.2017.01.006 .
54. Wasserstein, Ronald L.; Lazar, Nicole A.
(2016-04-02). "The ASA's Statement on p-
Values: Context, Process, and Purpose" .
The American Statistician. 70 (2): 129–133.
doi:10.1080/00031305.2016.1154108 .
ISSN 0003-1305 .
55. García-Pérez, Miguel A. (2016-10-05).
"Thou Shalt Not Bear False Witness Against
Null Hypothesis Significance Testing".
Educational and Psychological
Measurement: 0013164416668232.
doi:10.1177/0013164416668232 .
ISSN 0013-1644 .
56. Benjamin, Daniel; et al. (2017).
"Redefine statistical significance" . Nature
Human Behaviour. 1: 0189.
doi:10.1038/s41562-017-0189-z .
57. Chawla, Dalmeet (2017). " 'One-size-fits-
all' threshold for P values under fire" .
Nature. doi:10.1038/nature.2017.22625 .
58. Amrhein, Valentin; Greenland, Sander
(2017). "Remove, rather than redefine,
statistical significance" . Nature Human
Behaviour. 1: 0224. doi:10.1038/s41562-
017-0224-0 .
59. Vyse, Stuart. "Moving Science's
Statistical Goalposts" . csicop.org. CSI.
Retrieved 10 July 2018.
Further reading
Ziliak, Stephen and Deirdre McCloskey
(2008), The Cult of Statistical
Significance: How the Standard Error
Costs Us Jobs, Justice, and Lives . Ann
Arbor, University of Michigan Press,
2009. ISBN 978-0-472-07007-7. Reviews
and reception: (compiled by Ziliak)
Thompson, Bruce (2004). "The
"significance" crisis in psychology and
education". Journal of Socio-Economics.
33: 607–613.
doi:10.1016/j.socec.2004.09.034 .
Chow, Siu L., (1996). Statistical
Significance: Rationale, Validity and
Utility , Volume 1 of series Introducing
Statistical Methods, Sage Publications
Ltd, ISBN 978-0-7619-5205-3 – argues
that statistical significance is useful in
certain circumstances.
Kline, Rex, (2004). Beyond Significance
Testing: Reforming Data Analysis
Methods in Behavioral Research
Washington, DC: American
Psychological Association.
Nuzzo, Regina (2014). Scientific
method: Statistical errors . Nature Vol.
506, p. 150-152 (open access).
Highlights common misunderstandings
about the p value.
Cohen, Joseph (1994). [1] . The earth is
round (p<.05). American Psychologist.
Vol 49, p. 997-1003. Reviews problems
with null hypothesis statistical testing.
External links
Wikiversity has learning resources about
Statistical significance
Retrieved from
"https://en.wikipedia.org/w/index.php?
title=Statistical_significance&oldid=867619954"