| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
American Journal of Epidemiology Vol. 137, No. 5: 485-496
Copyright © 1993 by The Johns Hopkins University School of Hygiene and Public Health
review-article |
p Values, Hypothesis Tests, and Likelihood: Implications for Epidemiology of a Neglected Historical Debate
From the Division of Biostatistics, Oncology Center, The Johns Hopkins University School of Medicine Baltimore, MD
Reprint requests to Dr. Steven N. Goodman, 550 N. Broadway, Suite 1103, Baltimore, MD 21205.
It is not generally appreciated that the p value, as conceived by R. A. Fisher, is not compatible with the Neyman-Pearson hypothesis test in which it has become embedded. The p value was meant to be a flexible inferential measure, whereas the hypothesis test was a rule for behavior, not inference. The combination of the two methods has led to a reinterpretation of the p value simultaneously as an "observed error rate" and as a measure of evidence. Both of these interpretations are problematic, and their combination has obscured the important differences between Neyman and Fisher on the nature of the scientific method and inhibited our understanding of the philosophic implications of the basic methods in use today. An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Likelihood makes clearer the distinction between error rates and inferential evidence and is a quantitative tool for expressing evidential strength that is more appropriate for the purposes of epidemiology than the p value.
hypothesis tests; inference; likelihood; p values; significance tests
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. Cummings and T. D. Koepsell P Values vs Estimates of Association With Confidence Intervals Arch Pediatr Adolesc Med, February 1, 2010; 164(2): 193 - 196. [Full Text] [PDF] |
||||
![]() |
D. Curran-Everett Explorations in statistics: hypothesis tests and P values Advan Physiol Educ, June 1, 2009; 33(2): 81 - 86. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. T. Fosgate Practical sample size calculations for surveillance and diagnostic investigations J Vet Diagn Invest, January 1, 2009; 21(1): 3 - 14. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. S Freedman An analysis of the controversy over classical one-sided tests Clinical Trials, December 1, 2008; 5(6): 635 - 640. [Abstract] [PDF] |
||||
![]() |
J. P. A. Ioannidis Effect of Formal Statistical Significance on the Credibility of Observational Associations Am. J. Epidemiol., August 15, 2008; 168(4): 374 - 383. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wakefield Reporting and interpretation in genome-wide association studies Int. J. Epidemiol., June 1, 2008; 37(3): 641 - 653. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Greenland Multiple comparisons and association selection in general epidemiology Int. J. Epidemiol., June 1, 2008; 37(3): 430 - 434. [Full Text] [PDF] |
||||
![]() |
R. Hubbard and R. M. Lindsay Why P Values Are Not a Useful Measure of Evidence in Statistical Significance Testing Theory Psychology, February 1, 2008; 18(1): 69 - 88. [Abstract] [PDF] |
||||
![]() |
J. Lessler, D. A.T Cummings, S. Fishman, A. Vora, and D. S Burke Transmissibility of swine flu at Fort Dix, 1976 J R Soc Interface, August 22, 2007; 4(15): 755 - 762. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Hubbard and J. S. Armstrong Why We Don't Really Know What Statistical Significance Means: Implications for Educators Journal of Marketing Education, August 1, 2006; 28(2): 114 - 120. [Abstract] [PDF] |
||||
![]() |
S. Greenland Bayesian perspectives for epidemiological research: I. Foundations and basic methods Int. J. Epidemiol., June 1, 2006; 35(3): 765 - 775. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Greenland Response: Bayesian perspectives for epidemiological research Int. J. Epidemiol., June 1, 2006; 35(3): 777 - 778. [Full Text] [PDF] |
||||
![]() |
K. P. Spindler, J. E. Kuhn, W. Dunn, C. E. Matthews, F. E. Harrell Jr, and R. S. Dittus Reading and Reviewing the Orthopaedic Literature: A Systematic, Evidence-based Medicine Approach J. Am. Acad. Ortho. Surg., July 1, 2005; 13(4): 220 - 229. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Hubbard Alphabet Soup: Blurring the Distinctions Betweenp's anda's in Psychological Research Theory Psychology, June 1, 2004; 14(3): 295 - 327. [Abstract] [PDF] |
||||
![]() |
R A Crosby and R Rothenberg In STI interventions, size matters Sex Transm Inf, April 1, 2004; 80(2): 82 - 85. [Full Text] [PDF] |
||||
![]() |
N. Tahri-Daizadeh, D.-A. Tregouet, V. Nicaud, N. Manuel, F. Cambien, and L. Tiret Automated Detection of Informative Combined Effects in Genetic Association Studies of Complex Traits Genome Res., August 1, 2003; 13(8): 1952 - 1960. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A C Sterne, G. D. Smith, and D R Cox Sifting the evidence--what's wrong with significance tests? Physical Therapy, August 1, 2001; 81(8): 1464 - 1469. [Full Text] [PDF] |
||||
![]() |
H. C. Kraemer, E. Stice, A. Kazdin, D. Offord, and D. Kupfer How Do Risk Factors Work Together? Mediators, Moderators, and Independent, Overlapping, and Proxy Risk Factors Am J Psychiatry, June 1, 2001; 158(6): 848 - 856. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A C Sterne, G. D. Smith, and D R Cox Sifting the evidence{---}what's wrong with significance tests? Another comment on the role of statistical methods BMJ, January 27, 2001; 322(7280): 226 - 231. [Full Text] |
||||
![]() |
J. P. A. Ioannidis and J. Lau Evolution of treatment effects over time: Empirical insight from recursive cumulative metaanalyses PNAS, January 23, 2001; (2001) 21529998. [Abstract] [Full Text] |
||||
![]() |
S. N. Goodman Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy Ann Intern Med, June 15, 1999; 130(12): 995 - 1004. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. A. Ioannidis, J. C. Cappelleri, and J. Lau Issues in Comparisons Between Meta-analyses and Large Trials JAMA, April 8, 1998; 279(14): 1089 - 1093. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. SZABO Current Concepts Review - Principles of Epidemiology for the Orthopaedic Surgeon J. Bone Joint Surg. Am., January 1, 1998; 80(1): 111 - 20. [Full Text] [PDF] |
||||
![]() |
R. Hubbard, R. A. Parsa, and M. R. Luthy The Spread of Statistical Significance Testing in Psychology: The Case of the Journal of Applied Psychology, 1917-1994 Theory Psychology, August 1, 1997; 7(4): 545 - 554. [Abstract] |
||||
![]() |
J. P. A. Ioannidis and J. Lau Evolution of treatment effects over time: Empirical insight from recursive cumulative metaanalyses PNAS, January 30, 2001; 98(3): 831 - 836. [Abstract] [Full Text] [PDF] |
||||


















