Letter to the Editor |
THE AUTHORS REPLY
1 Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115
2 Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115
(e-mail: stdls{at}channing.harvard.edu)
We agree with Drs. Neogi and Zhang (1
) that direct calculations of point and interval estimates of univariate and multivariate-adjusted risk ratios and risk differences are useful for communicating results to the scientific community and the general public, and for public health policy and programmatic purposes. We are happy that our macro (2
) (http://www.hsph.harvard.edu/faculty/spiegelman/relrisk8.html) will make it easier to perform these calculations. In fact, since our editorial note was published, nearly 1,000 people have visited the Web page where the SAS macro has been posted, an indication of the interest in the public health community in performing these calculations.
Because it is only under very restrictive circumstances that Neogi and Zhang's remark (1
) that the prevalence odds ratio equals the incidence rate ratio, we share Greenland and Rothman's skepticism about the practical utility of this equation in the usual complex multifactorial settings in which most epidemiologic research is conducted. To quote Greenland and Rothman, "Seldom is prevalence of direct interest in etiologic applications of epidemiologic research. Since prevalence reflects both the incidence rate and the probability of surviving the disease, studies of prevalence or studies based on prevalent cases yield associations that reflect the determinants of survival with disease just as much as the causes of disease. The study of prevalence can be misleading in the paradoxical situation in which better survival from a disease and therefore a higher prevalence follow from the action of preventive agents that mitigate the disease once it occurs" (3
, p. 44).
The conditions under which the prevalence ratio can be equated with the risk ratio depend in a complex way upon age-specific incidence and survival rates following diagnosis, in- and outmigration rates to and from the study population, the rarity of the disease, and relative risks for incidence and survival for all of the risk factors for each. The mathematics of this have been worked out by Miettinen (4
), Preston (5
), Keiding (6
), Alho (7
), Newman (8
), and others.
There are cases where the risk factors for incidence are the same as those for survival, although the magnitudes of the relative risks are not necessarily identical; for example, obesity (9
, 10
), weight gain (11
, 12
), and physical inactivity (13
, 14
) are risk factors for both breast cancer incidence and earlier case fatality. On the other hand, alcohol intake is a risk factor for breast cancer incidence (15
) but does not appear to be a risk factor for decreased survival following breast cancer diagnosis (16
).
We agree with Drs. Tian and Liu (17
) that, in their example (18
), the log-binomial model does not fit the data well when age is modeled as a linear term. These authors raise an interesting philosophical question. Taken to its limit to sharpen the contrast between the two points of view, their question is, Is the goal of data analysis in epidemiology and public health to obtain a model that fits the data well but estimates a parameter that is not of interest, or is the goal to estimate the parameter of interest even if the model does not fit well? We believe that, in epidemiology and public health, the goal of analysis is typically to estimate the parameter of interest, and the goal of good study design should be to ensure that the data provide sufficient information to do so.
It is not clear in the example presented by Tian and Liu (17
) whether the primary goal of their analysis was, in fact, to estimate prevalence ratios for coronary heart disease in relation to increasing age. Rather, the goal of their analysis appeared to be descriptive. If so, model fit would be of primary importance, and we agree that the nonparametric smoothed regression that they presented is the best approach.
If estimation of prevalence ratios as a function of age were the goal of the analysis, then other considerations prevail. The nonparametric regression indicates that the prevalence of coronary heart disease may increase sub-log-linearly at the very high age range of the data. This feature can be studied by introducing a nonlinear term to the log-binomial regression model and assessing its significance. If the nonlinear term appears important, either statistically or for other reasons, it can be included in the final model, all the while retaining the parameterization of interest, that of the prevalence ratio. If this nonlinear log-binomial model fails to converge, the estimating equations can be fit with Poisson weights, for which these provide a valid estimate. We note importantly that only eight of the 100 subjects were older than age 60 years, and six of them were cases (of 43 total), giving us very little power to draw conclusions about prevalence ratios over age 60 years. Had the confidence intervals for the graphs been presented along with the "point estimates," this would have been clear.
Finally, we agree with Drs. Petersen and Deddens (19
) that prevalence ratios are most efficiently estimated by the log-binomial model, which are maximum likelihood. When the log-binomial model fails to converge, these authors suggest the use of the COPY algorithm, an unpublished procedure whose statistical properties have not been formally established, and which may provide maximum likelihood estimates of the prevalence ratio and its 95 percent confidence intervals when the standard computational algorithm is unable to estimate the quantities of interest. We believe it is preferable to use methods with well-understood properties, such as those utilized in our macro, %RELRISK8 (2
).
It should be noted that a fundamental issue underlying both Petersen and Deddens (19
) and Tian and Liu's (17
) comments is that there is an underlying restriction on the parameter space in both the likelihood and the estimating equations methods that is being ignored in PROC GENMOD and by COPY. Technically, all of these algorithms should incorporate the linear constraints on the parameters, ß'Xi
0, i = 1, ... n, where ß is the vector of log prevalence or risk ratios corresponding to the vector of risk factors, Xi, i indexes the subjects in the study, and n is the sample size of the study.
We thank all of the authors for their interesting remarks on our editorial note (2
).
ACKNOWLEDGMENTS
Conflict of interest: none declared.
References
- Neogi T, Zhang Y. Re: "Easy SAS calculations for risk or prevalence ratios and differences." (Letter). Am J Epidemiol 2006;163:1157.
[Free Full Text] - Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol 2005;162:199200.
[Free Full Text] - Greenland S, Rothman KJ. Measures of effect and measures of association. In: Rothman KJ, Greenland S, eds. Modern epidemiology. 2nd ed. Philadelphia, PA: Lippincott-Raven, 1998:445.
- Miettinen O. Estimability and estimation in case-referent studies. Am J Epidemiol 1976;103:22635.
[Abstract/Free Full Text] - Preston SH. Relations among standard epidemiologic measures in a population. Am J Epidemiol 1987;126:33645.
[Abstract/Free Full Text] - Keiding N. Age-specific incidence and prevalence: a statistical perspective. J R Stat Soc (A) 1991;154:371412.
- Alho JM. On prevalence, incidence and disease duration in stable populations. Biometrics 1992;48:57892.
- Newman SC. Odds ratio estimation in a steady state population. J Clin Epidemiol 1988;41:5965.[Medline]
- van den Brandt PA, Spiegelman D, Yaun SS, et al. Pooled analysis of prospective cohort studies on height, weight, and breast cancer risk. Am J Epidemiol 2000;152:51427.
[Abstract/Free Full Text] - Daling JR, Malone KE, Doody DR, et al. Relation of body mass index to tumor markers and survival among young women with invasive ductal breast carcinoma. Cancer 2001;92:7209.[CrossRef][ISI][Medline]
- Huang Z, Hankinson SE, Colditz GA, et al. Dual effects of weight and weight gain on breast cancer risk. JAMA 1997;278:140711.[Abstract]
- Kroenke CH, Chen WY, Rosner B, et al. Weight, weight gain and survival after breast cancer. J Clin Oncol 2005;23:13708.
[Abstract/Free Full Text] - Bianchini F, Kaaks R, Vainio H. Weight control and physical activity in cancer prevention. Obes Rev 2002;3:58.[Medline]
- Holmes MD, Chen WY, Feskanich D, et al. Physical activity and survival after breast cancer diagnosis. JAMA 2005;293:247986.
[Abstract/Free Full Text] - Smith-Warner SA, Spiegelman D, Yaun SS, et al. Alcohol and breast cancer in women: a pooled analysis of cohort studies. JAMA 1998;279:53540.
[Abstract/Free Full Text] - Holmes MD, Stampfer MJ, Colditz GA, et al. Dietary factors and the survival of women with breast carcinoma. Cancer 1999;86:82635.[CrossRef][ISI][Medline]
- Tian L, Liu K. Re: "Easy SAS calculations for risk or prevalence ratios and differences." (Letter). Am J Epidemiol 2006;163:11578.
[Free Full Text] - Hosmer DW, Lemeshow D. Applied logistic regression. New York, NY: John Wiley & Sons, 1989:23.
- Petersen MR, Deddens JA. Re: "Easy SAS calculations for risk or prevalence ratios and differences." (Letter). Am J Epidemiol 2006;163:11589.
[Free Full Text]
This article has been cited by other articles:
![]() |
Y. B. Cheung A Modified Least-Squares Regression Approach to the Estimation of Risk Difference Am. J. Epidemiol., December 1, 2007; 166(11): 1337 - 1344. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
