American Journal of Epidemiology Advance Access originally published online on June 29, 2006
American Journal of Epidemiology 2006 164(4):401-402; doi:10.1093/aje/kwj236
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letter to the Editor |
THE AUTHORS REPLY
1 Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT
2 Department of Medical Education, Griffin Hospital, Derby, CT
(e-mail: lisa.calvocoressi{at}yale.edu)
We are pleased to respond to Radespiel-Tröger et al.'s comment (1
) on our recent Journal article (2
), in which we used recursive partitioning to examine predictors of mammography screening. This response allows us to elaborate on our methodological strategy and to participate in dialogue about the strengths and limitations of this emerging statistical procedure for analysis of epidemiologic data.
The authors (1
) assert that the data presented in figure 1 of our paper (2
, p. 1220) are based on the sample alone, thus yielding a biased estimate of the proportion of women who would be misclassified in a practical setting. We agree that misclassification rates based on the sample itself may be biased and that a validation sample or resampling technique should be used. We direct the authors to page 1217 of our paper, where we indicate that our results are based on a 10-fold cross-validation procedure. The authors' calculation of the proportion of women who did not classify correctly is consistent with the estimate provided by the classification and regression tree (CART).
The authors state that alternative categorizations of variables in our model (2
) would "almost surely lead to a completely different tree" (1
, p. 400). With the exception of age, all variables included in the tree analysis were entered into the model as collected (as categorical variables). When age was entered in its continuous form, CART dichotomized that variable as less than age 49.5 years versus equal to or greater than age 49.5 years, consistent with the dichotomization of the categorical variable in our analysis (ages 4049 and ages 5079 years). When the continuous age variable was used, though, CART selected a tree that was larger than that reported in figure 1; however, the upper nodes of that tree were identical to the tree reported in figure 1 that used the categorical age variable.
The authors (1
) point out that one should examine competing splits. We agree. As we note in our paper (2
, p. 1223), it is possible that important predictors (i.e., strong competing splits) may be overlooked in the partitioning process. Although not reported, we did examine the competing splits provided by CART. Among women aged 5079 years, perceived susceptibility was a strong competitor to the split on usefulness of mammography. Because perceived susceptibility was selected as the next node in the tree for that age group, interpretation of the results of our analysis is not altered by this competing split. However, there was also a strong competitor to perceived susceptibility among older women; that is, knowledge of screening guidelines. This variable may therefore be important to consider, although it did not appear in the tree. Among women aged 4049 years, anxiety during the index screening compared with expectations was a strong competitor to perceived susceptibility. A construct related to perceived susceptibility, this variable may also be important.
We agree with the authors (1
) that recursive partitioning is but one method of data analysis needed to identify variables of importance in adherence to screening guidelines. Weaknesses of the approach are described in our paper (2
) and by Radespiel-Tröger et al; that is, potential instability and arbitrariness. A major strength is that recursive partitioning may reveal combinations of predictors and important risk subgroups that might be obscured by standard parametric linear models. Recursive partitioning is also an informative exploratory method for selecting discriminating predictors and their interactions from a large number of candidate variables. Adding recursive partitioning to the standard assemblage of multivariable methods thus represents a potentially important advance in understanding the interplay among risk factors. However, as is the case with any statistical method (including logistic regression), recursive partitioning should be carried out with appropriate caution by researchers who are aware of its strengths and limitations.
ACKNOWLEDGMENTS
Conflict of interest: none declared.
References
- Radespiel-Tröger M, Hothorn T, Pfahlberg AB, et al. Re: "Applying recursive partitioning to a prospective study of factors associated with adherence to mammography screening guidelines." (Letter). Am J Epidemiol 2006;164:4001.
[Free Full Text] - Calvocoressi L, Stolar M, Kasl SV, et al. Applying recursive partitioning to a prospective study of factors associated with adherence to mammography screening guidelines. Am J Epidemiol 2005;162:121524.
[Abstract/Free Full Text]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||