American Journal of Epidemiology Advance Access originally published online on January 27, 2008
American Journal of Epidemiology 2008 167(5):523-529; doi:10.1093/aje/kwm355
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Invited Commentary: Variable Selection versus Shrinkage in the Control of Multiple Confounders
From the Departments of Epidemiology and Statistics, University of California, Los Angeles, CA
Correspondence to Dr. Sander Greenland, Department of Epidemiology, School of Public Health, University of California, Los Angeles, CA 90095-1772 (e-mail: lesdomes{at}ucla.edu).
Received for publication February 5, 2007. Accepted for publication September 12, 2007.
| ABSTRACT |
|---|
|
|
|---|
After screening out inappropriate or doubtful covariates on the basis of background knowledge, one may still be left with many potential confounders. It is then tempting to use statistical variable-selection methods to reduce the number used for adjustment. Nonetheless, there is no agreement on how selection should be conducted, and it is well known that conventional selection methods lead to confidence intervals that are too narrow and p values that are too small. Furthermore, theory and simulation evidence have found no selection method to be uniformly superior to adjusting for all well-measured confounders. Nonetheless, control of all measured confounders can lead to problems for conventional model-fitting methods. When these problems occur, one can apply modern techniques such as shrinkage estimation, exposure modeling, or hybrids that combine outcome and exposure modeling. No selection or special software is needed for most of these techniques. It thus appears that statistical confounder selection may be an unnecessary complication in most regression analyses of effects.
Bayesian methods; collapsibility; confounding; epidemiologic methods; regression; shrinkage; validity; variable selection
| INTRODUCTION |
|---|
|
|
|---|
Epidemiologists often have available a set of many potential confounders for a targeted exposure-disease relation and may attempt to select or delete variables from this set using statistical methods. Hoffmann et al. (1) give a method based on a collapsibility test, and there are several other collapsibility tests that can be applied in the same way (2–6). For example, earlier authors (3, 4) gave much simpler collapsibility tests based on differencing the adjusted and unadjusted variances, which they showed can be applied to odds ratios and to rate ratios (4) and which extend to generalized linear models, including logistic and log-linear rate models (5, 6).
These methods may be useful when the primary research question is whether control of a set of covariates makes an "important" difference in settings like the one at hand. Yet, more often this question is a preliminary one, asking which variables should be used for confounding control in the current analysis. Are such variable-selection methods still needed for this task? In the past millennium, the answer was "yes," given the severe limitations that existed on the number of variables that could be controlled in one model (7). Nonetheless, their application was always theoretically defective in failing to account for selection effects on subsequent tests and confidence intervals for the exposure effect. Fixing this defect is not simple.
Fortunately, statistical confounder selection may no longer be necessary in typical applications. No selection strategy has proven uniformly better than adjusting for all well-measured confounders after preliminary screening to remove doubtful or inappropriate covariates. Statistical problems from controlling many covariates can be addressed by using modern adjustment techniques, such as shrinkage estimation and exposure modeling. If one wishes to delete covariates to simplify analysis and presentation, the safest approach may be to allow deletion only if it has negligible impact on the final confidence interval for the exposure effect.
| CONFOUNDER SELECTION |
|---|
|
|
|---|
Background screening of candidates
Background knowledge usually identifies covariates that are best controlled no matter what (typically age and sex) or matching factors in case-control studies that predict exposure strongly (8). Nonetheless, many other covariates can be excluded from control on the basis of background knowledge, for example, because they are affected by exposure or disease (8–10). Causal diagrams provide useful rules for such background screening, including rules for identifying covariate sets that are minimally sufficient for confounder control, assuming the diagram is correct (9, 10).
Some covariates that pass this background screening might be excluded because they are judged to have too little variation, too many missing data, or too much measurement error to be worth including. Unfortunately, there are as yet no well-founded guidelines for making these judgments. Nonetheless, such preliminary concerns are best raised early enough to influence data collection, for collecting too many data may tax quality control and subject cooperation, thus degrading the accuracy and completeness of data while increasing drop-out rates.
The structure of confounder selection
If background screening identifies Z as a confounder, the only theoretical rationale for deleting Z is that the resulting bias is negligible or is more than offset by a reduction in variance. Most confounder-selection methods do not address the bias-variance trade, however, and instead focus only on the size of the bias. To describe these methods, suppose we have a logistic risk model or log-linear rate model (such as a proportional-hazards model) in which
is the coefficient of a covariate Z, and βa is the corresponding "Z-adjusted" exposure coefficient. Also, let βu be the "unadjusted" exposure coefficient in the model without Z. The
, βa, and βu are parameters, representing what would be obtained in a sample so large that random error was negligible.
If βa
βu (equivalently, if βa – βu
0), the exposure coefficient is noncollapsible over Z, and βa – βu measures the degree of noncollapsibility. Confounding by Z is often measured by the change in exposure-disease association upon adjustment for Z, βa – βu, or a function of it such as the ratio of adjusted and unadjusted ratios exp(βa)/exp(βu) = exp(βa – βu). These measures can be useful approximations in some settings, but can be misleading, as will be discussed below. The remaining discussion will focus on βa – βu but applies to other measures as well. It also applies if Z is a list (vector) of covariates (1, 5, 6).
Significance testing
Although common, selection based on testing the Z coefficient
is logically flawed, in part because
is just one factor that determines the degree of confounding by Z (7–13). The latter problem is addressed by testing collapsibility instead (1–6).
Significance testing of collapsibility selects Z if the null hypothesis βa = βu (collapsibility) is rejected (2), that is, if the observed change
a –
u is "significant." Examples and simulations have found, however, that use of p < 0.05 for "significant coefficient" or "significant change" (
= 0.05) too often leads to deletion of important confounders (false negative decisions) (14–17). Use of much higher
levels for the confounder test (e.g.,
= 0.20, selecting if p < 0.20) mitigates this problem (15–17).
A more general objection, however, is that significance testing treats false negative error (deleting a confounder) as secondary to false positive error (selecting a harmless nonconfounder). It has been argued that this ranking of error is backward: Deleting a confounder introduces bias and thus can be justified only if one can infer that the bias is tolerably small or worth the precision gain, whereas retaining a harmless nonconfounder only reduces precision (which is the price paid for protection against confounding by Z) (7, 11–14).
The objection to significance testing may be seen more clearly by using confidence intervals. Deleting Z when the test of βa = βu is nonsignificant is the same as deleting Z if the confidence interval for βa – βu includes zero. In small studies, the interval will be wider and thus more likely to include zero; hence, significance testing is more likely to delete covariates in small than large studies, even if the covariates are important confounders (12, 13, 16). The same interval may also include large differences, in which case nonsignificance of the observed difference
a – u provides no assurance that the true difference βa – βu is small.
Equivalence testing
To obtain some assurance that Z can be safely deleted, some authors consider equivalence testing, in which Z is deleted only if the confidence interval for βa – βu includes no large value (1, 5, 17). For example, with use of –0.1 to 0.1 as the range of tolerable bias from deleting Z (corresponding roughly to 10 percent bias on the odds-ratio or rate-ratio scale), Z would be deleted only if the interval for βa – βu fell in this range (1, 5, 17). With use of a 95 percent or even an 80 percent confidence level and a 10 percent bias tolerance, equivalence testing may delete few covariates or none in smaller studies; this is because it cannot delete a covariate unless the interval for βa – βu is narrower than the range of tolerable bias.
More often, Z is deleted if the change in the point estimate,
a –
u, falls in the designated tolerable range (11–14). This approach corresponds to using a 0 percent confidence interval for equivalence testing. Some authors recommend instead deleting Z only if the difference in the confidence intervals for βa and βu appears negligible (8, p. 258; 14). This refinement limits the impact of deletions on inferences about the exposure effect.
Parallels and differences in the methods
Significance and equivalence testing are based on the same computation, that is, an estimate
a –
of βa – βu, and its variance estimate. For stratified analyses or for small βa – βu (which are the focus of equivalence testing), an approximate variance for
a –
u is just the variance of
a minus the variance of
u estimated from models with and without Z, that is, the adjusted minus the unadjusted variance (3–5). Clogg et al. (6) give a more general variance formula, while Hoffmann et al. (1) give another formula that represents βa and βu within a single model. All of these formulas extend to vector Z. Nonetheless, at this time there is no evidence regarding the relative performance of the different collapsibility tests that result.
As mentioned above, significance testing (whether of
= 0 or of βa = βu) is more likely to delete variables in smaller samples, where many variables will not be clearly confounding; in contrast, equivalence testing is more likely to delete variables in larger samples, where some variables may clearly emerge as unimportant. Because variances are larger, unnecessary control is of greater concern in small samples. These considerations arguably favor significance testing over equivalence testing. Many other criteria are possible. Nonetheless, limited simulations thus far have yet to find a confounder-selection method that always outperforms others when the
levels and bias tolerance are set to limit serious false negative error (16, 17).
Addressing distortions produced by selection
In practice, variable selection is dogged by the fact that the final variance estimates tend to be downwardly biased if they do not account for the selection, while the point estimates may suffer related distortions (18–24). Various procedures have been proposed to address these problems (21–24). A simple fix involves resampling ("bootstrapping") the data with replacement and then repeating the entire analysis procedure, including the variable selection. Confidence limits and p values can be computed from the resulting simulation output (25, 26). Unfortunately, simply using percentiles or the variance of the simulated exposure-disease associations (which is common practice) can give inaccurate confidence limits and p values (25, 26), especially when selection is part of the process (20). All told, it is difficult to construct valid statistics that account for selecting variables or model forms on the basis of the data (23, 27).
Problems with the noncollapsibility criterion
A problem for any analysis is that Z may not be a confounder even if the exposure-disease association is noncollapsible (βa
βu) and Z precedes exposure and disease. That is, Z may look like a confounder both statistically and temporally, but its control may only worsen bias. One way this can happen is if the relations of Z to the exposure and disease are themselves confounded (9, 10). For example, under figure 1 with Z binary, upon control of Z we would find βa
βu; nonetheless, it is the adjusted coefficient βa that is confounded (albeit by A and B rather than by Z).
|
For logistic and proportional-hazards models, noncollapsibility can occur even if there is no confounding of any relation (28–33). For example, if one uses maximum-likelihood estimation, the observed change
| CONTROLLING ALL MEASURED CONFOUNDERS |
|---|
|
|
|---|
To minimize the risk of confounding and to avoid the problems and complexities of variable selection, after preliminary screening we might fit a model regressing incidence or prevalence on exposure and all the measured potential confounders. As mentioned above, however, conventional fitting methods such as maximum likelihood (whether unconditional or conditional) may then suffer from estimate inflation (sparse-data bias) or may fail to converge. These problems are likely if there are too few cases or noncases per model coefficient, or if the cases or noncases are too concentrated at single covariate values (30–34). Nonetheless, inflation may occur even if there is only one confounder (32, 33).
Real examples have exhibited large inflation of odds ratios because of sparse data, often unrecognized in the original reports (32–36). To control inflation and introduce background information into the model, one can use shrinkage methods, also known as random-coefficient, ridge, Stein, empirical-Bayes, semi-Bayes, hierarchical, multilevel, and penalized estimation (8, chap. 21; 24, 31–35, 37–54). These methods pull ("shrink") coefficient estimates toward zero or more generally toward values expected from background information (Bayesian analysis) or estimated from the data (empirical Bayes). Given a covariate Z with coefficient
, the pull is proportional to the estimated variance of
; thus, unstable estimates are shrunk more than stable ones.
Variable selection can be viewed as crude shrinkage in which
is either pulled all the way to zero (deleted) or else not shrunk at all (19). Shrinkage methods allow more flexibility by letting a variable be partially deleted to an extent determined by the instability of
, although some varieties (e.g., the Lasso (24)) may completely delete certain variables. Related approaches shrink the adjusted exposure-coefficient estimate
2 toward the unadjusted estimate
1 in order to minimize the estimated mean-squared error (17, 55).
The theoretical advantages of shrinkage over conventional methods have been confirmed in many real examples and simulation studies (24, 32–35, 38–54). Of great practical importance, however, is that shrinkage can be done with ordinary software for logistic, Poisson, or Cox models via the method of prior data (35, 41, 48, 53, 56–58).
Shrinkage via prior data
Shrinkage estimation with prior data first appeared in the 18th century (41). To describe the original method of Laplace, suppose we wish to estimate the probability of disease
from a random sample with A cases and B noncases out of N = A + B total. The method adds one case and one noncase to the data, thus shifting from the maximum likelihood estimate A/N of
to the shrinkage estimate (A + 1)/(N + 2), which is closer to
. Similarly, the estimate of
= logit(
) = ln[
/(1 –
)] is shifted from the maximum likelihood estimate ln(A/B) to ln[(A + 1)/(B + 1)], which is closer to logit(
) = 0. Thus, Laplace's method shrinks the estimate of
toward zero by adding two "prior observations" to the data. The degree of shrinkage depends on the sizes of A and B, becoming important only if A or B is small.
Laplace's method can also be applied when
is the coefficient of Z in a logistic risk or a Poisson (log-linear) rate model (48): We add two "prior-data" records to the data, one with Z = 1 and the other with Z = 0. Each record has one case and 100,000 persons or person-years, with all other variables set to zero. These two records are a stratum of "prior data" with an odds ratio or rate ratio of 1, giving an estimate of ln(1) = 0 for the log ratio
. This prior stratum pulls the estimate of
toward zero. To avoid confounding of the prior data with the actual data, we also add an indicator variable PriorZ to the regression, with PriorZ = 1 for the two added records and = 0 for all other records.
Using just one case per prior record corresponds to 95 percent prior certainty that exp(
) is between 1/39 and 39 and has little impact on most coefficients. If we wish to narrow the probable range for exp(
) and thus induce more shrinkage, we add more prior cases. For example, the use of four cases per record corresponds roughly to 95 percent prior certainty that exp(
) is between
and 4, whereas the use of 16 cases per record corresponds to 95 percent prior certainty that exp(
) is between
and 2 (56).
There are many other extensions of the prior-data method. To pull the estimated
toward a nonzero number
0, we set the denominator in the second record to exp(
0) x 100,000 (56). To pull more in one direction than another, we skew the prior data (53, 58). We add two prior records and one prior indicator for each coefficient we wish to shrink; the number of cases added and the ratio of denominators may differ among coefficients (48). To shrink coefficients in conditional-logistic or proportional-hazards regression, however, we add counts of discordant matched pairs instead of counts of individuals, and we need no prior indicator (48, 57).
For exp(
) and the prior data to make sense, 0 and 1 must be meaningfully different values for Z, which may require recoding of Z. For example, if Z is diastolic blood pressure in millimeters of mercury, we may subtract 80mm to make Z = 0 correspond to 80mm and then divide by 10 to make Z = 1 correspond to 90mm; exp(
) would now be the odds ratio or rate ratio comparing 90mm (Z = 1) with 80mm (Z = 0).
Shrinkage for control of collinear confounders
Unlike conventional methods, shrinkage can be applied when the covariates are collinear and conventional methods fail completely. In particular, shrinkage allows control for potential confounding between primary recorded variables and derived quantities, such as confounding between food and nutrient effects.
As an example, table 1 shows the results of applying various methods to a study of 140 breast cancer cases and 222 matched controls (34, 59). Nutrient intakes were derived from a food questionnaire using composition tables; hence, the 35 derived nutrient measurements are collinear with the 87 food items. Keeping all 35 nutrients in the conditional logistic model identifies omega-3 fatty acids and phytoestrogens as negatively associated with breast cancer and beta-carotene as positively associated. Stepwise (backward) deletion using
-to-remove of 0.10 only slightly changes these results.
|
Although such analyses go unquestioned in the literature, they are not credible insofar as they assume that all the food coefficients are zero. That is, they assume that each food Z has no residual effect (no effect beyond that due to the measured nutrients). These assumptions are necessitated by the fact that conventional methods cannot fit a model with collinear covariates and so cannot include all the foods as confounders. This limitation leaves them vulnerable to large net confounding: Even if none of the omitted variables has a large effect by itself, their net effect taken together could be considerable. This possibility is a greater validity threat to the stepwise analysis, because it also assumes that most nutrient coefficients are zero.
Shrinkage methods allow one to address residual confounding by entering each food in the model, along with prior data that limit the size of the residual food effects. In the example, entering 16 case-control pairs with Z = 1 for the case and Z = 0 for the control, as well as 16 pairs with Z = 0 for the case and Z = 1 for the control, is equivalent to assuming a normal prior for the Z coefficient that places 95 percent probability on the odds ratio exp(
) being between
and 2 (48). When all 87 foods are entered in the regression along with these prior data, the nutrient results look far less certain than they do in the conventional analysis; refer to the final column of table 1.
The omission of foods from the conventional analyses corresponds to the extreme of betting 100 percent that every food coefficient is zero. The same result would be obtained by entering each food with an infinite number of matched pairs as prior data. The omission of an additional 20 nutrients by the stepwise analysis corresponds to extending this unsupportable certainty to most of the nutrients. The unwarranted certainty that residual food effects are absent translates into overly narrow confidence intervals for the nutrient effects.
Such excessive certainty in conventional results may help to explain refutations of epidemiologic findings by randomized trials (60). Extensions of shrinkage methods to bias modeling allow incorporation of further sources of uncertainty, such as selection bias, misclassification, and unmeasured confounders, leading to greater caution in inferences about highly "statistically significant" or precise-looking associations (61).
| EXPOSURE MODELING |
|---|
|
|
|---|
Many clinical studies have good information for treatment prediction, which allows improved confounder control via treatment (exposure) modeling. Methods that use exposure modeling include adjustment by exposure regression (11, 62, 63), exposure probability (propensity scoring) (64, 65), inverse-probability weighting (65, 66), and hybrid methods that use both exposure and outcome models for adjustment (67–69).
Although exposure modeling is a burgeoning topic, present methods have difficulties and need more development, especially for case-control analysis. For example, exposure-effect estimates from exposure modeling can suffer variance inflation if the fitted model predicts exposure very well, even if no confounding is present. Again, screening out nonconfounders on subject-matter grounds is essential, and statistical variable selection among what remains is problematic. Collapsibility tests based on exposure models do not yet exist, while methods that test only the covariate-exposure association can easily delete important confounders, yet worsen variance inflation by selecting nonconfounding but strong exposure predictors (70). Relevant avenues for investigation include development of valid methods for variable selection and shrinkage in exposure and hybrid modeling.
Exposure modeling does have a sample-size advantage in cohort studies when the outcome is rare but exposure is common, for then exposure risk can be more stably estimated than outcome risk (71). This advantage does not lead to greater efficiency, however, because the final precision remains limited by the small numbers of cases. The sample-size advantage can shift to outcome modeling when the outcome is more common in the study than the exposure, as in many case-control studies. Furthermore, exposure modeling is prone to several statistical artifacts in case-control studies (72).
| CONCLUSION |
|---|
|
|
|---|
Examination of the results of different analytical approaches is helpful, especially when one understands the strengths and weaknesses of each method. Confounder selection methods have numerous weaknesses that should encourage direct examination of results from control of all measured confounders when that is feasible, even if only for reference. When full control leads to problems for conventional regression programs, one can apply modern techniques such as shrinkage, exposure modeling, or hybrid methods as outlined above, rather than rely on conventional variable-selection algorithms that can easily delete or ignore important confounders. Versions of these techniques can be performed with common software packages, and they can provide more accurate estimates and uncertainty assessments than can conventional methods. Although no statistical method is a panacea for epidemiologic problems, these methods deserve to become part of the regular curriculum in epidemiologic analysis.
| ACKNOWLEDGMENTS |
|---|
The author thanks Babette Brumback, Katherine Hoggatt, and Charles Poole for helpful comments.
Conflict of interest: none declared.
| References |
|---|
|
|
|---|
- Hoffmann K, Pischon T, Schulz M, et al. A statistical test for the equality of differently adjusted incidence rate ratios. Am J Epidemiol (2008) 167:517–22.
[Abstract/Free Full Text] - Whittemore AS. Collapsing multidimensional contingency tables. J R Stat Soc (B) (1978) 40:328–40.
- Greenland S, Mickey RM. Closed form and dually consistent methods for inference on strict collapsibility in 2 x 2 x K and 2 x J x K tables. Appl Stat (1988) 37:335–43.[CrossRef]
- Hausman JA. Specification tests in econometrics. Econometrica (1978) 46:1251–71.[CrossRef][Web of Science]
- Greenland S, Maldonado G. Inference on collapsibility in generalized linear models. Biom J (1994) 36:771–82.[CrossRef][Web of Science]
- Clogg CC, Petkova E, Haritou A. Statistical methods for comparing regression coefficients between models (with discussion). Am J Sociol (1995) 100:1261–312.[CrossRef]
- Robins JM, Greenland S. The role of model selection in causal inference from nonexperimental data. Am J Epidemiol (1986) 123:392–402.
[Free Full Text] - Rothman KJ, Greenland S. Modern epidemiology. 2nd ed. (1998) Philadelphia, PA: Lippincott-Raven.
- Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology (1999) 10:37–48.[CrossRef][Web of Science][Medline]
- Pearl J. Causality. (2000) New York, NY: Cambridge University Press.
- Miettinen OS. Stratification by a multivariate confounder score. Am J Epidemiol (1976) 104:609–20.
[Abstract/Free Full Text] - Greenland S, Neutra RR. Control of confounding in the assessment of medical technology. Int J Epidemiol (1980) 9:361–7.
[Abstract/Free Full Text] - Greenland S. Cautions in the use of preliminary test estimators. Stat Med (1989) 8:669–73.[CrossRef][Web of Science]
- Greenland S. Modeling and variable selection in epidemiologic analysis. Am J Public Health (1989) 79:340–9.
[Abstract/Free Full Text] - Dales LD, Ury HK. An improper use of statistical significance testing in studying covariables. Int J Epidemiol (1978) 4:373–5.
- Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. Am J Epidemiol (1989) 129:125–37.
[Abstract/Free Full Text] - Maldonado G, Greenland S. Simulation study of confounder-selection strategies. Am J Epidemiol (1993) 138:923–36.
[Abstract/Free Full Text] - Sclove SL, Morris C, Radhakrishna R. Non-optimality of preliminary-test estimators for the mean of a multivariate normal distribution. Ann Math Stat (1972) 43:1481–90.[CrossRef]
- Leamer EE. Specification searches. (1978) New York, NY: Wiley.
- Freedman DA, Navidi W, Peters SC. On the impact of variable selection in fitting regression equations. In: On model uncertainty and its statistical implications—Dijlestra TK, ed. (1988) Berlin, Germany: Springer-Verlag. 1–16.
- Viallefont V, Raftery AE, Richardson S. Variable selection and Bayesian model averaging in case-control studies. Stat Med (2001) 20:3215–30.[CrossRef][Web of Science][Medline]
- Hill JL, McCulloch RE. Bayesian nonparametric modeling for causal inference. J Am Stat Assoc (in press).
- Breiman L. The little bootstrap and other methods for dimensionality selection in regression: X-fixed prediction error. J Am Stat Assoc (1992) 87:738–54.[CrossRef][Web of Science]
- Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. (2001) New York, NY: Springer-Verlag.
- Efron B, Tibshirani R. An introduction to the bootstrap. (1993) New York, NY: Chapman and Hall.
- Carpenter J, Bithell J. Bootstrap confidence intervals: when, which, and what? Stat Med (2000) 19:1141–64.[CrossRef][Web of Science][Medline]
- Sinisi SE, van der Laan MJ. Deletion/substitution/addition algorithm in learning with applications in genomics. Stat Appl Genet Mol Biol (2004) 3. article 18. (Electronic article). (http://www.bepress.com/sagmb/vol3/iss1/art18/).
- Greenland S. Absence of confounding does not correspond to collapsibility of the rate ratio or rate difference. Epidemiology (1996) 7:498–501.[Web of Science][Medline]
- Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Stat Sci (1999) 14:29–46.[CrossRef][Web of Science]
- Pike MC, Hill AP, Smith PG. Bias and efficiency in logistic analyses of stratified case-control studies. Int J Epidemiol (1980) 9:89–95.
[Abstract/Free Full Text] - Peduzzi P, Concato J, Kemper E, et al. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol (1996) 49:1373–9.[CrossRef][Web of Science][Medline]
- Greenland S. Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators. Biostatistics (2000) 1:113–22.[Abstract]
- Greenland S, Schwartzbaum JA, Finkle WD. Problems from small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol (2000) 151:531–9.
[Abstract/Free Full Text] - Greenland S. When should epidemiologic regressions use random coefficients? Biometrics (2000) 56:915–21.[CrossRef][Web of Science][Medline]
- Greenland S. Putting background information about relative risks into conjugate priors. Biometrics (2001) 57:663–70.[CrossRef][Web of Science][Medline]
- Mejia-Arangure JM, Fajardo-Guiterrez A, Perez-Saldivar ML, et al. Magnetic fields and acute leukemia in children with Down syndrome. Epidemiology (2007) 18:158–61.[CrossRef][Web of Science][Medline]
- Efron B, Morris C. Stein's estimation rule and its competitors—an empirical Bayes approach. J Am Stat Assoc (1973) 68:117–30.[CrossRef][Web of Science]
- Efron B, Morris CN. Data analysis using Stein's estimator and its generalizations. J Am Stat Assoc (1975) 70:311–19.[CrossRef][Web of Science]
- Morris CN. Parametric empirical Bayes: theory and applications (with discussion). J Am Stat Assoc (1983) 78:47–65.[CrossRef][Web of Science]
- Copas JB. Regression, prediction, and shrinkage. J R Stat Soc (B) (1983) 45:311–54.
- Good IJ. Good thinking. (1983) Minneapolis, MN: University of Minnesota Press.
- Thomas DC, Semiatycki J, Dewar R, et al. The problem of multiple inference in studies designed to generate hypotheses. Am J Epidemiol (1985) 122:1080–95.
[Abstract/Free Full Text] - Greenland S. Methods for epidemiologic analyses of multiple exposures: a review and a comparative study of maximum-likelihood, preliminary testing, and empirical-Bayes regression. Stat Med (1993) 12:717–36.[Web of Science][Medline]
- Leonard T, Hsu JSJ. Bayesian methods. (1999) New York, NY: Cambridge University Press.
- Greenland S. Multilevel modeling and model averaging. Scand J Work Environ Health (1999) 25(suppl 4):43–8.[Web of Science][Medline]
- Greenland S. Principles of multilevel modelling. Int J Epidemiol (2000) 29:158–67.
[Abstract/Free Full Text] - Carlin B, Louis TA. Bayes and empirical-Bayes methods of data analysis. (2000) 2nd ed. New York, NY: Chapman and Hall.
- Greenland S. Bayesian perspectives for epidemiologic research. II. Regression analysis. Int J Epidemiol (2007) 36:195–202.
[Abstract/Free Full Text] - Witte JS, Greenland S. Simulation study of hierarchical regression. Stat Med (1996) 15:1161–70.[CrossRef][Web of Science][Medline]
- Greenland S. Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analysis. Stat Med (1997) 16:515–26.[CrossRef][Web of Science][Medline]
- Aragaki CC, Greenland S, Probst-Hensch NM, et al. Hierarchical modeling of gene-environment interactions: estimating NAT2* genotype-specific dietary effects on adenomatous polyps. Cancer Epidemiol Biomarkers Prev (1997) 6:307–14.[Abstract]
- Witte JS, Greenland S, Kim LL, et al. Multilevel modeling in epidemiology with GLIMMIX. Epidemiology (2000) 11:684–8.[CrossRef][Web of Science][Medline]
- Greenland S. Generalized conjugate priors for Bayesian analysis of risk and survival regressions. Biometrics (2003) 59:92–9.[CrossRef][Web of Science][Medline]
- Greenland S. Reducing mean squared error in the analysis of stratified epidemiologic studies. Biometrics (1991) 47:773–5.[CrossRef][Web of Science][Medline]
- Greenland S. Bayesian perspectives for epidemiologic research. I. Foundations and basic methods. Int J Epidemiol (2006) 35:765–78.
[Abstract/Free Full Text] - Greenland S, Christensen R. Data augmentation for Bayesian and semi-Bayes analyses of conditional-logistic and proportional-hazards regression. Stat Med (2001) 20:2421–8.[CrossRef][Web of Science][Medline]
- Greenland S. Prior data for non-normal priors. Stat Med (2007) 26:3578–90.[CrossRef][Web of Science][Medline]
- Ursin G, Aragaki CC, Paganini-Hill A, et al. Oral contraceptives and premenopausal bilateral breast cancer: a case-control study. Epidemiology (1992) 3:414–19.[Web of Science][Medline]
- Lawlor DA, Davey Smith G, Bruckdorfer KR, et al. Those confounded vitamins: what can we learn from the differences between observational versus randomized trial evidence? Lancet (2004) 363:1724–7.[CrossRef][Web of Science][Medline]
- Greenland S. Multiple-bias modeling for analysis of observational data (with discussion). J R Stat Soc (A) (2005) 168:267–308.
- Robins JM, Mark SD, Newey WK. Estimating exposure effects by modeling the expectation of exposure conditional on confounders. Biometrics (1992) 48:479–95.[CrossRef][Web of Science][Medline]
- Brumback BA, Greenland S, Redman M, et al. The intensity-score approach to adjusting for confounding. Biometrics (2003) 59:274–85.[CrossRef][Web of Science][Medline]
- Rosenbaum PR. Observational studies. 2nd ed. (2002) New York, NY: Springer-Verlag.
- Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med (2004) 23:2937–60.[CrossRef][Web of Science][Medline]
- Robins JM, Hernán MA, Brumback BA. Marginal structural models and causal inference in epidemiology. Epidemiology (2000) 11:550–60.[CrossRef][Web of Science][Medline]
- Rubin DB, Thomas N. Combining propensity score matching with additional adjustments for prognostic covariates. J Am Stat Assoc (2000) 95:573–85.[CrossRef][Web of Science]
- Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics (2005) 61:962–72.[CrossRef][Web of Science][Medline]
- Kang JDY, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data (with discussion). Stat Sci. in press.
- Brookhart MA, Schneeweiss S, Rothman KJ, et al. Variable selection for propensity score models. Am J Epidemiol (2006) 163:1149–56.
[Abstract/Free Full Text] - Cepeda MS, Boston R, Farrar JT, et al. Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders. Am J Epidemiol (2003) 158:280–7.
[Abstract/Free Full Text] - Mansson R, Hennessy S, Joffe MM. On the estimation and use of propensity scores in case-control and case-cohort studies. Am J Epidemiol (2007) 166:332–9.
[Abstract/Free Full Text]
Related articles in Am. J. Epidemiol.:
- A Statistical Test for the Equality of Differently Adjusted Incidence Rate Ratios
- Kurt Hoffmann, Tobias Pischon, Mandy Schulz, Matthias B. Schulze, Jennifer Ray, and Heiner Boeing
Am. J. Epidemiol. 2008 167: 517-522.[Abstract] [FREE Full Text]
This article has been cited by other articles:
![]() |
H Koch, M. van Bokhoven, P. Bindels, T van der Weijden, G. Dinant, and G ter Riet The course of newly presented unexplained complaints in general practice patients: a prospective cohort study Fam. Pract., December 1, 2009; 26(6): 455 - 465. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kalantar-Zadeh and C. P. Kovesdy Clinical Outcomes with Active versus Nutritional Vitamin D Compounds in Chronic Kidney Disease Clin. J. Am. Soc. Nephrol., September 1, 2009; 4(9): 1529 - 1539. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Imamura, A. H. Lichtenstein, G. E. Dallal, J. B. Meigs, and P. F. Jacques Confounding by Dietary Patterns of the Inverse Association Between Alcohol Consumption and Type 2 Diabetes Risk Am. J. Epidemiol., July 1, 2009; 170(1): 37 - 45. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Greenland Weaknesses of Bayesian model averaging for meta-analysis in the study of vitamin E and mortality Clinical Trials, February 1, 2009; 6(1): 42 - 46. [PDF] |
||||
![]() |
S. Greenland Multiple comparisons and association selection in general epidemiology Int. J. Epidemiol., June 1, 2008; 37(3): 430 - 434. [Full Text] [PDF] |
||||
![]() |
RE: "INVITED COMMENTARY: VARIABLE SELECTION VERSUS SHRINKAGE IN THE CONTROL OF MULTIPLE CONFOUNDERS" Am. J. Epidemiol., May 1, 2008; 167(9): 1142 - 1142. [Full Text] [PDF] |
||||
![]() |
T. Pischon, M. B. Schulze, D. Drogan, and H. Boeing Pischon et al. Respond to "Variable Selection versus Shrinkage in Control of Confounders" Am. J. Epidemiol., March 1, 2008; 167(5): 530 - 531. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





