American Journal of Epidemiology Advance Access originally published online on March 28, 2007
American Journal of Epidemiology 2007 165(10):1110-1118; doi:10.1093/aje/kwm074
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PRACTICE OF EPIDEMIOLOGY |
Performance of Propensity Score CalibrationA Simulation Study
1 Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
2 Division of Preventive Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
3 Department of Epidemiology, Harvard School of Public Health, Boston, MA
4 Department of Epidemiology, Boston University School of Public Health, Boston, MA
5 Research Triangle Institute, Research Triangle Park, NC
6 Department of Biostatistics, Harvard School of Public Health, Boston, MA
Correspondence to Dr. Til Stürmer, Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women's Hospital, Harvard Medical School, 1620 Tremont Street, Suite 3030, Boston, MA 02120 (e-mail: til.sturmer{at}post.harvard.edu).
Received for publication November 1, 2005. Accepted for publication July 13, 2006.
| ABSTRACT |
|---|
|
|
|---|
Confounding can be a major source of bias in nonexperimental research. The authors recently introduced propensity score calibration (PSC), which combines propensity scores and regression calibration to address confounding by variables unobserved in the main study by using variables observed in a validation study. Here, the authors assess the performance of PSC using simulations in settings with and without violation of the key assumption of PSC: that the error-prone propensity score estimated in the main study is a surrogate for the gold-standard propensity score (i.e., it contains no additional information on the outcome). The assumption can be assessed if data on the outcome are available in the validation study. If data are simulated allowing for surrogacy to be violated, results depend largely on the extent of violation. If surrogacy holds, PSC leads to bias reduction between 32% and 106% (>100% representing overcorrection). If surrogacy is violated, PSC can lead to an increase in bias. Surrogacy is violated when the direction of confounding of the exposure-disease association caused by the unobserved variable(s) differs from that of the confounding due to observed variables. When surrogacy holds, PSC is a useful approach to adjust for unmeasured confounding using validation data.
bias (epidemiology); cohort studies; confounding factors (epidemiology); epidemiologic methods; models, statistical; propensity score calibration; research design
Abbreviations: OR, odds ratio; PSC, propensity score calibration; PSEP, error-prone propensity score; PSGS, gold-standard propensity score
| INTRODUCTION |
|---|
|
|
|---|
Confounding can be a major source of bias in nonexperimental research. Studies often lack data on important potential confounders, such as smoking and body mass index in pharmacoepidemiologic studies that use claims data, or laboratory or blood pressure measurements in questionnaire-based studies. Various methods have been proposed for assessing the sensitivity of observed associations to the possible effect of unobserved confounders (112), but only one of these can address the joint confounding due to multiple unobserved confounders (12). We recently introduced propensity score calibration (PSC) (13), which combines propensity scores (14) and regression calibration developed to correct for measurement error (15, 16). Our goal was to address the joint confounding by variables unobserved in the main cohort study by using variables observed in a cross-sectional validation study. We previously demonstrated that this method worked well in one specific pharmacoepidemiologic examplethat is, in the assessment of the effect of nonsteroidal antiinflammatory drugs on short-term all-cause mortality (13)without requiring outcome information in the validation study.
As we have noted previously (13), PSC, like regression calibration, is dependent on the assumption that the error-prone variable is a surrogate for the gold-standard variablethat is, that the error-prone propensity score is independent of disease given the gold-standard propensity score and exposure (17, 18). Thus, under surrogacy, the error-prone propensity score serves as a proxy for the gold-standard propensity score with measurement error that is independent of the outcome. Surrogacy is plausible in many settings, especially when the gold-standard and error-prone variables are observed at baseline in a cohort study in which the disease outcome occurs later in time (17). For example, it is plausible that a single day's blood pressure contributes no information on incidence of cardiovascular disease beyond that given by true long-term blood pressure (17). By contrast, self-reported total cholesterol values, which might be considered surrogates for (unavailable) serum cholesterol values, have been observed to be stronger predictors of cardiovascular outcomes than measured serum cholesterol values in the Women's Health Study (19). Thus, self-reported cholesterol is not a surrogate for measured cholesterol in this setting, and use of regression calibration to correct for measurement error in the self-reported cholesterol values, based on measured serum cholesterol level in a validation study, would be invalid because surrogacy is violated.
Here we present the results of a simulation study assessing the performance of PSC under a wide range of parameter constellations and in settings with and without violation of surrogacy, and we discuss the meaning of surrogacy using practical examples.
| METHODS |
|---|
|
|
|---|
Propensity score calibration
Assume a main cohort study with a dichotomous exposure of interest, A, a dichotomous outcome of interest, Y, and information on two confounders, X1 and X2. An additional third confounder, C, is observed in a separate validation study only, together with exposure A and confounders X1 and X2.
To control for confounding in the main study, we first estimate the propensity score given the two observed confounders, X1 and X2, by fitting a logistic regression model with exposure as the dependent variable. Since this propensity score is estimated without information on the third confounder C, we call this the error-prone propensity score (PSEP):
|
| (1) |
|
| (2) |
In the validation sample, we then estimate both PSEP, as a function of X1 and X2 (equation 1), and the gold-standard propensity score (PSGS), as a function of X1, X2, and C. Following the general notation introduced for the propensity score, we define
|
| (3) |
|
| (4) |
Assuming that the outcome is a function of the exposure and PSGS,
|
| (5) |
|
| (6) |
1,
2). We used the logistic link in equations 2 and 5 despite the noncollapsibility of the odds ratio under exchangeability of exposed and unexposed given PSGS, because regression calibration has not yet been evaluated for relative risk or Poisson outcome models. Corrected estimates for the variances account for the additional uncertainty caused by the estimation of
in the validation study. Regression calibration can also be implemented as a single imputation of the gold-standard variable based on the parameters of the measurement error model in the validation study (equation 4) and the values of X1, X2, and A observed in the main study (17). Since in PSC the gold-standard variable is a propensity score, single imputation of the PSGS makes it possible to implement matching on or stratification by this imputed propensity score, rather than controlling for the propensity score as a single continuous covariate in the outcome model (20). Analyses matched on the propensity score might be advantageous, since they are not based on comparisons outside a common range of propensity scores for exposed and unexposed (20). Stratification by the propensity score can also be restricted to this common range.
For these reasons, we implemented PSC in all simulations by first imputing missing values of PSGS based on equation 4 (i.e., as a function of exposure and PSEP but not disease outcome) rather than by using equation 6. We then matched a single unexposed observation to every exposed observation on this imputed value of PSGS using greedy matching (21). Greedy matching starts using a very narrow caliper of the propensity score (to the fifth decimal place) to find an unexposed match for every exposed observation and, if unsuccessful, widens the caliper in one-decimal-place steps up to the first decimal place (21). Greedy matching is a frequently used algorithm in this setting (22) because it achieves close matching with a high proportion of exposed observations for which an unexposed match can be found. The proportion of exposed observations that can be matched to unexposed ones on the imputed PSGS is an inverse function of the ability of the propensity score to predict exposure; in our study, it was above 85 percent in most scenarios (range, 7299 percent). High values are necessary for a causal contrast (what would have happened to the exposed had they been unexposed). The values in our simulations are well within the range observed in published applications of propensity score analysis (22).
We then estimated the exposure-outcome association in matched pairs, using conditional logistic regression to increase efficiency. To obtain 95 percent confidence limits for this estimate, we took 1,000 bootstraps sampling on matched pairs with replacement. These data sets were again analyzed with conditional logistic regression. We used the empirical distribution of these estimates (2.5th and 97.5th percentiles) to assign lower and upper bounds of the 95 percent confidence interval of the PSC estimate and assessed whether the true value of the log odds ratio for the exposure-outcome association was covered by this nonparametric confidence interval.
Simulation study
Let the exposure of interest, A, and the outcome of interest, Y, both be dichotomous variables. The confounders X1, X2, and C are independent standard normal variables with a mean of 0 and a variance of 1. The probability that an observation is exposed (A = 1) given confounders X1, X2, and C corresponds to PSGS (equation 3).
The probability that an observation has the outcome (Y = 1) given the exposure A and PSGS is given by equation 5. Using this model, the association between individual confounders X1, X2, and C and disease is defined by their association with exposure and the association of PSGS with disease. In particular, the association between the confounder C and disease cannot be varied independently of the confounders X1 and X2.
In these simulations with disease as a function of A and PSGS (equation 5), surrogacy (13, 17, 18) of PSEP is present by design, because PSEP is based on a subset of covariates contained in PSGS and disease is a (log)linear function of PSGS (and A) only.
To allow surrogacy to be violated, we conducted a second set of simulations wherein the expected value of the dichotomous disease outcome Y given the exposure A, as well as the confounders X1, X2, and C, was defined as
|
| (7) |
Using these expected values, we simulated 1,000 data sets for each parameter constellation. Although it is simulated for the whole data set, the confounder C is deleted from the main study and only observed in a random validation sample, whereas X1 and X2 are observed in both the main study and the validation sample.
Since our validation study contains outcome information, we are able to assess surrogacy. Following the logistic form of equation 7, we fitted a logistic regression model with the disease outcome Y as a function of A, PSGS, and PSEP:
|
| (8) |
In the absence of a specific test for surrogacy, we used two measures: First, we performed a likelihood ratio test for the predictive value of PSEP independent of PSGS and A, that is, comparing the full model (equation 8) with a model without PSEP. Second, we assessed the percentage of the variance in Y explained by PSGS and PSEP which is due to PSGS. This ratio of pseudo-R2 values was calculated as the ratio of the likelihood ratio comparing the logistic regression model logit(Y) =
'0 +
'1A +
'2PSGS with the nested logistic regression model logit(Y) =
''0 +
''1A and the likelihood ratio comparing the full model (equation 8) with the nested logistic regression model logit(Y) =
''0 +
''1A x 100 (23). Values close to the maximum possible value of 100 percent suggest that surrogacy holds.
Parameters
The parameters used in the basic scenario, as well as the range of parameter values covered in these simulations, are shown in table 1. In the basic scenario, we assume a prevalence of the exposure of 20 percent (PA = 0.2), a cumulative incidence of disease of 1 percent (IY = 0.01), no association between the exposure of interest and the disease (odds ratio (OR) for the exposure-disease association (ORAY) = 1), a main study size of 10,000 (Nmain = 10,000), and a 10 percent validation sample (%val = 10).
|
In all scenarios of both the first and second set of simulations, both X1 and X2 are inversely associated with exposure (
1 = 0.405 and
2 = 0.405, corresponding to an ORXA of 0.67).
In the first set of simulations (equation 5), the associations between the confounders and disease are defined by the association between PSGS and disease. The value for the basic scenario (
2 = 9) reflects a reasonable relation between the propensity score, which is bounded between 0 and 1 and often has low variability, and risk of disease. In the second set of simulations (equation 7), both X1 and X2 are risk factors for disease (
1 = 0.405 and
2 = 0.405, corresponding to an ORXY of 1.5). Therefore, X1 and X2 lead to confounding towards lower values of the exposure-disease association.
| RESULTS |
|---|
|
|
|---|
In table 2, we present the results of the first set of simulationsthat is, results obtained when simulating the disease as a function of exposure and PSGS (equation 5) and thus surrogacy holds by design. The following parameters are varied around the value of the basic scenario: the cumulative incidence of disease IY, the odds ratio for the exposure-of-interestdisease association ORAY, the odds ratio for the unobserved confounder-exposure association ORCA, the log odds ratio for the PSGS-disease association
2, the size of the main study Nmain, and the percentage of persons in the random validation sample %val. These parameters are varied while keeping all other parameters at the value of the basic scenario (presented in italic type in table 1). In particular, the true ORAY is 1.0 in all scenarios, except for the two rows with ORAY = 2 and ORAY = 0.5. For easy comparison of the magnitude and direction of confounding, we present the median crude ORAY and the median ORAY adjusting for PSEP based on the observed covariates X1 and X2 only (equation 2) in the main study.
|
In all scenarios assessed, median estimates of ORAY from PSC are close to the true value and the percentage of bias reduction (where applicable) is 71110 percent, except when Nmain = 1,000. A bias reduction of 100 represents complete control of bias (no residual confounding), and values exceeding 100 represent overcorrection. In some scenarios, the percentage of bias reduction either is undefined (since the expected value of the estimator controlling for X1 and X2 only is unbiased) or is not meaningful, since there is little residual bias (and thus the denominator of the percentage of bias reduction is close to 0). The coverage of the 95 percent confidence interval ranges from 86.0 percent to 95.9 percent (except when Nmain = 1,000) and is nearly nominal in many scenarios. Coverage is reduced with increasing incidence of disease (IY), odds ratios for the exposure-disease association (ORAY) pointing away from the null, and decreasing size of the validation study (%val).
In table 3, we show the results obtained when disease occurrence is simulated as a function of exposure and the three individual covariates (equation 7) in order to allow surrogacy to be violated. The table presents the median odds ratio (and interquartile range) for the exposure-disease association ORAY, the median percentage of bias reduction for selected parameters using PSC, and results of the two diagnostic assessments for violation of the surrogacy assumption. Instead of the log odds ratio for the PSGS-disease association
2 varied in the first set of simulations, ORCY is varied for ORCA's of 0.5 and 2.0, respectively.
|
Since surrogacy can be violated when disease occurrence is simulated as a function of the individual covariates (equation 7), we include the results of the two proposed diagnostic evaluations for the surrogacy assumptionthat is, the likelihood ratio test for PSEP and the percentage of variance of Y due to PSGS and PSEP that is explained by PSGS.
When the surrogacy assumption holds, that is, when the median p value from the likelihood ratio test is higher than 0.3 and the percentage of variance explained by PSGS is more than 73.8 percent in the scenarios assessed, the median ORAY is very close to the true value and the percentage of bias reduction ranges from 32 percent to 106 percent accordingly (except when Nmain = 1,000). The coverage of the 95 percent confidence interval is close to nominal in these scenarios, with coverage decreasing with increasing incidence of disease (IY) and decreasing size of the validation sample (%val).
PSC is biased, however, when ORCA = 2 and ORCY = 2, when ORCA = 0.5 and ORCY = 0.5, when ORCA = 0.5 and ORCY = 1, and when ORCA = 2 and ORCY = 1 (some scenarios are presented twice to allow easy assessment of variation of one of the parameters). These scenarios can be characterized by the additional confounding due to the unobserved covariate C's not acting in the same direction as the confounding by the observed covariates X (see arrows in table 3). They all show indications of violation of the surrogacy assumption: The median p value from the likelihood ratio test is 0.2 or less, and the percentage of variance explained by PSGS is low (less than 45.5 percent in the scenarios assessed).
The only apparent exception seems to be the scenario with ORCA = 1, where the likelihood ratio test (p = 0.05) and the percentage of variance explained by PSGS (43.6 percent) indicate a violation of the surrogacy assumption but PSC is nevertheless unbiased. Since there is by definition no residual confounding when ORCA = 1, the analysis controlling for X1 and X2 leads to an unbiased estimate. When ORCA = 1, PSC is unbiased despite indications of violation of surrogacy, since C is not associated with exposure and therefore is not a confounder. When ORCA = 1, surrogacy is violated, since the inclusion of C adds unnecessary variability to PSGS as compared with PSEP (24). Therefore, surrogacy is a sufficient but not always necessary condition for PSC to be valid.
| DISCUSSION |
|---|
|
|
|---|
We evaluated the performance of PSC using simulations over a wide range of parameter values. These results should be interpreted in light of the specific parameter values we selected for our settings. These values resulted in strong, but not unrealistic, unmeasured confounding in the main study (e.g., such as might be plausible for the association between self-selected hormone therapy and myocardial infarction in postmenopausal women). PSC was always valid in the first set of simulations (table 2)that is, when simulating the disease as a (log)linear function of PSGS according to the target model of PSC and surrogacy thus holds by design. The second set of simulations, however, indicates that the approach may increase rather than decrease bias if surrogacy is violated (table 3). Generally speaking, surrogacy is violated when the direction of confounding of the exposure-disease association caused by the unobserved variable(s) differs from that of the confounding due to observed variables. One can use different diagnostics to assess violations of surrogacy if the validation study contains sufficient information on the outcome.
Despite the intuition that adding an unmeasured confounder to the propensity score would always introduce differential measurement error and thus violate surrogacy, surrogacy holds when the direction of confounding of the observed and unobserved variables(s) is the same, as evinced by our simulations. In such settings, adding the confounder to the propensity score increases the strength of association between the propensity score and the disease outcome. Therefore, the entire association between PSEP and the outcome might be captured in PSGS, which results in surrogacy. If the direction of the confounding by the unmeasured confounder(s) is different from the direction of the measured confounder(s), including the unmeasured confounder(s) in the propensity score reduces the strength of association between the propensity score and the disease outcome. Therefore, PSEP is more strongly associated with disease risk than is PSGS, thus violating surrogacy.
Simulations cannot allow a quantitative assessment of how frequently surrogacy holds or is violated in epidemiologic studies. Many informal and other formal sensitivity analyses of residual confounding also depend on the assumption of unidirectionality of confounding (2527). This assumption is plausible in many epidemiologic settings, but not all, especially since PSC addresses the joint confounding of a set of observed and unobserved covariates rather than a single covariate. In such a setting, the surrogacy assumption might be plausible if an underlying and well-understood framework for confounding is consistent with surrogacy. Practical examples for such a framework include variables used in claims data (e.g., chronic obstructive pulmonary disease, being admitted to a nursing home) as a proxy for the unmeasured covariate of interest (e.g., smoking and frailty, respectively). In such settings, a more refined propensity score based on alternative measures with less error than those measured in the main study (e.g., smoking and activities of daily living or cognitive function, respectively) might be hypothesized to contain all the relevant information on propensity of exposure captured in an error-prone propensity score. Thus, surrogacy might be a plausible assumption in such settings.
The direction of confounding introduced by any single unobserved covariate may be unpredictable and thus clearly lead to a violation of surrogacy of PSEP estimated without information on that covariate. Prior knowledge about the association of that covariate with the study outcome might be used as a warning sign in case outcome information is not available in the validation study. As in regression analyses, inclusion of a covariate unrelated to disease should be avoided. In propensity score analyses, not including such a covariate would lead to an increase in efficiency (24). Because PSC is used to adjust for unmeasured confounding, including covariates from a validation study thought to be unrelated to disease (ORCY = 1) would make no sense. Including only covariates from the validation study that truly are risk factors for the disease outcome would avoid biased results due to violation of surrogacy in two out of the four settings assessed where PSC was biased.
The assumption necessary for PSC to be valid can further be conceptualized in the framework of instrumental variables (28). One critical assumption of instrumental variable analysis is that the instrument is unrelated to the outcome given the exposure of interest (29). Similarly, PSC is valid if PSEP is independent of disease given the exposure of interest and PSGS.
If the validation study contains data on disease outcome, fitting a model of the outcome as a function of these two propensity scores and exposure in the (internal) validation study allows one to test surrogacy before applying PSC. The proposed tests for surrogacy performed well in the scenarios we assessed. The cutpoints we used were chosen according to the scenarios assessed, however, and are arbitrary. The power of the likelihood ratio test to detect violations of surrogacy will depend on the size of the validation study. Therefore, the percentage of variance in outcome explained by PSGS might be preferred in validation studies with few outcomes.
PSC based on regression calibration as proposed by Rosner et al. (15, 16) had a tendency to "overadjust" even in scenarios where surrogacy was met. Furthermore, standard errors of the adjusted estimates obtained from regression calibration were consistently smaller than the empirical standard errors of the adjusted estimates across all simulations, leading to nonnominal coverage of the confidence intervals (data not shown). Therefore, we implemented PSC as a single imputation according to Carroll et al. (17), which allowed matching on the imputed PSGS. Matching on the imputed PSGS and using bootstrapping to obtain a robust estimate of the variance solved both problems, resulting in nominal coverage probabilities for most scenarios assessed.
Coverage probabilities decreased with increasing incidence of the disease outcome and decreasing size of the validation sample. A rare disease outcome is a general assumption of regression calibration (16) and is likely to be exacerbated in PSC owing to the problem of the noncollapsibility of the odds ratio under exchangeability of exposed and unexposed given PSGS (30). The lower coverage probabilities with smaller sizes of the validation study might be an indication of problems due to model misspecification or nonconvergence. Coverage probabilities are meaningless for biased estimators, since they would approach 0 with increasing study size. Despite this, we present coverage probabilities for all scenarios, because scenarios with only small residual bias are likely to converge to unbiased ones with increasing study size and number of simulations.
Regression calibration approximations are known to fail when the measurement error is large (31, 32), as when the correlation between the estimated error-prone and gold-standard measurements is weak. Kuha (31) observed that the performance of regression calibration degrades if the product of the squared estimate and its mean squared error exceeds 0.5. In our scenarios, the median of this value ranged from 0.48 to 0.66, corresponding to a range in which problems can be expected in a large proportion of simulated data sets. Since PSGS captures all of the confounding in a single covariate, misspecification of its association with the outcome is likely to reduce its ability to control for confounding (33). The linear measurement error model of regression calibration is only one possible model for relating PSGS to PSEP.
Besides surrogacy, the validity of PSC is dependent on additional assumptions underlying all epidemiologic analyses. Even with validation data, it is unlikely that all confounders are measured with sufficient accuracy, and therefore unmeasured confounding can never be completely ruled out. Residual or unmeasured confounding is only one aspect of uncertainty in epidemiologic studies (34), and considering multiple forms of bias is worthwhile (35, 36).
We conclude that the use of PSC to adjust for unmeasured confounding with validation data is a useful approach for reducing residual bias when the error-prone propensity score estimated in the main study is a surrogate for the true gold-standard propensity score. Like any method addressing unmeasured covariates or missing data, PSC is not a substitute for having all covariates adequately measured. If surrogacy is violated, PSC might increase rather than decrease bias. In the usual setting of validation studies without information on outcomes, PSC is likely but not guaranteed to improve estimates if an underlying theory about the confounding pattern is consistent with the necessary assumption. Adding measures of disease outcome to validation studies would allow epidemiologists to expand the application of PSC without relying on a strong surrogacy assumption.
| ACKNOWLEDGMENTS |
|---|
This project was funded by a grant (RO1 023178) from the National Institute on Aging.
Dr. Til Stürmer does not accept personal compensation of any kind from industry but has received salary support from unrestricted research grants from the pharmaceutical industry to the Brigham and Women's Hospital. Dr. Stürmer does not have a conflict of interest regarding the content of this manuscript. Dr. Robert J. Glynn has received grant support from AstraZeneca, Bristol-Myers Squibb, Merck & Company, Novartis International AG, and Pfizer Inc.
| NOTES |
|---|
Editor's note: An invited commentary on this article appears on page 1119, and the authors' response appears on page 1122.
| References |
|---|
|
|
|---|
- Cornfield J, Haenszel W, Hammond EC, et al. Smoking and lung cancer: recent evidence and a discussion of some questions. J Natl Cancer Inst (1959) 22:173203.[ISI][Medline]
- Rosenbaum PR, Rubin DB. Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J R Stat Soc Ser B (1983) 45:21218.
- Rosenbaum PR. Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika (1987) 74:1326.
[Abstract/Free Full Text] - Rosenbaum PR. Sensitivity analysis for matched case-control studies. Biometrics (1991) 47:87100.[CrossRef][ISI][Medline]
- Greenland S. Basic methods for sensitivity analysis of bias. Int J Epidemiol (1996) 25:110716.
[Abstract/Free Full Text] - Lin DY, Psaty BM, Kronmal RA. Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics (1998) 54:94863.[CrossRef][ISI][Medline]
- Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Statistical models in epidemiologyHalloran ME, Berry DA, eds. (1999) New York, NY: Springer Publishing Company. 192.
- Little RJA, Rubin DA. Statistical analysis with missing data (2002) 2nd ed. New York, NY: John Wiley and Sons, Inc.
- Rosenbaum P. Observational studies (2002) 2nd ed. New York, NY: Springer Publishing Company.
- Little RJ, Rubin DB. Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches. Annu Rev Public Health (2000) 21:12145.[CrossRef][ISI][Medline]
- Schneeweiss S, Glynn RJ, Tsai EH, et al. Adjusting for unmeasured confounders in pharmacoepidemiologic claims data using external information: the example of COX2 inhibitions and myocardial infarction. Epidemiology (2005) 16:1724.[CrossRef][ISI][Medline]
- MacLehose RF, Kaufman S, Kaufman JS, et al. Bounding causal effects under uncontrolled confounding using counterfactuals. Epidemiology (2005) 16:54855.[CrossRef][ISI][Medline]
- Stürmer T, Schneeweiss S, Avorn J, et al. Adjusting effect estimates for unmeasured confounding with validation data using propensity score calibration. Am J Epidemiol (2005) 162:27989.
[Abstract/Free Full Text] - Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika (1983) 70:4155.
[Abstract/Free Full Text] - Rosner B, Willett WC, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med (1989) 8:105169.[ISI][Medline]
- Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am J Epidemiol (1990) 132:73445.
[Abstract/Free Full Text] - Carroll RJ, Ruppert D, Stefanski LA. Measurement error in nonlinear models (1995) London, United Kingdom: Chapman and Hall Ltd.
- Carroll RJ, Stefanski LA. Approximate quasi-likelihood estimation in models with surrogate predictors. J Am Stat Assoc (1990) 85:65263.[CrossRef][ISI]
- Huang P-Y, Buring JE, Ridker PM, et al. Awareness, accuracy, and predictive validity of self-reported cholesterol in women. In: J Gen Intern Med. (in press).
- Stürmer T, Schneeweiss S, Brookhart MA, et al. Analytic strategies to adjust confounding using exposure propensity scores and disease risk scores: nonsteroidal antiinflammatory drugs and short-term mortality in the elderly. Am J Epidemiol (2005) 161:8918.
[Abstract/Free Full Text] - Parsons LS. Reducing bias in a propensity score matched-pair sample using greedy matching techniques. (Paper 214-26). In: SUGI 26 proceedings. (Proceedings of the 26th annual SAS Users Group International conference, Long Beach, California, April 2225, 2001). Cary, NC: SAS Institute, Inc, 2001. (http://www2.sas.com/proceedings/sugi26/p214-26.pdf).
- Stürmer T, Joshi M, Glynn RJ, et al. A review of applications of propensity score methods showed increased use but infrequently different estimates compared with other methods. J Clin Epidemiol (2006) 59:43747.[ISI][Medline]
- McFadden D. The measurement of urban travel demand. J Public Econ (1974) 3:30328.[CrossRef]
- Brookhart MA, Schneeweiss S, Rothman KJ, et al. Variable selection for propensity score models. Am J Epidemiol (2006) 163:114956.
[Abstract/Free Full Text] - Cook JR, Stefanski LA. Simulation-extrapolation estimation in parametric measurement error models. J Am Stat Assoc (1994) 89:131428.[CrossRef][ISI]
- Fung KY, Krewski D. Evaluation of regression calibration and SIMEX methods in logistic regression when one of the predictors is subject to additive measurement error. J Epidemiol Biostat (1999) 4:6574.[Medline]
- Rothman KJ, Wentworth CE 3rd. Mortality of cystic fibrosis patients treated with tobramycin solution for inhalation. Epidemiology (2003) 14:559.[CrossRef][ISI][Medline]
- Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol (2000) 29:7229.
[Abstract/Free Full Text] - Angrist JD, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables. J Am Stat Assoc (1996) 91:44455.[CrossRef][ISI]
- Gail MH, Wieand S, Piantadosi S. Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika (1984) 71:43144.
[Abstract/Free Full Text] - Kuha J. Corrections for exposure measurement error in logistic regression models with an application to nutritional data. Stat Med (1994) 13:113548.[ISI][Medline]
- Stürmer T, Thürigen D, Spiegelman D, et al. The performance of methods for correcting measurement error in case-control studies. Epidemiology (2002) 13:50716.[CrossRef][ISI][Medline]
- Rubin DB. The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics (1973) 29:185203.[CrossRef][ISI]
- Maclure M, Schneeweiss S. Causation of bias: the episcope. Epidemiology (2001) 12:11422.[CrossRef][ISI][Medline]
- Lash TL, Fink AK. Semi-automated sensitivity analysis to assess systematic errors in observational epidemiologic data. Epidemiology (2003) 14:4518.[ISI][Medline]
- Greenland S. Multiple-bias modeling for analysis of observational data. J R Stat Soc Ser A (2005) 168:267306.[CrossRef]
- Cochran WG. The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics (1968) 24:295313.[CrossRef][ISI][Medline]
Related articles in Am. J. Epidemiol.:
- Invited Commentary: Advancing Propensity Score Methods in Epidemiology
- J. Michael Oakes and Timothy R. Church
Am. J. Epidemiol. 2007 165: 1119-1121.[Abstract] [FREE Full Text] - Stürmer et al. Respond to "Propensity Score Methods in Epidemiology"
- Til Stürmer, Sebastian Schneeweiss, Kenneth J. Rothman, Jerry Avorn, and Robert J. Glynn
Am. J. Epidemiol. 2007 165: 1122-1123.[Extract] [FREE Full Text]
This article has been cited by other articles:
![]() |
P. Cummings Propensity Scores Arch Pediatr Adolesc Med, August 1, 2008; 162(8): 734 - 737. [Full Text] [PDF] |
||||
![]() |
S. M. Cadarette, J. N. Katz, M. A. Brookhart, T. Sturmer, M. R. Stedman, and D. H. Solomon Relative Effectiveness of Osteoporosis Drugs for Preventing Nonvertebral Fracture Ann Intern Med, May 6, 2008; 148(9): 637 - 646. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Sturmer, S. Schneeweiss, K. J. Rothman, J. Avorn, and R. J. Glynn Sturmer et al. Respond to "Propensity Score Methods in Epidemiology" Am. J. Epidemiol., May 15, 2007; 165(10): 1122 - 1123. [Full Text] [PDF] |
||||
![]() |
J. M. Oakes and T. R. Church Invited Commentary: Advancing Propensity Score Methods in Epidemiology Am. J. Epidemiol., May 15, 2007; 165(10): 1119 - 1121. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


