American Journal of Epidemiology Advance Access originally published online on April 3, 2007
American Journal of Epidemiology 2007 165(12):1454-1461; doi:10.1093/aje/kwm034
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PRACTICE OF EPIDEMIOLOGY |
Using Probabilistic Corrections to Account for Abstractor Agreement in Medical Record Reviews
1 Boston University School of Public Health, Boston, MA
2 Boston University School of Medicine, Boston, MA
3 Boston Medical Center, Boston, MA
4 Wake Forest University School of Medicine, Winston-Salem, NC
5 Kaiser Permanente Southern California, Los Angeles, CA
6 Group Health Center for Health Studies, Seattle, WA
7 HealthPartners Research Foundation, Minneapolis, MN
8 University of Massachusetts Medical School, Worcester, MA
9 Fallon Community Health Plan, Worcester, MA
10 Henry Ford Health System, Detroit, MI
11 Yale University School of Medicine, New Haven, CT
12 Lovelace Health Systems, Albuquerque, NM
Correspondence to Dr. Timothy L. Lash, Department of Epidemiology, Boston University School of Public Health, 715 Albany Street, TE3, Boston, MA 02118 (e-mail: tlash{at}bu.edu).
Received for publication June 14, 2006. Accepted for publication December 6, 2006.
| ABSTRACT |
|---|
|
|
|---|
The quality of medical record abstracts is often characterized in a reliability substudy. These results usually indicate agreement, but not the extent to which lack of agreement affects associations observed in the complete data. In this study, medical records were reviewed and abstracted for patients diagnosed with stage I or stage II breast cancer between 1990 and 1994 at one of six US Cancer Research Network sites. For a subsample, interrater reliability data were available. The authors calculated conventional hazard ratios and 95% confidence intervals for the association of demographic, tumor, and treatment characteristics with recurrence rate. These conventional estimates of effect were compared with three sets of estimates and 95% simulation intervals that took account of the uncertainty assessed by lack of agreement in the reliability substudy. The rate of recurrence was associated with increasing cancer stage and with treatment modality but not with demographic characteristics. The hazard ratios and simulation intervals that took account of the reliability data showed that the simulation interval grew wider as the sources of uncertainty taken into account grew more complete, but the associations expected a priori remained readily apparent. While many investigators use reliability data only as a metric for data quality, a more thorough approach can also quantitatively depict the uncertainty in the observed associations.
breast neoplasms; data collection; epidemiologic methods; medical records
Abbreviations: CCI, Charlson comorbidity index; IRR, interrater reliability
| INTRODUCTION |
|---|
|
|
|---|
Medical record review provides data with which to confirm subject eligibility, ascertain disease outcomes, or characterize disease severity, treatments received, or comorbid conditions (1). Abstractors review medical records to collect requisite data using a standardized form accompanied by a code book to assure uniform coding decisions. With the advent of electronic medical record data, some data fields can be completed through direct linkage of electronic medical records with a data collection system (2). Nonetheless, many data fields still require review of multiple data sources or free text to abstract the information.
Abstractors should receive uniform training (3, 4), including an explanation of the data collection form (4) and its code book and pilot practice with a sample similar to the study sample (2). These pilot reviews should be compared with a review completed by an experienced reviewer (2) to identify abstractors who require additional training and data collection items that require revision to assure quality data. Once data collection begins, abstractors should be blinded to information that might affect coding decisions (3, 4), although separate reviews by two persons to ascertain outcomes and independent variables may be too costly. In this case, single reviewers who ascertain both should be blind to the study's hypotheses.
Resources should be allocated to collect reliability and validity data during the period of medical record review (3, 4). Abstractors should reabstract a proportion of the medical records for measurement of intrarater reliability, and a proportion of the medical records should be abstracted by an experienced reviewer for measurement of interrater reliability (IRR). If medical record review occurs over an extended study period, more than one set of reliability studies ought to be conducted (1). Conventionally, research articles present these measures of reliability and some measure of intraclass correlation (5) (e.g., the kappa statistic (6, 7)), along with an assurance that they reflect high-quality data, but investigators seldom conduct a quantitative assessment of the uncertainty in an effect estimate attributable to the observed inconsistencies.
We postulated that agreement between abstractors would correlate with the sensitivity and specificity of classification. We used results from an IRR study to inform a simulation study that quantified the bias and uncertainty introduced by classification errors.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The parent study enrolled 1,859 women aged 65 years or older with early-stage breast cancer who received health care in one of six geographically dispersed community-based integrated health-care systems (8). The study aims were to identify patient and tumor characteristics associated with receipt of treatment and adverse breast cancer outcomes. The study protocol was approved by institutional review boards at the coordinating center and the enrollment sites.
Population
We identified potentially eligible women diagnosed with American Joint Committee on Cancer (9) stage I, IIA, or IIB breast cancer in one of six health-care delivery systems participating in the National Cancer Institute-funded Cancer Research Network (Group Health Center for Health Studies, Seattle, Washington; Kaiser Permanente Southern California, Pasadena, California; Lovelace Health Systems, Albuquerque, New Mexico; Henry Ford Health System, Detroit, Michigan; HealthPartners, Minneapolis, Minnesota; and Fallon Community Health Plan, Worcester, Massachusetts). Eligible patients had a histologically confirmed first breast neoplasm diagnosed between 1990 and 1994 and had been enrolled in their health plan for at least 12 months before and after their diagnosis, unless they died within the first year after diagnosis. We excluded women with other malignancies, except nonmelanoma skin cancer, diagnosed within 5 years before or 30 days after the breast cancer diagnosis and women with bilateral breast cancer. For this analysis, we restricted the sample to women who received either breast-conserving surgery or mastectomy.
Data collection
We collected demographic, tumor, treatment, comorbidity, and follow-up data from medical records and electronic data sources, including cancer registry, administrative, and clinical databases. We initially populated the database with electronically available data and verified all preloaded data by medical record review, except cancer registry data elements reported to the National Cancer Institute's Surveillance, Epidemiology, and End Results registry (10). One person used a standard procedure to train all medical record abstractors at all sites. Standardized medical record reviews were conducted on-site by the trained abstractors, and the data were entered directly into a computer-based data collection system.
Analytic variables
Breast cancer recurrence.
We defined breast cancer recurrence as invasive cancer diagnosed in the same breast, in the lymph nodes, or at a distant site at least 120 days after the original diagnosis or after completion of the last surgery in the first course of treatment, whichever was later. We followed women to the first of the following: date of disenrollment, date of recurrence, date of death, or 10 years from the date of diagnosis.
Demographic data.
We gathered information on each woman's date of birth from cancer registry databases for sites with cancer registries (Group Health, Kaiser Permanente, Lovelace, Henry Ford) and from the women's medical records at the sites without cancer registries (Fallon, HealthPartners). We classified women into age groups of 65<70, 70<80, and
80 years.
Comorbidity data.
We collected information on comorbid conditions diagnosed at least 1 year before breast cancer diagnosis. We used this information to calculate the Charlson comorbidity index (CCI) (11), which has been validated in a breast cancer cohort (12) and used in previous studies of older breast cancer patients (12, 13).
Tumor data.
At sites with electronic cancer registry records, we collected information on date of diagnosis, tumor size, node evaluation, histology, differentiation, and estrogen receptor status from the registry, unless it was missing, in which case we collected the information from the women's medical records. At sites without cancer registry records, we collected this information from the medical records.
Treatment data.
We gathered information on surgical, radiation, and systemic therapies from electronic cancer registry files, when available, and supplemented these data with information from the women's medical records. We collected information about whether women had an axillary lymph node dissection or sentinel node biopsy, underwent breast-conserving surgery or mastectomy, completed radiation therapy, completed chemotherapy, and initiated systemic hormonal therapy.
IRR data.
We selected 125 records for reabstraction at random within enrollment sites from those completed in the first half of the data collection period. We chose a sample size of 125 as a number that would yield reasonably precise estimates of agreement at a reasonable cost for the resources required for reabstraction. For a similar reason, we selected 54 elements from a total of 658 to be reabstracted by an experienced reviewer at each site.
Statistical analysis
Conventional analysis.
We calculated the frequency and proportion of the cohort in each category of each of the analytic variables, as well as the crude recurrence rate and hazard ratio associating breast cancer recurrence with the variable's categories. We used multivariable Cox proportional hazards modeling to model the hazard of recurrence as a function of age at diagnosis, baseline comorbidity, tumor size, node status, histology, histologic grade, receipt of axillary node evaluation, receipt of breast-conserving surgery, receipt of radiation therapy, receipt of tamoxifen therapy (within strata of estrogen receptor expression), and receipt of chemotherapy.
Analysis of IRR data.
For each variable included in the IRR exercises, we calculated the proportion of concordant and discordant responses gathered by the exercise.
Monte Carlo simulation.
We used Monte Carlo simulation to assess the bias and uncertainty introduced by classification errors. We have previously described our general strategy for modifying data sets by simulation to account for uncertain classification (14, 15). We used the proportion of concordant responses to inform three estimates of the sensitivity of correct classification for each category of each variable.
Fixed scenario.
Initially, we set the sensitivity equal to the IRR concordance in that category. For example, we set the sensitivity of classification of having had a CCI value of 0 equal to 94 percent, because 83 of 88 of the women originally classified as having a CCI value of 0 were also reported to have that value during the IRR abstraction. For dichotomous variables, we also set the specificity equal to the IRR concordance in the reference category.
Binomial scenario.
We parameterized a binomial distribution for the sensitivity (and specificity, for dichotomous variables) by setting the probability of success equal to the fixed sensitivity (or specificity) and the number of trials equal to the number of women originally observed in the category. We then drew a random number from the binomial distribution and used this number in the numerator of the sensitivity or specificity for the iteration. For example, we set the probability of success equal to the observed 83 of 88 women initially classified as having a CCI value of 0 who were reported to have a CCI value of 0 during the IRR abstraction, and we set the number of trials equal to 88. Drawing from this binomial distribution can yield anywhere from 0 to 88 correctly classified subjects, resulting in sensitivity that ranges from 0 percent (0/88 = 0 percent) to 100 percent (88/88 = 100 percent) but is centered on the expectation of 94 percent (83/88 = 94 percent).
Trapezoidal scenario.
We parameterized a trapezoidal distribution for the sensitivity (and specificity, for dichotomous variables) by setting the maximum equal to 100 percent, the upper mode equal to the fixed sensitivity (or specificity), the lower mode equal to the upper mode minus the greater of 10 percent and twice the difference between the maximum and the upper mode, and the minimum equal to the lower mode minus the greater of 10 percent and twice the difference between the maximum and the upper mode. We required all four parameters of the trapezoidal distribution to exceed 50 percent; otherwise we assigned a value of 50 percent. For example, the maximum sensitivity for having a CCI value of 0 equaled 100 percent, the upper mode equaled the fixed sensitivity (94.4 percent), the lower mode equaled 83.1 percent (94.4 percent minus 11.3 percent), and the minimum equaled 71.8 percent (83.1 percent minus 11.3 percent). For variables with more than two levels, we induced a correlation of 0.8 between the trapezoidal distributions assigned to each level of the variable (15).
These distributions describe the sensitivity (s) (or specificity, t) of classification. However, the simulation method requires an estimate of the predictive value, which is also a function of the prevalence of the value (p). The relation is depicted in the following equations used to calculate the positive predictive value (PPV) and negative predictive value (NPV) for a dichotomous (j = 1 or 0) variable (indexed by i).
![]() |
Given the dependence of predictive values on prevalence, we calculated predictive values within strata of another variable expected a priori to be strongly associated with the variable under consideration. For example, we expected the prevalence of a CCI value of 0 to be highest in the youngest age group and lowest in the oldest age group. Table 1 shows that this relation was observed and shows its effect on the predictive values. As the prevalence decreases, the positive predictive value decreases and the negative predictive value increases. We calculated predictive values for recurrence, receipt of chemotherapy, and tumor size within strata of node status. We also calculated predictive values for 1) receipt of tamoxifen therapy within strata of estrogen receptor expression, 2) receipt of radiation therapy within strata of breast-conserving surgery, and 3) CCI value within strata of age.
|
After calculating the predictive values, we conducted a Bernoulli trial for every variable included in the IRR and for every record in the data set with the probability of success equal to the predictive value. When the Bernoulli trial returned a finding of "true," we did not change the original value assigned to the variable in that record. When the Bernoulli trial returned a finding of "false," we changed the original value assigned to the variable in that record. For dichotomous variables, we changed values of 1 to 0 and values of 0 to 1. For multilevel variables, we assigned a new value through imputation informed by the IRR results. For example, a person originally classified as having a CCI value of 3 or 4 whose Bernoulli trial returned a finding of "false" would have a 50 percent probability of reclassification to CCI = 0 and a 50 percent probability of reclassification to CCI = 1 or 2.
We then analyzed this modified data set to obtain an estimate of the associations, adjusted for the simulated misclassification errors, by reestimating the multivariable proportional hazards model. By repeating the method over 5,000 iterations, we accumulated a frequency distribution of the associations, from which we obtained a median as a point estimate and a simulation interval (the 2.5th and 97.5th percentiles) that reflected adjustment for the misclassification bias and additional uncertainty contributed by classification errors. To simultaneously incorporate random error, we subtracted the product of the proportional hazards model's covariance matrix and a vector of random normal deviates from the model's parameter vector. Accumulating these results generated a frequency distribution that simultaneously accounted for random error and uncertainty due to classification errors (14, 15).
| RESULTS |
|---|
|
|
|---|
We enrolled 1,859 breast cancer patients, 1,836 of whom underwent either mastectomy or breast-conserving surgery and formed the sample for this substudy.
Table 2 shows the descriptive characteristics of the cohort, crude breast cancer recurrence rates, crude hazard ratios, and mutually adjusted hazard ratios. These conventional results are similar to those from an analysis of receipt of guideline-appropriate therapy and recurrence in the full study sample (16).
|
Table 3 shows the results of the IRR substudy. For most variables, concordance between the initial abstracted value and the reabstracted value was quite good (
90 percent). The IRR data (table 3) also provide face validity for the notion that the sensitivity and specificity of classification are proportional to the IRR concordance. For example, we expected tumor size to be easily and accurately abstracted because it was either populated from another data source or abstracted from the medical record, where it is readily found in the pathology report. Only two of the 125 reabstracted records contained a discordant tumor size. In contrast, the CCI always requires review of the medical record and is a composite variable constructed by abstracting reports of multiple diseases recorded in multiple and diverse records. Therefore, we expected that the reabstracted records would have poorer agreement with the original value, and we found that 15 of the 125 reabstracted records contained a discordant value.
|
Figure 1 illustrates the four probability density distributions used to model the probability of correct classification, using the IRR data for the sensitivity of classification of a CCI equal to zero as an example.
|
Figure 2 depicts the results of the Monte Carlo simulations using the four probability distributions. The characteristics expected a priori to be associated with recurrence hazard were, in fact, observed to be associated with recurrence. For example, women with larger tumors, more positive lymph nodes, or poorly differentiated tumors had a higher hazard of recurrence. Breast-conserving surgery conferred a higher hazard of recurrence than mastectomy, but radiation therapywhich ordinarily accompanies breast-conserving surgery but less often accompanies mastectomyreduced the hazard of recurrence. In general, the trapezoidal scenario yielded the greatest uncertainty, as measured by the vertical length of the intervals, and the conventional model yielded the least uncertainty.
|
| DISCUSSION |
|---|
|
|
|---|
The IRR study showed that medical record review yielded high-quality data, since most values assigned to variables did not change upon reabstraction by an experienced reviewer. Most investigators would only report this characterization of the data quality, but we chose to quantify the additional uncertainty in the effect estimates contributed by the uncertainty in classification of the analytic variables. We implemented three models to quantify that uncertainty, none of which eradicated the ability to discern the associations strongly expected a priori.
Of the scenarios we implemented, the binomial scenario may provide an appropriate compromise between modeling the uncertainty due to classification errors and overstating that uncertainty. The conventional model assumes no errors in classification, which is unlikely given the results of the IRR substudy. The fixed scenario sets the classification error rates equal to the results observed in the IRR substudy. It is unlikely that the true classification error rates exactly equal those observed in the reliability substudy, both because we only expect them to be correlated and because of the potential for chance variation in the substudy sample. The binomial scenario accounts for the subsample variability but on average preserves the correlation. The trapezoidal scenario probably overstates the uncertainty in the classification error rates.
Monte Carlo simulation methods, such as the ones implemented here, extend conventional confidence intervals to account for the assigned distributions of bias parameters (17), such as abstractor agreement, and thereby account for sources of uncertainty beyond random error (18). Alternative methods for addressing measurement error in Cox proportional hazards models have been proposed (1923). Many rely on an additive error model (24, 25) or an assumption of perfect validity data (26), neither of which restriction applies to our method. Others become computationally prohibitive as the number of covariates subject to classification errors grows large (20, 22). The method of Zucker and Spiegelman (20), which applies directly to misclassified discrete covariates such as those used in our analysis, would require specification of too large a matrix to analyze all of our variables simultaneously.
However, our method should be considered with the following limitations in mind. First, substituting a measure of agreement for the classification error rate is questionable. Were a true gold standard available, an actual measure of classification error rates would be preferable. For example, errors in the medical record, such as an erroneous report of tumor size, would result in classification error because the abstracted tumor size would not equal the true tumor size. Nonetheless, since tumor size is easily abstracted, it is likely that the two abstractors would both record the same erroneous value, thereby underrepresenting the classification error rate. Despite this limitation, it does seem that measures of agreement are correlated with classification error rates (27, 28), as suggested by the poorer agreement for items thought to be more difficult to abstract (e.g., the CCI).
Second, not all variables in the model of recurrence rate were included in the IRR substudy, so we could not model all of them simultaneously. We selected variables for the substudy based, in part, on an expectation for them to be susceptible to differences in agreement and for them to be strongly related to recurrence rate. Age group, for example, is likely to be easily and correctly abstracted, so it should have very high agreement. Age group is unlikely to be strongly related to recurrence rate. Therefore, we do not expect that our exclusion of age group or similar variables from the IRR substudy affected the results.
Third, the trapezoidal scenario introduced a positive correlation, so that classification error rates drawn from the tail of a distribution would be likely to also be drawn from the same tail in a different category of the same variable. For example, if abstractors were simulated as being good at assessing comorbidity equal to 1, they were also probably simulated as being good at assessing comorbidity equal to 2. It is possible, though, that abstractors were good at assessing comorbidity equal to 0 but poor at assessing comorbidity greater than 0, simply because they were not skilled at identifying comorbid diseases represented in the medical record. In this circumstance, a negative correlation would better represent the data. Ultimately, the simulation model must represent the error-generating mechanisms (4, 27) and their interactions, not all of which may be well understood.
We calculated predictive values within strata of a second variable strongly related to the first. A full Bayesian model might simultaneously take account of all other variables, rather than just one, and thereby more completely model the data-generating mechanism. In addition, a more complete simulation study would evaluate a wider range of parameters, whereas findings reported in this study were based on parameters actually estimated in the cohort.
Despite these limitations, we contend that a quantitative estimate of the uncertainty contributed by classification errors is superior to a qualitative assessment of data quality based on the agreement between raters (29). Each of the aforementioned limitations is amenable to alternative modeling strategies that might yield important differences in the results. If so, those differences should inspire investigators to collect additional validation data that better characterize the association between the variables and the outcome, taking account of the classification errors. A simple characterization of the agreement between raters, which almost always seems to find that agreement was "good," is unlikely to inspire collection of additional validation data for better characterizing the true uncertainty in estimates of effect.
| ACKNOWLEDGMENTS |
|---|
This study was supported by Public Health Service grant R01 CA093772 ("Breast Cancer Treatment Effectiveness in Older Women"; Rebecca A. Silliman, Principal Investigator) from the National Cancer Institute, National Institutes of Health, US Department of Health and Human Services.
The authors thank the people who supported this project, including the site project managers, programmers, and medical record abstractors: Group Health Center for Health StudiesLinda Shultz, Kristin Delaney, Margaret Farrell-Ross, Mary Sunderland, Millie Magner, and Beth Kirlin; Meyers Primary Care Institute, Fallon Community Health PlanKimberly Hill, Jackie Fuller, Doris Hoyer, and Janet Guilbert; Henry Ford Health SystemSharon Hensley Alford, Karen Wells, Patricia Baker, and Rita Montague; HealthPartnersMaribet McCarty and Alex Kravchik; Kaiser Permanente Southern CaliforniaJulie Stern, Janis Yao, Michelle McGuire, Erica Hnatek-Mitchell, and Noemi Manlapaz; Lovelace Health SystemsJudith Hurley, Hans Petersen, and Melissa Roberts.
This research was presented at the 38th Annual Meeting of the Society for Epidemiologic Research (Toronto, Ontario, Canada, June 2730, 2005) and the 2005 Joint Statistical Meetings (Minneapolis, Minnesota, August 711, 2005).
Conflict of interest: none declared.
| References |
|---|
|
|
|---|
- Yawn BP, Wollan P. Interrater reliability: completing the methods description in medical records review studies. Am J Epidemiol (2005) 161:9747.
[Abstract/Free Full Text] - Cassidy LD, Marsh GM, Holleran MK, et al. Methodology to improve data quality from chart review in the managed care setting. Am J Manag Care (2002) 8:78793.[Web of Science][Medline]
- Reisch LM, Scura Fosse J, Beverly K, et al. Training, quality assurance, and assessment of medical record abstraction in a multisite study. Am J Epidemiol (2003) 157:54651.
[Abstract/Free Full Text] - Gilbert EH, Lowensten SR, Koziol-McLain J, et al. Chart reviews in emergency medicine research: where are the methods? Ann Emerg Med (1996) 27:3058.[CrossRef][Web of Science][Medline]
- Sheikh K. Re: "Interrater reliability: completing the methods description in medical records review studies." (Letter). Am J Epidemiol (2005) 162:919.
[Free Full Text] - Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics (1977) 33:15974.[CrossRef][Web of Science][Medline]
- Yawn BP, Wollan PM. Re: "Interrater reliability: completing the methods description in medical records review studies." (Reply to letter). Am J Epidemiol (2005) 162:91920.
[Free Full Text] - Enger SM, Thwin SS, Buist DS, et al. Breast cancer treatment of older women in integrated health care settings. J Clin Oncol (2006) 24:437783.
[Abstract/Free Full Text] - Fleming ID, Cooper JS, Henson DE, et al. AJCC cancer staging manual (1997) 5th ed. Philadelphia, PA: Lippincott Williams & Wilkins.
- Johnson CH, ed. SEER program coding and staging manual 2004 (2004) 4th ed. Bethesda, MD: National Cancer Institute. (NIH publication no. 04-5581). (http://seer.cancer.gov/tools/codingmanuals/).
- Charlson ME, Pompei P, Ales KL, et al. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis (1987) 40:37383.[CrossRef][Web of Science][Medline]
- Hebert-Croteau N, Brisson J, Latreille J, et al. Compliance with consensus recommendations for the treatment of early stage breast carcinoma in elderly women. Cancer (1999) 85:110413.[CrossRef][Web of Science][Medline]
- Newschaffer CJ, Bush TL, Penberthy LT. Comorbidity measurement in elderly female breast cancer patients with administrative and medical records data. J Clin Epidemiol (1997) 50:72533.[CrossRef][Web of Science][Medline]
- Lash TL, Fink AK. Semi-automated sensitivity analysis to assess systematic errors in observational epidemiologic data. Epidemiology (2003) 14:4518.[Web of Science][Medline]
- Fox MP, Lash TL, Greenland S. A method to automate probabilistic sensitivity analyses of misclassified binary variables. Int J Epidemiol (2005) 34:13706.
[Abstract/Free Full Text] - Geiger AM, Thwin SS, Lash TL, et al. Recurrences and second primary breast cancers in older women with early stage disease initially. Cancer (2007) 109:96674.[CrossRef][Web of Science][Medline]
- Greenland S. Interval estimation by simulation as an alternative to and extension of confidence intervals. Int J Epidemiol (2004) 33:138997.
[Abstract/Free Full Text] - Greenland S. Multiple bias modelling for analysis of observational data. J R Stat Soc Ser A Stat Soc (2005) 168:267306.[CrossRef]
- Hu P, Tsiatis AA, Davidian M. Estimating the parameters in the Cox model when covariate variables are measured with error. Biometrics (1998) 54:140719.[CrossRef][Web of Science][Medline]
- Zucker DM, Spiegelman D. Inference for the proportional hazards model with misclassified discrete-valued covariates. Biometrics (2004) 60:32434.[CrossRef][Web of Science][Medline]
- Zucker DM. A pseudo-partial likelihood method for semiparametric survival regression with covariate errors. J Am Stat Assoc (2005) 100:126477.[CrossRef][Web of Science]
- Li Y, Ryan L. Inference on survival data with covariate measurement erroran imputation-based approach. Scand J Stat (2006) 33:16990.[CrossRef]
- Liu K, Stone RA, Mazumdar S, et al. Covariate measurement error in the Cox model: a simulation study. Commun Stat Simul Comput (2004) 33:107793.
- Xie SX, Wang CY, Prentice RL. A risk set calibration method for failure time regression by using a covariate reliability sample. J R Stat Soc Ser B Stat Methodol (2001) 63:85570.[CrossRef]
- Huang YJ, Wang CY. Cox regression with accurate covariates unascertainable: a nonparametric-correction approach. J Am Stat Assoc (2000) 95:120919.[CrossRef][Web of Science]
- Hu CC, Lin DY. Cox regression with covariate measurement error. Scand J Stat (2002) 29:63755.[CrossRef]
- Horwitz RI, Yu EC. Assessing the reliability of epidemiologic data obtained from medical records. J Chronic Dis (1984) 37:82531.[CrossRef][Web of Science][Medline]
- Boyd NF, Pater JL, Ginsburg AD, et al. Observer variation in the classification of information from medical records. J Chronic Dis (1979) 32:32732.[CrossRef][Web of Science]
- Lash TL. Heuristic thinking and inference from observational epidemiology. Epidemiology (2007) 18:6772.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
J. L.F. Bosco, T. L. Lash, M. N. Prout, D. S.M. Buist, A. M. Geiger, R. Haque, F. Wei, R. A. Silliman, and for the BOW Investigators Breast Cancer Recurrence in Older Women Five to Ten Years after Diagnosis Cancer Epidemiol. Biomarkers Prev., November 1, 2009; 18(11): 2979 - 2983. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



