American Journal of Epidemiology Advance Access originally published online on January 14, 2008
American Journal of Epidemiology 2008 167(6):641-643; doi:10.1093/aje/kwm368
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Invited Commentary: The Use of Imperfect Data—Compromise or Compromising?
From the Epidemiology Department, Emory University, Atlanta, GA
Correspondence to Dr. Penelope Howards, Epidemiology Department, Emory University, 1518 Clifton Road, NE, Atlanta, GA 30322 (e-mail: penelope.howards{at}emory.edu).
Received for publication October 30, 2007. Accepted for publication November 8, 2007.
| ABSTRACT |
|---|
|
|
|---|
Automated databases are appealing resources because they contain detailed data that are relatively accessible, but there are also critical gaps in the data available. Researchers may compromise by trying to fill those gaps with proxy variables, but how appropriate these surrogates are is rarely known. In this issue (Am J Epidemiol 2008;167:630–640), Toh et al. consider the effect of using two algorithms to estimate the timing of medication use during pregnancy in the absence of gestational-age data. Although the delivery-date algorithm has promising sensitivity and specificity, it is true under very specific conditions that seem unlikely to hold generally. Furthermore, it seems difficult to know a priori when those conditions do hold. There are times when using imperfect data is acceptable, but, at other times, the data are too imperfect to be helpful. Automated databases are certainly valuable, but they should be used with caution and where possible should be linked to databases that can fill the critical gaps.
bias (epidemiology); gestational age; medical records systems, computerized; pregnancy outcome
Automated databases, such as pharmacy records from health maintenance organizations, are appealing resources because they contain detailed health-related data (e.g., drug name, dose, date that the prescription was filled) that are accessible by using a fairly simple study protocol. However, other important variables are completely absent (e.g., whether the medication was actually consumed, potential confounders, gestational age for pregnant women). In the face of imperfect or incomplete information, researchers often try to resolve critical data gaps by using proxy variables. Unfortunately, it is rarely possible to directly evaluate how such approximations affect substantive conclusions, but, in this issue of the Journal, Toh et al. (1) examine how well timing of medication use during pregnancy can be assessed in the absence of gestational-age data.
They compare two algorithms to estimate medication use during the first trimester of pregnancy with the standard dating assessment based on the last menstrual period (1). The algorithms were selected from the literature and include the delivery-date algorithm (2), which assumes that all included pregnancies end in delivery 270 days after conception, and the pregnancy-indicator algorithm (3), which assumes that all included pregnancies involved prenatal visits (or pregnancy tests) at approximately the same time early in pregnancy. Toh et al. (1) examine the sensitivity and specificity of these algorithms in relation to dating based on the last menstrual period. They also perform sensitivity analyses to evaluate how various possible sensitivities of the algorithms would affect the estimated risk ratio.
The pregnancy-indicator algorithm is theoretically appealing because, as the authors point out (1), the time of the first prenatal visit is not dependent on the outcome of the pregnancy. However, women enter prenatal care at widely variable gestational ages, so the fundamental assumption of this method is flawed, even when excluding women whose first prenatal visit was less than 7 months before the delivery date. (Exclusion of these women, in of itself, could be a source of bias if late entry into prenatal care is associated with medication use and the outcome of interest.) As would be expected, Toh et al. (1) found the sensitivity of this method to be poor.
The delivery-date algorithm performed much better overall (for women meeting the original algorithm inclusion criteria, for all women, and for women with term births) but poorly for preterm births (1). As originally constructed (2), the delivery-date algorithm excludes a number of conditions associated with preterm birth in an attempt to limit the population to term births (because preterm births have shorter gestations but are not directly identifiable without gestational-age data). As Toh et al. (1) point out, this limitation undermines the utility of the algorithm. Many pregnancy outcomes that could be of interest are associated with young gestational ages at birth, including preterm birth itself, low birth weight, and some birth defects. In fact, it is difficult to think of outcomes that might be of interest and that would not be associated with gestational age. Therefore, excluding these pregnancies could lead to misleading results. Even if exclusion of preterm pregnancies was acceptable, exclusion of pregnancies affected by conditions related to preterm birth could be problematic. These conditions are not perfectly correlated with preterm birth. Therefore, non-preterm births would be unintentionally excluded and some preterm births would remain in the study population with possibly misclassified exposure status. Furthermore, the complications themselves might be outcomes of interest or be associated with the outcome of interest.
Toh et al. (1) considered medication use during the entire first trimester, but, as they mention, the sensitivity and specificity of the algorithms would change depending on the exposure window of interest. Teratogenicity of medications is one area of traditional concern. While organogenesis occurs during the first trimester, different organs develop during different gestational weeks, so, for specific birth defects, examining medication use during key gestational weeks or key gestational months may be more appropriate than considering the entire first trimester. The sensitivity and specificity of the algorithms would presumably decrease with shorter exposure windows of interest, so studies of specific birth defects would be more prone to error. As Toh et al. point out, the performance of the algorithms is also dependent on the length of time the medication is used. Medication used daily throughout pregnancy would be easy to identify even with a flawed algorithm, but medication used sporadically would be more difficult to capture.
It is possible that the sensitivities observed by Toh et al. (1) are optimistic even given the criteria they used. Medication use was based on recall up to 6 months after delivery. As the authors state, the point of the paper was to assess the timing of medication use, which theoretically would not be affected by the accuracy of the exposure recall (1) because once exposure status during the first trimester is assigned based on the last menstrual period, the only remaining question is whether the algorithms assign the same status. Nevertheless, women may be more aware of medication use after they know they are pregnant. Therefore, they may not remember medications they used after they were pregnant but before they knew they were pregnant. These women would currently be classified as unexposed based on the last menstrual period dating and the algorithms (assuming they did not use medications later in pregnancy), but it is possible that if the exposure had been recalled correctly (or an automated database had been used), they would have been classified as exposed based on the last menstrual period and unexposed based on the algorithms. Early in the first trimester is the very time when the last menstrual period–based trimester and the algorithms overlap inconsistently; therefore, for longer pregnancies, forgotten medication use early in the trimester would increase the denominator (true positives) but not the numerator (observed positives) of the sensitivities. Thus, it is possible that the observed sensitivities reported in the paper may be higher than the true sensitivities, although it is unclear how much of a problem this actually is.
Toh et al. (1) went beyond considering the sensitivities and specificities of the algorithm and examined how the observed sensitivities would affect estimated relative risks in two sensitivity analyses: one in which the sensitivities were the same for cases and noncases (nondifferential misclassification of exposure) and one in which they differed (differential misclassification). As would be expected on average, when the sensitivities were the same for the cases and the noncases, the results were biased toward the null, but, under the authors' assumptions, the magnitude of the bias was small and would not lead to different substantive conclusions compared with the unbiased results (1). It is not clear what, if any, conditions would be required to introduce substantial bias when misclassification of medication use is not differential by outcome. In contrast, the sensitivity analyses in which the sensitivities differed by case/noncase status led to bias that could affect the conclusions drawn. It seems plausible if not likely that the sensitivity for the cases and the noncases would be different, especially if the outcome is associated with gestational age at birth, so these results are of concern.
Another drawback to the algorithms is that spontaneous abortion cannot be studied. The delivery-date method would not be appropriate because there would not be 270 days between conception and loss, and the pregnancy-indicator method would also drop spontaneous abortions because the time between the first prenatal visit and pregnancy termination would be less than 7 months. Admittedly, spontaneous abortion data are difficult to capture, but if the interest is in the teratogenicity of drugs, then it is important to consider the possibility that the medications lead to an increased risk of fetal death due to severe malformations. It may not be adequate to examine the effect of the medication among pregnancies that survive to term (or even preterm). Automated databases might still be helpful, but they would need to be linked to medical records that include gestational-age data for all pregnancies, including those that terminate early.
There are times when it is appropriate and necessary to use imperfect data rather than wait for a possibly unattainable ideal, but there are other times when the data are so imperfect that it is better not to use them because they are unlikely to provide valid information. It is difficult to identify which case is which without outside knowledge. Toh et al. (1) have provided a valuable contribution to the literature by actively exploring the effect of using two algorithms to assess the timing of exposure during pregnancy in the absence of gestational-age data. Their work suggests that the delivery-date algorithm could be useful provided the following conditions are met: the exposure is not associated with gestational age, the exposure window of interest is large (e.g., the entire first trimester), the medication is used more than sporadically, medication use is not focused early in pregnancy, and the exposure is not associated with the exclusion criteria. It seems unlikely that all these conditions would hold generally. Even if there are circumstances in which the delivery-date algorithm is adequate, it may not be possible to identify those circumstances a priori. If the effect of the medication is unknown, how is it possible to know in advance that it does not directly or indirectly affect gestational age?
One could argue that studies using automated databases are exploratory and, where needed, would be followed by studies designed to explore specific promising or important hypotheses. However, Toh et al.'s results (1) suggest that, at least under the specified assumptions, using the algorithm data can result in strong biases toward the null. If published, such research would lend weight to the idea that the medications of interest are harmless and might prevent studies with more idealized designs from being performed. On the other hand, it is not possible or desirable to develop a large cohort study or other expensive study design to examine an exposure that is unlikely to have an effect. Perhaps the best compromise would be to use the automated databases but spend the extra time and money to link them to other relevant medical records. Linking to birth certificates should at least provide a better (although still imperfect) estimate of gestational age for pregnancies that resulted in birth. It would be more difficult to find and link to data on potential confounders, but when confounding is likely to be strong, it would be essential.
There is a danger that bias can be introduced by filling critical data gaps with imperfect proxy data. One step in the right direction is to perform sensitivity analyses to evaluate the adequacy and potential effect of the surrogate data, but such analyses are limited by their assumptions. Toh et al. (1) went one step further and were able to actually compare the first-trimester algorithms with the last menstrual period standard. Their findings should be considered when weighing the evidence of studies using algorithms to estimate gestational age. Still, uncertainty remains as to the conditions under which their results hold. Although more expensive and resource intensive, designing studies that link automated databases to complementary data sources is a worthy goal.
| ACKNOWLEDGMENTS |
|---|
The author thanks Suzanne M. Gilboa and Pamela J. Mink for insightful comments on an earlier draft of the manuscript.
Conflict of interest: none declared.
| References |
|---|
|
|
|---|
- Toh S, Mitchell AA, Werler MM, et al. Sensitivity and specificity of computerized algorithms to classify gestational periods in the absence of information on date of conception. Am J Epidemiol (2008) 167:633–40.
[Abstract/Free Full Text] - Andrade SE, Raebel MA, Morse AN, et al. Use of prescription medications with a potential for fetal harm among pregnant women. Pharmacoepidemiol Drug Saf (2006) 15:546–54.[CrossRef][Web of Science][Medline]
- Hardy JR, Leaderer BP, Holford TR, et al. Safety of medications prescribed before and during early pregnancy in a cohort of 81 975 mothers from the UK General Practice Research Database. Pharmacoepidemiol Drug Saf (2006) 15:555–64.[CrossRef][Web of Science][Medline]
Related articles in Am. J. Epidemiol.:
- Sensitivity and Specificity of Computerized Algorithms to Classify Gestational Periods in the Absence of Information on Date of Conception
- Sengwee Toh, Allen A. Mitchell, Martha M. Werler, and Sonia Hernández-Díaz
Am. J. Epidemiol. 2008 167: 633-640.[Abstract] [FREE Full Text]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||