American Journal of Epidemiology Advance Access originally published online on May 28, 2008
American Journal of Epidemiology 2008 168(2):212-224; doi:10.1093/aje/kwn104
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PRACTICE OF EPIDEMIOLOGY |
On the Estimation of Additive Interaction by Use of the Four-by-two Table and Beyond
1 Department of Epidemiology and Biostatistics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
2 Robarts Clinical Trials, Robarts Research Institute, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
Correspondence to Dr. G. Y. Zou, Department of Epidemiology and Biostatistics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada N6A 5C1 (e-mail: gzou{at}robarts.ca).
Received for publication September 14, 2007. Accepted for publication March 27, 2008.
| ABSTRACT |
|---|
|
|
|---|
A four-by-two table with its four rows representing the presence and absence of gene and environmental factors has been suggested as the fundamental unit in the assessment of gene-environment interaction. For such a table to be more meaningful from a public health perspective, it is important to estimate additive interaction. A confidence interval procedure proposed by Hosmer and Lemeshow has become widespread. This article first reveals that the Hosmer-Lemeshow procedure makes an assumption that confidence intervals for risk ratios are symmetric and then presents an alternative that uses the conventional asymmetric intervals for risk ratios to set confidence limits for measures of additive interaction. For the four-by-two table, the calculation involved requires no statistical programs but only elementary calculations. Simulation results demonstrate that this new approach can perform almost as well as the bootstrap. Corresponding calculations in more complicated situations can be simplified by use of routine output from multiple regression programs. The approach is illustrated with three examples. A Microsoft Excel spreadsheet and SAS codes for the calculations are available from the author and the Journal's website, respectively.
bootstrap; genotype-environment interaction; logistic regression; proportional hazards models; risk ratio
Abbreviations: AP, attributable proportion due to interaction; CI, confidence interval; OR, odds ratio; RERI, relative excess risk due to interaction; RR, risk ratio; SA, simple asymptotic; SI, synergy index
| INTRODUCTION |
|---|
|
|
|---|
In 1976, it was recognized that "[a]s more risk factors become established as probable causes in the elaboration of disease etiology, scientists will turn their attention increasingly to the question of interaction (synergy or antagonism) of the causes" (1, p. 506). Scientists can now study literally thousands of genes and their interactions with environmental factors, thanks to the Human Genome Project.
It has been suggested that at the fundamental core of assessing gene-environment interaction is a four-by-two table; note that the original article refers to the table as a two-by-four table (2). However, conducting proper inferences is the ultimate goal of any research (3, p. 2). Furthermore, on the basis of the sufficient component cause model (4), it is more meaningful to assess interaction on the additive scale (1). This is because information concerning an additive interaction between two factors is more relevant to disease prevention and intervention (5, 6; 7, chapters 6 and 10). For example, if the joint effect of two factors surpasses the sum of their single effects, then reduction of either one would also reduce the risk of the other factor in producing the disease.
There has been little discussion concerning appropriate statistical methods for estimating additive interactions. As a result, a simple asymptotic approach proposed by Hosmer and Lemeshow (8) has proliferated in the literature (9–12), despite its well-documented poor performance (13).
The purpose of this article is to present an alternative approach for constructing accurate confidence intervals (CIs) for measures of additive interaction. The desirable performance of this new approach is the result of incorporating the asymmetric confidence limits for risk ratios (or odds ratios), in contrast to the simple asymptotic approach that forces confidence limits for risk ratios (RRs) to be symmetric. The central idea is to recover the variances needed for measures of interactions from confidence limits for RRs. For the four-by-two table, the calculations involved can be done in a spreadsheet or by a hand-held calculator. Simulation results demonstrate that the new approach is accurate enough to replace the bootstrap. The new approach may also be applied to more complicated situations by using output from standard multiple regression programs. Three worked examples are presented. All calculations were done by use of a Microsoft Excel (Microsoft Corporation, Redmond, Washington) spreadsheet that is available from the author upon request. SAS codes (SAS Institute, Inc., Cary, North Carolina) using routine regression output to obtain confidence limits are supplementary material posted on the Journal's website (http://aje.oupjournals.org/).
| THE FOUR-BY-TWO TABLE AND ESTIMATION OF MEASURES OF ADDITIVE INTERACTION |
|---|
|
|
|---|
Let G and E denote two risk factors, with their presence and absence reflected by 1 and 0, respectively. In the case of gene-environment interaction, three possible biallelic genotypes may be readily handled. For example, one can assume a dominant mode of gene action so that the genotype AA and Aa are equivalent and coded as 1, and aa coded as 0. Thus, a contingency table may be formed with four rows representing gene and environment combinations 11, 10, 01, and 00 and two columns representing disease status (yes and no) as follows:
| ||||||||||||||||||||||||||
Depending on the study design, one can estimate either odds ratios (ORs) in a case-control study or RRs in a cohort study, as illustrated in figures 1 and 2, respectively. The three measures of additive interaction devised by Rothman (14, chapter 15), in terms of RR, are relative excess risks due to interaction (RERI),
|
|
|
|
|
|
|
|
|
|
| CONFIDENCE INTERVALS FOR MEASURES OF ADDITIVE INTERACTION |
|---|
|
|
|---|
Because the sampling distributions for single RRs are positively skewed, introductory texts in epidemiology thus suggest that inferences be conducted on the log scale. Since log-transformation cannot be applied to RERI (it could be negative), Hosmer and Lemeshow (8) suggested a simple asymptotic (SA) approach by which the 95 percent confidence limits may be obtained by subtracting from and adding to the point estimate a quantity of 1.96 times the standard error.
Both RERI and AP may be parameterized as
|
|
1 = RR11,
2 = RR10, and
3 = RR01 for RERI and
1 = 1/RR11,
2 = RR10/RR11, and
3 = RR01/RR11 for AP. Therefore, the problem reduces to constructing CIs for
1 –
2 –
3 + 1 using estimates
|
The Appendix details a general approach to construction of the CI for linear combination of parameters. Since the basic idea is to recover variance estimates needed for setting confidence limits for functions of parameters, the method may be referred to as "MOVER," indicating the method of variance estimates recovery. By Appendix equations A7 and A8, a (1 –
)100 percent CI (L, U) for 1 +
1 –
2 –
3 is given by
![]() | (1) |
![]() | (2) |
The estimated correlation rij, i = 1, 2, j = 2, 3 may be obtained as
![]() | (3) |
|
|
|
|
![]() |
j,
![]() |
It can be shown with equations 1 and 2 that the SA method by Hosmer and Lemeshow (8) is a consequence of assuming symmetric confidence limits for RRs. To see this for RERI, one needs to replace
– l1 and u1 –
by
– l2 and u2 –
by
and
– l3 and u3 –
by
Similar exercises will result in the SA CI for AP. This brings out the failing point of the SA approach, that it has implicitly assumed that confidence limits for the RR are given by
. Failing to see this point may have resulted in the proliferation of the SA method (9–12).
Now, since the derivation of the MOVER method (equations 1 and 2) did not assume symmetric confidence limits for
i, one can use sensible confidence limits for RRs, such as
in the construction of confidence interval for RERI and AP.
Furthermore, denoting
1 = ln(RR11 – 1) and
2 = –ln(RR10 + RR01 – 2), Appendix equations A5 and A6 may be applied to ln SI that, in turn, can be used to obtain confidence limits for SI. With the expressions in figures 1–3, the results for SI will be identical to those obtained by use of the methods proposed by Rothman (1).
| SIMULATION STUDY |
|---|
|
|
|---|
Despite the justification provided in the Appendix, the proposed procedure for measures of interaction is based on asymptotic theory. Simulation studies were therefore undertaken to evaluate its performance.
For AP, a method based on ln(1 – AP) (18) was also included. The studies were performed in the context of a case-control design, with the understanding that the statistical theory is identical regardless of whether the OR, RR, or hazard ratio is selected as the effect measure.
The first study used 20 OR combinations (2RR10 x 2RR01 x 5RR11) and a sample size of 250 in each case and control group as in the study reported by Assmann et al. (13). Compared with the MOVER approach, the approaches were the SA approach (8) and the bias-corrected and accelerated (BCa) bootstrap approach (3, pp. 184–188). For each parameter combination, 1,000 replicates were performed. The number of resamples for the bootstrap was also set to 1,000. The proportions of control subjects exposed to G alone, E alone, and both G and E were 0.1, 0.2, and 0.1, respectively. The exposure probability distribution for the case subjects was then calculated by use of the specific values of RR11, RR10, and RR01. Data for the cases and controls were generated separately from multinomial distributions. Cells with 0 count were added by 0.5 so that ORs could be calculated.
An additional simulation study without the bootstrap was conducted to see whether the SA approach could perform reasonably well in sample sizes of 1,000 in each of the case and control groups.
A third simulation study was performed to assess the performance of the MOVER method compared with the SA method in situations with small exposure probabilities. With the other parameters set as in the first simulation, the probabilities of controls exposed to G alone, E alone, and both G and E were 0.05, 0.05, and 0.05, respectively.
Because the main focus was on the extent to which the empirical coverage of the CI matched with the nominal 95 percent level, the first criterion was whether or not the coverage rate was within the range of 93.6–96.4 percent. The difference between the two miscoverage rates was the second criterion, where smaller differences were preferred. The reason to set balanced miscoverage errors as the second criterion is that a CI should contain possible parameter values that are not too large and not too small. An advertised 95 percent CI is supposed to miss about 2.5 percent from each side.
The coverage rate for each method was calculated as the proportion of the 1,000 CIs constructed that contained the values of the additive interaction. The left miscoverage rate was obtained by calculating the proportion of the upper limits that were less than the parameter value, while the right miscoverage rate was obtained as the proportion of the lower limits larger than the parameter value.
The results in table 1 show that, for RERI, the SA approach missed the target coverage range of 93.6–96.4 percent in 14 of 20 cases. The poor performance is more pronounced when the miscoverage rates are considered. In contrast, the MOVER approach provided coverage rates that are all in the range, with only a single one with 96.5 percent. The overall performance is very comparable to that of the bootstrap.
|
For AP, the SA approach actually provides overall coverage rates that are within the range, but in a lop-sided manner. In particular, the high right miscoverage rates indicate that this approach tends to provide lower confidence limits that are too high. A possible consequence is false positive results. Table 1 also demonstrates that the MOVER approach and the ln(1 – AP) approach provided slightly better coverage results, but the miscoverage rates are not balanced as is the case for the bootstrap approach. Nonetheless, these miscoverage rates seem to be reasonable from a practical perspective. Table 2 shows that increasing sample sizes to 2,000 subjects can have only a limited effect in improving the performance of the SA approach, especially when the miscoverage rates are considered.
|
As predicted by the theoretical results above, further simulation results with small exposure probabilities (table 3) demonstrate that, for RERI, the MOVER approach performed satisfactorily, while the SA deteriorated. Interestingly, the ln(1 – AP) approach performed very well. These results also demonstrate that there exists room for improvement in the case of AP when the exposure probabilities are small. Since the MOVER approach draws its validity for the confidence limits for ORs, future research may focus on adopting better CIs for OR (19) or for RR (20).
|
| EXAMPLES |
|---|
|
|
|---|
Example 1: negative confidence limits for ORs used by the SA method to obtain those for RERI
This example concerns smoking and alcohol use in relation to oral cancer among male veterans (8; 14, chapter 15). The data are presented in a four-by-two table in figure 4. An application of the naive SA method results in a 95 percent CI of –1.83, 9.31 for RERI (8). As discussed above, this CI is a consequence of applying symmetric intervals for ORs in equations 1 and 2. In other words, the SA method has implicitly used a symmetric interval for OR given by
|
Example 2: falsely claimed interaction resulting from the SA method
Consider a data set arising from a case-cohort design in the Atherosclerosis Risk in Communities (ARIC) Study (21, 22), where it is of interest to determine the interaction between a susceptibility genotype, glutathione S-transferase M1 polymorphism (GSTM1), and smoking on the risk of incident coronary heart disease. A total of 458 incident cases of coronary heart disease occurred in the population of 14,239 eligible participants during the period from 1989 to the end of 1993. A cohort of 986 participants including 36 incident cases were selected from the eligible population. Excluding 118 subjects with missing GSTM1 data, the final sample of 1,290 with the outcome variable "time to coronary heart disease diagnosis" was analyzed by Cox proportional hazards regression, taking into account the feature of the case-cohort design by using a weighting scheme (23). Specifically, the weights in the denominator of the pseudolikelihood are one for cases that arise outside the subcohort and the inverse of the sampling fraction for subcohort controls. In addition, the subcohort cases are weighted by the inverse of the sampling fraction before failure and by one at failure. Valid variances can then be estimated using the sandwich error approach (23).
After adjustment for 10 covariates, it was reported (12) that the estimated coefficients for the GSTM1 susceptibility genotype (yes/no), ever smoking (yes/no), and their product term are given, respectively, by
=0.0543,
=0.2826, and
=0.5869. Application of the MOVER approach with the use of figure 5 results in a 95 percent CI for RERI of –0.013, 2.505. On the basis of the simulation results presented above, one should doubt that "we found a statistically significant additive interaction between susceptibility genotype and ever smoking for the risk of incident CHD [coronary heart disease]" (12, p. 232), which was based on the CI of 0.052, 2.222 that was derived using the SA method. As regards to attributable fraction due to interaction, the MOVER approach also provided a 95 percent CI of –0.023, 0.729, which is very comparable to that from the ln(1 – AP) transformation method: –0.025, 0.706, but very different from the one provided by the SA method: 0.108, 0.794. (Note that reference 12 contains errors in the expression for
). Again, there is no sufficient evidence to suggest additive interaction as claimed (12, 22).
|
Example 3: exaggerated interaction using ORs in a cohort study
This data set arose from a cohort study in which it was of interest to investigate the effect of age and body mass index (weight (kg)/height (cm)2) on diastolic blood pressure (17). To form a four-by-two table, we coded age
40 years as 1 and age <40 years as 0, while body mass index
25 was coded as 1 and body mass index <25 as 0. The outcome, diastolic blood pressure
90 mmHg, was classified as hypertension and coded as 1, and <90 mmHg was coded as 0. The four-by-two table and associated calculation are given in figure 5. Although the RERI = 1.3 in terms of RRs, the data were analyzed by logistic regression with the percentile bootstrap, resulting in a RERI of 2.7 (95 percent CI: 1.3, 4.4) (17). As the measures of interaction are defined in terms of RRs, it is much more appropriate to discuss additive interaction in terms of RRs when it is possible, using either regression programs (15, 16) or the formulas presented here (figure 5). With figure 5, the estimated RERI is 1.34 (95 percent CI: 0.31, 2.37), and AP is 21.5 percent (95 percent CI: 5.6, 34.6). When estimating measures of interaction in terms of ORs, the new approach would result in RERI = 2.71 (95 percent CI: 1.25, 4.45) and AP = 33.0 percent (95 percent CI: 16.1, 46.0). Although the direction of the interaction would be unchanged, the magnitude would be exaggerated if ORs were used as the effect measure (17). The intuitive explanation is that the first term in RERI is a product of three RRs, and thus a slight exaggeration of each will result in a large overestimation of RERI.
| Concluding remarks |
|---|
|
|
|---|
This article has proposed a simple approach to construction of confidence intervals for measures of additive interaction. This approach works because it acknowledges the fact that confidence limits for risk ratios are asymmetric. The article has also demonstrated that one can appropriately analyze the four-by-two table without having to use a statistical program. In the case of multivariable models, there is no need to recode the risk variables prior to using a regression program (8–11).
Furthermore, this article has shown that the RR should always be the first choice of effect measure for single risk factors because, as shown in the third example (17), the exaggeration of the OR can be more pronounced in assessing additive interaction. Regression models resulting in RR should be adopted if covariate adjustment is desired (15, 16, 24).
Although additive regression models are available for assessing additive interaction (25), multiplicative models are still more accessible and commonly used by epidemiologists (26), even when assessment of additive interaction is desired (14, chapter 15). This may be in part because additive models require specialized software to fit, and in part because it is straightforward to estimate measures of additive interaction using routinely available output from multiplicative software (14, chapter 15). The results in this article help to remove the obstacle of CI construction for measures of additive interaction and thus face a real challenge of gene-environment interaction, that is, conducting appropriate inferences for disease prevention (5–7).
| APPENDIX |
|---|
|
|
|---|
Construction of Confidence Interval for Linear Functions of Parameters
Recall that the asymmetric confidence limits for RRs are readily available either by hand calculation or from regression programs. The strategy here is to use these limits to recover the variance estimates, without destroying the asymmetric feature of sampling distribution for RRs, in setting confidence intervals for a linear function of several RRs. The underlying principle has been discussed in the case of constructing CIs for differences between two parameters in general (20) and applied to correlations in particular (27). A summary is presented here followed by a generalized framework for a linear combination of parameters.
To begin, consider construction of a (1 –
)100 percent two-sided confidence interval for
1 +
2, where the two estimates
and
are for the moment assumed to be independently distributed. The lower limit may be given by
|
| (A1) |
|
| (A2) |
/2 denotes the upper
/2 quantile from the standard normal distribution.
Equations A1 and A2 contain unknown terms
(i = 1, 2), which may be estimated by two approaches: one assumes that
is independent of
i, while the other makes no such assumption. Confidence limits from the former are symmetric, and those from the latter are asymmetric and usually perform better (3, p. 180). The focus here is to derive asymmetric confidence intervals since the variance for the estimated RR is a function of RR itself.
By the duality between hypothesis testing and confidence interval construction, a (1 –
)100 percent two-sided CI should contain all such parameter values that cannot be rejected by a test at the
level (3, p. 157). In other words, L is the minimum value of
1 +
2, satisfying
|
|
1 +
2, the variance estimate for obtaining L should be estimated in the neighborhood of L, or min(
1) + min(
2). Among the plausible values provided by the two pairs of confidence limits (l1, u1 and l2, u2),
1 +
2 = l1 + l2 is close to L. This implies that the variance can be estimated at
1 = l1 and
2 = l2.
Again, by the duality of hypothesis testing and confidence interval construction,
|
|
![]() |
|
| (A3) |
|
| (A4) |
Equations A3 and A4 may be extended to
1 –
2 =
1 + (–
2) by recognizing that the confidence limits for –
i are given by (–ui, –li). These equations may also be extended to incorporate dependency between
and
. Let
and then a confidence interval for
1 +
2 is given by
![]() | (A5) |
![]() | (A6) |
By mathematical induction (28, pp. 28–29), it can be shown that a (1 –
)100 percent confidence interval (L, U) for
where ci's are constants, is given by
![]() | (A7) |
![]() |
![]() | (A8) |
![]() |
![]() |
| ACKNOWLEDGMENTS |
|---|
Guang Yong Zou is a recipient of the Early Researcher Award, Ontario Ministry of Research and Innovation, Canada. This work was also partially supported by an Individual Discovery Grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.
The author gratefully acknowledges Julia Taleban for comments on drafts of the manuscript and help on the Excel spreadsheet.
Conflict of interest: none declared.
| References |
|---|
|
|
|---|
- Rothman KJ. The estimation of synergy or antagonism. Am J Epidemiol (1976) 103:506–11.
[Free Full Text] - Botto LD, Khoury MJ. Commentary: facing the challenge of gene-environment interaction: the two-by-four table and beyond. Am J Epidemiol (2001) 153:1016–20.
[Abstract/Free Full Text] - Efron B, Tibshirani RJ. An introduction to the bootstrap. (1993) New York, NY: Chapman & Hall/CRC.
- Rothman KJ. Causes. Am J Epidemiol (1976) 104:587–92.
[Free Full Text] - Rothman KJ, Greenland S, Walker AM. Concepts of interaction. Am J Epidemiol (1980) 112:467–70.
[Free Full Text] - Darroch J. Biologic synergy and parallelism. Am J Epidemiol (1997) 145:661–8.
[Abstract/Free Full Text] - Szklo M, Nieto FJ. Epidemiology: beyond the basics. 2nd ed. (2006) Sudbury, MA: Jones and Bartlett Publishers, Inc.
- Hosmer DW, Lemeshow S. Confidence interval estimation of interaction. Epidemiology (1992) 3:452–6.[Web of Science][Medline]
- Lundberg M, Fredlund P, Hallqvist J, et al. A SAS program calculating three measures of interaction with confidence intervals. (Letter). Epidemiology (1996) 7:655–6.[Web of Science][Medline]
- Andersson T, Alfredsson L, Kallberg H, et al. Calculating measures of biological interaction. Eur J Epidemiol (2005) 20:575–9.[CrossRef][Web of Science][Medline]
- Kallberg H, Ahlbom A, Alfredsson L. Calculating measures of biological interaction using R. Eur J Epidemiol (2006) 21:571–3.[CrossRef][Medline]
- Li R, Chambless L. Test for additive interaction in proportional hazards models. Ann Epidemiol (2007) 17:227–36.[CrossRef][Web of Science][Medline]
- Assmann SF, Hosmer DW, Lemeshow S, et al. Confidence intervals for measures of interaction. Epidemiology (1996) 7:286–90.[Web of Science][Medline]
- Rothman KJ. Modern epidemiology. (1986) Boston, MA: Little, Brown & Co.
- Zou GY. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol (2004) 159:702–6.
[Abstract/Free Full Text] - Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol (2005) 162:199–200.
[Free Full Text] - Knol MJ, van der Tweel I, Grobbee DE, et al. Estimating interaction on an additive scale between continuous determinants in a logistic regression model. Int J Epidemiol (2007) 36:1111–18.
[Abstract/Free Full Text] - Walker AM. Proportion of disease attributable to the combined effect of two factors. Int J Epidemiol (1981) 10:81–5.
[Abstract/Free Full Text] - Gart JJ, Thomas DG. The performance of three approximate confidence limit methods for the odds ratio. Am J Epidemiol (1982) 115:453–70.
[Abstract/Free Full Text] - Zou GY, Donner A. Construction of confidence limits about effect measures: a general approach. Stat Med (2008) 27:1693–702.[CrossRef][Web of Science][Medline]
- The ARIC Investigators. The Atherosclerosis Risk in Communities (ARIC) Study. Am J Epidemiol (1989) 129:687–702.
[Abstract/Free Full Text] - Li R, Boerwinkle E, Olshan AF, et al. Glutathione S-transferase genotype as a susceptibility factor in smoking-related coronary heart disease. Atherosclerosis (2000) 149:451–62.[CrossRef][Web of Science][Medline]
- Barlow WE. Robust variance estimation for the case-cohort design. Biometrics (1994) 50:1064–72.[CrossRef][Web of Science][Medline]
- Wacholder S. Binomial regression in GLIM—estimating risk ratios and risk differences. Am J Epidemiol (1986) 123:174–84.
[Abstract/Free Full Text] - Greenland S. Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med (1983) 2:243–51.[Medline]
- Levy PS, Stolte K. Statistical methods in public health and epidemiology: a look at the recent past and projections for the next decade. Stat Methods Med Res (2000) 9:41–55.
[Abstract/Free Full Text] - Zou GY. Toward using confidence intervals to compare correlations. Psychol Methods (2007) 12:399–413.[CrossRef][Web of Science][Medline]
- Kolmogorov AN. Introductory real analysis. Translated by Silverman RA. (1975) New York, NY: Dover Publications.
This article has been cited by other articles:
![]() |
H. J. Grabe, C. Spitzer, C. Schwahn, A. Marcinek, A. Frahnow, S. Barnow, M. Lucht, H. J. Freyberger, U. John, H. Wallaschofski, et al. Serotonin Transporter Gene (SLC6A4) Promoter Polymorphisms and the Susceptibility to Posttraumatic Stress Disorder in the General Population Am J Psychiatry, August 1, 2009; 166(8): 926 - 933. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. B. Richardson and J. S. Kaufman Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds Am. J. Epidemiol., March 15, 2009; 169(6): 756 - 760. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



















