American Journal of Epidemiology Advance Access originally published online on September 27, 2006
American Journal of Epidemiology 2006 164(12):1242-1250; doi:10.1093/aje/kwj335
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The Question of Nonlinearity in the Dose-Response Relation between Particulate Matter Air Pollution and Mortality: Can Akaike's Information Criterion be Trusted to Take the Right Turn?
From the School of Finance and Applied Statistics, College of Business and Economics, Australian National University, Canberra, Australia
Correspondence to Dr. Steven Roberts, School of Finance and Applied Statistics, College of Business and Economics, The Australian National University, Canberra ACT 0200, Australia (e-mail: steven.roberts{at}anu.edu.au).
Received for publication November 7, 2005. Accepted for publication May 8, 2006.
| ABSTRACT |
|---|
|
|
|---|
The shape of the dose-response relation between particulate matter air pollution and mortality is crucial for public health assessment, and departures of this relation from linearity could have important regulatory consequences. A number of investigators have studied the shape of the particulate matter-mortality dose-response relation and concluded that the relation could be adequately described by a linear model. Some of these researchers examined the hypothesis of linearity by comparing Akaike's Information Criterion (AIC) values obtained under linear, piecewise linear, and spline alternative models. However, at the current time, the efficacy of the AIC in this context has not been assessed. The authors investigated AIC as a means of comparing competing dose-response models, using data from Cook County, Illinois, for the period 19872000. They found that if nonlinearities exist, the AIC is not always successful in detecting them. In a number of the scenarios considered, AIC was equivocal, picking the correct simulated dose-response model about half of the time. These findings suggest that further research into the shape of the dose-response relation using alternative model selection criteria may be warranted.
air pollution; epidemiologic methods; models, statistical; mortality
Abbreviations: AIC, Akaike's Information Criterion; PM, particulate matter
| INTRODUCTION |
|---|
|
|
|---|
The majority of investigators who have studied the relation between particulate matter (PM) air pollution and mortality have assumed a linear dose-response relationthat is, that the incremental effect of an increase in PM on mortality is the same for all PM exposures (1). However, knowledge about the shape of the dose-response relation between PM and mortality is crucial for public health assessment (2), and departures of the relation from linearity could have important consequences. For example, if there is a threshold level below which PM exposure has no effect on mortality, regulations that reduce PM exposure that is already below this level will have little public health impact. Alternatively, if there is a change-point level above which the incremental effect of PM exposure on mortality increases or decreases, regulations designed to reduce PM exposure based on a simple linear dose-response model may not, in fact, have their intended public health consequences, depending on how mortality responds to exposure levels beyond the change point.
A number of recent studies have investigated the shape of the dose-response relation using a variety of methods ranging from parametric piecewise linear and spline models to nonparametric smoothing (1, 37). Similar methods have also been used to investigate the shape of the dose-response relation between emergency hospital admissions for myocardial infarction and PM exposure (8). The goal of using these techniques is to allow more flexibility than is permitted by a linear model, so that nonlinearities in the dose-response relation can be captured. For the most part, these investigators have concluded that the relation between PM and mortality is adequately described by a linear model. In a number of studies this conclusion was based partly on comparing linear and nonlinear models using Akaike's Information Criterion (AIC) (1, 3, 4, 9). The AIC is a model selection criterion that combines a measure of the lack of fit of a particular model with a penalty for the number of parameters in the model, models with smaller AIC values being preferred. However, at the current time, the efficacy of the AIC for the purpose of selecting an appropriate dose-response model relating PM exposure and mortality has not been assessed. We addressed this shortfall in the literature by investigating AIC as a means of comparing linear and nonlinear dose-response models to determine which is more appropriate for modeling the relation between PM and mortality.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The data used in this paper were obtained from the publicly available database of the National Morbidity, Mortality, and Air Pollution Study, funded by the Health Effects Institute (Boston, Massachusetts). The data we extracted consisted of daily time series of mortality, weather, and PM for Cook County, Illinois, for the period 19872000. The mortality time-series data were the number of nonaccidental deaths of persons aged
65 years. The weather time-series data were the 24-hour averages of temperature and dew point temperature. The PM data were the ambient 24-hour concentrations of particulate matter less than 10 µm in diameter, measured in units of µg/m3. Table 1 provides summary statistics for the PM, weather, and mortality data available from Cook County for the period 19872000. Further details on the data used can be obtained at the website of the Internet-based Health and Air Pollution Surveillance System (http://www.ihapss.jhsph.edu/).
|
To evaluate AIC as a method for testing the hypothesis of linearity in the PM-mortality dose-response relation, we needed to generate mortality time series for which the true relation was known. This was achieved using a method which has previously been shown to generate realistic mortality time series that are modulated by time-varying covariates (10). In this case, the covariates were the actual Cook County concurrent and lagged temperature, concurrent and lagged dew point temperature, day-of-the-week effects, and parametric slow-changing time trends. The method proceeds by estimating the effect of the covariates on mortality and adding to these estimated effects an explicitly specified PM-mortality dose-response relation. The model fitted to the Cook County mortality time series to estimate the effect of the covariates on mortality was
![]() | (1) |
The explicitly specified dose-response relation g(PMt) took two forms in our study. The first was a "no-threshold" piecewise linear relation (no-threshold) with a change point at a PM exposure of either 25 µg/m3 or 50 µg/m3, with PM exposures above the change point having a different incremental effect on mortality than those below the change point. The second was a "threshold" piecewise linear relation (threshold) with PM exposure below 25 µg/m3 having no effect on mortality and PM exposure above 50 µg/m3 having a different incremental effect on mortality than exposures between 25 µg/m3 and 50 µg/m3. The no-threshold and threshold dose-response relations can be expressed as follows:
![]() |
ß1 percent increase in expected mortality. For a one-unit increment in PM above the change point, expected mortality increases by exp (ß1 + ß2) 1
ß1 + ß2 percent. For the threshold model, PM exposures below 25 µg/m3 have no effect on mortality. For PM exposures between 25 µg/m3 and 50 µg/m3, expected mortality increases by exp (ß1) 1
ß1 percent for each one-unit increment in PM. For PM exposures above 50 µg/m3, the corresponding value is given by exp (ß1 + ß2) 1
ß1 + ß2 percent. Figure 1 depicts mortality time series generated using the no-threshold model with (ß1 = 0.0005, ß2 = 0.0005,
= 25) and the threshold model with (ß1 = 0.0005, ß2 = 0.0005), together with the actual Cook County mortality time series for comparison. From this figure, it is evident that the generated mortality time series are reasonable facsimiles of the actual Cook County mortality time series, retaining the key features of that series.
|
To estimate the shape of the actual dose-response relation g(PMt), the following Poisson log-linear model was fitted to each generated mortality time series:
|
|
t is the mean mortality count on day t and confounderst has exactly the same specification as the confounderst term used to generate mortality in expression 1. Here, f(PMt) is the form of the dose-response relation used to model mortality, that is, to estimate g(PMt), where PMt is the current day's 24-hour average PM concentration. Four candidate forms of f(PMt) were considered: 1) linear, f(PMt) =
PMt; 2) one-change-point piecewise linear, f(PMt) =
1PMt +
2(PMt
)+; 3) spline, f(PMt) = ns(PMt, knots = c(25,50)); and 4) two-change-point piecewise linear, f(PMt) =
1 PMt +
2(PMt 25)+ +
3 (PMt 50)+. The one-change-point piecewise linear model was fitted allowing the change point
to take values of 575 µg/m3 inclusive, using increments of 5 µg/m3. This "grid" method has been used in previous studies to fit a single change-point piecewise linear model (3). Note that the no-threshold model can be couched as a one-change-point piecewise linear model. The spline model was fitted using natural cubic splines with knots at 25 µg/m3 and 50 µg/m3, the same locations as the change points in the assumed threshold relation. Natural cubic splines are cubic polynomials within each pair of knots with smooth connections between adjoining segments. The two-change-point piecewise linear model was considered in addition to the spline model because the threshold model can be considered a two-change-point piecewise linear model with change points at 25 µg/m3 and 50 µg/m3. Piecewise linear models with one change point and spline models with two fixed knots have been commonly used in the literature to estimate the shape of the dose-response relation (3, 7). In this literature, spline models with two fixed knots have been preferred over piecewise linear models with two change points. AIC has been used by some investigators to test the hypothesis of a linear dose-response relation (1, 3, 4). In these studies, AIC was computed for each competing model as 2 x log(likelihood) + 2 x (no. of parameters). The model with the smallest AIC value was then chosen as the "best." If the linear model had the smallest AIC value, this was used as evidence to support the conclusion that the dose-response relation was adequately described by a linear model. The first term in AIC is a measure of the lack of fit of the model under consideration, and the second is a penalty for the number of parameters in the model, designed to prevent overfitting. The use of a penalty term in model selection criteria such as AIC is useful in our context, since both the piecewise linear model and the spline model have additional parameters over and above those in the linear model. Through these additional parameters, the piecewise linear and spline models will always fit the mortality data better than the linear model can. The goal of using the penalty term is to try to ensure that models with additional parameters are only selected if these parameters are able to significantly improve the fit to the data. In a situation where the no-threshold model represents the true dose-response relation, AIC should prefer the one-change-point piecewise linear model over competing models, as the no-threshold relation is a one-change-point piecewise linear model. Similarly, AIC should prefer the two-change-point piecewise linear model over competing models when the threshold model represents the true dose-response relation.
We conducted simulations to investigate the efficacy of AIC for testing the hypothesis of linearity in the dose-response relation. We generated mortality time series using the no-threshold or threshold models and a range of (ß1, ß2) combinations. For each dose-response relation type g(PMt) and (ß1, ß2) combination, 1,000 mortality time series were generated. For the mortality time series generated using the no-threshold model, we investigated the percentage of times AIC chose the linear model over 16 competing nonlinear modelsthe 15 one-change-point piecewise linear models and the spline model. Likewise, for mortality generated using the threshold model, we investigated the percentage of times AIC chose the linear model over two competing nonlinear modelsthe two-change-point piecewise linear model and the spline model.
We computed two additional quantities for each simulation scenario that provided information on the consequences of concluding that a linear relation was appropriate. The first quantity was the expected mortality count for PM concentrations of 25 µg/m3 and 75 µg/m3 relative to a PM concentration of 0 µg/m3 for the actual dose-response relation that generated mortality and for a corresponding linear relation. For the actual dose-response relation, these expected values were given by exp(g(PMt = 25)) 1 and exp(g(PMt = 75)) 1, respectively. The expected values for the linear relation were estimated using the average values of
1 over the subset of 1,000 simulated data sets where the linear model was selected as best (
represents an estimate of
obtained from fitting the linear model to a generated mortality time series). The second quantity of interest computed was the difference in the total number of expected deaths attributable to PM exposure over the 14-year simulation period, calculated using the actual dose-response relation and a corresponding linear relation,
![]() | (2) |
| RESULTS |
|---|
|
|
|---|
Tables 2 and 3 show the results of the simulations for mortality generated using the no-threshold model, and table 4 shows the results for mortality generated using the threshold model. Each row summarizes the results of simulations based on mortality time series generated using the given (ß1, ß2) combination. For example, table 2 indicates that over the 1,000 mortality time series generated using the no-threshold model with (ß1 = 0.001, ß2 = 0.0005,
= 25), the linear model was selected as having a lower AIC value than the competing nonlinear models for 30 percent of the 1,000 generated mortality time series. Analogous information for the simulations based on the threshold model can be found in table 4.
|
|
|
Tables 3 and 4 provide information about the consequences of concluding that a linear relation is appropriate when the actual relation follows a no-threshold or threshold model, respectively. For each (ß1, ß2) combination, the values in these tables give the expected mortality counts for PM concentrations of 25 µg/m3 and 75 µg/m3 relative to a PM concentration of 0 µg/m3 for the actual model that generated mortality and a corresponding linear model. For example, table 3 indicates that for the no-threshold model with (ß1 = 0.001, ß2 = 0.0005,
= 25), the expected mortality count for a day with a PM concentration of 25 µg/m3 will be 2.53 percent higher than that for a day with a PM concentration of 0 µg/m3. The average value for this quantity over the 299 simulated data sets for which the linear model was selected as best was 3.71 percent. First, we consider the no-threshold dose-response simulations (table 2). Here, the performance of AIC in detecting the true shape of the underlying dose-response relation was best when ß1 = 0.001, correctly selecting a piecewise linear model in approximately two thirds or more of simulations, even for relatively mild nonlinearities. In a number of situations AIC performed particularly well, detecting the correct model in all simulations. For smaller values of ß1 = 0.0005 and ß1 = 0.00025, the performance of AIC deteriorated somewhat; it picked the correct model in only about 55 percent of the simulations for some cases when the change point was equal to 25 µg/m3. Note that for these simulations, the spline model was selected as best only rarely, because the one-change-point piecewise linear model had one less parameter than the spline model and was able to capture exactly the piecewise-linear nature of the "true" no-threshold relation. The scenarios with ß2 = 0 in the table are of particular interest because they reflect an exactly linear relation. For these scenarios, AIC selected the linear model as best only about 45 percent of the time. Put another way, even when the actual dose-response relation is linear, we might expect AIC to conclude that nonlinearities exist more than half of the time.
For mortality generated using the threshold model (table 4), the performance of AIC was qualitatively similar to the no-threshold case when ß1 = 0.001. However, the deterioration in the performance of AIC for the smaller values ß1 = 0.0005 and ß1 = 0.00025 was much more pronounced than in the no-threshold case, with the rate at which AIC failed to detect the true shape of the dose-response relation exceeding 70 percent in a number of cases when ß1 = 0.00025. Again, in these cases, although such a finding seems a rather negative one for AIC, it should be noted that the extent of the overall nonlinearity in the true relation was small relative to other cases considered.
In cases where AIC was essentially equivocal as to which candidate model was best, it is vital to understand the consequences of an incorrect choicebearing in mind, of course, that in real-world analyses such insight is rarely, if ever, available. For the nonlinear dose-response relations considered in the simulations, the values in tables 3 and 4 show that incorrectly concluding a linear relation appropriate can sometimes result in large under- or overestimates of the increase in expected mortality for PM concentrations of 25 µg/m3 and 75 µg/m3 relative to a PM concentration of 0 µg/m3. Perhaps more evocatively, consider the difference in the total number of expected deaths attributable to PM exposure calculated using the actual dose-response relation and a linear relation over the 14-year simulation period. Using expression 2, the values of this quantity for the three no-threshold scenarios with a change point at 25 µg/m3 (ß1 = 0.001, ß2 = 0.0005), (ß1 = 0.0005, ß2 = 0.00025), and (ß1 = 0.00025, ß2 = 0.000125) were 4,221 lives, 2,015 lives, and 852 lives, respectively. These scenarios were considered because for each unique value of ß1 they were the scenarios for which AIC performed least effectively at detecting nonlinearities. The analogous values for the three threshold scenarios in which AIC performed least effectively (ß1 = 0.001, ß2 = 0.0005), (ß1=0.0005, ß2 = 0.00025), and (ß1 = 0.00025, ß2 = 0.000125) were 4,990 lives, 2,486 lives, and 1,167 lives, respectively. These differences suggest that even for situations where the nonlinearities in the dose-response relation appear mild, incorrectly concluding that a linear relation is appropriate can have a large impact on the estimate of the number of deaths attributable to PM exposure.
| DISCUSSION |
|---|
|
|
|---|
This work does not suggest that the relation between PM and mortality is or is not linear. Rather, it suggests that AIC may not be able to detect any nonlinearities should they exist. To help set the reported performance of AIC in an appropriate context, it is important to recognize and assess the extent of the nonlinearities present in the "true" dose-response relations from which the data were simulated. In scenarios where the true curve is close to linear, at least to the eye, it simply may not be reasonable to expect AIC to detect such departures from the simple linear case a large percentage of the time. To clarify this issue, we produced plots of the true dose-response relations corresponding to scenarios in which AIC performed well and scenarios in which AIC performed least effectively. Figure 2 depicts six cases of the no-threshold relation with a change point equal to 25 µg/m3three corresponding to situations where AIC performed particularly effectively at detecting nonlinearities (solid line) and three corresponding to situations where AIC performed less effectively (dotted line). Figure 3 contains similar plots for the threshold relation. The y-axes on these figures were restricted to the 08 percent range, the chosen scale based on realistic effect levels reported in previous multicity studies of the PM-mortality dose-response relation in both the United States and Europe (1, 3, 6, 12).
|
|
For the three scenarios depicted in figure 2, the cases for which AIC performed least effectively are arguably those for which the extent of the nonlinearity was slight. Of course, even in these cases, extending the lines on either side of the change point reveals that simple visual inspection of the curves can be somewhat misleading. Nevertheless, the performance of AIC in these cases suggests little statistical difference in the quality of fit between linear and piecewise linear alternatives. In these cases, therefore, a prudent model selection procedure would necessarily consider factors other than quality of fit and number of fitted parameters in further assessing the modelsfor example, by considering the predictive ability of each candidate model.
For the three scenarios depicted in figure 3, the performance of AIC is more obviously of concern. In each case for which AIC performed least effectively, the nonlinearities in the true relations appear clear, and we might have expected AIC to detect such nonlinearities more frequently than we observed in our simulation study.
A number of investigators have formally tested the hypothesis of linearity in the PM-mortality dose-response relation by comparing AIC values obtained from linear, piecewise linear, and spline models. Based partly on these tests, these researchers have concluded that linear models are adequate for assessing the effect of PM on mortalitythe premise being that, if nonlinearities existed, the more flexible piecewise linear and spline models would have fitted the data better and hence been favored by AIC. The results of our simulations have shown that if nonlinearities exist in the dose-response relation, AIC is not always successful in detecting them. In a number of the simulation scenarios considered, AIC was equivocal, picking the correct dose-response relation only about half of the time. One possible reaction to this result is that in some of these scenarios such equivocation is properthat the correct and linear models are, in a sense, close and that AIC is justifiably unable to demonstrably prefer one over the other. However, we believe it is valuable to assess the behavior of AIC in both cases of mild nonlinearity and cases of extreme nonlinearity, and we feel that the results are useful for practitioners who routinely use AIC in setting their expectations as to how it will perform in such cases.
The results of this analysis are important because they show that the evidence from past investigations about the shape of the PM-mortality dose-response relation may not be as compelling as previously thought and that further research may be needed to more reliably determine whether there are nonlinearities in the relation between PM and mortality. One such area of research would be the development of alternative model selection techniques or criteria which are able to select the appropriate dose-response relation a larger proportion of the time than does AIC. Note that not all investigators who have studied the shape of the dose-response relation have based their decision on the AIC. Some investigators have simply fitted smooth models that allowed enough flexibility to model nonlinearities and then interpreted the shape of the resulting fitted dose-response curves (6). However, even in these situations, it would be desirable to have a more formal way of determining the appropriateness of a linear fit to the dose-response relation.
The two forms of the dose-response relation considered in this paper have important regulatory implications. If the effect of PM on mortality follows a no-threshold relation, this means that any "tightening" of PM emission standards will result in a reduction in PM-induced mortality. However, the piecewise linear form of the curve means that the incremental reduction in mortality will not be constant. If the effects of PM on mortality follow a threshold dose-response relation, the regulatory implications are more complex. In this situation, tightening of PM emission standards that are already below the threshold level will offer no additional benefits in terms of reduced mortality. Tightening of emission standards that are above the threshold will, like the no-threshold dose-response relation, result in a nonconstant reduction in PM-induced mortality. Our simulations indicate that even in situations of mild nonlinearity, incorrectly concluding that a linear relation is appropriate can result in large under- or overestimates of the total number of deaths attributable to PM exposure. For these reasons, departures of the dose-response relation from linearity must be discovered in order to accurately assess the benefits, if any, of regulations tightening PM emission standards.
| ACKNOWLEDGMENTS |
|---|
Conflict of interest: none declared.
| References |
|---|
|
|
|---|
- Samoli E, Analitis A, Touloumi G, et al. (2005) Estimating the exposure-response relationship between particulate matter and mortality within the APHEA multicity project. Environ Health Perspect 113:8895.[ISI][Medline]
- Levy JI. (2003) Issues and uncertainties in estimating the health benefits of air pollution control. J Toxicol Environ Health Part A 66:186571.[CrossRef][ISI][Medline]
- Daniels MJ, Dominici F, Samet JM, et al. (2000) Estimating particulate matter-mortality dose-response curves and threshold levels: an analysis of daily time-series for the 20 largest US cities. Am J Epidemiol 152:397406.
[Abstract/Free Full Text] - Kim SY, Lee JT, Hong YC, et al. (2004) Determining the threshold effect of ozone on daily mortality: an analysis of ozone and mortality in Seoul, Korea, 1995 1999. Environ Res 94:11319.[Medline]
- Koop G and Tole L. (2004) An investigation of thresholds in air pollution-mortality effects. (Department of Economics, University of Leicester, Leicester, United Kingdom) (Working paper 04/20).
- Schwartz J and Zanobetti A. (2000) Using meta-smoothing to estimate dose-response trends across multiple studies, with application to air pollution and daily death. Epidemiology 11:66672.[CrossRef][ISI][Medline]
- Smith RL, Spitzner D, Kim Y, et al. (2000) Threshold dependence of mortality effects for fine and coarse particles in Phoenix, Arizona. J Air Waste Manag Assoc 50:136779.[ISI][Medline]
- Zanobetti A and Schwartz J. (2005) The effect of particulate air pollution on emergency admissions for myocardial infarction: a multicity case-crossover analysis. Environ Health Perspect 113:97882.[ISI][Medline]
- Akaike H. (1973) Information theory and an extension of the maximum likelihood principle. Second International Symposium in Information Theory, Budapest, Hungary, 1972 (Akademiai KaiadoIn Petrov BN and Csaki F (Eds.). , Budapest, Hungary)26781.
- Roberts S and Switzer P. (2004) Mortality displacement and distributed lag models. Inhal Toxicol 16:87988.[CrossRef][ISI][Medline]
- R Development Core Team. R: a language and environment for statistical computing. , Vienna, Austria R Foundation for Statistical Computing, 2004. (http://www.R-project.org).
- Dominici F, Daniels M, Zeger SL, et al. (2002) Air pollution and mortality: estimating regional and national dose-response relationships. J Am Stat Assoc 97:10011.[CrossRef]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





