American Journal of Epidemiology Advance Access originally published online on September 17, 2007
American Journal of Epidemiology 2007 166(9):994-1002; doi:10.1093/aje/kwm231
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Invited Commentary: Effect Modification by Time-varying Covariates
1 Department of Epidemiology, Harvard School of Public Health, Boston, MA
2 Department of Biostatistics, Harvard School of Public Health, Boston, MA
3 Department of Economics, Universidad Di Tella, Buenos Aires, Argentina
Correspondence to Dr. Miguel A. Hernán, Department of Epidemiology, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115 (e-mail: miguel_hernan{at}post.harvard.edu).
Received for publication November 21, 2006. Accepted for publication March 9, 2007.
| ABSTRACT |
|---|
|
|
|---|
Marginal structural models (MSMs) allow estimation of effect modification by baseline covariates, but they are less useful for estimating effect modification by evolving time-varying covariates. Rather, structural nested models (SNMs) were specifically designed to estimate effect modification by time-varying covariates. In their paper, Petersen et al. (Am J Epidemiol 2007;166:985–993) describe history-adjusted MSMs as a generalized form of MSM and argue that history-adjusted MSMs allow a researcher to easily estimate effect modification by time-varying covariates. However, history-adjusted MSMs can result in logically incompatible parameter estimates and hence in contradictory substantive conclusions. Here the authors propose a more restrictive definition of history-adjusted MSMs than the one provided by Petersen et al. and compare the advantages and disadvantages of using history-adjusted MSMs, as opposed to SNMs, to examine effect modification by time-dependent covariates.
causality; confounding factors (epidemiology); longitudinal studies; nested model; observational data; structural model; time-dependent covariate
Abbreviations: MSM, marginal structural model; SNM, structural nested model
| INTRODUCTION |
|---|
|
|
|---|
Marginal structural models (MSMs) are being increasingly used to estimate the effects of time-varying treatments or exposures. Unlike conventional statistical methods, MSMs allow consistent estimation of the effect of a time-varying treatment on an outcome of interest even when there is confounding by time-varying covariates affected by earlier treatment. However, MSMs have an important limitation. As was pointed out by Robins (1, 2) and Hernán et al. (3), MSMs naturally allow estimation of effect modification by baseline covariates, but they are less useful for estimating effect modification by evolving time-varying covariates. Rather, structural nested models (SNMs) were specifically designed to estimate effect modification by time-varying covariates.
In this issue of the Journal, Petersen et al. (4) describe history-adjusted MSMs, a generalized form of MSM that was first proposed by Joffe et al. (5) and studied in detail by van der Laan et al. (6). Petersen et al. argue that history-adjusted MSMs allow a researcher to easily estimate effect modification by time-varying covariates, thus overcoming an important shortcoming of standard MSMs.
However, as we explain below, this apparent advantage of history-adjusted MSMs over standard MSMs comes at a price: History-adjusted MSMs can produce logically incompatible parameter estimates and hence result in contradictory substantive conclusions. As a consequence, clinicians or other decision-makers relying on history-adjusted MSMs to decide the best course of action can be left without guidance. In this commentary, we clarify how history-adjusted MSMs differ from standard MSMs and describe the conditions under which incompatible parameter estimates can arise in the former. We also propose a more restrictive definition of history-adjusted MSMs than the one provided by Petersen et al. (4) and compare the advantages and/or disadvantages of using history-adjusted MSMs, as opposed to SNMs, to examine effect modification by time-dependent covariates.
| STANDARD MARGINAL STRUCTURAL MODELS |
|---|
|
|
|---|
We start by briefly reviewing standard MSMs using Petersen et al.'s notation. To simplify the exposition, we assume a closed cohort with a well-defined time of enrollment for each subject and no loss to follow-up. Time is measured in periods (e.g., months) since time of enrollment, m = 0, until the end of follow-up, m = K + 1. We denote the treatment received in month m as A(m) and covariates measured at the start of month m as L(m). A subject's chronologically ordered data are therefore L(0), A(0), L(1), A(1), ...., L(K), A(K), L(K + 1). In Petersen et al.'s article (4), a subject's A(m) is 1 for the times m that the subject stays on the failing antiretroviral treatment and 0 after switching to another treatment, and CD4 T-cell count Y(m) is a component of the vector L(m). A nondynamic treatment regime that specifies the treatment at each time from time m through time t – 1 is denoted by a(m, t – 1) = {a(m), a(m + 1), ..., a(t – 1)}. For example, in the paper by Petersen et al. (4), a(m, t – 1) = {1, 1, 0, 0, 0, ..., 0} would be the regime "switch from the failing treatment at time m + 2 and continue on the new treatment through t – 1." The counterfactual (or potential) variable Ya(m)(t) represents a subject's CD4 T-cell count measured at time t had the subject followed regime a(m, t – 1). In the paper by Petersen et al. (4), t is m + 8.
A standard MSM can be used to model the mean CD4 T-cell count Ya(m)(t) at time t under all possible nondynamic treatment regimes from baseline time m to t – 1, that is, E[Ya(m)(t)], where E[X] is the expected value or mean of the random variable X. If so desired, the model may be made conditional on baseline variables V(m) to model the conditional mean E[Ya(m)(t)|V(m)] as a function of a(m, t – 1) and V(m). Here V(m) is a vector whose components may include any function of a subject's treatment and covariate history measured before A(m). The model is not defined until the analyst chooses a baseline time m, a response time t, and a functional form for E[Ya(m)(t)|V(m)]. The choice of the times m and t turns out to be a key point in the comparison of standard versus history-adjusted MSMs, so we defer the discussion of this topic to the next section. For now, let us think of m and t as two fixed times after the time of enrollment—for example, m = 1 and t = 9. As to the choice of a functional form for E[Ya(m)(t)|V(m)], the analyst needs to use her subject-matter knowledge to decide what functions of treatment (e.g., duration of treatment, average treatment dose) and baseline variables V(m) are the most appropriate.
An example of a standard MSM is
![]() |
![]() |
Before comparing standard and history-adjusted MSMs in the next section, we point out one important warning for the causal interpretation of a standard MSM. Suppose the baseline time m exceeds 0 and V(m) is a vector with two components: "duration of treatment before m" and "CD4 T-cell count at m." Further suppose that both the estimate of the main effect of "treatment duration before m" and the estimate of the interaction between "treatment duration before m" and dur[a(m, t – 1)] ("treatment duration from m onwards") are large and highly significant. One cannot conclude that "treatment duration before m" has a causal effect on the response Y(t), because these results are compatible with 1) unmeasured confounding for treatment before m or 2) selection bias. To understand why those results might be explained by selection bias, consider the following scenario: 1) Treatment before m is a cause of CD4 T-cell count at m but not a cause of CD4 T-cell count at t, Y(t), and 2) an unmeasured genetic trait that is unassociated with treatment history is a cause of CD4 T-cell count at m and also causes Y(t) both directly and by interacting with treatment subsequent to m. When conditions 1 and 2 hold, conditioning the analysis on CD4 T-cell count at m, a common effect of the genetic trait and treatment before m, induces an association between treatment before m and the unmeasured genetic trait, and therefore between treatment before m and Y(t) (7). The causal directed acyclic graph shown in figure 1 depicts this situation with A(m–), C(m), A(m+), and Y(t) representing treatment before baseline m, CD4 T-cell count at m, treatment after baseline m, and outcome at time t, respectively. We refer to this association as "selection bias" because it exists even when both treatment before m has no causal effect on Y(t) and the genetic trait responsible for the selection bias is marginally unassociated with treatment before m and thus is a nonconfounder.
|
| STANDARD VERSUS HISTORY-ADJUSTED MARGINAL STRUCTURAL MODELS |
|---|
|
|
|---|
Below we discuss the choice of the response time t and the baseline time m. As we will see, these choices are intimately connected with the definitions of standard and history-adjusted MSMs.
Let us first discuss the choice of the response time t. The above standard MSM models the mean outcome at a single fixed time t (e.g., t = 9), and thus we say that it is a univariate MSM. However, a standard MSM need not be univariate. If we are willing to assume that the above model holds for all possible values of t greater than baseline time m, we can simultaneously model the mean of the outcome at all times t > m. The MSM is then multivariate. The procedure for the estimation, via inverse probability weighting, of the parameters of a multivariate MSM requires only a minor generalization of the procedure used for univariate MSMs (see Hernán et al. (8) and the Appendix for details). If one believes this multivariate MSM to be unrealistic because it assumes that the effect of treatment does not depend on the time t, one can make the model more flexible and allow for treatment effects that vary with time by replacing
= (
0,
1) and ß = (ß0, ß1) with time-specific parameter vectors,
t = (
0,t,
1,t) and ßt = (ß0,t, ß1,t).
Let us now turn our attention to the choice of the baseline time m. In many longitudinal studies, the effect of treatment received at time m from enrollment will be confounded unless one can adjust for high-quality time-varying laboratory, clinical, and treatment data collected over a number of periods prior to m. Any time m at which such high-quality data are available is eligible to be the baseline time of an MSM, although generally the earliest eligible time is chosen. For example, if measurements of treatment in the past month and CD4 T-cell counts in the previous 2 months were needed to control confounding for the effect of current treatment, then the earliest possible baseline time would be m = 1 if CD4 T-cell measurements began at the time of enrollment m = 0. (See Robins et al. (9) for a more detailed discussion.)
However, rather than using precisely one eligible baseline time (e.g., m = 1), one could decide to use all eligible baseline times m = 1, 2, 3, ... before t. Thus, in the above univariate MSM, we could see m as an index for multiple baseline times instead of as a single fixed time m. If one believes the MSM with multiple baseline times to be unrealistic because it assumes that the effect of treatment does not depend on the baseline time m, one can make the model more flexible and allow for treatment and covariate effects that vary with time by replacing
= (
0,
1) and ß = (ß0, ß1) with time-specific parameter vectors,
m = (
0,m,
1,m) and ßm = (ß0,m, ß1,m), which are indexed by the eligible baseline times m < t.
A univariate MSM with multiple baseline times appears closely analogous to a multivariate MSM, except with multiple baseline times m per subject substituted for multiple response times t. In fact, the procedure for estimation, via inverse probability weighting, of the parameters of a univariate MSM with multiple baseline times and of a multivariate MSM are also analogous (see Appendix).
Petersen et al. (4) refer to MSMs with multiple baseline times as "history-adjusted MSMs" (6, 10). MSMs with multiple baseline times can be divided into two mutually exclusive groups. For a given MSM and outcome time t, let num(t) count the number of different baseline times m for which the MSM models the effect of regimes beginning at m on the outcome Y(t). The first group is composed of MSMs for which num(t) exceeds 1 for one or more outcome times t. This group includes MSMs, similar to those considered in an earlier paper by van der Laan et al. (6), that model the effect on an outcome Y(t) of treatment regimes beginning at all times m prior to t. The second group is composed of MSMs for which num(t) is 1 for all outcome times t. This group includes the MSM discussed by Petersen et al. (4) that restricts the set of outcome times to months 8 and later and only models the effect on each outcome Y(t) of treatment regimes beginning at time m = t – 8. We propose that the use of the term "history-adjusted MSM" be reserved for the first group, for the following reasons.
First, restricting the name "history-adjusted" to MSMs in group 1 is more in keeping with Petersen et al.'s conceptualization of the difference between history-adjusted MSMs and standard MSMs (4, 10). Specifically, in their abstract, the authors state that "unlike standard MSMs, history-adjusted MSMs can be used to estimate modification of treatment effects by time-varying covariates" (4, p. 985). However, this claim is true only for MSMs in group 1: To estimate effect modification by a time-varying covariate on a response Y(t), we must, by definition, model effect modification at two or more times m, since otherwise we could regard the covariate as non-time-varying. In contrast to MSMs in group 1, MSMs in group 2 are like standard MSMs in that, for a given response Y(t), they estimate the magnitude of effect modification by past time-varying covariates only at a single baseline time m—for example, m = t – 8. For this reason, we can regard MSMs in group 2 to be simply a collection of ordinary MSMs that, just like multivariate MSMs, allow increased estimation efficiency 1) by assuming that the parameters corresponding to different members of the collection are related and 2) by using more realistic working models than the independence model for within-individual correlations.
Second, it was only MSMs in group 1 that Robins (1, 2) was warning against when he stated that MSMs could not be easily used to estimate effect modification by evolving time-dependent covariates. This is because, as we explain below and in the Appendix, only MSMs in group 1 can be incompatible and thus lead to logical inconsistencies. Henceforth we refer only to models in group 1 as history-adjusted MSMs.
| MODEL INCOMPATIBILITY IN HISTORY-ADJUSTED MARGINAL STRUCTURAL MODELS |
|---|
|
|
|---|
Below we show that the apparently nearly exact analogy between history-adjusted MSMs and a standard multivariate MSM goes only so far. Specifically, a history-adjusted MSM, unlike a standard multivariate MSM, may be an incompatible model. We say a model is incompatible if there exist any logically inconsistent (incompatible) parameter values. A familiar case of an incompatible model is a linear regression model Pr(D = 1|X) =
0 +
1X for a binary outcome D. For example, if the covariate X takes values 0, 1, ..., 100 and one fits this model by a method (such as ordinary least squares) that does not impose the constraint that predicted probabilities must lie between 0 and 1, one can easily obtain incompatible parameter estimates, such as
0 +
1X is compatible, because e
0 +
1X/(1 + e
0 +
1X) is always between 0 and 1.
We now provide an informal explanation of why history-adjusted MSMs may be incompatible (see the Appendix for a formal treatment). Let us start by considering our original univariate MSM with the only response time t equal to K + 1 but now with multiple baseline times m:
![]() |
- The direct effect of baseline treatment a(m) is the same as the effect of each subsequent component of the treatment.
- The effect of treatment from m + 1 to K is the same regardless of (i.e., is not modified by) the value of the baseline treatment a(m).
- The effect of (baseline and subsequent) treatment is the same for all baseline times.
![]() |
, ß
, ß
) encodes the direct effect of baseline treatment when subsequent treatment is withheld, a(m + 1, K) = 0. We will henceforth refer to this simply as the direct effect of baseline treatment. The parameter vector ß(2) = (ß
, ß
, ß
, ß
) encodes the effect of subsequent cumulative treatment. The estimates of the model parameters, (
Suppose, as an example, that 1) the components of
(1) are all negative and lie within the interval (–1/4, –3/4) and a joint 95 percent confidence interval for ß(1) only includes vectors with all components lying between –1 and –0.01 and 2) the components of
(2) all exceed 10 and a joint 95 percent confidence interval for ß(2) includes only vectors with all components exceeding 8. The negative
(1) implies that, for each m, the effect of baseline treatment a(m) has a negative effect on Y(K + 1) when a(m + 1, K) = 0. The positive
(2) implies that cumulative treatment from m + 1 to K has a large positive effect. Furthermore, suppose the confidence intervals imply that the opposite signs of the estimated effects of baseline versus subsequent treatment cannot be explained by sampling variability.
However, it is logically impossible for cumulative treatment from m + 1 to K to have a large positive effect on Y(K + 1) if, for each time s greater than m, a(s) alone has a negative effect. This implies that the history-adjusted MSM is an incompatible model and the parameter estimates
(1) and
(2) are logically inconsistent. This result is made precise in theorem 1, shown in the Appendix, where it is formally proven that pairs (ß(1), ß(2)) with ß(1) negative and ß(2) positive are logically incompatible.
The incompatible estimates of
(1) and
(2) also result in logically inconsistent statements about clinical strategies. Specifically, theorem 1 shows that all components of
(1) being less than 0 implies that the estimated optimal treatment regime starting from any eligible time m is the regime 0(m), "always withhold treatment from m." However, in the Appendix, we also show that all components of
(2) being positive and larger in absolute value than those of
(1) implies that the regime 1(m), "always take treatment starting at m," is (estimated) to be preferable to the regime 0(m). The preceding two statements are logically inconsistent and taken together would leave a health-care provider without any guidance as to a reasonable treatment strategy. Thus, an analyst committed to using history-adjusted MSMs would face two undesirable alternatives: to use a compatible but unrealistic, and therefore probably very badly misspecified, model or to use a more realistic but incompatible model that may lead to logically inconsistent estimates.
Of course, the use of incompatible models only poses a difficulty if incompatible estimates are likely to occur. It is clear that an ordinary least-squares fit of our linear Bernoulli regression model will frequently result in incompatible estimates. It may be less clear that an inverse probability weighting fit of our incompatible history-adjusted MSM can also easily result in incompatible estimates. However, in the model used in our example, incompatible estimates may occur if 1) the model is somewhat misspecified in that the effect of subsequent treatment on the mean outcome actually depends on a much more complicated function of a(m + 1, K) than the assumed linear dependence on dur[a(m + 1, K)] and 2) for most times j, A(j) is highly correlated with the part of that complicated function of A(m + 1, K) that is uncorrelated with dur[A(m + 1, K)]. In the Appendix, we argue that it may be prohibitively difficult to develop an empirical test of fit for a history-adjusted MSM that reliably indicates that conditions 1 and 2 have not only occurred but are of sufficient magnitude to produce estimates which suffer from incompatibility to such an extent that the clinically relevant inferences may be compromised.
| STRUCTURAL NESTED MODELS VERSUS HISTORY-ADJUSTED MARGINAL STRUCTURAL MODELS |
|---|
|
|
|---|
We have seen that the problem in fitting a history-adjusted MSM by inverse probability weighting is that the estimates
Indeed, the fact that we can estimate ß(1) by g-estimation rather than inverse probability weighting is a second important benefit (in addition to avoiding model incompatibility) of using an SNM rather than a history-adjusted MSM. As we describe in the Appendix, inverse probability weighting estimation requires a "positivity assumption" (13) and is sensitive to the presence of extreme weights, either true or estimated. In contrast, g-estimation does not require a positivity assumption and is much less affected by extreme weights. The Appendix also contains a brief discussion of approaches other than g-estimation to handling model incompatibility and of how incompatible models might be used for goodness-of-fit testing and model selection.
In the absence of model misspecification or confounding by unmeasured factors, both inverse probability weighting estimation of standard or history-adjusted MSMs and g-estimation of SNMs allow one to estimate the effect of a time-varying treatment even when there is time-dependent confounding by time-varying covariates affected by earlier treatment. However, for the reasons discussed above, we would recommend that SNMs rather than history-adjusted MSMs be the routine model choice for investigation of effect modification by evolving time-varying covariates. Nevertheless, we also encourage comparison of the results obtained with SNMs to those obtained with history-adjusted MSMs to deepen our understanding of and experience with these new models. Only through such comparisons will we learn whether incompatible estimates occur with history-adjusted MSMs frequently enough to be of concern.
| APPENDIX |
|---|
|
|
|---|
Here we prove a number of results mentioned in the main text as well as briefly touch on certain more advanced issues. Our discussion is restricted to structural mean models, that is, models for the conditional mean of a counterfactual outcome.
Estimation of the parameters of marginal structural models (MSMs)
Throughout we use the following notational conventions. Capital letters such as L(m) refer to random variables, that is, a variable which can take on different values for different study subjects. Small letters such as l(m) refer to the possible values of L(m). Overbar variables with a time t in parentheses denote the history of the variable from 0 to t, and overbars without parentheses denote the entire covariate history, that is,
(t) = {A(0), ..., A(t)} and
=
(K). In addition, we use underbars to denote future values of a variable in the following way: A(m, t) = {A(m), A(m + 1), ..., A(t – 1), A(t)} is the A-history from time m through time t and A(m) = A(m, K) is a subject's treatment history from m to the end of the study. Similarly, we let a(m, t) and a(m) = a(m, K) denote a possible treatment history from m to t and from m to K, respectively. By convention, a(m) = 0 denotes either no treatment or a standard treatment at time m. Thus, the history a(m, t) = 0(m, t) stands for the history "withhold treatment from m through t" or "receive the standard treatment from m through t."
Let H(m) = {
(m),
(m – 1)} be the entire covariate and treatment history prior to receiving treatment A(m), and let V(m) be a subvector of H(m) = {
(m),
(m – 1)} that is of interest as an effect modifier. We may sometimes choose V(m) to be all of H(m). Let Ya(m)(t) be a subject's counterfactual outcome at time t if the subject had received his observed treatment regime
(m – 1) up to time m and history a(m) from m onwards.
A standard, univariate MSM models the mean of the counterfactual outcome Ya(m)(t) at time t > m as a function of the possible treatment histories a(m, t – 1) from time m to time t – 1 and the baseline covariates V(m). For example,
|
|
![]() |
* = (
, 
) and ß* = (ß
, ß
) are unknown parameter vectors and
![]() |
Under the assumption of no unmeasured confounders for the effect of the time-varying treatment A(m), the parameters of the univariate MSM can be estimated by weighted least squares with estimated stabilized inverse probability weights depending on the baseline time m and response time t:
![]() |
Model incompatibility in history-adjusted MSMs
To explain why history-adjusted MSMs may be incompatible, we will consider a univariate MSM with t = K + 1 that allows the effect of the treatment a(m) on Y(K + 1) to differ from the effect of later treatments. It will be helpful to decompose E[Ya(m)(K + 1)|V(m)] into the sum of three functions:
- The conditional mean of Y(K + 1) when treatment is withheld from m onwards:

- The direct effect of treatment a(m) on Y(K + 1) when treatment from m + 1 onwards is withheld (i.e., a(m + 1, K) = 0):

- The effects of treatment a(m + 1, K) from m + 1 onwards (including effects due to interactions with treatment a(m) at m):

*, ß*) for E[Ya(m)(K + 1)|V(m)]. As a concrete example, suppose we specify- r0(K + 1, m, V(m),
*) = 
+ 
V(m);
- r1(K + 1, m, a(m), V(m), ß(1)*) = ß
a(m) + ß
a(m) x V(m) + ß
a(m)(K – m); and
- r2(K + 1, m, a(m, K), V(m), ß(2)*) = ß
dur[a(m + 1, K)] + ß
dur[a(m + 1, K)]V(m) + ß
dur[a(m + 1, K)](K – m) + ß
a(m)dur[a(m + 1, K)],
.
A given treatment regime g(m) = {gm{h(m)}, gm +1{h(m + 1)}, ..., gK{h(K)}} has a subject follow her observed treatment history up to m and then, at each time j
m, determines her treatment dose at j by the value of a given function gj{h(j)} of past treatment and covariate history h(j). If, for each j
m, gj{h(j)}gives the same value a(j) for all past h(j), we can say that the regime g(m) is nondynamic and write the regime as a(m) = {a(m), ..., a(K)}, as in the main text. Otherwise, the regime is dynamic. The following is an immediate consequence of theorem 4 in the paper by Robins (12).
Theorem 1.
Suppose the sequential randomization assumption holds (i.e., there are no unmeasured confounders) for all m and that all treatments a(m) are coded as nonnegative. Suppose that for each m the effect of a(m) on the mean of Y(K + 1) is less than 0 when a(m + 1, K) = 0—that is, for all m, H(m):
|
|
|
|
|
|
We now discuss the relevance of theorem 1 for the example given in the text. Suppose all levels h(m) of H(m) are coded as nonnegative and V(m) = H(m). It follows from the theorem that the inverse-probability-weighted estimates
(1) and
(2) in our example are logically inconsistent (in the sense that no actual distribution exists with these parameter values), since all components of
(1) being negative and all components of
(2) being positive imply that our estimate r1(K + 1, m, a(m), V(m),
(1)) of r1(K + 1, m, a(m), H(m)) is negative but our estimate r2(K + 1, m, a(m, K), V(m),
(2)) of r2(K + 1, m, a(m, K), V(m)) is positive for all H(m), which contradicts the above theorem.
Furthermore, the last part of theorem 1 implies, as stated in the text, that our negative estimate of
(1) means that the regime 0(m) is the estimated optimal regime. We next verify our claim in the text that components of
(2) positive and larger in absolute value than those of
(1) imply that the regime 1(m), "always take treatment starting at m," is estimated to be preferable to the regime 0(m). Note that
|
|
|
|
Finally, we stress that incompatible estimates often pose no difficulty when they cannot result in logically contradictory estimates of substantively important effects. As an example, Robins and Rotnitzky (14) argued that using incompatible models and estimates to construct generalized doubly robust estimators posed no problem, because the models served simply as statistical tools for reducing bias. In contrast, use of incompatible history-adjusted MSMs can be problematic, because they are substantive tools used to estimate treatment effects.
Structural nested models (SNMs) for handling model incompatibility
Standard inverse probability weighting methods require that the positivity assumption f[a(j)|H(j)] > 0 hold for all possible values of a(j) and (essentially) all histories H(j). Even when the positivity assumption holds, the denominator of
(m, K),
![]() |
In contrast, g-estimation of a structural nested mean model r1(K + 1, m, a(m), H(m), ß(1)*) for the direct effect r1(K + 1, m, a(m), H(m)) of treatment a(m) does not require the positivity assumption and is much less affected by K – m being large, the treatment A(j) having many levels or being continuous, and there being many continuous covariates in L(j). First, one does not divide by estimates of f[A(j)|H(j)], so the problem of extreme weights does not exist. In fact, those subjects who would have the most extreme weights and thus cause the most trouble for inverse probability weighting make a much smaller contribution to the g-estimation analysis, thereby causing little trouble. Second, for continuous or many-leveled A(j)'s, one need only model the mean of A(j) given H(j), a much easier task than modeling the entire density function f[a(j)|H(j)]. Third, even if K – m is large, one can choose not to model the mean of A(j) given H(j) for large j near K, thereby trading off some loss of precision for better bias control.
Another apparent advantage of estimating r1(K + 1, m, a(m), H(m)) by g-estimation of an SNM rather than inverse probability weighting estimation of an MSM is that no model for r0(K + 1, m, H(m)) is required. However, this advantage is only apparent; Robins (1) describes a modification of an MSM, referred to as a "semiparametric regression MSM," that also does not require a model for r0(K + 1, m, H(m)) and is fitted by inverse probability weighting.
In our example, we took V(m) to be the entire past H(m). When V(m) and H(m) differ, a model for r1(K + 1, m, a(m), V(m)) is referred to as a "marginal structural nested model"; as befits its name, a hybrid of g-estimation and inverse probability weighting estimation is used to estimate the model parameters. (See van der Laan and Robins (16) for details.)
Finally, g-estimation of an SNM, unlike inverse probability weighting estimation of an MSM, has not been possible when the response Y(t) was a dichotomous indicator of disease status, except under the rare disease assumption. Hence, history-adjusted MSMs might be preferred to SNMs for nonrare dichotomous responses. However, recent work by van der Laan et al. (17) and Richardson and Robins (T. Richardson and J. Robins, Harvard School of Public Health, unpublished data) holds the promise that, in the near future, g-estimation of SNMs may be extended to cover nonrare dichotomous responses.
We now describe how, after obtaining a g-estimate
(1) of the parameter ß(1)* of an SNM r1(K + 1, m, a(m), H(m); ß(1)*), we can use Monte Carlo simulation to estimate E[Yg(m)(K + 1)] for any g(m) without having to model r2(K + 1, m, a(m, K), H(m)). First we estimate E[Y(0(m))(K + 1)] by the sample average of
![]() |
- First, for k = m, ..., K, fit a parametric model for f[l(k)|
(k–1),
(k– 1)] to the data and let
[l(k)|
(k – 1),
(k – 1)] denote the estimate of f[l(k)|
(k – 1),
(k – 1)] under the model.
- Do the following for v = 1, ..., V, with V selected to be very large:
- a) Choose hv(m) =
v (m),
v(m – 1) to be the value of H(m) for a subject randomly drawn from the n study subjects.
- b) Recursively for k = m + 1, ..., K, draw lv(k) from
[l(k)|
v(k – 1),
v(k – 1)] with the treatment history from m to k – 1 determined by the regime g(m).
- c) Let
g(m),v = 
r1(K + 1, j, av(j), hv(j),
(1)).
- b) Recursively for k = m + 1, ..., K, draw lv(k) from
- a) Choose hv(m) =
- Let
[Yg(m)(K + 1)] =
[Y(0(m))(K + 1)] + 
g(m),v/V be the estimate of E[Yg(m)(K + 1)].
The above approach is based on theorem 4 in the paper by Robins (12). Alternative approaches to the estimation of E[Yg(m)(K + 1)] for both dynamic and nondynamic regimes, based on other recent extensions of MSMs and SNMs, have been developed by Orellana et al. (18), van der Laan et al. (10), Murphy et al. (19), and Robins (20).
Alternative approaches to handling model incompatibility
We now discuss alternative approaches to handling model incompatibility and consider their possible application to history-adjusted MSMs.
Saturated models.
If an incompatible model is saturated, one will never obtain incompatible parameter estimates. Thus, in the context of our linear probability example, if we fit the saturated incompatible model
|
|
Replacing models with approximations.
All models are incorrect. Van der Laan et al. (6) argue that it is therefore more honest to redefine ß(1)* and ß(2)* in the history-adjusted MSM of our example to be the limits of the inverse-probability-weighted estimates
(1) and
(2) as the sample size goes to infinity. They then view r1(K + 1, m, a(m), V(m); ß(1)*) and r2(K + 1, m, a(m, K), V(m); ß(2)*) as approximations of, rather than models for, r1(K + 1, m, a(m), V(m)) and r2(K + 1, m, a(m, K), V(m)). From this point of view, since there are no models, there is no possibility of model or parameter incompatibility. Thus, neither ß(1)* and ß(2)* nor
(1) and
(2) can be incompatible.
Our difficulty with this approach is that it does nothing to solve our problem; it simply sweeps the problem under the rug. In the context of our example, a health-care provider remains without a clue as to a reasonable treatment strategy, since she can still deduce from theorem 1 that it is logically impossible for both r1(K + 1, m, a(m), V(m);
(1)*) to be a good approximation of r1(K + 1, m, a(m), V(m); ß(1)*) and r2(K + 1, m, a(m, K), V(m);
(2)*) to be a good approximation of r2(K + 1, m, a(m, K), V(m)).
Exploiting incompatible models for goodness-of-fit (GOF) testing and model selection.
We say that a model indexed by a parameter vector
is correctly specified if there is a true (and therefore compatible) value
* of
under which the data were generated. All saturated models are correctly specified. In contrast to a saturated model, if one fits a correctly specified incompatible model that is not saturated, one may obtain incompatible parameter estimates; however, a 1 –
confidence interval for
* must include the true compatible parameter vector
* and, thus, a compatible parameter value with probability at least 1 –
. Therefore, we can perform a valid (albeit conservative)
-level GOF test of the null hypothesis that an incompatible model is correctly specified by rejecting the null hypothesis whenever a 1 –
confidence interval for
* fails to contain a compatible parameter value
. If the GOF test accepts, we accept the null hypothesis of correct specification, and the set of compatible parameter values
in the 1 –
confidence interval for
* forms a 1 –
confidence set for
*.
If, as would be the case in our example with
* = (ß(1)*, ß(2)*), our GOF test rejects, we enlarge our model by increasing the dimension of
*—for example, by adding quadratic interactions with time, ß
a(m)(K – m)2 and ß
dur[a(m + 1, K)](K – m)2—and then testing whether the enlarged model fits. If not, we continue enlarging until we finally have a model that fits, and we report the set of compatible values of the enlarged parameter
contained in the 1 –
confidence interval for the enlarged
* as a 1 –
confidence set for
*. The actual coverage of these intervals would not be 1 –
, but appropriate corrections could be worked out. Furthermore, one needs an algorithm for finding the set
of compatible values in a given 1 –
confidence interval for
*, which is a highly nontrivial problem. In addition, the power properties of this procedure are almost certainly poor.
With much additional work, it is conceivable that this GOF-testing-based model selection strategy might someday become, in certain settings, a viable alternative to the strategy of using SNMs rather than history-adjusted MSMs. A naive reader might think the strategy based on GOF testing even has certain advantages over the use of SNMs, since, if the model r1(K + 1, m, a(m), V(m); ß(1)*) is badly misspecified, the GOF approach might detect such misspecification while the most straightforward use of g-estimation will not.
However, if one really wishes to perform a GOF test of the model r1(K + 1, m, a(m), V(m); ß(1)*) with V(m) = H(m), GOF tests based on g-estimation of enlargements of the model should be more efficacious and powerful than the above inverse-probability-weighting-based GOF test of compatibility of the model r1(K + 1, m, a(m), V(m); ß(1)*) with the model r2(K + 1, m, a(m, K), V(m); ß(2)*), since the latter model may itself be badly misspecified. Thus, we are skeptical that the use of incompatible models for GOF testing and model selection will prove beneficial.
| ACKNOWLEDGMENTS |
|---|
This work was supported by National Institutes of Health grants R37-AI032475 and R01-HL080644.
Conflict of interest: none declared.
| References |
|---|
|
|
|---|
- Robins JM. Marginal structural models. In: In: 1997 Proceedings of the American Statistical Association, Section on Bayesian Statistical Science (1998) Alexandria, VA: American Statistical Association. 1–10.
- Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Statistical models in epidemiology: the environment and clinical trials—Halloran E, Berry D, eds. (1999) New York, NY: Springer-Verlag. 95–134.
- Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of human immunodeficiency virus-positive men. Epidemiology (2000) 11:561–70.[CrossRef][Web of Science][Medline]
- Petersen M, Deeks S, Martin J, et al. History-adjusted marginal structural models for estimating time-varying effect modification. Am J Epidemiol (2007) 166:985–93.
[Abstract/Free Full Text] - Joffe M, Santanna J, Feldman H. Partially marginal structural models for causal inference. (Abstract). Am J Epidemiol (2001) 153(suppl):S261.
- van der Laan MJ, Petersen ML, Joffe MM. History-adjusted marginal structural models and statically-optimal dynamic treatment regimens. Int J Biostat (2005) 1. article 4. (Electronic article). (http://www.bepress.com/ijb/vol1/iss1/4).
- Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology (2004) 15:615–25.[CrossRef][Web of Science][Medline]
- Hernán MA, Brumback B, Robins JM. Estimating the causal effect of zidovudine on CD4 count with a marginal structural model for repeated measures. Stat Med (2002) 21:1689–709.[CrossRef][Web of Science][Medline]
- Robins JM, Hernán MA, Siebert U. Effects of multiple interventions. In: Comparative quantification of health risks: global and regional burden of disease attributable to selected major risk factors—Ezzati M, Lopez AD, Rodgers A, et al, eds. (2004) Vol II. Geneva, Switzerland: World Health Organization. 2191–230.
- van der Laan MJ. Causal effect models for intention to treat and realistic individualized treatment rules. In: (U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 203) (2006) Berkeley, CA: Division of Biostatistics, School of Public Health, University of California, Berkeley. (http://www.bepress.com/ucbbiostat/paper203).
- Robins JM. The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In: Health service research methodology: a focus on AIDS—Sechrest L, Freeman H, Mulley A, eds. (1989) Washington, DC: National Center for Health Services Research, US Public Health Service. 113–59.
- Robins JM. Correcting for non-compliance in randomized trials using structural nested mean models. Commun Stat (1994) 23:2379–412.
- Hernán MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health (2006) 60:578–86.
[Abstract/Free Full Text] - Robins JM, Rotnitzky A. Comment on the Bickel and Kwon article, "Inference for semiparametric models: some questions and an answer." Stat Sinica (2001) 11:920–36.
- Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc (1995) 90:106–21.[CrossRef][Web of Science]
- van der Laan M, Robins JM. Unified methods for censored and longitudinal data and causality (2003) New York, NY: Springer Verlag.
- van der Laan MJ, Hubbard AE, Jewell NP. Estimation of treatment effects in randomized trials with noncompliance and a dichotomous outcome. In: (U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 157) (2004) Berkeley, CA: Division of Biostatistics, School of Public Health, University of California, Berkeley. (http://www.bepress.com/ucbbiostat/paper157).
- Orellana L, Rotnitzky A, Robins JM. Generalized marginal structural models for estimating optimal treatment regimes. In: (Technical report) (2006) Boston, MA: Department of Biostatistics, Harvard School of Public Health.
- Murphy SA. Optimal dynamic treatment regimes. J R Stat Soc B (2003) 65:331–66.[CrossRef]
- Robins JM. Optimal structural nested models for optimal sequential decisions. In: Proceedings of the Second Seattle Symposium on Biostatistics—Lin DY, Heagerty P, eds. (2004) New York, NY: Springer Publishing Company.
Related articles in Am. J. Epidemiol.:
- History-adjusted Marginal Structural Models for Estimating Time-varying Effect Modification
- Maya L. Petersen, Steven G. Deeks, Jeffrey N. Martin, and Mark J. van der Laan
Am. J. Epidemiol. 2007 166: 985-993.[Abstract] [FREE Full Text]
This article has been cited by other articles:
![]() |
R. W. Platt, E. F. Schisterman, and S. R. Cole Time-modified Confounding Am. J. Epidemiol., September 15, 2009; 170(6): 687 - 694. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Petersen and M. J. van der Laan Petersen et al. Respond to "Effect Modification by Time-varying Covariates" Am. J. Epidemiol., November 1, 2007; 166(9): 1003 - 1004. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||










