Skip Navigation


American Journal of Epidemiology Advance Access originally published online on August 21, 2006
American Journal of Epidemiology 2006 164(7):697-705; doi:10.1093/aje/kwj256
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
164/7/697    most recent
kwj256v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Yasui, Y.
Right arrow Articles by Egan, K. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yasui, Y.
Right arrow Articles by Egan, K. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

American Journal of Epidemiology Copyright © 2006 by the Johns Hopkins Bloomberg School of Public Health All rights reserved; printed in U.S.A.

Practice of Epidemiology

Familial Relative Risk Estimates for Use in Epidemiologic Analyses

Yutaka Yasui1, Polly A. Newcomb2,3, Amy Trentham-Dietz3 and Kathleen M. Egan4

1 Department of Public Health Sciences, School of Public Health, University of Alberta, Edmonton, Alberta, Canada
2 Cancer Prevention Program, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA
3 University of Wisconsin Comprehensive Cancer Center, Madison, WI
4 Vanderbilt University School of Medicine and Vanderbilt-Ingram Cancer Center, Nashville, TN

Correspondence to Dr. Yutaka Yasui, Department of Public Health Sciences, University of Alberta, 13-106J Clinical Sciences Building, Edmonton, Alberta T6G 2G3, Canada (e-mail: yyasui{at}ualberta.ca).

Received for publication August 7, 2005. Accepted for publication March 21, 2006.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 
Commonly used crude measures of disease risk or relative risk in a family, such as the presence/absence of disease or the number of affected relatives, do not take into account family structures and ages at disease occurrence. The Family History Score incorporates these factors and has been used widely in epidemiology. However, the Family History Score is not an estimate of familial relative risk; rather, it corresponds to a measure of statistical significance against a null hypothesis that the family's disease risk is equal to that expected from reference rates. In this paper, the authors consider an estimate of familial relative risk using the empirical Bayes framework. The approach uses a two-level hierarchical model in which the first level models familial relative risk and the second considers a Poisson count of the number of affected relatives given the familial relative risk from the first level. The authors illustrate the utility of this methodology in a large, population-based case-control study of breast cancer, showing that, compared with commonly used summaries of family history including the Family History Score, the new estimates are more strongly associated with case-control status and more clearly detect effect modification of an environmental risk factor by familial relative risk.

Bayes theorem; family; Poisson distribution; regression analysis; risk


Abbreviations: AFB, age at first birth; CBCS II, Collaborative Breast Cancer Study II; FSIR, Familial Standardized Incidence Ratio; MLE, maximum likelihood estimator


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 
Estimates of disease relative risk in families have important utilities in investigations of disease etiology. They are used to examine whether the disease of interest clusters in certain families and whether its etiology has a familial component. They are also used to adjust for familial aggregations when evaluating the effects of other nonfamilial etiologic factors in epidemiologic studies. Furthermore, familial relative risk estimates are used to examine effect modification of an etiologic factor according to levels of disease relative risk in families. Finally, a valid assessment of familial relative risk may have important clinical utility in triaging persons for more involved genetic screening and informing family members about potential risks.

In spite of the important utilities, family history information is often handled rather crudely in epidemiologic analyses. A commonly used summary of family history is a binary indicator (yes/no) of whether study participants have affected family members, often gender specific, in first- or second-degree relatives. Another summary that carries a little more information is the number of affected family members. These crude summaries have two critical deficiencies in view of their use as familial relative risk estimates. First, they do not account for family size, structure, or ages of family members. Larger families and families with older members are naturally more likely to have members who have developed chronic diseases such as cancer. Second, the crude summaries do not take chance into account: families with identical familial relative risk levels, sizes, structures, and ages can yield different numbers of affected members by chance alone.

Kerber (1Go) proposed the Familial Standardized Incidence Ratio (FSIR) as a measure of familial relative risk that accounts for family size, structure, or ages of family members. Boucher and Kerber (2Go) applied a linear empirical Bayes approach to log{1 + log(1 + FSIR)} with a normality assumption to its underlying true values. In this paper, we extend Kerber's method for estimating familial relative risk levels, applying empirical Bayes estimation methods with a nonparametric discrete prior distribution to overcome the deficiencies of the crude summaries. Following a brief review of the Family History Score (3Go), which was proposed for the same reasons as described above, we explain why it is actually not an estimate of familial relative risk. The utility of the new method proposed here is shown in a large, population-based case-control study of breast cancer. Two main points are illustrated. First, the empirical Bayes estimates of familial relative risk are associated with case-control status more strongly than other summary measures of family history, including Family History Scores. Second, they detect an effect modification of an environmental risk factor according to the level of familial relative risk more clearly than do other summary measures. In the Discussion section of this paper, we outline the potential use of the empirical Bayes familial relative risk estimates in other areas of public health and clinical research.


    FAMILY HISTORY SCORE
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 
A method previously proposed to overcome the deficiencies of the crude summaries and used widely in epidemiologic analyses is the Family History Score (3Go). In this approach, an expected risk of the disease of interest is computed for each family member by using a set of external reference rates for the disease. For ith family's jth member, the expected risk Formula is given by the cumulative risk of the disease under observation (4Go):

Formula
where Formula is the external reference rate for the kth stratum (e.g., age-sex-race–defined stratum) and Formula is the length of time that ith family's jth member spent under observation in the kth stratum. Ages of family members are accounted for in the computation of the expected risks. The Family History Score Formula for ith family is defined by

Formula
where Formula is the disease indicator of ith family's jth member. If the disease is rare, then Formula is approximately equal to Formula and Formula resulting in a simpler formula:

Formula

The Family History Score Formula is in the form of a test statistic, which suggests a measure of statistical significance against a null hypothesis that the disease risk for each family member is equal to the expected risk computed from the external reference rates. A Bernoulli random variable Formula has the "success probability" Formula under the null hypothesis and, accordingly, we have

Formula
where the variance formula assumes that Formula's within each family are uncorrelated. The Family History Score Formula can then be seen as a test statistic in the form of Formula that usually leads to a standard normal large-sample distribution, where the large sample refers to the size of each family being large.

A Family History Score is actually not an estimate of the familial relative risk level. It is a test statistic for a null hypothesis that the disease risk for each family member is equal to the expected risk computed from the external reference rates. Statistical significance determined by the observed value of a test statistic is a function of a sample size (i.e., family size, structure, and ages) as well as the degree of departure from the null hypothesis (i.e., familial relative risk levels). Data for larger families tend to give higher statistical significance and therefore larger absolute values of Family History Scores given the same level of familial risk. Note also that the numeric values of Family History Scores cannot be interpreted directly. They suggest statistical significance levels determined according to a known probability distribution of the test statistic. In other words, Family History Scores order families by statistical significance against the null hypotheses, but their numeric values require a metric, the known probability distribution of the test statistic, in order to have interpretable numeric distances between them. These considerations have led us to a different approach to estimating familial relative risk levels, which shares similarities with the methods of Kerber (1Go) and of Boucher and Kerber (2Go).


    EMPIRICAL BAYES ESTIMATES OF FAMILIAL RELATIVE RISK
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 
We define the familial relative risk of the disease for ith family as the relative risk of the disease shared by the members of ith family relative to the external reference. Our model is

Formula
where Formula's are Bernoulli random variables conditionally independent given Formula's. We may estimate Formula by maximizing the sum of the Bernoulli log-likelihood for ith family. The score equation that the maximum likelihood estimator (MLE) Formula satisfies is

Formula
and the MLE can be simplified to Formula the standardized mortality (or incidence) ratio, under the rare disease assumption. The precision of the MLEs varies across families, however, because Formula is based solely on ith family's data, and family sizes, structures, and ages differ across families. Small families with Formula could yield extremely high values of Formula's just by chance alone.

Similar difficulties with the MLEs can occur in other biostatistical applications such as estimation of small-area disease risks (5Go) and comparison of risk across hospitals for a given medical procedure (6Go). A common feature shared by these problems is that there are many parameters to be estimated, each of which is indexed by one of the units of various sizes (e.g., families, small areas, and hospitals), and the data available from each unit are limited. As a consequence, extreme values of MLEs occur for small units corresponding to very large variances of MLEs.

Such difficulties with MLEs can be alleviated by the use of hierarchical models in which Formula's are considered random quantities and are modeled in an additional hierarchical layer. Specifically, the hierarchical model takes the form

Formula
where G denotes a probability distribution over positive real numbers. Let us call the layers for Formula and Formula the "observable level" and the "latent level" of the hierarchical model, respectively. The latent level assumes common stochastic features for Formula's, which provide additional information on shared characteristics of Formula's that are not used to compute MLEs. By adding the latent level, estimators of Formula's can "borrow strength" from other units (e.g., families) by combining the information on each individual unit with that on the common characteristics of Formula's.

For the distribution G of Formula's, we propose the use of a (nonparametric) discrete distribution with K levels of familial relative risk Formula and their associated probabilities {pk}. While G can be a (parametric) continuous distribution such as gamma or lognormal distributions, the nonparametric G has an advantage in its flexible shape, determined by the data. Maximum likelihood estimation of the nonparametric G has been discussed by a number of authors (7Go–9Go). To compute the MLE of G, we used the C.A.MAN program (Computer Assisted Mixture ANalysis) of Böhning et al. (10Go) and their freeware (11Go). Once the MLE of G is computed, the empirical Bayes estimate of Formula is given by the posterior mean of Formula with the MLE Formula:

Formula
where Formula is the probability of observing the realization vector Formula given Formula Note that Formula is of the form of a weighted average of Formula


    APPLICATION TO AN EPIDEMIOLOGIC INVESTIGATION OF BREAST CANCER ETIOLOGY
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 
As an example, we apply the proposed familial risk estimates to a large, population-based case-control study of breast cancer. Two main points are illustrated. First, the empirical Bayes estimates of familial relative risk are associated with case-control status more strongly than other summary measures of family history, including Family History Scores. Second, these estimates detect an effect modification of an environmental risk factor according to the level of familial relative risk more clearly than do other summary measures.

Collaborative Breast Cancer Study II
The data used in this illustration were derived from the Collaborative Breast Cancer Study II (CBCS II); the CBCS II study protocol was approved by the institutional review boards of the participating institutions (12Go, 13Go). Briefly, CBCS II was a case-control study of breast cancer in which cases were female residents of Wisconsin, Massachusetts (excluding metropolitan Boston), and New Hampshire with a new diagnosis of invasive breast cancer reported to each state's cancer registry from January 1992 through December 1994 and aged 50–79 years at the time of diagnosis. Of the 6,839 eligible cases, 5,685 completed the standardized telephone interview (83 percent). Community controls were randomly selected in each state by using two sampling frames: those 50–64 years of age were selected from lists of licensed drivers, and those 65–79 years of age were chosen from rosters of Medicare beneficiaries. The controls were selected at random within age strata to yield an age distribution similar to that of the cases within each state. Of the 7,655 potential controls, 5,951 completed the telephone interview (78 percent).

A 40-minute telephone interview elicited information on the number of sisters and daughters for each participant, their current ages, and the age of their mother. If these female relatives were deceased, the interview inquired about their age at death. Participants were asked whether these first-degree female relatives were ever diagnosed with cancer (including breast cancer) and, if so, the type of cancer and age at diagnosis. The interview also covered reproductive history, physical activity, selected dietary items, alcohol consumption and tobacco use, use of exogenous hormones, body height and weight, personal medical history, and demographic factors.

Empirical Bayes estimates of familial relative risk of breast cancer
Using the first-degree female family history data collected in CBCS II, we estimated familial relative risk levels of breast cancer by using the empirical Bayes method. For each first-degree female family member, we calculated her person-years at risk of breast cancer incidence stratifying by 5-year age segments from birth to the earlier occurrence of death or the reference date of her family's enrolled subject. Reference dates for study subjects were defined as the date of diagnosis for breast cancer cases and, for controls, the date randomly sampled from the dates of diagnosis among cases within the same 5-year age stratum (on average, 1 year prior to interview). We then multiplied each person-time segment by the corresponding age-specific reference rate of breast cancer incidence among White females taken from the data of the Surveillance, Epidemiology, and End Results Program registry (14Go). Summing the products of the above multiplication for each family member yielded each participant's expected risk Formula of developing breast cancer. Since it is reasonable to assume the rare disease condition for breast cancer, we were able to approximate the model by

Formula

We fitted this model by using the vertex exchange algorithm with the Newton-Raphson full-optimization step-length procedure in the C.A.MAN program (UNIX version) (10Go, 11Go). The initial parameter grid was chosen as 10 equally spaced points between a relative risk of 0.1 and 5.0. The algorithm was stopped based on the maximum directional derivative with an accuracy level of 0.00001. The C.A.MAN program identified seven grid points (Formula = 7) with positive support, which was then refined with the program's EM algorithm. The resulting nonparametric MLE of G is shown in figure 1. Three of the seven points were very close to each other around a relative risk of 2.6 because the EM algorithm was stopped by any practical convergence criterion (10Go): it stopped at the 806th step. However, this does not have any important consequences, as evident from several examples in the paper that described the C.A.MAN program in detail (10Go). Specifically, we can interpret figure 1 as showing five relative risk clusters, instead of seven, and the numerical values of the empirical Bayes estimates Formula would have changed negligibly if the algorithm had run for a longer time.


Figure 1
View larger version (6K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIGURE 1. Nonparametric maximum likelihood estimates of the familial relative risk distribution of breast cancer in the Collaborative Breast Cancer Study II (Wisconsin; Massachusetts, excluding metropolitan Boston; and New Hampshire, 1992–1994).

 
With this nonparametric MLE of G, the empirical Bayes estimate Formula of the familial relative risk level for the ith participant was calculated by the posterior-mean equation. Figure 2 displays the empirical Bayes familial relative risk estimates Formula according to the expected counts Formula of breast cancer cases in the families. The empirical Bayes familial relative risk estimates are lower for families with larger expected counts for a given observed count of affected family members, Formula This is sensible because, for a given observed count of affected family members, Formula true familial relative risk should tend to be lower with a larger expected count of affected family members. For CBCS II participants with no family history of breast cancer Formula the empirical Bayes estimates are all less than 1.0.


Figure 2
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIGURE 2. Empirical Bayes familial relative risk estimates of breast cancer for participants in the Collaborative Breast Cancer Study II (Wisconsin; Massachusetts, excluding metropolitan Boston; and New Hampshire, 1992–1994) according to the expected number of affected family members.

 
To contrast with the empirical Bayes estimates, the Family History Score values were plotted (figure 3). Recall that Family History Scores are indicators of statistical significance, not estimates of familial relative risk. Very small differences in the expected count of affected family members, Formula can lead a range of observed counts of affected family members, Formula to the same Family History Score; for example, a Family History Score of 6 can arise from families with Formula = (1, 0.03), (2, 0.10), (3, 0.22), and (4, 0.37). Extremely large Family History Score values were observed among the families with the smallest expected counts. These features of Family History Scores are clearly unsuitable for use as estimates of familial relative risk levels.


Figure 3
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIGURE 3. Family History Scores of breast cancer for participants in the Collaborative Breast Cancer Study II (Wisconsin; Massachusetts, excluding metropolitan Boston; and New Hampshire, 1992–1994) according to the expected number of affected family members.

 
Main effects of family history on disease risk
We examined the degree of association between the case-control status of the CBCS II participants and their familial relative risk estimates to assess the strength of evidence for familial aggregation. We fitted a conditional logistic regression model, conditioned on age group and US state (corresponding to the study design), to the case-control data of the CBCS II with their familial relative risk estimates as a sole covariate (unadjusted analysis) and with a set of adjustment variables (adjusted analysis). The adjustment variables included participants' age at menarche, parity, age at first birth (AFB), age at menopause, body mass index, exogenous hormone use, alcohol consumption, and educational level. Matching on age and the state of residence in the design of the CBCS II was accounted for in the analysis as strata of the conditional logistic regression. Table 1 presents the deviance explained and odds ratio estimates by each type of familial relative risk estimate in the unadjusted and adjusted conditional logistic regression analyses. The amount of deviance explained was used to measure the strength of association between disease status and familial relative risk estimates.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Model deviance and odds ratio estimates with 95% confidence intervals from conditional logistic regression analyses of Collaborative Breast Cancer Study II data (Wisconsin; Massachusetts, excluding metropolitan Boston; and New Hampshire, 1992–1994) using various summary measures of familial risk of breast cancer as covariates

 
Empirical Bayes estimates explained the largest amount of deviance in the unadjusted analysis and nearly the largest in the adjusted analysis using only 1 degree of freedom, close to the categorical observed counts that used 4 degrees of freedom. Family History Scores did not show as strong associations as empirical Bayes estimates, even when the scores were categorized into five groups (negative and quartiles of positive scores). This finding was consistent with our description earlier that Family History Scores are not estimates of familial relative risk. The results shown in table 1 suggest that empirical Bayes estimates of familial relative risk provide higher power in the assessment of the main effects of family history (familial aggregation) on disease risk than either the crude summaries or Family History Scores.

Examination of an indication of gene-environmental interaction
Colditz et al. (15Go) and Egan et al. (16Go) reported that the effects of reproductive factors on breast cancer risk were modified by family history. Following this intriguing finding, we assessed the effect modification of parity/AFB effects according to familial relative risk levels. We created a covariate of parity and AFB by forming four categories of reproductive patterns: 1) nulliparous, 2) AFB before age 20 years, 3) AFB at age 20–29 years, and 4) AFB at age 30 years or older. Using the same conditional logistic regression models as those described above (unadjusted and adjusted analyses), we tested an interaction of the parity-AFB covariate with familial relative risk estimates. Three types of familial relative risk estimates were examined, and the results of the unadjusted analysis are shown in table 2 (the adjusted analysis gave very similar odds ratio estimates, which are not shown in the tables).


View this table:
[in this window]
[in a new window]

 
TABLE 2. Odds ratio estimates with 95% confidence intervals for parity/age at first birth according to various summary measures of familial risk or breast cancer from conditional logistic regression analyses of Collaborative Breast Cancer Study II data (Wisconsin; Massachusetts, excluding metropolitan Boston; and New Hampshire, 1992–1994)

 
The top third of table 2 shows the odds ratio estimates and 95 percent confidence intervals for each category of the parity-AFB covariate by presence/absence of family history. The interaction of the parity-AFB covariate and family history was not clear from the odds ratio estimates and was not statistically significant: {chi}2 = 1.74 with 3 degrees of freedom yielding p = 0.63 in the unadjusted analysis (p = 0.78 in the adjusted analysis). The middle third of this table shows the interaction of the parity-AFB covariate with whether the number of affected first-degree female relatives was two or more. The odds ratio estimates suggest the presence of an effect modification, but the test for interaction was not statistically significant: {chi}2 = 4.60 with 3 degrees of freedom yielding p = 0.20 in the unadjusted analysis (p = 0.30 in the adjusted analysis).

The bottom third of table 2 shows the interaction of the parity-AFB covariate with whether the empirical Bayes estimate of familial relative risk was 1.75 or more (i.e., top 2 percent). The odds ratio estimates suggest a pattern of the effect modification similar to that with the number of affected first-degree female relatives, but the test for interaction was statistically more significant: {chi}2 = 8.26 with 3 degrees of freedom yielding p = 0.04 in the unadjusted analysis (p = 0.09 in the adjusted analysis). Table 2 illustrates a situation in which a dichotomy by the empirical Bayes estimates of familial relative risk detects more clearly effect modification of an environmental risk factor according to familial relative risk levels than crude dichotomies of family history information.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 
We considered empirical Bayes estimates of familial relative risk levels and compared their utilities with those of 1) crude summaries of family history and 2) Family History Scores. In the analysis of the CBCS II, an epidemiologic study of breast cancer, the empirical Bayes estimates were shown to be associated with case-control status more strongly and to detect modified effects of an environmental risk factor by familial relative risk level more clearly than either crude summaries of family history or Family History Scores.

A similar approach was previously proposed by Kerber (1Go) and Boucher and Kerber (2Go) that used kinship coefficients to weight data for different relatives. Kerber used the FSIR (i.e., MLEs) (1Go), whereas Boucher and Kerber applied a linear empirical Bayes approach to log{1 + log(1 + FSIR)}, approximating its variance by the delta method (2Go). If the observed counts of cancer-history-positive relatives are large (unlike in our example, where they were mostly 0 or 1 with a small number of families with 2, 3, or 4 cases), use of FSIRs as estimates of familial relative risk and the approximations in the use of the linear Bayes estimator and the delta method would be sensible. We considered an empirical Bayes approach here, motivated by the very small (unstable) number of cancer-history-positive relatives in each family.

We specifically chose the nonparametric form of the distribution G of Formula's. Doing so allows 1) an interpretation of figure 1 that suggests five clusters of families with varying relative risks of breast cancer and 2) a posterior probability estimate of the cluster membership for each family. Specifically, relative risks are grouped into five clusters of (0.02, 0.59, 0.92, 1.90, 2.64) with corresponding mixture probabilities of (0.11, 0.47, 0.20, 0.12, 0.10). Given the observed count, oi, of the ith family, we can calculate its posterior probability estimate of kth-cluster membership by Formula For example, the cutoff value 1.75 for familial relative risk estimates used in table 2 was the smallest among all 0.25-increment cutoff values at which the posterior probability of belonging to one of the greater-than-1 relative risk clusters (1.90, 2.64) was higher than that of belonging to one of the less-than-1 relative risk clusters (0.02, 0.59, 0.92). Using a gamma or log-normal prior distribution as G would eliminate these advantageous features of the nonparametric prior. Furthermore, although it makes little difference in the rank order of empirical Bayes estimates with a gamma, log-normal, or nonparametric prior (5Go), the nonparametric prior smoothes the estimates to a lesser extent and has been shown empirically to perform better in relative risk prediction (17Go).

Another potential application of the empirical Bayes estimates of familial relative risk is as a predictor in risk prediction models. Risk-prediction models such as the Gail model for estimating breast cancer risk use crude summaries of family history as a predictor (18Go). It would be of interest to determine whether the empirical Bayes estimates of familial relative risk can refine the validity of such risk prediction models. To reach this objective, epidemiologic studies need to collect modest additional data on family structure and ages, requiring a longer survey, so that empirical Bayes estimates can be computed in addition to crude summaries of family history. The additional data collection requirement is an issue that must be weighed against its gain in using the proposed method.

The empirical Bayes estimates of familial relative risk and the associated posterior probability estimates of relative-risk-cluster membership can also be used in identifying high-risk families in clinical and research settings. Identification of high-risk families is important for clinical monitoring and counseling. Not knowing exactly how to interpret family history information will contribute to confusion, anxiety, and worry about the possibility of getting the disease. Note that the relative risk assessed by the empirical Bayes estimates is not necessarily hereditary (and was not modeled specifically as such). Some familial risk can be elevated because of an environmental factor(s) shared by family members. Such shared factors may be modifiable and, if that is the case, have important implications regarding disease prevention.


    ACKNOWLEDGMENTS
 
The parent project was supported by grants CA47305, CA69664, and CA47147 from the National Cancer Institute. Y. Y. is partially supported by the Canada Research Chair Program, Canadian Institute of Health Research, and the Alberta Heritage Foundation for Medical Research.

The authors thank, for their help during the conduct of the parent epidemiologic study, Drs. Patrick L. Remington, Henry Anderson, Walter C. Willett, Meir J. Stampfer, Linda Titus-Ernstoff, E. Robert Greenberg, and John A. Baron for their advice and assistance; the staff of the three state cancer registries for providing data and technical support; and the project staff in the three states for their dedication.

Conflict of interest: none declared.


    References
 TOP
 ABSTRACT
 INTRODUCTION
 FAMILY HISTORY SCORE
 EMPIRICAL BAYES ESTIMATES OF...
 APPLICATION TO AN EPIDEMIOLOGIC...
 DISCUSSION
 References
 

  1. Kerber RA. Method for calculating risk associated with family history of a disease. Genet Epidemiol 1995;12:291–301.[CrossRef][ISI][Medline]
  2. Boucher KM, Kerber RA. Measures of familial aggregation as predictors of breast-cancer risk. J Epidemiol Biostat 2001;6:377–85.[CrossRef][Medline]
  3. Khoury MJ, Beaty TH, Cohen BH. Fundamentals of genetic epidemiology. New York, NY: Oxford University Press, 1993.
  4. Clayton D, Hills M. Statistical models in epidemiology. New York, NY: Oxford University Press, 1993.
  5. Clayton D, Kaldor J. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics 1987;43:671–81.[CrossRef][ISI][Medline]
  6. Christiansen CL, Morris CN. Hierarchical Poisson regression modeling. J Am Stat Assoc 1997;92:618–32.[CrossRef]
  7. Laird NM. Nonparametric maximum likelihood estimation of a mixing distribution. J Am Stat Assoc 1978;73:805–11.[CrossRef][ISI]
  8. Titterington DM, Smith AFM, Makov UE. Statistical analysis of finite mixture distributions. New York, NY: John Wiley & Sons, 1985.
  9. Böhning D. Computer assisted analysis of mixtures and applications: meta-analysis, disease mapping, and others. London, United Kingdom: Chapman & Hall/CRC, 1999.
  10. Böhning D, Schlattmann P, Lindsay BG. Computer assisted analysis of mixtures (C.A.MAN): statistical algorithms. Biometrics 1992;48:283–303.[CrossRef][ISI][Medline]
  11. The Program C.A.MAN (Computer Assisted Analysis of Mixtures). Dankmar B Böhning and Peter Schlattmann, Free University Berlin, Berlin, Germany, 1997. (http://www.personal.rdg.ac.uk/~sns05dab/Software.html).
  12. Newcomb PA, Egan KM, Titus-Ernstoff L, et al. Lactation in relation to postmenopausal breast cancer. Am J Epidemiol 1999;150:174–82.[Abstract/Free Full Text]
  13. Trentham-Dietz A, Newcomb PA, Egan KM, et al. Weight change and risk of postmenopausal breast cancer. Cancer Causes Control 2000;11:533–42.[CrossRef][ISI][Medline]
  14. Surveillance, Epidemiology, and End Results (SEER) Program (www.seer.cancer.gov [general home page]); SEER*Stat Database: Incidence – SEER 9 Regs Public-Use, Nov 2004 Sub (1973–2002), National Cancer Institute, DCCPS, Surveillance Research Program, Cancer Statistics Branch, released April 2005, based on the November 2004 submission. (http://www.seer.cancer.gov/publicdata/ [more specific information]).
  15. Colditz GA, Rosner BA, Speizer FE. Risk factors for breast cancer according to family history of breast cancer: for the Nurses' Health Study Research Group. J Natl Cancer Inst 1996;88:365–71.[Abstract/Free Full Text]
  16. Egan KM, Stampfer MJ, Rosner BA, et al. Risk factors for breast cancer in women with a breast cancer family history. Cancer Epidemiol Biomarkers Prev 1998;7:359–64.[Abstract/Free Full Text]
  17. Yasui Y, Liu H, Benach J, et al. An empirical evaluation of various priors in the empirical Bayes estimation of small area disease risks. Stat Med 2000;19:2409–20.[CrossRef][ISI][Medline]
  18. Gail MH, Brinton LA, Byar DP, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst 1989;81:1879–86.[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
E. Kampman
A First-Degree Relative with Colorectal Cancer: What Are We Missing?
Cancer Epidemiol. Biomarkers Prev., January 1, 2007; 16(1): 1 - 3.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
164/7/697    most recent
kwj256v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Yasui, Y.
Right arrow Articles by Egan, K. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yasui, Y.
Right arrow Articles by Egan, K. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?