Imputation methods for missing data for polygenic models.

Brooke Fridley; Kari Rabe; Mariza de Andrade

doi:10.1186/1471-2156-4-s1-s42

Imputation methods for missing data for polygenic models.

Brooke Fridley, Kari Rabe, Mariza de Andrade

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.

Original language	English (US)
Journal	BMC genetics
Volume	4 Suppl 1
DOIs	https://doi.org/10.1186/1471-2156-4-s1-s42
State	Published - 2003

ASJC Scopus subject areas

Genetics
Genetics(clinical)

Access to Document

10.1186/1471-2156-4-s1-s42

Cite this

@article{4c12977b3b3647afb3bf5a81715ed337,

title = "Imputation methods for missing data for polygenic models.",

abstract = "Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.",

author = "Brooke Fridley and Kari Rabe and {de Andrade}, Mariza",

year = "2003",

doi = "10.1186/1471-2156-4-s1-s42",

language = "English (US)",

volume = "4 Suppl 1",

journal = "BMC genetics",

issn = "1471-2156",

publisher = "BioMed Central",

}

TY - JOUR

T1 - Imputation methods for missing data for polygenic models.

AU - Fridley, Brooke

AU - Rabe, Kari

AU - de Andrade, Mariza

PY - 2003

Y1 - 2003

N2 - Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.

AB - Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.

UR - http://www.scopus.com/inward/record.url?scp=34248645120&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34248645120&partnerID=8YFLogxK

U2 - 10.1186/1471-2156-4-s1-s42

DO - 10.1186/1471-2156-4-s1-s42

M3 - Article

C2 - 14975110

AN - SCOPUS:34248645120

SN - 1471-2156

VL - 4 Suppl 1

JO - BMC genetics

JF - BMC genetics

ER -

Imputation methods for missing data for polygenic models.

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this