European American stratification in ovarian cancer case control data: The utility of genome-wide data for inferring ancestry

Paola Raska, Edwin Iversen, Ann Chen, Zhihua Chen, Brooke L. Fridley, Jennifer Permuth-Wey, Ya Yu Tsai, Robert A. Vierkant, Ellen L Goode, Harvey Risch, Joellen M. Schildkraut, Thomas A. Sellers, Jill Barnholtz-Sloan

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available.

Original languageEnglish (US)
Article numbere35235
JournalPLoS One
Volume7
Issue number5
DOIs
StatePublished - May 9 2012

Fingerprint

ovarian neoplasms
Ovarian Neoplasms
ancestry
Genes
Genome
genome
Principal Component Analysis
Principal component analysis
principal component analysis
Chromosomes
chromosomes
inversion polymorphism
Lactase
Chromosomes, Human, Pair 8
Chromosomes, Human, Pair 6
Chromosomes, Human, Pair 2
European Americans
Emigration and Immigration
Northern European region
beta-galactosidase

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Raska, P., Iversen, E., Chen, A., Chen, Z., Fridley, B. L., Permuth-Wey, J., ... Barnholtz-Sloan, J. (2012). European American stratification in ovarian cancer case control data: The utility of genome-wide data for inferring ancestry. PLoS One, 7(5), [e35235]. https://doi.org/10.1371/journal.pone.0035235

European American stratification in ovarian cancer case control data : The utility of genome-wide data for inferring ancestry. / Raska, Paola; Iversen, Edwin; Chen, Ann; Chen, Zhihua; Fridley, Brooke L.; Permuth-Wey, Jennifer; Tsai, Ya Yu; Vierkant, Robert A.; Goode, Ellen L; Risch, Harvey; Schildkraut, Joellen M.; Sellers, Thomas A.; Barnholtz-Sloan, Jill.

In: PLoS One, Vol. 7, No. 5, e35235, 09.05.2012.

Research output: Contribution to journalArticle

Raska, P, Iversen, E, Chen, A, Chen, Z, Fridley, BL, Permuth-Wey, J, Tsai, YY, Vierkant, RA, Goode, EL, Risch, H, Schildkraut, JM, Sellers, TA & Barnholtz-Sloan, J 2012, 'European American stratification in ovarian cancer case control data: The utility of genome-wide data for inferring ancestry', PLoS One, vol. 7, no. 5, e35235. https://doi.org/10.1371/journal.pone.0035235
Raska, Paola ; Iversen, Edwin ; Chen, Ann ; Chen, Zhihua ; Fridley, Brooke L. ; Permuth-Wey, Jennifer ; Tsai, Ya Yu ; Vierkant, Robert A. ; Goode, Ellen L ; Risch, Harvey ; Schildkraut, Joellen M. ; Sellers, Thomas A. ; Barnholtz-Sloan, Jill. / European American stratification in ovarian cancer case control data : The utility of genome-wide data for inferring ancestry. In: PLoS One. 2012 ; Vol. 7, No. 5.
@article{fc43da1ab9f54cb982665be1d21f4c0b,
title = "European American stratification in ovarian cancer case control data: The utility of genome-wide data for inferring ancestry",
abstract = "We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available.",
author = "Paola Raska and Edwin Iversen and Ann Chen and Zhihua Chen and Fridley, {Brooke L.} and Jennifer Permuth-Wey and Tsai, {Ya Yu} and Vierkant, {Robert A.} and Goode, {Ellen L} and Harvey Risch and Schildkraut, {Joellen M.} and Sellers, {Thomas A.} and Jill Barnholtz-Sloan",
year = "2012",
month = "5",
day = "9",
doi = "10.1371/journal.pone.0035235",
language = "English (US)",
volume = "7",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "5",

}

TY - JOUR

T1 - European American stratification in ovarian cancer case control data

T2 - The utility of genome-wide data for inferring ancestry

AU - Raska, Paola

AU - Iversen, Edwin

AU - Chen, Ann

AU - Chen, Zhihua

AU - Fridley, Brooke L.

AU - Permuth-Wey, Jennifer

AU - Tsai, Ya Yu

AU - Vierkant, Robert A.

AU - Goode, Ellen L

AU - Risch, Harvey

AU - Schildkraut, Joellen M.

AU - Sellers, Thomas A.

AU - Barnholtz-Sloan, Jill

PY - 2012/5/9

Y1 - 2012/5/9

N2 - We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available.

AB - We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available.

UR - http://www.scopus.com/inward/record.url?scp=84860733538&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860733538&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0035235

DO - 10.1371/journal.pone.0035235

M3 - Article

C2 - 22590501

AN - SCOPUS:84860733538

VL - 7

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 5

M1 - e35235

ER -