Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel

Olivier Delaneau, Jonathan Marchini, Gil A. McVeanh, Peter Donnelly, Gerton Lunter, Jonathan L. Marchini, Simon Myers, Anjali Gupta-Hinch, Zamin Iqbal, Iain Mathieson, Andy Rimmer, Dionysia K. Xifara, Angeliki Kerasidou, Claire Churchhouse, David M. Altshuler, Stacey B. Gabriel, Eric S. Lander, Namrata Gupta, Mark J. Daly, Mark A. DePristoEric Banks, Gaurav Bhatia, Mauricio O. Carneiro, Guillermo Del Angel, Giulio Genovese, Robert E. Handsaker, Chris Hartl, Steven A. McCarroll, James C. Nemesh, Ryan E. Poplin, Stephen F. Schaffner, Khalid Shakir, Pardis C. Sabeti, Sharon R. Grossman, Shervin Tabrizi, Ridhi Tariyal, Heng Li, David Reich, Richard M. Durbin, Matthew E. Hurles, Senduran Balasubramaniam, John Burton, Petr Danecek, Thomas M. Keane, Anja Kolb-Kokocinski, Shane McCarthy, James Stalker, Michael Quail, Qasim Ayub, Yuan Chen, Alison J. Coffey, Vincenza Colonna, Ni Huang, Luke Jostins, Aylwyn Scally, Klaudia Walter, Yali Xue, Yujun Zhang, Ben Blackburne, Sarah J. Lindsay, Zemin Ning, Adam Frankish, Jennifer Harrow, Tyler S. Chris, Gonalo R. Abecasis, Hyun Min Kang, Paul Anderson, Tom Blackwell, Fabio Busonero, Christian Fuchsberger, Goo Jun, Andrea Maschio, Eleonora Porcu, Carlo Sidore, Adrian Tan, Mary Kate Trost, David R. Bentley, Russell Grocock, Sean Humphray, Terena James, Zoya Kingsbury, Markus Bauer, R. Keira Cheetham, Tony Cox, Michael Eberle, Lisa Murray, Richard Shaw, Aravinda Chakravarti, Andrew G. Clark, Alon Keinan, Juan L. Rodriguez-Flores, Francisco M. De LaVega, Jeremiah Degenhardt, Evan E. Eichler, Paul Flicek, Laura Clarke, Rasko Leinonen, Richard E. Smith, Xiangqun Zheng-Bradley, Kathryn Beal, Fiona Cunningham, Javier Herrero, William M. McLaren, Graham R S Ritchie, Jonathan Barker, Gavin Kelman, Eugene Kulesha, Rajesh Radhakrishnan, Asier Roa, Dmitriy Smirnov, Ian Streeter, Iliana Toneva, Richard A. Gibbs, Huyen Dinh, Christie Kovar, Sandra Lee, Lora Lewis, Donna Muzny, Jeff Reid, Min Wang, Fuli Yu, Matthew Bainbridge, Danny Challis, Uday S. Evani, James Lu, Uma Nagaswamy, Aniko Sabo, Yi Wang, Jin Yu, Gerald Fowler, Walker Hale, Divya Kalra, Eric D. Green, Bartha M. Knoppers, Jan O. Korbel, Tobias Rausch, Adrian M. Sttz, Charles Lee, Lauren Griffin, Chih Heng Hsieh, Ryan E. Mills, Marcin Von Grotthuss, Chengsheng Zhang, Xinghua Shi, Hans Lehrach, Ralf Sudbrak, Vyacheslav S. Amstislavskiy, Matthias Lienhard, Florian Mertes, Marc Sultan, Bernd Timmermann, Marie Laure Yaspo, Sudbrak Ralf Herwig, Elaine R. Mardis, Richard K. Wilson, Lucinda Fulton, Robert Fulton, George M. Weinstock, Asif Chinwalla, Li Ding, David Dooling, Daniel C. Koboldt, Michael D. McLellan, John W. Wallis, Michael C. Wendl, Qunyuan Zhang, Gabor T. Marth, Erik P. Garrison, Deniz Kural, Wan Ping Lee, Wen Fung Leong, Alistair N. Ward, Jiantao Wu, Mengyao Zhang, Deborah A. Nickerson, Can Alkan, Fereydoun Hormozdiari, Arthur Ko, Peter H. Sudmant, Jeanette P. Schmidt, Christopher J. Davies, Jeremy Gollub, Teresa Webster, Brant Wong, Yiping Zhan, Stephen T. Sherry, Chunlin Xiao, Deanna Church, Victor Ananiev, Zinaida Belaia, Dimitriy Beloslyudtsev, Nathan Bouk, Chao Chen, Robert Cohen, Charles Cook, John Garner, Timothy Hefferon, Mikhail Kimelman, Chunlei Liu, John Lopez, Peter Meric, Yuri Ostapchuk, Lon Phan, Sergiy Ponomarov, Valerie Schneider, Eugene Shekhtman, Karl Sirotkin, Douglas Slotta, Hua Zhang, Jun Wang, Xiaodong Fang, Xiaosen Guo, Min Jian, Hui Jiang, Xin Jin, Guoqing Li, Jingxiang Li, Yingrui Li, Xiao Liu, Yao Lu, Xuedi Ma, Shuaishuai Tai, Meifang Tang, Bo Wang, Guangbiao Wang, Honglong Wu, Renhua Wu, Ye Yin, Wenwei Zhang, Jiao Zhao, Meiru Zhao, Xiaole Zheng, Hans Lachlan, Lin Fang, Qibin Li, Zhenyu Li, Haoxiang Lin, Binghang Liu, Ruibang Luo, Haojing Shao, Bingqiang Wang, Yinlong Xie, Chen Ye, Chang Yu, Hancheng Zheng, Hongmei Zhu, Hongyu Cai, Hongzhi Cao, Yeyang Su, Zhongming Tian, Huanming Yang, Ling Yang, Jiayong Zhu, Zhiming Cai, Jian Wang, Marcus W. Albrecht, Tatiana A. Borodina, Adam Auton, Seungtai C. Yoon, Jayon Lihm, Vladimir Makarov, Hanjun Jin, Wook Kim, Ki Cheol Kim, Srikanth Gottipati, Danielle Jones, David N. Cooper, Edward V. Ball, Peter D. Stenson, Bret Barnes, Scott Kahn, Kai Ye, Mark A. Batzer, Miriam K. Konkel, Jerilyn A. Walker, Daniel G. MacArthur, Monkol Lek, Mark D. Shriver, Carlos D. Bustamante, Simon Gravel, Eimear E. Kenny, Jeffrey M. Kidd, Phil Lacroute, Brian K. Maples, Andres Moreno-Estrada, Fouad Zakharia, Brenna Henn, Karla Sandoval, Jake K. Byrnes, Eran Halperin, Yael Baran, David W. Craig, Alexis Christoforides, Tyler Izatt, Ahmet A. Kurdoglu, Shripad A. Sinari, Nils Homer, Kevin Squire, Jonathan Sebat, Vineet Bafna, Kenny Ye, Esteban G. Burchard, Ryan D. Hernandez, Christopher R. Gignoux, David Haussler, Sol J. Katzman, W. James Kent, Bryan Howie, Andres Ruiz-Linares, Emmanouil T. Dermitzakis, Tuuli Lappalainen, Scott E. Devine, Xinyue Liu, Ankit Maroo, Luke J. Tallon, Jeffrey A. Rosenfeld, Leslie P. Michelson, Andrea Angius, Francesco Cucca, Serena Sanna, Abigail Bigham, Chris Jones, Fred Reinier, Yun Li, Robert Lyons, David Schlessinger, Philip Awadalla, Alan Hodgkinson, Taras K. Oleksyk, Juan C. Martinez-Cruzado, Yunxin Fu, Xiaoming Liu, Momiao Xiong, Lynn Jorde, David Witherspoon, Jinchuan Xing, Brian L. Browning, Iman Hajirasouliha, Ken Chen, Cornelis A. Albers, Mark B. Gerstein, Alexej Abyzov, Jieming Chen, Yao Fu, Lukas Habegger, Arif O. Harmanci, Xinmeng Jasmine Mu, Cristina Sisu, Suganthi Balasubramanian, Mike Jin, Ekta Khurana, Declan Clarke, Jacob J. Michaelson, Chris OSullivan, Kathleen C. Barnes, Neda Gharani, Lorraine H. Toji, Norman Gerry, Jane S. Kaye, Alastair Kent, Rasika Mathias, Pilar N. Ossorio, Michael Parker, Charles N. Rotimi, Charmaine D. Royal, Sarah Tishkoff, Marc Via, Walter Bodmer, Gabriel Bedoya, Gao Yang, Chu Jia You, Andres Garcia-Montero, Alberto Orfao, Julie Dutil, Lisa D. Brooks, Adam L. Felsenfeld, Jean E. McEwen, Nicholas C. Clemm, Mark S. Guyer, Jane L. Peterson, Audrey Duncanson, Michael Dunn, Leena Peltonen

Research output: Contribution to journalArticle

136 Scopus citations

Abstract

A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.

Original languageEnglish (US)
Article number3934
JournalNature Communications
Volume5
DOIs
StatePublished - Jun 13 2014
Externally publishedYes

    Fingerprint

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Chemistry(all)
  • Physics and Astronomy(all)
  • Medicine(all)

Cite this

Delaneau, O., Marchini, J., McVeanh, G. A., Donnelly, P., Lunter, G., Marchini, J. L., Myers, S., Gupta-Hinch, A., Iqbal, Z., Mathieson, I., Rimmer, A., Xifara, D. K., Kerasidou, A., Churchhouse, C., Altshuler, D. M., Gabriel, S. B., Lander, E. S., Gupta, N., Daly, M. J., ... Peltonen, L. (2014). Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nature Communications, 5, [3934]. https://doi.org/10.1038/ncomms4934