On the suitability of short reads of 16S rRNA for phylogeny-based analyses in environmental surveys

Patricio Jeraldo, Nicholas D Chia, Nigel Goldenfeld

Research output: Contribution to journalArticle

38 Citations (Scopus)

Abstract

Pyrosequencing platforms have been widely used in 16S rRNA deep sequencing of organisms sampled from environmental surveys. Despite the massive number of reads generated by these platforms, the reads only cover short regions of the gene, and the use of these short reads has recently been called into question for phylogeny-based and diversity analyses. We explore the limits of the use of short reads by quantifying the loss of information, and its effect on phylogeny. Using available nearly-full-length reads from published clone libraries and databases, and simulated short reads created from these reads, we show that for selected regions of the gene, short reads contain a surprisingly high amount of biological information, making them suitable to resolve an approximate phylogeny. In particular, we find that the V6 region is significantly poorer than the V1-V3 region in its representation of phylogenetic relationships. We conclude that the use of short reads, combined with a careful choice of the gene region used, and a thorough alignment procedure, can yield phylogenetic information comparable with that obtained from nearly-full-length 16S rRNA reads.

Original languageEnglish (US)
Pages (from-to)3000-3009
Number of pages10
JournalEnvironmental Microbiology
Volume13
Issue number11
DOIs
StatePublished - Nov 2011
Externally publishedYes

Fingerprint

Phylogeny
phylogeny
ribosomal RNA
Genes
High-Throughput Nucleotide Sequencing
gene
phylogenetics
Clone Cells
Databases
genes
clone
Surveys and Questionnaires
clones
organisms

ASJC Scopus subject areas

  • Microbiology
  • Ecology, Evolution, Behavior and Systematics

Cite this

On the suitability of short reads of 16S rRNA for phylogeny-based analyses in environmental surveys. / Jeraldo, Patricio; Chia, Nicholas D; Goldenfeld, Nigel.

In: Environmental Microbiology, Vol. 13, No. 11, 11.2011, p. 3000-3009.

Research output: Contribution to journalArticle

@article{d73c625c725d408aaf226bf82ae0afb9,
title = "On the suitability of short reads of 16S rRNA for phylogeny-based analyses in environmental surveys",
abstract = "Pyrosequencing platforms have been widely used in 16S rRNA deep sequencing of organisms sampled from environmental surveys. Despite the massive number of reads generated by these platforms, the reads only cover short regions of the gene, and the use of these short reads has recently been called into question for phylogeny-based and diversity analyses. We explore the limits of the use of short reads by quantifying the loss of information, and its effect on phylogeny. Using available nearly-full-length reads from published clone libraries and databases, and simulated short reads created from these reads, we show that for selected regions of the gene, short reads contain a surprisingly high amount of biological information, making them suitable to resolve an approximate phylogeny. In particular, we find that the V6 region is significantly poorer than the V1-V3 region in its representation of phylogenetic relationships. We conclude that the use of short reads, combined with a careful choice of the gene region used, and a thorough alignment procedure, can yield phylogenetic information comparable with that obtained from nearly-full-length 16S rRNA reads.",
author = "Patricio Jeraldo and Chia, {Nicholas D} and Nigel Goldenfeld",
year = "2011",
month = "11",
doi = "10.1111/j.1462-2920.2011.02577.x",
language = "English (US)",
volume = "13",
pages = "3000--3009",
journal = "Environmental Microbiology",
issn = "1462-2912",
publisher = "Wiley-Blackwell",
number = "11",

}

TY - JOUR

T1 - On the suitability of short reads of 16S rRNA for phylogeny-based analyses in environmental surveys

AU - Jeraldo, Patricio

AU - Chia, Nicholas D

AU - Goldenfeld, Nigel

PY - 2011/11

Y1 - 2011/11

N2 - Pyrosequencing platforms have been widely used in 16S rRNA deep sequencing of organisms sampled from environmental surveys. Despite the massive number of reads generated by these platforms, the reads only cover short regions of the gene, and the use of these short reads has recently been called into question for phylogeny-based and diversity analyses. We explore the limits of the use of short reads by quantifying the loss of information, and its effect on phylogeny. Using available nearly-full-length reads from published clone libraries and databases, and simulated short reads created from these reads, we show that for selected regions of the gene, short reads contain a surprisingly high amount of biological information, making them suitable to resolve an approximate phylogeny. In particular, we find that the V6 region is significantly poorer than the V1-V3 region in its representation of phylogenetic relationships. We conclude that the use of short reads, combined with a careful choice of the gene region used, and a thorough alignment procedure, can yield phylogenetic information comparable with that obtained from nearly-full-length 16S rRNA reads.

AB - Pyrosequencing platforms have been widely used in 16S rRNA deep sequencing of organisms sampled from environmental surveys. Despite the massive number of reads generated by these platforms, the reads only cover short regions of the gene, and the use of these short reads has recently been called into question for phylogeny-based and diversity analyses. We explore the limits of the use of short reads by quantifying the loss of information, and its effect on phylogeny. Using available nearly-full-length reads from published clone libraries and databases, and simulated short reads created from these reads, we show that for selected regions of the gene, short reads contain a surprisingly high amount of biological information, making them suitable to resolve an approximate phylogeny. In particular, we find that the V6 region is significantly poorer than the V1-V3 region in its representation of phylogenetic relationships. We conclude that the use of short reads, combined with a careful choice of the gene region used, and a thorough alignment procedure, can yield phylogenetic information comparable with that obtained from nearly-full-length 16S rRNA reads.

UR - http://www.scopus.com/inward/record.url?scp=80055082991&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80055082991&partnerID=8YFLogxK

U2 - 10.1111/j.1462-2920.2011.02577.x

DO - 10.1111/j.1462-2920.2011.02577.x

M3 - Article

C2 - 21910812

AN - SCOPUS:80055082991

VL - 13

SP - 3000

EP - 3009

JO - Environmental Microbiology

JF - Environmental Microbiology

SN - 1462-2912

IS - 11

ER -