Abstract
We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts. We detected over 30 million additional transcripts at more than 650,000 genomic loci, nearly all of which are likely nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells. The CHESS database is available at http://ccb.jhu.edu/chess.
Original language | English (US) |
---|---|
Article number | 208 |
Journal | Genome biology |
Volume | 19 |
Issue number | 1 |
DOIs | |
State | Published - Nov 28 2018 |
Keywords
- GTEx
- Human gene count
- RNA sequencing
- Transcriptome
- Transcriptome assembly
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Genetics
- Cell Biology