We describe a method to identify candidate cancer biomarkers by analyzing numeric approximations of tissue specificity of human genes. These approximations were calculated by analyzing predicted tissue expression distributions of genes derived from mapping expressed sequence tags (ESTs) to the human genome sequence using a binary indexing algorithm. Tissue-specificity values facilitated high-throughput analysis of the human genes and enabled the identification of genes highly specific to different tissues. Tissue expression distributions for several genes were compared to estimates obtained from other public gene expression datasets and experimentally validated using quantitative RT-PCR on RNA isolated from several human tissues. Our results demonstrate that most human genes (∼98%) are expressed in many tissues (low specificity), and only a small number of genes possess very specific tissue expression profiles. These genes comprise a rich dataset from which novel therapeutic targets and novel diagnostic serum biomarkers may be selected.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics