TY - JOUR
T1 - Significance analysis of microarray transcript levels in time series experiments
AU - Di Camillo, Barbara
AU - Toffolo, Gianna
AU - Nair, Sreekumaran K.
AU - Greenlund, Laura J.
AU - Cobelli, Claudio
PY - 2007/3/8
Y1 - 2007/3/8
N2 - Background: Microarray time series studies are essential to understand the dynamics of molecular events. In order to limit the analysis to those genes that change expression over time, a first necessary step is to select differentially expressed transcripts. A variety of methods have been proposed to this purpose; however, these methods are seldom applicable in practice since they require a large number of replicates, often available only for a limited number of samples. In this data-poor context, we evaluate the performance of three selection methods, using synthetic data, over a range of experimental conditions. Application to real data is also discussed. Results: Three methods are considered, to assess differentially expressed genes in data-poor conditions. Method 1 uses a threshold on individual samples based on a model of the experimental error. Method 2 calculates the area of the region bounded by the time series expression profiles, and considers the gene differentially expressed if the area exceeds a threshold based on a model of the experimental error. These two methods are compared to Method 3, recently proposed in the literature, which exploits splines fit to compare time series profiles. Application of the three methods to synthetic data indicates that Method 2 outperforms the other two both in Precision and Recall when short time series are analyzed, while Method 3 outperforms the other two for long time series. Conclusion: These results help to address the choice of the algorithm to be used in data-poor time series expression study, depending on the length of the time series.
AB - Background: Microarray time series studies are essential to understand the dynamics of molecular events. In order to limit the analysis to those genes that change expression over time, a first necessary step is to select differentially expressed transcripts. A variety of methods have been proposed to this purpose; however, these methods are seldom applicable in practice since they require a large number of replicates, often available only for a limited number of samples. In this data-poor context, we evaluate the performance of three selection methods, using synthetic data, over a range of experimental conditions. Application to real data is also discussed. Results: Three methods are considered, to assess differentially expressed genes in data-poor conditions. Method 1 uses a threshold on individual samples based on a model of the experimental error. Method 2 calculates the area of the region bounded by the time series expression profiles, and considers the gene differentially expressed if the area exceeds a threshold based on a model of the experimental error. These two methods are compared to Method 3, recently proposed in the literature, which exploits splines fit to compare time series profiles. Application of the three methods to synthetic data indicates that Method 2 outperforms the other two both in Precision and Recall when short time series are analyzed, while Method 3 outperforms the other two for long time series. Conclusion: These results help to address the choice of the algorithm to be used in data-poor time series expression study, depending on the length of the time series.
UR - http://www.scopus.com/inward/record.url?scp=34248145293&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34248145293&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-8-S1-S10
DO - 10.1186/1471-2105-8-S1-S10
M3 - Article
C2 - 17430554
AN - SCOPUS:34248145293
SN - 1471-2105
VL - 8
JO - BMC bioinformatics
JF - BMC bioinformatics
IS - SUPPL. 1
M1 - S10
ER -