TY - JOUR
T1 - Inferring gene regulatory networks by integrating ChIP-seq/chip and transcriptome data via LASSO-type regularization methods
AU - Qin, Jing
AU - Hu, Yaohua
AU - Xu, Feng
AU - Yalamanchili, Hari Krishna
AU - Wang, Junwen
N1 - Funding Information:
This work was supported by funding from the Research Grants Council, Hong Kong SAR, China (Grant number 781511M ), National Natural Science Foundation of China , China (Grant numbers 91229105 and 11101186 ).
PY - 2014/6/1
Y1 - 2014/6/1
N2 - Inferring gene regulatory networks from gene expression data at whole genome level is still an arduous challenge, especially in higher organisms where the number of genes is large but the number of experimental samples is small. It is reported that the accuracy of current methods at genome scale significantly drops from Escherichia coli to Saccharomyces cerevisiae due to the increase in number of genes. This limits the applicability of current methods to more complex genomes, like human and mouse. Least absolute shrinkage and selection operator (LASSO) is widely used for gene regulatory network inference from gene expression profiles. However, the accuracy of LASSO on large genomes is not satisfactory. In this study, we apply two extended models of LASSO, L0 and L1/2 regularization models to infer gene regulatory network from both high-throughput gene expression data and transcription factor binding data in mouse embryonic stem cells (mESCs). We find that both the L0 and L1/2 regularization models significantly outperform LASSO in network inference. Incorporating interactions between transcription factors and their targets remarkably improved the prediction accuracy. Current study demonstrates the efficiency and applicability of these two models for gene regulatory network inference from integrative omics data in large genomes. The applications of the two models will facilitate biologists to study the gene regulation of higher model organisms in a genome-wide scale.
AB - Inferring gene regulatory networks from gene expression data at whole genome level is still an arduous challenge, especially in higher organisms where the number of genes is large but the number of experimental samples is small. It is reported that the accuracy of current methods at genome scale significantly drops from Escherichia coli to Saccharomyces cerevisiae due to the increase in number of genes. This limits the applicability of current methods to more complex genomes, like human and mouse. Least absolute shrinkage and selection operator (LASSO) is widely used for gene regulatory network inference from gene expression profiles. However, the accuracy of LASSO on large genomes is not satisfactory. In this study, we apply two extended models of LASSO, L0 and L1/2 regularization models to infer gene regulatory network from both high-throughput gene expression data and transcription factor binding data in mouse embryonic stem cells (mESCs). We find that both the L0 and L1/2 regularization models significantly outperform LASSO in network inference. Incorporating interactions between transcription factors and their targets remarkably improved the prediction accuracy. Current study demonstrates the efficiency and applicability of these two models for gene regulatory network inference from integrative omics data in large genomes. The applications of the two models will facilitate biologists to study the gene regulation of higher model organisms in a genome-wide scale.
KW - ChIP-seq/chip
KW - Gene regulatory networks
KW - Integrative omics data
KW - LASSO-type regularization methods
KW - Transcriptome
UR - http://www.scopus.com/inward/record.url?scp=84901463273&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901463273&partnerID=8YFLogxK
U2 - 10.1016/j.ymeth.2014.03.006
DO - 10.1016/j.ymeth.2014.03.006
M3 - Article
C2 - 24650566
AN - SCOPUS:84901463273
SN - 1046-2023
VL - 67
SP - 294
EP - 303
JO - Methods
JF - Methods
IS - 3
ER -