TY - GEN
T1 - Classifying protein complexes from candidate subgraphs using fuzzy machine learning model
AU - Xu, Bo
AU - Lin, Hongfei
AU - Yang, Zhihao
AU - Wagholikar, Kavishwar B.
AU - Liu, Hongfang
PY - 2012
Y1 - 2012
N2 - Many computational methods have been applied to identify protein complexes from experimentally obtained protein-protein interaction (PPI) networks. Because of the presence of unreliable interactions in PPI networks, multi-functionality of proteins, and complex connectivity of the PPI network, the task is very challenging. In this study, we tackle the presence of unreliable interactions in protein complex using Genetic-Algorithm Fuzzy Naïve Bayes (GAFNB) which takes unreliability into consideration. Many existing methods can provide lots of candidate subgraphs. So we focused on how to classify the protein complexes from the subgraphs by considering the fuzzy attribute of PPI. We experimented with two datasets of size 10,371 and 986, each containing 493 positive protein complexes from MIPS and TAP-MS datasets. We compared the performance of GAFNB with Naïve Bayes (NB). Results show that GAFNB performed better which indicates that a fuzzy model is more suitable when unreliability is present. It is necessary to consider the unreliability in identifying protein complexes.
AB - Many computational methods have been applied to identify protein complexes from experimentally obtained protein-protein interaction (PPI) networks. Because of the presence of unreliable interactions in PPI networks, multi-functionality of proteins, and complex connectivity of the PPI network, the task is very challenging. In this study, we tackle the presence of unreliable interactions in protein complex using Genetic-Algorithm Fuzzy Naïve Bayes (GAFNB) which takes unreliability into consideration. Many existing methods can provide lots of candidate subgraphs. So we focused on how to classify the protein complexes from the subgraphs by considering the fuzzy attribute of PPI. We experimented with two datasets of size 10,371 and 986, each containing 493 positive protein complexes from MIPS and TAP-MS datasets. We compared the performance of GAFNB with Naïve Bayes (NB). Results show that GAFNB performed better which indicates that a fuzzy model is more suitable when unreliability is present. It is necessary to consider the unreliability in identifying protein complexes.
KW - Machine Learning
KW - Naïve Bayes
KW - Protein complexes
UR - http://www.scopus.com/inward/record.url?scp=84875590936&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84875590936&partnerID=8YFLogxK
U2 - 10.1109/BIBMW.2012.6470213
DO - 10.1109/BIBMW.2012.6470213
M3 - Conference contribution
AN - SCOPUS:84875590936
SN - 9781467327466
T3 - Proceedings - 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2012
SP - 640
EP - 647
BT - Proceedings - 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2012
T2 - 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2012
Y2 - 4 October 2012 through 7 October 2012
ER -