Background: Many computational approaches have been developed to detect protein complexes from proteinprotein interaction (PPI) networks. However, these PPI networks are always built from high-throughput experiments. The presence of unreliable interactions in PPI network makes this task very challenging. Methods: In this study, we proposed a Genetic-Algorithm Fuzzy Naïve Bayes (GAFNB) filter to classify the protein complexes from candidate subgraphs. It takes unreliability into consideration and tackles the presence of unreliable interactions in protein complex. We first got candidate protein complexes through existed popular methods. Each candidate protein complex is represented by 29 graph features and 266 biological property based features. GAFNB model is then applied to classify the candidate complexes into positive or negative. Results: Our evaluation indicates that the protein complex identification algorithms using the GAFNB model filtering outperform original ones. For evaluation of GAFNB model, we also compared the performance of GAFNB with Naïve Bayes (NB). Results show that GAFNB performed better than NB. It indicates that a fuzzy model is more suitable when unreliability is present. Conclusions: We conclude that filtering candidate protein complexes with GAFNB model can improve the effectiveness of protein complex identification. It is necessary to consider the unreliability in this task.
ASJC Scopus subject areas
- Molecular Biology