TY - JOUR
T1 - NetGO
T2 - Improving large-scale protein function prediction with massive network information
AU - You, Ronghui
AU - Yao, Shuwei
AU - Xiong, Yi
AU - Huang, Xiaodi
AU - Sun, Fengzhu
AU - Mamitsuka, Hiroshi
AU - Zhu, Shanfeng
N1 - © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2019/7/2
Y1 - 2019/7/2
N2 - Automated function prediction (AFP) of proteins is of great significance in biology. AFP can be regarded as a problem of the large-scale multi-label classification where a protein can be associated with multiple gene ontology terms as its labels. Based on our GOLabeler-a state-of-the-art method for the third critical assessment of functional annotation (CAFA3), in this paper we propose NetGO, a web server that is able to further improve the performance of the large-scale AFP by incorporating massive protein-protein network information. Specifically, the advantages of NetGO are threefold in using network information: (i) NetGO relies on a powerful learning to rank framework from machine learning to effectively integrate both sequence and network information of proteins; (ii) NetGO uses the massive network information of all species (>2000) in STRING (other than only some specific species) and (iii) NetGO still can use network information to annotate a protein by homology transfer, even if it is not contained in STRING. Separating training and testing data with the same time-delayed settings of CAFA, we comprehensively examined the performance of NetGO. Experimental results have clearly demonstrated that NetGO significantly outperforms GOLabeler and other competing methods. The NetGO web server is freely available at http://issubmission.sjtu.edu.cn/netgo/.
AB - Automated function prediction (AFP) of proteins is of great significance in biology. AFP can be regarded as a problem of the large-scale multi-label classification where a protein can be associated with multiple gene ontology terms as its labels. Based on our GOLabeler-a state-of-the-art method for the third critical assessment of functional annotation (CAFA3), in this paper we propose NetGO, a web server that is able to further improve the performance of the large-scale AFP by incorporating massive protein-protein network information. Specifically, the advantages of NetGO are threefold in using network information: (i) NetGO relies on a powerful learning to rank framework from machine learning to effectively integrate both sequence and network information of proteins; (ii) NetGO uses the massive network information of all species (>2000) in STRING (other than only some specific species) and (iii) NetGO still can use network information to annotate a protein by homology transfer, even if it is not contained in STRING. Separating training and testing data with the same time-delayed settings of CAFA, we comprehensively examined the performance of NetGO. Experimental results have clearly demonstrated that NetGO significantly outperforms GOLabeler and other competing methods. The NetGO web server is freely available at http://issubmission.sjtu.edu.cn/netgo/.
UR - https://www.biorxiv.org/content/early/2018/10/11/439554
UR - http://www.mendeley.com/research/netgo-improving-largescale-protein-function-prediction-massive-network-information
UR - https://www.biorxiv.org/about-biorxiv
U2 - 10.1093/nar/gkz388
DO - 10.1093/nar/gkz388
M3 - Article
C2 - 31106361
SN - 0305-1048
VL - 47
SP - 379
EP - 387
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 1
ER -