当前位置: X-MOL 学术Curr. Proteom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of Human Protein Subcellular Location with Multiple Networks
Current Proteomics ( IF 0.8 ) Pub Date : 2022-06-01 , DOI: 10.2174/1570164619666220531113704
Lei Chen 1 , Rui Wang 1
Affiliation  

Background: Protein function is closely related to its location within the cell. Determination of protein subcellular location is helpful in uncovering its functions. However, traditional biological experiments to determine the subcellular location are of high cost and low efficiency, which cannot meet today’s needs. In recent years, many computational models have been set up to identify the subcellular location of proteins. Most models use features derived from protein sequences. Recently, features extracted from the protein-protein interaction (PPI) network have become popular in studying various protein-related problems. Objective: A novel model with features derived from multiple PPI networks was proposed to predict protein subcellular location. Methods: Protein features were obtained by a newly designed network embedding algorithm, Mnode2vec, which is a generalized version of the classic Node2vec algorithm. Two classic classification algorithms: support vector machine and random forest, were employed to build the model. Results: Such model provided good performance and was superior to the model with features extracted by Node2vec. Also, this model outperformed some classic models. Furthermore, Mnode2vec was found to produce powerful features when the path length was small. Conclusion: The proposed model can be a powerful tool to determine protein subcellular location, and Mnode2vec can efficiently extract informative features from multiple networks.

中文翻译:

使用多个网络识别人类蛋白质亚细胞定位

背景:蛋白质功能与其在细胞内的位置密切相关。蛋白质亚细胞定位的测定有助于揭示其功能。然而,传统的确定亚细胞定位的生物学实验成本高、效率低,已不能满足当今的需要。近年来,已经建立了许多计算模型来识别蛋白质的亚细胞定位。大多数模型使用源自蛋白质序列的特征。最近,从蛋白质-蛋白质相互作用 (PPI) 网络中提取的特征在研究各种与蛋白质相关的问题中变得流行起来。目的:提出了一种具有源自多个 PPI 网络的特征的新模型来预测蛋白质亚细胞定位。方法:通过新设计的网络嵌入算法 Mnode2vec 获得蛋白质特征,这是经典 Node2vec 算法的通用版本。采用两种经典的分类算法:支持向量机和随机森林来构建模型。结果:该模型提供了良好的性能,优于使用 Node2vec 提取特征的模型。此外,该模型的性能优于一些经典模型。此外,当路径长度很小时,Mnode2vec 被发现可以产生强大的特征。结论:所提出的模型可以成为确定蛋白质亚细胞定位的有力工具,Mnode2vec 可以有效地从多个网络中提取信息特征。这样的模型提供了良好的性能并且优于使用 Node2vec 提取特征的模型。此外,该模型的性能优于一些经典模型。此外,当路径长度很小时,Mnode2vec 被发现可以产生强大的特征。结论:所提出的模型可以成为确定蛋白质亚细胞定位的有力工具,Mnode2vec 可以有效地从多个网络中提取信息特征。这样的模型提供了良好的性能并且优于使用 Node2vec 提取特征的模型。此外,该模型的性能优于一些经典模型。此外,当路径长度很小时,Mnode2vec 被发现可以产生强大的特征。结论:所提出的模型可以成为确定蛋白质亚细胞定位的有力工具,Mnode2vec 可以有效地从多个网络中提取信息特征。
更新日期:2022-06-01
down
wechat
bug