当前位置: X-MOL 学术bioRxiv. Biophys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bond strength between receptor binding domain of spike protein and human angiotensin converting enzyme-2 using machine learning.
bioRxiv - Biophysics Pub Date : 2024-04-18 , DOI: 10.1101/2024.04.16.589808
Abdulmateen Adebiyi , Puja Adhikari , Praveen Rao , Wai-Yim Ching

The spike protein (S-protein) of SARS-CoV-2 plays an important role in binding, fusion, and host entry. In this study, we have predicted interatomic bond strength between receptor binding domain (RBD) and angiotensin converting enzyme-2 (ACE2) using machine learning (ML), that matches with expensive ab initio calculation result. We collected bond order result from ab initio calculations. We selected a total of 18 variables such as bond type, bond length, elements and their coordinates, and others, to train ML models. We then trained five well-known regression models, namely, Decision Tree regression, KNN Regression, XGBoost, Lasso Regression, and Ridge Regression. We tested these models on two different datasets, namely, Wild type (WT) and Omicron variant (OV). In the first setting, we used 90% of each dataset for training and 10% for testing to predict the bond order. XGBoost model outperformed all the other models in the prediction of the WT dataset. It achieved an R2 Score of 0.997. XGBoost also outperformed all the other models with an R2 score of 0.9998 in the prediction of the OV dataset. In the second setting, we trained all the models on the WT (or OV) dataset and predicted the bond order on the OV (or WT) dataset. Interestingly, Decision Tree outperformed all the other models in both cases. It achieved an R2 score of 0.997.

中文翻译:

使用机器学习研究刺突蛋白受体结合域与人血管紧张素转换酶 2 之间的键合强度。

SARS-CoV-2 的刺突蛋白(S 蛋白)在结合、融合和进入宿主中发挥着重要作用。在这项研究中,我们使用机器学习(ML)预测了受体结合域(RBD)和血管紧张素转换酶-2(ACE2)之间的原子间键强度,这与昂贵的从头计算结果相匹配。我们从头开始计算收集了债券订单结果。我们总共选择了键类型、键长度、元素及其坐标等 18 个变量来训练 ML 模型。然后我们训练了五个著名的回归模型,即决策树回归、KNN 回归、XGBoost、Lasso 回归和岭回归。我们在两个不同的数据集上测试了这些模型,即野生型 (WT) 和 Omicron 变体 (OV)。在第一个设置中,我们使用每个数据集的 90% 进行训练,使用 10% 进行测试来预测键顺序。 XGBoost 模型在 WT 数据集的预测中优于所有其他模型。它的 R2 分数为 0.997。 XGBoost 在 OV 数据集的预测中也优于所有其他模型,R2 得分为 0.9998。在第二个设置中,我们在 WT(或 OV)数据集上训练所有模型,并预测 OV(或 WT)数据集上的键序。有趣的是,决策树在这两种情况下都优于所有其他模型。它的 R2 分数为 0.997。
更新日期:2024-04-19
down
wechat
bug