当前位置: X-MOL 学术Mol. Divers. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improved QSAR models for PARP-1 inhibition using data balancing, interpretable machine learning, and matched molecular pair analysis
Molecular Diversity ( IF 3.8 ) Pub Date : 2024-02-20 , DOI: 10.1007/s11030-024-10809-9
Anish Gomatam , Bhakti Umesh Hirlekar , Krishan Dev Singh , Upadhyayula Suryanarayana Murty , Vaibhav A. Dixit

The poly (ADP-ribose) polymerase-1 (PARP-1) enzyme is an important target in the treatment of breast cancer. Currently, treatment options include the drugs Olaparib, Niraparib, Rucaparib, and Talazoparib; however, these drugs can cause severe side effects including hematological toxicity and cardiotoxicity. Although in silico models for the prediction of PARP-1 activity have been developed, the drawbacks of these models include low specificity, a narrow applicability domain, and a lack of interpretability. To address these issues, a comprehensive machine learning (ML)-based quantitative structure–activity relationship (QSAR) approach for the informed prediction of PARP-1 activity is presented. Classification models built using the Synthetic Minority Oversampling Technique (SMOTE) for data balancing gave robust and predictive models based on the K-nearest neighbor algorithm (accuracy 0.86, sensitivity 0.88, specificity 0.80). Regression models were built on structurally congeneric datasets, with the models for the phthalazinone class and fused cyclic compounds giving the best performance. In accordance with the Organization for Economic Cooperation and Development (OECD) guidelines, a mechanistic interpretation is proposed using the Shapley Additive Explanations (SHAP) to identify the important topological features to differentiate between PARP-1 actives and inactives. Moreover, an analysis of the PARP-1 dataset revealed the prevalence of activity cliffs, which possibly negatively impacts the model’s predictive performance. Finally, a set of chemical transformation rules were extracted using the matched molecular pair analysis (MMPA) which provided mechanistic insights and can guide medicinal chemists in the design of novel PARP-1 inhibitors.



中文翻译:

使用数据平衡、可解释的机器学习和匹配的分子对分析改进 PARP-1 抑制的 QSAR 模型

聚(ADP-核糖)聚合酶-1(PARP-1)是乳腺癌治疗的重要靶点。目前,治疗选择包括药物Olaparib、Niraparib、Rucaparib和Talazoparib;然而,这些药物可能会引起严重的副作用,包括血液毒性和心脏毒性。尽管已经开发出用于预测 PARP-1 活性的计算机模型,但这些模型的缺点包括特异性低、适用范围窄和缺乏可解释性。为了解决这些问题,提出了一种基于机器学习 (ML) 的综合定量结构-活性关系 (QSAR) 方法,用于对 PARP-1 活性进行知情预测。使用用于数据平衡的合成少数过采样技术 (SMOTE) 构建的分类模型提供了基于 K 最近邻算法的稳健预测模型(准确度 0.86、灵敏度 0.88、特异性 0.80)。回归模型建立在结构相似的数据集上,其中二氮杂萘类和稠环化合物的模型具有最佳性能。根据经济合作与发展组织 (OECD) 的指导方针,提出了使用 Shapley 加法解释 (SHAP) 的机械解释,以确定区分 PARP-1 活性和非活性的重要拓扑特征。此外,对 PARP-1 数据集的分析揭示了活动悬崖的普遍存在,这可能会对模型的预测性能产生负面影响。最后,使用匹配分子对分析(MMPA)提取了一组化学转化规则,该规则提供了机制见解,可以指导药物化学家设计新型 PARP-1 抑制剂。

更新日期:2024-02-21
down
wechat
bug