当前位置: X-MOL 学术Comput. Chem. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Leveraging 2D molecular graph pretraining for improved 3D conformer generation with graph neural networks
Computers & Chemical Engineering ( IF 4.3 ) Pub Date : 2024-02-07 , DOI: 10.1016/j.compchemeng.2024.108622
Kumail Alhamoud , Yasir Ghunaim , Abdulelah S. Alshehri , Guohao Li , Bernard Ghanem , Fengqi You

Predicting stable 3D molecular conformations from 2D molecular graphs is a challenging and resource-intensive task, yet it is critical for various applications, particularly drug design. Density functional theory (DFT) calculations set the standard for molecular conformation generation, yet they are computationally intensive. Deep learning offers more computationally efficient approaches, but struggles to match DFT accuracy, particularly on complex drug-like structures. Additionally, the steep computational demands of assembling 3D molecular datasets constrain the broader adoption of deep learning. This work aims to utilize the abundant 2D molecular graph datasets for pretraining a machine learning model, a step that involves initially training the model on a different task with a wealth of data before fine-tuning it for the target task of 3D conformation generation. We build on GeoMol, an end-to-end graph neural network (GNN) method for predicting atomic 3D structures and torsion angles. We examine the limitations of the GeoMol method and introduce new baselines to enhance molecular graph embeddings. Our computational results show that 2D molecular graph pretraining enhances the quality of generated 3D conformers, yielding a 7.7% average improvement over state-of-the-art sequential methods. These advancements not only facilitate superior 3D conformation generation but also emphasize the potential of leveraging pretrained graph embeddings to boost performance in 3D chemical tasks with GNNs.

中文翻译:

利用 2D 分子图预训练通过图神经网络改进 3D 构象异构体生成

从 2D 分子图预测稳定的 3D 分子构象是一项具有挑战性且资源密集型的任务,但它对于各种应用(尤其是药物设计)至关重要。密度泛函理论 (DFT) 计算为分子构象生成设定了标准,但它们的计算量很大。深度学习提供了计算效率更高的方法,但很难达到 DFT 的准确性,特别是在复杂的药物样结构上。此外,组装 3D 分子数据集的巨大计算需求限制了深度学习的更广泛采用。这项工作旨在利用丰富的 2D 分子图数据集来预训练机器学习模型,该步骤涉及使用大量数据在不同任务上初始训练模型,然后针对 3D 构象生成的目标任务对其进行微调。我们以 GeoMol 为基础,这是一种用于预测原子 3D 结构和扭转角的端到端图神经网络 (GNN) 方法。我们检查了 GeoMol 方法的局限性,并引入了新的基线来增强分子图嵌入。我们的计算结果表明,2D 分子图预训练提高了生成的 3D 构象异构体的质量,比最先进的序列方法平均提高了 7.7%。这些进步不仅促进了卓越的 3D 构象生成,还强调了利用预训练图嵌入来提高 GNN 3D 化学任务性能的潜力。
更新日期:2024-02-07
down
wechat
bug