当前位置: X-MOL 学术Cell Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Meta learning addresses noisy and under-labeled data in machine learning-guided antibody engineering
Cell Systems ( IF 9.3 ) Pub Date : 2024-01-08 , DOI: 10.1016/j.cels.2023.12.003
Mason Minot , Sai T. Reddy

Machine learning-guided protein engineering is rapidly progressing; however, collecting high-quality, large datasets remains a bottleneck. Directed evolution and protein engineering studies often require extensive experimental processes to eliminate noise and label protein sequence-function data. Meta learning has proven effective in other fields in learning from noisy data via bi-level optimization given the availability of a small dataset with trusted labels. Here, we leverage meta learning approaches to overcome noisy and under-labeled data and expedite workflows in antibody engineering. We generate yeast display antibody mutagenesis libraries and screen them for target antigen binding followed by deep sequencing. We then create representative learning tasks, including learning from noisy training data, positive and unlabeled learning, and learning out of distribution properties. We demonstrate that meta learning has the potential to reduce experimental screening time and improve the robustness of machine learning models by training with noisy and under-labeled training data.



中文翻译:

元学习解决机器学习引导的抗体工程中的噪声和标记不足的数据

机器学习引导的蛋白质工程正在快速进展;然而,收集高质量的大型数据集仍然是一个瓶颈。定向进化和蛋白质工程研究通常需要大量的实验过程来消除噪音并标记蛋白质序列功能数据。事实证明,元学习在其他领域通过双层优化从噪声数据中学习是有效的,因为可以使用带有可信标签的小数据集。在这里,我们利用元学习方法来克服噪声和标记不足的数据,并加快抗体工程的工作流程。我们生成酵母展示抗体诱变文库,并筛选它们的靶抗原结合,然后进行深度测序。然后,我们创建代表性的学习任务,包括从嘈杂的训练数据中学习、积极的和未标记的学习以及从分布属性中学习。我们证明,元学习有潜力通过使用噪声和标签不足的训练数据进行训练来减少实验筛选时间并提高机器学习模型的稳健性。

更新日期:2024-01-08
down
wechat
bug