当前位置: X-MOL 学术Brain Inf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Assisted neuroscience knowledge extraction via machine learning applied to neural reconstruction metadata on NeuroMorpho.Org
Brain Informatics Pub Date : 2022-11-07 , DOI: 10.1186/s40708-022-00174-4
Kayvan Bijari 1, 2 , Yasmeen Zoubi 1, 2 , Giorgio A Ascoli 1, 2, 3
Affiliation  

The amount of unstructured text produced daily in scholarly journals is enormous. Systematically identifying, sorting, and structuring information from such a volume of data is increasingly challenging for researchers even in delimited domains. Named entity recognition is a fundamental natural language processing tool that can be trained to annotate, structure, and extract information from scientific articles. Here, we harness state-of-the-art machine learning techniques and develop a smart neuroscience metadata suggestion system accessible by both humans through a user-friendly graphical interface and machines via Application Programming Interface. We demonstrate a practical application to the public repository of neural reconstructions, NeuroMorpho.Org, thus expanding the existing web-based metadata management system currently in use. Quantitative analysis indicates that the suggestion system reduces personnel labor by at least 50%. Moreover, our results show that larger training datasets with the same software architecture are unlikely to further improve performance without ad-hoc heuristics due to intrinsic ambiguities in neuroscience nomenclature. All components of this project are released open source for community enhancement and extensions to additional applications.

中文翻译:

通过机器学习辅助神经科学知识提取应用于 NeuroMorpho.Org 上的神经重建元数据

学术期刊中每天产生的非结构化文本数量巨大。从如此大量的数据中系统地识别、分类和构建信息对于研究人员来说越来越具有挑战性,即使是在分隔的领域中也是如此。命名实体识别是一种基本的自然语言处理工具,可以训练它从科学文章中注释、构建和提取信息。在这里,我们利用最先进的机器学习技术,开发了一个智能神经科学元数据建议系统,人类可以通过用户友好的图形界面访问,机器可以通过应用程序编程接口访问。我们展示了神经重建公共存储库 NeuroMorpho.Org 的实际应用,从而扩展了当前使用的现有基于 Web 的元数据管理系统。量化分析表明,建议系统减少了至少50%的人员劳动。此外,我们的结果表明,由于神经科学命名法中固有的歧义,如果没有临时启发式算法,具有相同软件架构的更大训练数据集不太可能进一步提高性能。该项目的所有组件都是开源发布的,用于社区增强和对其他应用程序的扩展。
更新日期:2022-11-07
down
wechat
bug