当前位置: X-MOL 学术Data Knowl. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Challenges of a Data Ecosystem for scientific data
Data & Knowledge Engineering ( IF 2.5 ) Pub Date : 2023-10-19 , DOI: 10.1016/j.datak.2023.102236
Edoardo Ramalli , Barbara Pernici

Data Ecosystems (DE) are used across various fields and applications. They facilitate collaboration between organizations, such as companies or research institutions, enabling them to share data and services. A DE can boost research outcomes by managing and extracting value from the increasing volume of generated and shared data in the last decades. However, the adoption of DE solutions for scientific data by R&D departments and scientific communities is still difficult. Scientific data are challenging to manage, and, as a result, a considerable part of this information still needs to be annotated and organized in order to be shared. This work discusses the challenges of employing DE in scientific domains and the corresponding potential mitigations. First, scientific data and their typologies are contextualized, then their unique characteristics are discussed. Typical properties regarding their high heterogeneity and uncertainty make assessing their consistency and accuracy problematic. In addition, this work discusses the specific requirements expressed by the scientific communities when it comes to integrating a DE solution into their workflow. The unique properties of scientific data and domain-specific requirements create a challenging setting for adopting DEs. The challenges are expressed as general research questions, and this work explores the corresponding solutions in terms of data management aspects. Finally, the paper presents a real-world scenario with more technical details.



中文翻译:

科学数据数据生态系统的挑战

数据生态系统(DE)用于各个领域和应用程序。它们促进公司或研究机构等组织之间的协作,使他们能够共享数据和服务。DE 可以通过管理过去几十年中不断增加的生成和共享数据量并从中提取价值来提高研究成果。然而,研发部门和科学界对科学数据采用DE解决方案仍然很困难。科学数据的管理具有挑战性,因此,相当一部分信息仍然需要注释和组织才能共享。这项工作讨论了在科学领域使用 DE 的挑战以及相应的潜在缓解措施。首先,将科学数据及其类型放在上下文中,然后讨论它们的独特特征。其高度异质性和不确定性的典型特性使得评估其一致性和准确性成为问题。此外,这项工作还讨论了科学界在将 DE 解决方案集成到其工作流程中时表达的具体要求。科学数据的独特属性和特定领域的要求为采用 DE 创造了一个具有挑战性的环境。这些挑战表现为一般性研究问题,本工作在数据管理方面探索相应的解决方案。最后,本文提出了一个具有更多技术细节的现实场景。

更新日期:2023-10-19
down
wechat
bug