Privacy-preserving federated machine learning on FAIR health data: A real-world application,Computational and Structural Biotechnology Journal

当前位置： X-MOL 学术 › Comput. Struct. Biotechnol. J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Privacy-preserving federated machine learning on FAIR health data: A real-world application
Computational and Structural Biotechnology Journal ( IF 6 ) Pub Date : 2024-02-17 , DOI: 10.1016/j.csbj.2024.02.014
A. Anil SINACI , Mert GENCTURK , Celia ALVAREZ-ROMERO , Gokce Banu LALECI ERTURKMEN , Alicia MARTINEZ-GARCIA , María José ESCALONA-CUARESMA , Carlos Luis PARRA-CALDERON

This paper introduces a privacy-preserving federated machine learning (ML) architecture built upon Findable, Accessible, Interoperable, and Reusable (FAIR) health data. It aims to devise an architecture for executing classification algorithms in a federated manner, enabling collaborative model-building among health data owners without sharing their datasets. Utilizing an agent-based architecture, a privacy-preserving federated ML algorithm was developed to create a global predictive model from various local models. This involved formally defining the algorithm in two steps: data preparation and federated model training on FAIR health data and constructing the architecture with multiple components facilitating algorithm execution. The solution was validated by five healthcare organizations using their specific health datasets. Five organizations transformed their datasets into Health Level 7 Fast Healthcare Interoperability Resources via a common FAIRification workflow and software set, thereby generating FAIR datasets. Each organization deployed a Federated ML Agent within its secure network, connected to a cloud-based Federated ML Manager. System testing was conducted on a use case aiming to predict 30-day readmission risk for chronic obstructive pulmonary disease patients and the federated model achieved an accuracy rate of 87%. The paper demonstrated a practical application of privacy-preserving federated ML among five distinct healthcare entities, highlighting the value of FAIR health data in machine learning when utilized in a federated manner that ensures privacy protection without sharing data. This solution effectively leverages FAIR datasets from multiple healthcare organizations for federated ML while safeguarding sensitive health datasets, meeting legislative privacy and security requirements.

中文翻译：

FAIR 健康数据的隐私保护联合机器学习：现实世界的应用

本文介绍了一种基于可查找、可访问、可互操作和可重用 (FAIR) 健康数据构建的隐私保护联合机器学习 (ML) 架构。它的目标是设计一种以联合方式执行分类算法的架构，从而实现健康数据所有者之间的协作模型构建，而无需共享其数据集。利用基于代理的架构，开发了一种保护隐私的联合机器学习算法，以根据各种本地模型创建全局预测模型。这涉及分两个步骤正式定义算法：数据准备和基于 FAIR 健康数据的联合模型训练，以及构建具有多个组件以促进算法执行的架构。该解决方案由五个医疗保健组织使用其特定的健康数据集进行了验证。五个组织通过通用的 FAIRification 工作流程和软件集将其数据集转换为 Health Level 7 快速医疗互操作性资源，从而生成 FAIR 数据集。每个组织在其安全网络中部署了一个联合机器学习代理，连接到基于云的联合机器学习管理器。对旨在预测慢性阻塞性肺疾病患者 30 天再入院风险的用例进行了系统测试，联合模型的准确率达到 87%。该论文演示了隐私保护联合机器学习在五个不同医疗实体之间的实际应用，强调了以联合方式使用公平健康数据在机器学习中的价值，该数据可在不共享数据的情况下确保隐私保护。该解决方案有效利用来自多个医疗保健组织的 FAIR 数据集进行联合机器学习，同时保护敏感的健康数据集，满足立法隐私和安全要求。

更新日期：2024-02-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>