Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers,Nature Machine Intelligence

当前位置： X-MOL 学术 › Nat. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers
Nature Machine Intelligence ( IF 23.8 ) Pub Date : 2024-02-07 , DOI: 10.1038/s42256-023-00773-8
Harry Coppock , George Nicholson , Ivan Kiskin , Vasiliki Koutra , Kieran Baker , Jobie Budd , Richard Payne , Emma Karoune , David Hurley , Alexander Titcomb , Sabrina Egglestone , Ana Tendero Cañadas , Lorraine Butler , Radka Jersakova , Jonathon Mellor , Selina Patel , Tracey Thornley , Peter Diggle , Sylvia Richardson , Josef Packham , Björn W. Schuller , Davide Pigoli , Steven Gilmour , Stephen Roberts , Chris Holmes

Recent work has reported that respiratory audio-trained AI classifiers can accurately predict SARS-CoV-2 infection status. However, it has not yet been determined whether such model performance is driven by latent audio biomarkers with true causal links to SARS-CoV-2 infection or by confounding effects, such as recruitment bias, present in observational studies. Here we undertake a large-scale study of audio-based AI classifiers as part of the UK government’s pandemic response. We collect a dataset of audio recordings from 67,842 individuals, with linked metadata, of whom 23,514 had positive polymerase chain reaction tests for SARS-CoV-2. In an unadjusted analysis, similar to that in previous works, AI classifiers predict SARS-CoV-2 infection status with high accuracy (ROC–AUC = 0.846 [0.838–0.854]). However, after matching on measured confounders, such as self-reported symptoms, performance is much weaker (ROC–AUC = 0.619 [0.594–0.644]). Upon quantifying the utility of audio-based classifiers in practical settings, we find them to be outperformed by predictions on the basis of user-reported symptoms. We make best-practice recommendations for handling recruitment bias, and for assessing audio-based classifiers by their utility in relevant practical settings. Our work provides insights into the value of AI audio analysis and the importance of study design and treatment of confounders in AI-enabled diagnostics.

中文翻译：

没有证据表明基于音频的 AI 分类器比简单的症状检查器更能改善 COVID-19 筛查

最近的研究报告称，经过呼吸音频训练的 AI 分类器可以准确预测 SARS-CoV-2 感染状态。然而，尚未确定这种模型性能是否是由与 SARS-CoV-2 感染具有真正因果关系的潜在音频生物标志物驱动的，还是由观察性研究中存在的混杂效应（例如招募偏差）驱动的。作为英国政府应对流行病的一部分，我们对基于音频的人工智能分类器进行了大规模研究。我们收集了 67,842 人的录音数据集，并带有链接元数据，其中 23,514 人的 SARS-CoV-2 聚合酶链反应测试呈阳性。在一项未经调整的分析中，与之前的工作类似，AI 分类器可以高精度预测 SARS-CoV-2 感染状态（ROC-AUC = 0.846 [0.838-0.854]）。然而，在匹配测量的混杂因素（例如自我报告的症状）后，表现要弱得多（ROC-AUC = 0.619 [0.594-0.644]）。在量化基于音频的分类器在实际环境中的效用后，我们发现基于用户报告的症状的预测优于它们。我们提出了处理招聘偏见的最佳实践建议，并根据音频分类器在相关实际环境中的实用性来评估它们。我们的工作深入了解了人工智能音频分析的价值以及人工智能诊断中研究设计和混杂因素处理的重要性。

更新日期：2024-02-07

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>