Abstract
There is an ethical dilemma present when considering updating predictive clinical artificial intelligence (AI) models, which should be part of the departmental quality improvement process. One needs to consider whether withdrawing the AI model is necessary to obtain the relevant information from a naive patient population or whether to use causal inference techniques to obtain this information. Withdrawing an AI model from patient care might pose challenges if the AI model is considered standard of care, while use of causal inference will not be reliable if the relevant statistical assumptions do not hold true. Hence, each of these two updating strategies is associated with patient risks, but lack of reliable data might endanger future patients. Similarly, not withdrawing an outdated AI might also expose patients to risk. Here I propose a high level ethical framework – epistemic risk management - that provides guidance for which route of model updating should be taken based on the likelihood of the assumptions used during the creation of the original AI model and the assumptions required for causal inference holding true. This approach balances our uncertainty about the status of the AI as standard of care with the risk of not obtaining the necessary data, so as to increase the probability of benefiting current and future patients for whose care the AI is being used.
Similar content being viewed by others
Data availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
References
Beauchamp, T. L. (2013). In J. F. Childress (Ed.), Principles of Biomedical Ethics (7 edition.). OUP USA.
Bhutta, Z. (2004). Standards of care in research. BMJ: British Medical Journal, 329(7475), 1114–1115.
Collins, G. S., & Moons, K. G. M. (2019). Reporting of artificial intelligence prediction models. The Lancet, 393(10181), 1577–1579.
Colton, D. (2000). Quality Improvement in Health Care. Evaluation & the Health Professions, 23(1), 7–42.
Davis, S. E., et al. (2020). Detection of Calibration Drift in Clinical Prediction Models to inform model updating. Journal of Biomedical Informatics, 112, 103611.
Davis, S. E., Walsh, C. G., & Matheny, M. E. (2022). Open questions and research gaps for monitoring and updating AI-enabled tools in clinical settings. Frontiers in Digital Health, 4, 958284.
European Parliament. Directorate General for Parliamentary Research Services (2022). Artificial intelligence in healthcare: applications, risks, and ethical and societal impacts LU: Publications Office. [online]. Available from: https://data.europa.eu/doi/10.2861/568473 [Accessed January 18, 2023].
FDA (2021). Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan.
Feng, J., et al. (2022). Clinical artificial intelligence quality improvement: Towards continual monitoring and updating of AI algorithms in healthcare. npj Digital Medicine, 5(1), 66.
Ferryman, K., & Pitcan, M. (2018). Fairness in Precision Medicine. [online]. Available from: https://datasociety.net/wp-content/uploads/2018/02/DataSociety_Fairness_In_Precision_Medicine_Feb2018.pdf.
Ghassemi, M. (2021). ’Exploring Healthy Models in ML for Health. [online]. Available from: https://www.youtube.com/watch?v=5uZROGFYfcA.
Giordano, C., et al. (2021). Accessing Artificial Intelligence for clinical decision-making. Frontiers in Digital Health, 3, 645232.
GOV.UK. (2021). A guide to good practice for digital and data-driven health technologies. GOV.UK. [online]. Available from: https://www.gov.uk/government/publications/code-of-conduct-for-data-driven-health-and-care-technology/initial-code-of-conduct-for-data-driven-health-and-care-technology [Accessed January 16, 2023].
Gupta, M., & Kaplan, H. C. (2020). Measurement for quality improvement: Using data to drive change. Journal of Perinatology, 40(6), 962–971.
F. D. A., Health Canada & MHRA (2021). Good machine learning practice for medical device development: Guiding principles. GOV.UK. [online]. Available from: https://www.gov.uk/government/publications/good-machine-learning-practice-for-medical-device-development-guiding-principles/good-machine-learning-practice-for-medical-device-development-guiding-principles [Accessed April 20, 2022].
Hernán, M. A., & Robins, J. M. Causal Inference: What If.
Jenkins, D. A., et al. (2021). Continual updating and monitoring of clinical prediction models: Time for dynamic prediction systems? Diagnostic and Prognostic Research, 5(1), 1.
Kamulegeya, L. H. (2019). Using artificial intelligence on dermatology conditions in Uganda: A case for diversity in training data sets for machine learning. [online]. Available from: http://biorxiv.org/lookup/doi/10.1101/826057 [Accessed April 19, 2022].
Kaushal, A., Altman, R., & Langlotz, C. (2020). Geographic distribution of US cohorts used to Train Deep Learning Algorithms. Journal of the American Medical Association, 324(12), 1212–1213.
Kelly, C. J., et al. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine, 17(1), 195.
Kleinberg, S., & Hripcsak, G. (2011). A review of causal inference for biomedical informatics. Journal of Biomedical Informatics, 44(6), 1102–1112.
Kuo, P. C., et al. (2021). Recalibration of deep learning models for abnormality detection in smartphone-captured chest radiograph. npj Digital Medicine, 4(1), 25.
Lenert, M. C., Matheny, M. E., & Walsh, C. G. (2019). Prognostic models will be victims of their own success, unless…. Journal of the American Medical Informatics Association, 26(12), 1645–1650.
Liley, J. (2022). Stacking interventions for equitable outcomes. [online]. Available from: http://arxiv.org/abs/2110.04163 [Accessed November 23, 2022].
Liley, J., et al. (2021). Model updating after interventions paradoxically introduces bias. Proceedings of Machine Learning Research, 130, 3916–3924.
MHRA. (2022). Guidance: Medical device stand-alone software including apps (including IVDMDs). [online]. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1105233/Medical_device_stand-alone_software_including_apps.pdf [Accessed May 4, 2023].
Obermeyer, Z., et al. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453.
Pearl, J. (2009). Causal inference in statistics: An overview. Statistics Surveys, 3(none), 96–146.
Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. 1st edition. Chichester, West Sussex: Wiley.
Rajaraman, S., Ganesan, P., & Antani, S. (2022). Deep learning model calibration for improving performance in class-imbalanced medical image classification tasks. PLOS ONE, 17(1), e0262838.
Ross, T. K. (2006). A statistical process control Case Study. Quality Management in Health Care, 15(4), 221–236.
Saito, T., & Rehmsmeier, M. (2015). The Precision-Recall plot is more informative than the ROC plot when evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432.
Scheines, R. (1997). An Introduction to Causal Inference. In V. McKim & S. Turner, eds. Causality in Crisis? Statistical Methods and the Search for Causal Knowledge in the Social Sciences Notre Dame: University of Notre Dame Press, pp. 185–199. [online]. Available from: https://kilthub.cmu.edu/articles/journal_contribution/An_Introduction_to_Causal_Inference/6490904/1.
Schnellinger, E. M., Yang, W., & Kimmel, S. E. (2021). Comparison of dynamic updating strategies for clinical prediction models. Diagnostic and Prognostic Research, 5(1), p.20.
Scott, I., Carter, S., & Coiera, E. (2021). Clinician checklist for assessing suitability of machine learning applications in healthcare. BMJ Health & Care Informatics, 28(1), e100251.
Seyyed-Kalantari, L. (2020). CheXclusion: Fairness gaps in deep chest X-ray classifiers. In Biocomputing 2021. Pacific Symposium on Biocomputing 2021. Kohala Coast, Hawaii, USA: WORLD SCIENTIFIC, pp. 232–243. [online]. Available from: https://www.worldscientific.com/doi/abs/10.1142/9789811232701_0022 [Accessed January 18, 2023].
Sperrin, M., et al. (2018). Using marginal structural models to adjust for treatment drop-in when developing clinical prediction models. Statistics in Medicine, 37(28), 4142–4154.
Sperrin, M., et al. (2019). Explicit causal reasoning is needed to prevent prognostic models being victims of their own success. Journal of the American Medical Informatics Association, 26(12), 1675–1676.
Tsopra, R. (2021). A framework for validating AI in precision medicine: considerations from the European ITFoC consortium. BMC Medical Informatics and Decision Making, 21(1), p.274.
van Smeden, M., et al. (2022). Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. European Heart Journal, 43(31), 2921–2930.
WHO. (2021). Ethics and Governance of Artificial Intelligence for Health. WHO Guidance.
Winet, H. (2022). Ethics for Bioengineering scientists: Treating Data as clients. CRC Press.
Acknowledgements
I would like to thank Matthew Sperrin and Robert Palmer for helpful advice on this manuscript, as well as Nathan Proudlove for the opportunity to explore this topic during my studies and for comments on an assignment on which this submission is based; moreover, I thank David A. Jenkins for pointing me to some key papers. I also want to thank the anonymous reviewers for their constructive feedback during the review process.
Funding
This article is based on an assignment I wrote for the A4 module of my HSST / DClinSci program in Health Informatics for which I am funded by Health Education and Improvement Wales.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
None to declare.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pruski, M. Ethics framework for predictive clinical AI model updating. Ethics Inf Technol 25, 48 (2023). https://doi.org/10.1007/s10676-023-09721-x
Published:
DOI: https://doi.org/10.1007/s10676-023-09721-x