Skip to main content
Log in

Ethics framework for predictive clinical AI model updating

  • Original Paper
  • Published:
Ethics and Information Technology Aims and scope Submit manuscript

Abstract

There is an ethical dilemma present when considering updating predictive clinical artificial intelligence (AI) models, which should be part of the departmental quality improvement process. One needs to consider whether withdrawing the AI model is necessary to obtain the relevant information from a naive patient population or whether to use causal inference techniques to obtain this information. Withdrawing an AI model from patient care might pose challenges if the AI model is considered standard of care, while use of causal inference will not be reliable if the relevant statistical assumptions do not hold true. Hence, each of these two updating strategies is associated with patient risks, but lack of reliable data might endanger future patients. Similarly, not withdrawing an outdated AI might also expose patients to risk. Here I propose a high level ethical framework – epistemic risk management - that provides guidance for which route of model updating should be taken based on the likelihood of the assumptions used during the creation of the original AI model and the assumptions required for causal inference holding true. This approach balances our uncertainty about the status of the AI as standard of care with the risk of not obtaining the necessary data, so as to increase the probability of benefiting current and future patients for whose care the AI is being used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

  • Beauchamp, T. L. (2013). In J. F. Childress (Ed.), Principles of Biomedical Ethics (7 edition.). OUP USA.

  • Bhutta, Z. (2004). Standards of care in research. BMJ: British Medical Journal, 329(7475), 1114–1115.

    Article  Google Scholar 

  • Collins, G. S., & Moons, K. G. M. (2019). Reporting of artificial intelligence prediction models. The Lancet, 393(10181), 1577–1579.

    Article  Google Scholar 

  • Colton, D. (2000). Quality Improvement in Health Care. Evaluation & the Health Professions, 23(1), 7–42.

    Article  Google Scholar 

  • Davis, S. E., et al. (2020). Detection of Calibration Drift in Clinical Prediction Models to inform model updating. Journal of Biomedical Informatics, 112, 103611.

    Article  Google Scholar 

  • Davis, S. E., Walsh, C. G., & Matheny, M. E. (2022). Open questions and research gaps for monitoring and updating AI-enabled tools in clinical settings. Frontiers in Digital Health, 4, 958284.

    Article  Google Scholar 

  • European Parliament. Directorate General for Parliamentary Research Services (2022). Artificial intelligence in healthcare: applications, risks, and ethical and societal impacts LU: Publications Office. [online]. Available from: https://data.europa.eu/doi/10.2861/568473 [Accessed January 18, 2023].

  • FDA (2021). Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan.

  • Feng, J., et al. (2022). Clinical artificial intelligence quality improvement: Towards continual monitoring and updating of AI algorithms in healthcare. npj Digital Medicine, 5(1), 66.

    Article  Google Scholar 

  • Ferryman, K., & Pitcan, M. (2018). Fairness in Precision Medicine. [online]. Available from: https://datasociety.net/wp-content/uploads/2018/02/DataSociety_Fairness_In_Precision_Medicine_Feb2018.pdf.

  • Ghassemi, M. (2021). ’Exploring Healthy Models in ML for Health. [online]. Available from: https://www.youtube.com/watch?v=5uZROGFYfcA.

  • Giordano, C., et al. (2021). Accessing Artificial Intelligence for clinical decision-making. Frontiers in Digital Health, 3, 645232.

    Article  Google Scholar 

  • GOV.UK. (2021). A guide to good practice for digital and data-driven health technologies. GOV.UK. [online]. Available from: https://www.gov.uk/government/publications/code-of-conduct-for-data-driven-health-and-care-technology/initial-code-of-conduct-for-data-driven-health-and-care-technology [Accessed January 16, 2023].

  • Gupta, M., & Kaplan, H. C. (2020). Measurement for quality improvement: Using data to drive change. Journal of Perinatology, 40(6), 962–971.

    Article  Google Scholar 

  • F. D. A., Health Canada & MHRA (2021). Good machine learning practice for medical device development: Guiding principles. GOV.UK. [online]. Available from: https://www.gov.uk/government/publications/good-machine-learning-practice-for-medical-device-development-guiding-principles/good-machine-learning-practice-for-medical-device-development-guiding-principles [Accessed April 20, 2022].

  • Hernán, M. A., & Robins, J. M. Causal Inference: What If.

  • Jenkins, D. A., et al. (2021). Continual updating and monitoring of clinical prediction models: Time for dynamic prediction systems? Diagnostic and Prognostic Research, 5(1), 1.

    Article  Google Scholar 

  • Kamulegeya, L. H. (2019). Using artificial intelligence on dermatology conditions in Uganda: A case for diversity in training data sets for machine learning. [online]. Available from: http://biorxiv.org/lookup/doi/10.1101/826057 [Accessed April 19, 2022].

  • Kaushal, A., Altman, R., & Langlotz, C. (2020). Geographic distribution of US cohorts used to Train Deep Learning Algorithms. Journal of the American Medical Association, 324(12), 1212–1213.

    Article  Google Scholar 

  • Kelly, C. J., et al. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine, 17(1), 195.

    Article  Google Scholar 

  • Kleinberg, S., & Hripcsak, G. (2011). A review of causal inference for biomedical informatics. Journal of Biomedical Informatics, 44(6), 1102–1112.

    Article  Google Scholar 

  • Kuo, P. C., et al. (2021). Recalibration of deep learning models for abnormality detection in smartphone-captured chest radiograph. npj Digital Medicine, 4(1), 25.

    Article  Google Scholar 

  • Lenert, M. C., Matheny, M. E., & Walsh, C. G. (2019). Prognostic models will be victims of their own success, unless…. Journal of the American Medical Informatics Association, 26(12), 1645–1650.

    Article  Google Scholar 

  • Liley, J. (2022). Stacking interventions for equitable outcomes. [online]. Available from: http://arxiv.org/abs/2110.04163 [Accessed November 23, 2022].

  • Liley, J., et al. (2021). Model updating after interventions paradoxically introduces bias. Proceedings of Machine Learning Research, 130, 3916–3924.

    Google Scholar 

  • MHRA. (2022). Guidance: Medical device stand-alone software including apps (including IVDMDs). [online]. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1105233/Medical_device_stand-alone_software_including_apps.pdf [Accessed May 4, 2023].

  • Obermeyer, Z., et al. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453.

    Article  Google Scholar 

  • Pearl, J. (2009). Causal inference in statistics: An overview. Statistics Surveys, 3(none), 96–146.

    Article  MathSciNet  MATH  Google Scholar 

  • Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. 1st edition. Chichester, West Sussex: Wiley.

  • Rajaraman, S., Ganesan, P., & Antani, S. (2022). Deep learning model calibration for improving performance in class-imbalanced medical image classification tasks. PLOS ONE, 17(1), e0262838.

    Article  Google Scholar 

  • Ross, T. K. (2006). A statistical process control Case Study. Quality Management in Health Care, 15(4), 221–236.

    Article  Google Scholar 

  • Saito, T., & Rehmsmeier, M. (2015). The Precision-Recall plot is more informative than the ROC plot when evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432.

    Article  Google Scholar 

  • Scheines, R. (1997). An Introduction to Causal Inference. In V. McKim & S. Turner, eds. Causality in Crisis? Statistical Methods and the Search for Causal Knowledge in the Social Sciences Notre Dame: University of Notre Dame Press, pp. 185–199. [online]. Available from: https://kilthub.cmu.edu/articles/journal_contribution/An_Introduction_to_Causal_Inference/6490904/1.

  • Schnellinger, E. M., Yang, W., & Kimmel, S. E. (2021). Comparison of dynamic updating strategies for clinical prediction models. Diagnostic and Prognostic Research, 5(1), p.20.

  • Scott, I., Carter, S., & Coiera, E. (2021). Clinician checklist for assessing suitability of machine learning applications in healthcare. BMJ Health & Care Informatics, 28(1), e100251.

    Article  Google Scholar 

  • Seyyed-Kalantari, L. (2020). CheXclusion: Fairness gaps in deep chest X-ray classifiers. In Biocomputing 2021. Pacific Symposium on Biocomputing 2021. Kohala Coast, Hawaii, USA: WORLD SCIENTIFIC, pp. 232–243. [online]. Available from: https://www.worldscientific.com/doi/abs/10.1142/9789811232701_0022 [Accessed January 18, 2023].

  • Sperrin, M., et al. (2018). Using marginal structural models to adjust for treatment drop-in when developing clinical prediction models. Statistics in Medicine, 37(28), 4142–4154.

    Article  MathSciNet  Google Scholar 

  • Sperrin, M., et al. (2019). Explicit causal reasoning is needed to prevent prognostic models being victims of their own success. Journal of the American Medical Informatics Association, 26(12), 1675–1676.

    Article  Google Scholar 

  • Tsopra, R. (2021). A framework for validating AI in precision medicine: considerations from the European ITFoC consortium. BMC Medical Informatics and Decision Making, 21(1), p.274.

  • van Smeden, M., et al. (2022). Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. European Heart Journal, 43(31), 2921–2930.

    Article  Google Scholar 

  • WHO. (2021). Ethics and Governance of Artificial Intelligence for Health. WHO Guidance.

  • Winet, H. (2022). Ethics for Bioengineering scientists: Treating Data as clients. CRC Press.

Download references

Acknowledgements

I would like to thank Matthew Sperrin and Robert Palmer for helpful advice on this manuscript, as well as Nathan Proudlove for the opportunity to explore this topic during my studies and for comments on an assignment on which this submission is based; moreover, I thank David A. Jenkins for pointing me to some key papers. I also want to thank the anonymous reviewers for their constructive feedback during the review process.

Funding

This article is based on an assignment I wrote for the A4 module of my HSST / DClinSci program in Health Informatics for which I am funded by Health Education and Improvement Wales.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Pruski.

Ethics declarations

Competing interests

None to declare.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pruski, M. Ethics framework for predictive clinical AI model updating. Ethics Inf Technol 25, 48 (2023). https://doi.org/10.1007/s10676-023-09721-x

Download citation

  • Published:

  • DOI: https://doi.org/10.1007/s10676-023-09721-x

Keywords

Navigation