Managing Data Drift in Medical AI Models: Solutions

Author: 

Martin Willemink

Reading time / 
4 min read
Industry

While medical AI has many potential benefits, there are challenges that have not been tackled yet. One of the major challenges is the limited generalizability of many AI algorithms. Applying a medical AI algorithm that is trained in hospital A may give unexpected results when applied in hospital B. Why does this happen? Certain parameters are different between hospitals. If hospital A has Siemens CT scanners and hospital B has Philips CT scanners, the images look different. If the AI algorithm is only trained on Siemens CT images, it will probably perform less well on Philips CT images. This is only one example of a parameter that may differ, there are many others such as age and race distribution of the patient cohort, imaging acquisition and reconstruction protocols, etc. The solution for this problem would be to train the AI algorithm with a large heterogeneous dataset.

But even if an algorithm is developed with a large heterogeneous dataset, the performance may reduce over time. This is caused by a phenomenon called data drift [1]. This February, MIT’s Jameel Clinic published a study evaluating the effect of data drift on the performance of AI models [2]. They replicated commercial sepsis prediction models on a public dataset (MIMIC-IV [3]) at different time windows and found a significant drop of 8 to 20% in performance (area under the curve) of an Epic algorithm. One of the reasons for this decrease was the switch in 2015 from ICD-9 codes to ICD-10 (which provides a more detailed disease classification). Another important factor was the change of microbiology sampling practices in the ICU. Since the AI models were trained with older data, the performance went down after these changes. 


Small changes in clinical protocols may thus have large effects on AI model performance. For example, performance may be affected if a hospital installs a new CT scanner that has a slightly different image quality compared to the older scanners. A few months ago, Siemens introduced a new type of CT scanner - photon-counting detector CT, which provides additional information with improved image quality [4]. There are currently no AI algorithms that have been trained with this type of data.

To solve the data drift problem, there are solutions proposed such as alerting the physician when drift is detected [5]. This will help signaling that there is an issue, but will not automatically calibrate the AI model. Access to a real time dataset to keep optimizing AI algorithms is the most ideal solution. The FDA also realizes the importance of data drift and is therefore considering changing the medical device regulation for adaptive AI technologies [6].

Segmed partners with hospitals to de-identify and structure real-time medical data, and to make it machine readable and searchable. 

  • Are you an AI developer who wants to work with diverse and real-time de-identified data allowing to prevent data drift? Sign up for the beta-version of Segmed Insight!
  • Are you a healthcare executive and you see the value of AI for your patients? Learn more about how Segmed can help you!

References:

[1] Samuel G Finlayson, Adarsh Subbaswamy, Karandeep Singh, John Bowers, Annabel Kupke, Jonathan Zittrain, Isaac S Kohane, Suchi Saria. New England Journal of Medicine 2021; 385:283-286. https://www.nejm.org/doi/full/10.1056/NEJMc2104626

[2] Janice Yang, Ludvig Karstens, Casey Ross, Adam Yala. AI Gone Astray: AI gone astray: How subtle shifts in patient data send popular algorithms reeling, undermining patient safety. February 28, 2022. https://www.statnews.com/2022/02/28/sepsis-hospital-algorithms-data-shift/.

[3] Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Steven Horng, Leo Anthony Celi, Roger Mark. MIMIC-IV (version 0.4). PhysioNet. August 13, 2020. https://doi.org/10.13026/a3wn-hq05.

[4] Martin J Willemink, Thomas M Grist. Counting Photons: The Next Era for CT Imaging? Radiology. 2022 Feb 15:213203. https://doi.org/10.1148/radiol.213203.

[5] Sharon E Davis, Robert A Greevy, Thomas A Lasko, Colin G Walsh, Michael E Matheny. Detection of calibration drift in clinical prediction models to inform model updating. Journal of Biomedical Informatics. December 2020;112:103611. https://doi.org/10.1016/j.jbi.2020.103611

[6] US Food & Drug Administration. Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan. September 22, 2021. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device.