The researchers first developed a deep neural network model using 19,784 images (5,725 positive COVID-19 cases) from four sites within the Henry Ford Health System in Michigan. They then tested the model's generalizability to seven external, publicly available test sets.
Among the four internal test sites, the model achieved area under the curve (AUC) values ranging from 0.84 to 0.87 when identifying COVID-19 on x-ray. In contrast, when the model was applied to external test datasets, AUC performance ranged from 0.77 to 0.83.
"AI models trained from one, or a limited number of clinical sites, will drop in performance when they are applied to external test datasets," the researchers wrote.
Attend the session to learn more.