AI models for classifying lung nodule malignancy on chest CT imaging show high sensitivity but only moderate specificity when tested against external datasets, according to a study published June 3 in Radiology: Artificial Intelligence.
The finding suggests these types of models could offer “a potential role as adjunctive tools for ruling out malignancy” rather than classifying it, wrote a team led by Oke Dimas Asmara, MD, of Frisius Medical Center in Leeuwarden, the Netherlands.
“Current AI models may support rule-out of malignancy in lung nodules; however, moderate specificity limits their use for definitive classification of malignant nodules,” the group noted.
Lung cancer is the leading cause of cancer death and is the most commonly diagnosed cancer around the world, the group noted. Recent studies have reported that AI-based malignancy classification on CT imaging shows promise, but how to integrate this technology into clinical practice isn’t clear, they explained, writing that “an important gap remains in understanding how AI performance generalizes across different clinical environments” -- that is, asymptomatic screening populations versus symptomatic patients.
The researchers searched PubMed, Embase, Web of Science, CINAHL, and the Cochrane Library in January 2025 to identify studies that evaluated AI models for malignancy classification of lung nodules on chest CT using pathology and/or at least two-year follow-up as reference standards. The search produced 21 studies that included 7,454 lung nodules; lung cancer prevalence ranged from 5.7% to 91.5%.
All of the AI models used in these studies were based on deep-learning. Of the studies included in the research, 17 (81%) involved Asian populations, 15 (71%) used non-screening populations, 14 (67%) reported on 2D or 3D convolutional neural network (CNN) architectures, and eight (38%) focused on predefined malignancy thresholds.
The group reported the following overall results:
- Pooled sensitivity was 88%.
- Pooled specificity was 75%.
- Positive likelihood ratio was 3.55, while negative likelihood ratio was 0.16.
- Area under the receiver operating characteristic curve (AUROC) was 0.89.
- The diagnostic odds ratio was 22.4.
- Variation across AI algorithms was high, measuring I2 greater than 90%.
- Higher specificity in the study models’ architecture was associated with those studies that used 2D or 3D CNNs compared with those without reported architecture (83% versus 58%, p = 0.03).
Asmara and colleagues urged further research that includes more use standardized thresholds, diverse populations, and detailed architecture reporting to measure real-world clinical impact of the use of AI to classify lung nodules.
Access the full study here.



















![Axial images from unenhanced calcium score cardiac CT (left) and curved planar reformation images from CT angiography (right) show that higher long-term exposure to air pollution is associated with greater coronary artery calcium and more obstructive coronary artery disease (CAD). Top row: Images in a 68-year-old male patient with higher 10-year mean ambient air pollution exposure (7.9 μg/m3 for particulate matter measuring ≤2.5 μm in diameter [PM2.5] and 17.4 parts per billion [ppb] for NO2) with extensive CAD (coronary artery calcium score [CACS] >1,000 and obstructive CAD [≥70% diameter stenosis]). Bottom row: Images in a 57-year-old female patient with lower 10-year mean ambient air pollution exposure (6.3 μg/m3 for PM2.5 and 4.6 ppb for NO2) with no CAD (CACS = 0 and no obstructive stenosis).](https://img.auntminnie.com/mindful/smg/workspaces/default/uploads/2026/06/hanneman.r6SMLzkezo.png?auto=format%2Ccompress&fit=crop&h=112&q=70&w=112)
