A team of researchers led by Robert Harris, PhD, of teleradiology services provider vRad, described how they created a deep-learning model that can be used to prioritize -- at a high level of specificity -- injured patients with free-air findings on abdomen or chest CT exams.
"We've been able to show through live prospective data over two weeks that we can quantify the total amount of free air within each study passing through our [teleradiology] system and get this free-air volume metric for use in active [worklist] prioritization," Harris said.
A critical pathology
Free air in the chest and abdomen is typically a critical pathology and can come in subtypes such as pneumoperitoneum, pneumothorax, and soft-tissue gas, according to Harris. The researchers hypothesized that a segmentation-based convolutional neural network (CNN) could be utilized to identify different subtypes of free-air findings on CT images of the chest and abdomen, enabling prioritization of these patients in the clinical workflow.
"Roughly 450 chest or abdomen CT studies pass through our system every day," Harris said. "By identifying which studies have these critical pathologies, we can boost these to the top of our worklist and make sure those patients are seen faster. We can then measure our success by running natural language processing on the clinical patient reports to determine our model accuracy on prospective live data."
To develop their deep-learning algorithm, the researchers first trained three separate natural language processing (NLP) models to search their database for reports containing findings of free air. They used the following keywords to search for the main subtypes of free air:
After using the NLP models to extract the studies that were positive for free air from their database, the researchers segmented the free-air subtypes on the images using a thresholding brush tool. They also randomly sampled image slices from a control group of 539 chest/abdomen CT studies that were negative for free air.
Harris and colleagues wound up with a total training and validation dataset consisting of 15,362 negative images, 3,679 pneumoperitoneum images, 2,704 soft-tissue gas images, and 1,298 pneumothorax images. Of these, 90% of the images were used for training and 10% were utilized for validation.
Next, the DICOM images with free air were converted into two types of JPEG images: a vascular window that's good for seeing soft tissue in the abdomen and a lung window that enables better visualization of pneumothorax, Harris said. Both of these image types were used as inputs for the CNN.
The CNN outputs three segmentation maps, one for each subtype of free air. These segmentation maps are summed across all slices and multiplied by the voxel size to calculate a free-air volume metric for each subtype, according to Harris.
After training and validation, the model was evaluated on a test set of 200 studies, including 36 pneumoperitoneum, 30 pneumothorax, and 22 soft-tissue gas cases. The threshold for a positive finding was 1 cm3 for pneumoperitoneum, 10 cm3 for pneumothorax, and 0.25 cm3 for soft-tissue gas. On the test set, the algorithm achieved an area under the curve of 0.981 for pneumoperitoneum, 0.915 for pneumothorax, and 0.856 for soft-tissue gas.
The algorithm was then placed into their practice's teleradiology workflow for two weeks, and it was used to screen a total of 6,290 CT studies for free air.
|Performance for detecting free air by subtype in 2 weeks of clinical practice|
|Positive studies prioritized||27||24||29|
"We didn't have this activated for live worklist prioritization yet because it's still experimental, but if we had turned it on, this is what the results would have been," Harris said.
In those two weeks, 65 studies with free air -- some with multiple subtypes -- were deemed as positive in the worklist, he stated. To reduce the number of false positives, the researchers had used thresholds for each subtype that resulted in approximately 95% specificity.
"At this specificity, our sensitivity was in the 40s for all three subtypes, indicating that we're catching a little under half of all patients with free air, which does leave room for further improvement but certainly is good for all of those patients with critical conditions that are getting prioritized," he said.
The amount of free air per patient varied greatly, he noted.
"Some of the patients that are missed just have tiny specks of free air, while many of the patients that are caught have huge regions of air that are much more important clinically," Harris said.
In other results, the researchers observed that false-positive findings were commonly experienced in patients with conditions such as pleural effusions, cholecystectomy, ileus, and colitis.