ISMRM: Researchers highlight advancements in MRI foundation models

May 14, 2026

A key theme at the International Society for Magnetic Resonance in Medicine (ISMRM) 2026 is standardization.

Leading a May 13 talk on large data analysis in the age of AI, Ofer Pasternak, PhD, from Massachusetts General Brigham Hospital and Harvard Medical School in Boston, emphasized the importance of data standardization while also encouraging innovation. The U.S. National Institutes of Health (NIH) has been promoting standardization though various workgroups, he noted.

However, "we're not there yet in the terms that everybody's collecting exactly the same data," said Pasternak, an associate professor in psychiatry and assistant professor in radiology at Harvard. "It's good that it's not exactly like that. We still want to have some innovation in data collection, but especially in the clinical domain it's best to collect data that is comparable, similar to what others are collecting."

Pasternak also highlighted strategies for handling pooled data toward large, multisite studies, ideally with a standardized approach, including standardized image acquisition and standardized quality control.

"Start from existing large studies as a guide for standardization," Pasternak advised. "We want large studies in order to validate generalizable international findings. There are multiple datasets that are available and, in general, those studies become a major catalyst of research, especially in the age of AI."

Mega studies

To that end, resources for large data analysis (mega studies) have been pooled from around the world. Among the largest publicly available datasets is the UK Biobank. It houses data from over 100,000 individuals "and it's perfectly standardized," explained Daniel Rueckert, PhD, chair of AI in Healthcare and Medicine at Technical University of Munich (TUM) and TUM University Hospital in Germany.

The UK Biobank has amassed whole-body MRI scans, dedicated brain imaging and cardiovascular imaging, as well as associated data from about 500,000 people, that includes lifestyle, physical activity data, genetic data, and it's also linked to clinical records.

"That's very important because you can study outcome in this relatively healthy population," Rueckert said. His lab is among those conducting correlation analyses of UK Biobank data to develop image-derived phenotypes.

Large cohorts provide the basis for developing risk-prediction models that then might be replicated, Rueckert explained.

Using "multimodal data, not only from the imaging side, but also from the non-imaging side, you can study different organ systems, different disease groups, and you can, for example, define normative groups," Rueckert said. "One of the things which I think is very interesting to see is that, of course, different modalities have different value for this type of risk prediction."

Cardiac MRI

A simple foundation model for cardiac MRI illustrates the point, according to Rueckert, who described a unimodal imaging model that takes imaging data and then performs image analysis downstream tasks.

"Building such a model is relatively straightforward," he said. However, enriching it with multimodal information is even more interesting. In this scenario, an encoder–decoder model basically encodes the data, is forced to compress the data, and then decodes it again, Rueckert explained, highlighting the end product, a latent space representation of the data that requires no labels.

From phenotypes, such as left ventricular mass, stroke volume, and the systolic volume, "you can use these features to make predictions or to derive from that latent space representation, for example, a segmentation of the heart."

However, Rueckert suggested the study becomes richer by adding tabular data from the UK Biobank dataset -- smoking, diet factors, drinking, for example -- ultimately encoding that tabular information into the latent space, essentially forcing the two representations to be aligned.

Latent space representation.Daniel Rueckert, PhD, and ISMRM

By adding multimodal data, "you can deduce some of the information you might only be able to see in imaging from the tabular information" ... for example, "you can predict left ventricular ejection fraction just from the tabular data." It works to some degree but doesn't work perfectly, and Rueckert cautioned researchers not to overinterpret the associations.

Vision–language models

Foundation models in MRI, however, are generally under-explored because MR imaging pipelines are fragmented, explained Fang Liu, PhD, associate professor of radiology at Harvard Medical School in Boston during a Tuesday talk, highlighting various task-based models that have been developed relative to the brain and knee, as well as disease diagnoses and pathology detection.

"Deep learning has improved individual steps, yet most models are trained for a single anatomy or task and lack generalization across scanner types, protocols, and clinical settings," Liu's group explained. "They also overlook linguistic information that is central to radiological reasoning and documentation."

For a different approach, Liu's group developed a single unified foundation model, called OmniMRI, that incorporates 2D slice, 3D volume, and text inputs.

"The model follows a multistage learning paradigm including vision pretraining, vision–language alignment and modeling, and multi-task instruction tuning," the group explained in its poster presentation.

OmniMRI involved curating and harmonizing 80 publicly available datasets from institutions and vendors. Liu and colleagues collected data from 58,739 patients (62.7% from North America,18.6% from Europe, and 18.6% from Asia), 228,438 volumes, 19.4 million slices -- and across diverse anatomies such as the brain, breast, knee, and prostate.

This unified framework consolidates task-specific pipelines into a generalist solution, according to the group. In an evaluation against classic deep learning and other models, multitask OmniMRI demonstrated strong performance at reconstruction (SSIM 0.87), segmentation (DICE 0.77), detection (IoU 0.68), classification (F1 0.91), and report generation (BLEU 0.26), according to Liu.

OmniMRI in the context of reporting.Fang Liu, PhD, and ISMRM

Importantly, OmniMRI is not immediately translatable to the clinical setting. However, the approach suggests the vision looking forward.

"I think it will redefine the MRI workflow from a fragmented pipeline into the more end-to-end intelligent system," Liu said. "The overall radiology-in-the-loop system for the MRI pipeline might be realized using this type of special specific foundation model in MRI."

In the meantime, the model developed by the group needs to be comprehensively evaluated and fine-tuning improved for some body parts where there is no representation in the database, Liu said, acknowledging R21, R01, and R56 funding support from the NIH for this project.

Check out AuntMinnie's full coverage of ISMRM 2026 on our ShowCast.