By Nick Klenske
Unstructured reports are an ongoing challenge in radiology—a challenge that can limit the ability to extract standardized information for research, quality improvement and AI development.
“The rapid evolution of large language models (LLMs) offers promising opportunities for radiology report annotation, particularly when it comes to identifying specific diagnostic findings,” said Mana Moassefi, MD, an incoming radiology resident at Mayo Clinic, in Rochester, MN. Dr. Moassefi co-authored a recent study evaluating whether LLMs could identify the presence or absence of specific diagnoses or findings in radiology reports across multiple institutions.
With a focus on LLMs with strong natural language understanding and adaptability, researchers looked to see if these models—when optimized through prompt engineering—could overcome inter-institutional variability and thus serve as scalable tools for radiology report labeling and cohort generation.
“The idea was that if we can get reliable labels from the massive amount of existing radiology data, then medical data will no longer be seen as being too rare and too expensive,” said Dr. Moassefi, who made her remarks during a Monday session. “If we achieve that, then we can start building powerful and effective AI models using the data we already have.”
The study is unique in that it consisted of a cross-institutional evaluation spanning six major academic centers, with each center collecting 500 radiology reports across five diagnostic categories (liver metastases, subarachnoid hemorrhage, pneumonia, cervical spine fracture and glioma progression).
“We purposely kept the dataset’s labels diverse to capture the unique characteristics of each label and to see how those differences might affect the results,” Dr. Moassefi explained.
A high-level programming language with a human-optimized prompt was developed and distributed to each site. The script instructed a locally hosted model to answer either ‘yes’ or ‘no’ regarding the presence of the target finding.“The idea was that if we can get reliable labels from the massive amount of existing radiology data, then medical data will no longer be seen as being too rare and too expensive. If we achieve that, then we can start building powerful and effective AI models using the data we already have.”
Mana Moassefi, MD
The standardized human-optimized prompt proved highly adaptable across diverse institutional practices, illustrating the power of well-designed prompt engineering. At one site, where eight LLMs were systematically compared, the model achieved the highest level of accuracy (~95%).
The study further found that model performance correlated with report structure quality and achieved near-perfect accuracy. However, diagnostic categories such as pneumonia proved to be more challenging due to interpretive ambiguity in free-text reports.
“These findings demonstrate that LLMs can serve as reliable tools for labeling radiology reports, helping us scale data annotation, generate AI datasets and create retrospective research cohorts—all tasks that traditionally require extensive manual review,” Dr. Moassefi said.
By showing cross-institutional reproducibility with only prompt-based customization, the study moves radiology closer to automated, standardized information extraction—an essential step toward achieving AI-ready data pipelines and structured reporting adoption.
“Larger, more diverse datasets make models more generalizable and reduce uncertainty, which in turn leads to more accurate diagnoses and better patient outcomes,” Dr. Moassefi concluded.
Access the presentation, “Engineering Prompts, Extracting Diagnoses: A Multi-Institutional Assessment of LLMs in Radiology,” (M3-SSIN02-1) on demand at RSNA.org/MeetingCentral
© 2025 RSNA.
The RSNA 2025 Daily Bulletin is the official publication of the 110th Scientific Assembly and Annual Meeting of the Radiological Society of North America. Published online Sunday, November 30 — Thursday, December 4.
The RSNA 2025 Daily Bulletin is owned and published by the Radiological Society of North America, Inc., 820 Jorie Blvd., Suite 200, Oak Brook, IL 60523.