Using GPT-4 to Classify Glioma Types Based on Preoperative MRI Data  

Monday, December 2, 2024

By Nick Klenske

Gliomas account for about 30% of all brain tumors and are the most common form of primary brain tumors in adults. They also vary in terms of aggressiveness and prognosis.

Martha Foltyn-Dumitru, MD, MSc
Foltyn-Dumitru

The gold standard for grading gliomas is the 2021 World Health Organization (WHO) Classification, which uses two molecular mutations (isocitrate-dehydrogenase (IDH) and 1p/19q co-deletion) to classify a glioma as being either astrocytoma, oligodendroglioma or glioblastoma. 

While the classification has been shown to have significant prognostic and therapeutic implications, detecting these mutations typically requires a biopsy. As a result, there has been an increasing effort to find a non-invasive alternative. 

“Radiomic features have recently demonstrated a convincing performance for the non-invasive detection of mutations,” said Martha Foltyn-Dumitru, MD, MSc, a radiologist at Bonn University Hospital in Germany. 

 

Simplifying and Democratizing Machine Learning in Radiology

According to Dr. Foltyn-Dumitru, radiomic-based machine learning algorithms are particularly promising for their ability to simultaneously classify the presence of IDH and 1p/19q in glioma. During a Sunday RSNA session, she presented the results of a study on the use of ChatGPT-4’s (GPT-4) Advanced Data Analytics (ADA) for autonomously creating machine learning (ML) models for classifying glioma types based on preoperative MRI data. 

“Our goal was to showcase how large language models can simplify and democratize machine learning in radiology,” Dr. Foltyn-Dumitru explained.  

In the study, MRI data from 615 newly diagnosed glioma patients were classified by IDH and 1p/19q status for multiclass classification and split into an 80/20 training/testing ratio. Radiomic features were then extracted from these scans.

“We utilized ADA within GPT-4 to autonomously develop modeling strategies, construct an ML model, and benchmark its performance against an established hand-crafted model,” said Dr. Foltyn-Dumitru.

The study evaluated the effectiveness of these models using various normalization methods to compare the accuracy and consistency of GPT-4’s outcomes with established benchmarks. What they found was that the GPT-4 model achieved the highest level of accuracy (0.820), significantly outperforming the benchmark model’s accuracy of 0.678. 

“This demonstrates that GPT-4’s ADA capabilities can produce radiomics-based machine learning models that match a hand-crafted model’s ability to classify glioma types from MRI data,” Dr. Foltyn-Dumitru said. 

No Coding Expertise Required

Dr. Foltyn-Dumitru did note that despite its strengths, GPT-4 struggles when it comes to handling class imbalances, highlighting the need for continued improvements in machine learning workflows and a critical evaluation of its results and prompts. That being said, she remains confident that autonomous ML model development using large language models can make machine learning more accessible in clinical radiology. 

“By automating machine learning model creation, GPT-4 enables radiologists without any coding expertise to leverage ML tools, making precision diagnostics more accessible,” Dr. Foltyn-Dumitru concluded. “But perhaps more importantly, the model holds promise for improving glioma classification, providing a non-invasive alternative that could enhance diagnostic precision and treatment planning.”

 

Access the presentation, “The Potential of GPT-4 Advanced Data Analysis (ADA) for Radiomics-Based Machine Learning Models,” (S4-SSIN01-1) on demand at RSNA.org/MeetingCentral