Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
A new study shows that radiology AI models fine-tuned on institutional data outperform general-purpose language models like GPT-4.1 in clinical acceptance, challenging assumptions about off-the-shelf AI solutions in healthcare.
As artificial intelligence increasingly enters clinical practice, radiologists face a choice: adopt general-purpose language models or implement custom AI systems trained on institutional data. This study compared both approaches by having 10 clinicians—including radiologists and oncologists—evaluate AI-generated impressions against human-authored reports in 200 oncologic CT cases from an academic cancer center.
These results suggest domain-specific fine-tuning is essential for clinical AI acceptance. Rather than replacing radiologists, AI impressions work best as flexible drafting tools that reduce cognitive burden while preserving clinician oversight. Implementation should match stakeholder preferences rather than assume a single objective standard for quality.
The study involved a single institutional dataset and focused on oncologic CT. Generalizability to other imaging modalities and healthcare settings remains unclear. Additionally, the inherent subjectivity in ratings limits consensus on what constitutes an optimal impression.
Original paper: Comparison of AI-generated radiology impressions: a multi-stakeholder evaluation. — NPJ digital medicine. 10.1038/s41746-026-02586-6