Can AI Enhance Surgeon Decision-Making in Spine Surgery? LLMs Show Mixed Results in Predicting Post-Kyphoplasty Complications

A new study comparing large language models with traditional machine learning reveals that while LLMs show promise for predicting bone cement leakage after spine surgery, they fall short for more complex complications and lack clinical readiness.

Background

Percutaneous kyphoplasty treats osteoporotic vertebral compression fractures, but postoperative complications—particularly bone cement leakage (BCL) and new vertebral fractures (NVF)—can significantly impact patient outcomes. Predictive models could help surgeons identify high-risk patients preoperatively.

Key Findings

Researchers compared two state-of-the-art LLMs (GPT-5 and DeepSeek R1) with traditional machine learning models and spine surgeon predictions:

  • Bone cement leakage: LLMs demonstrated acceptable performance (F1-score 0.857–0.871), comparable to traditional ML and slightly superior to surgeons alone. LLM explanations enhanced surgeon decision-making for BCL prediction.
  • New vertebral fractures: Zero-shot LLM performance was poor (F1-score 0.309), but improved with few-shot learning. RBF-SVM outperformed both LLMs and surgeons for NVF prediction.
  • Complication subtypes: LLMs performed poorly at predicting specific complication subtypes.

Why It Matters

These findings challenge assumptions that advanced LLMs universally outperform traditional approaches. LLMs appear to have selective value in clinical decision support—useful for specific tasks but not replacements for conventional ML or surgeon expertise.

Limitations

The single-center design limits generalizability. Authors emphasize that LLMs currently lack the maturity and reliability for clinical implementation. Further validation is necessary before real-world deployment in surgical practice.

Original paper: Comparative performance of LLMs and machine learning in predicting complications after percutaneous kyphoplasty for osteoporotic vertebral compression fractures. — NPJ digital medicine. 10.1038/s41746-026-02588-4

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA