Assessing ChatGPT in Diagnosing Degenerative Diseases

Original Title: Clinical Manifestations

Journal: Alzheimer's & dementia : the journal of the Alzheimer's Association

DOI: 10.1002/alz70857_101996

Overview

This study evaluates the clinical performance of ChatGPT version 3.5 in diagnosing neurodegenerative diseases. Building on previous research where the model achieved a 45.1% accuracy rate on neurology residency exams, this investigation uses nine case reports from the journal Dementia and Neurocognitive Disorders. The methodology involved a two-stage interaction to simulate the diagnostic process. First, the model received patient symptoms, medical histories, and physical findings to generate differential diagnoses and suggest diagnostic procedures. Second, specific laboratory and imaging results were provided to determine the final diagnosis. This approach assesses how the model processes incremental clinical information. Results show the model included the correct diagnosis in its initial differential list for 33.3% of cases. However, it correctly identified appropriate diagnostic methods in 88.9% of cases, representing eight out of nine instances.

Novelty

This research transitions from evaluating general knowledge through standardized testing to assessing clinical reasoning using peer-reviewed case reports. Unlike studies focused on multiple-choice questions, this requires the model to synthesize clinical descriptions and suggest logical steps in a workup. The study highlights significant improvement when the model receives objective test results. Final diagnostic accuracy increased to 77.8%, with the model identifying the disease in seven out of nine cases after receiving laboratory data. This demonstrates the model's capacity to refine its output based on clinical evidence, reflecting the iterative nature of the medical diagnostic process. By focusing specifically on dementia, the research provides a specialized benchmark for performance in chronic conditions that present with complex, overlapping symptoms.

Potential Clinical / Research Applications

These findings suggest several applications in medical education and clinical support. The model can help students practice formulating diagnostic plans and selecting appropriate laboratory tests. Given its 88.9% accuracy in recommending methods, it could serve as a digital checklist to ensure standard protocol adherence. In primary care settings, it could assist practitioners in identifying necessary tests before making a specialist referral. Research could scale this methodology to evaluate how artificial intelligence handles atypical dementia cases. By automating case history analysis, researchers can identify diagnostic error patterns, leading to refined decision support systems. The model provides a consistent framework for processing clinical data in the management of degenerative diseases.

Assessing ChatGPT in Diagnosing Degenerative Diseases

Overview

Novelty

Potential Clinical / Research Applications

Comments

Leave a Reply Cancel reply