KT-LLM: An Auditable Framework for Kidney Transplant Care

Original Title: KT-LLM: an evidence-grounded and sequence text framework for auditable kidney transplant modeling

Journal: NPJ digital medicine

DOI: 10.1038/s41746-025-02323-5

Overview

The management of kidney transplantation involves complex longitudinal data and strict regulatory policies that are often difficult to align. This study presents KT-LLM, a framework designed to bridge the gap between structured patient follow-up data and the textual rules governing clinical practice. The system uses a modular architecture consisting of three specialized agents coordinated by a large language model. Agent-A, utilizing a Mamba-based sequence model, predicts survival and graft loss outcomes. Agent-B identifies distinct patient subgroups through deep embedded clustering, while Agent-C translates policy documents into executable rules to ensure compliance with reporting deadlines and terminology. In evaluations using national registry data, the framework demonstrated high predictive accuracy and strong alignment with clinical guidelines. Specifically, for survival prediction, the model achieved a C-index of 0.82 for patient death and 0.80 for graft loss, outperforming established deep survival baselines which recorded values of 0.79 and 0.77, respectively. Furthermore, the system attained a question-answering accuracy of 91.8% on kidney-specific pathology tasks and an evidence hit rate of 83.5%, ensuring that decisions are grounded in authoritative medical sources.

Novelty

The novelty of this research lies in its verifiable orchestration layer that integrates retrieval-augmented generation with specialized sequence modeling. Unlike conventional medical AI models that focus solely on predictive metrics, this framework introduces a system where textual rules become computable checklists. It employs a selective state space model, known as Mamba, which allows for efficient processing of long-term patient histories in linear time, avoiding the high computational costs associated with standard transformer architectures. Another distinct feature is the inclusion of an evidence pointer head and a coverage gate. These components enforce multi-source grounding, meaning the model must cite specific clauses from official documents like the Banff classification or registry policies before generating an answer. This design shifts the focus from manual governance to an automated, auditable process where every output is linked to a versioned policy or terminology source. By anchoring reasoning to an external governance clock, the system ensures that clinical predictions remain synchronized with the latest regulatory updates without requiring constant retraining of the primary model.

Potential Clinical / Research Applications

Potential clinical and research applications include the automation of compliance monitoring for transplant centers. The system can proactively identify missing follow-up forms or flag cases where terminology does not match the latest Banff criteria, thereby reducing reporting errors. In a research context, the framework provides a standardized method for multi-center outcome analysis, allowing investigators to compare graft survival rates while adjusting for center-specific policy variations. The predictive capabilities of the survival agent can assist clinicians in personalizing follow-up schedules based on individual risk trajectories. Additionally, the population clustering agent can be used to identify patients who may benefit from targeted interventions, supporting more equitable care delivery. Beyond kidney transplantation, the modular architecture could be adapted for other complex medical fields that rely on both long-term longitudinal data and evolving clinical guidelines, such as oncology or chronic disease management.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA