Robust CRC Diagnosis via Causal and Uncertainty-Aware AI

Original Title: Uncertainty-aware and causal test-time adaptive foundation model for robust colorectal cancer pathology diagnosis

Journal: NPJ digital medicine

DOI: 10.1038/s41746-025-02149-1

Overview

Colorectal cancer remains a major global health challenge, requiring precise histopathological analysis for effective treatment. While computational pathology has advanced with the use of large-scale foundation models, these systems frequently encounter obstacles when deployed in real-world clinical settings. Key issues include domain shifts caused by variations in staining protocols and scanner hardware, as well as the tendency for models to provide overconfident yet incorrect predictions. This paper introduces UAD-FM, an uncertainty-aware and causally adaptive foundation model designed to address these limitations. The framework integrates a variational Bayesian approach to decompose uncertainty into epistemic and aleatoric components, alongside a causal test-time adaptation mechanism. By evaluating the model across five diverse datasets, including TCGA-COAD/READ and DigestPath 2019, the researchers demonstrate that UAD-FM maintains high performance and reliability even when faced with unfamiliar data distributions. The system is designed to bridge the gap between experimental AI performance and the rigorous requirements of clinical diagnostic environments.

Novelty

The technical contribution of UAD-FM lies in its unique combination of three distinct methodologies within a single foundation model architecture. First, it employs a variational uncertainty decomposition head that distinguishes between model-related uncertainty and inherent data noise. Second, the model introduces causal test-time adaptation using do-interventions to separate biological content from non-causal style variables, such as staining artifacts. This allows the model to ignore spurious correlations that often mislead standard deep learning systems. Third, the framework incorporates post-hoc clinical calibration to align prediction confidence with empirical accuracy. Quantitative results show that UAD-FM achieves an AUROC of 0.945 on the TCGA dataset, outperforming established models like UNI and Virchow2, which achieved 0.923 and 0.931 respectively. Furthermore, the model reduces the Expected Calibration Error to 0.031, a significant improvement over the 0.089 observed in traditional Monte Carlo dropout methods. This integration ensures that the model is not only accurate but also provides a reliable measure of its own limitations.

Potential Clinical / Research Applications

This framework has significant potential for clinical triage, where it could automatically flag the 10% most uncertain cases for expert review. In simulations, this strategy improved diagnostic accuracy from 0.881 to 0.907 and reduced high-confidence errors by 32%. In multi-institutional research networks, UAD-FM can facilitate the pooling of data from centers with different scanning technologies without requiring extensive manual normalization. The model also provides fine-grained gland segmentation, achieving a Dice score of 0.872, which is useful for automated grading and prognosis modeling. Furthermore, the interpretable uncertainty maps generated by the system can serve as educational tools for pathology trainees, highlighting regions of diagnostic ambiguity that require closer inspection. By providing a transparent measure of confidence, the model supports a collaborative workflow where AI handles routine cases and humans focus on complex, high-uncertainty samples.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA