Aviation Lessons for Human-AI Collaboration in Medicine

Original Title: Flight rules for clinical AI: lessons from aviation for human-AI collaboration in medicine

Journal: NPJ digital medicine

Overview

Medicine and aviation are high-stakes fields where safety is paramount. Over the past decades, healthcare has adopted various aviation safety tools, such as surgical checklists and incident reporting systems. However, as artificial intelligence (AI) becomes more integrated into clinical workflows, new challenges arise that mirror aviation's earlier experiences with automation. The automation paradox describes a situation where increased automation erodes human skills and situational awareness, potentially leading to errors when the system fails. A study on AI-assisted colonoscopy showed a 6.0 percentage point absolute reduction in adenoma detection after the AI assistant was removed from the process, indicating a clear dependency effect. This paper argues that the medical community must learn from aviation to optimize human-AI collaboration. It suggests moving away from viewing AI as an autopilot toward a digital copilot model where clinicians remain the pilot-in-command. The authors propose five specific considerations: benchmarking clinician performance without AI, redesigning training to ensure fundamental skill development, defining AI literacy competencies, implementing regular simulation-based training, and fostering an operational understanding of how AI systems function.

Novelty

This perspective introduces a framework for medical AI safety grounded in the specific historical reforms of the aviation industry from the late 1970s. It identifies the risk of never skilling or mis-skilling in younger clinicians who may develop shallower knowledge by relying on AI-based tools during their formative training years. Unlike previous discussions that focused primarily on technical AI performance, this work categorizes human-AI interaction into a matrix based on the level of automation and clinician agency. It identifies the high-automation/high-agency quadrant as the optimal configuration for co-intelligent systems. The paper also provides concrete regulatory suggestions, such as the European Union's mandate for pilots to spend 48 hours every 3 years in high-fidelity simulators, as a model for medical simulation. Furthermore, it redefines the goal of AI explainability, suggesting that it should be judged by its contribution to operational understanding rather than technical depth. This shift emphasizes how a clinician can practically override or disengage a system when parameters drift outside expected bounds, similar to the golden rules followed by pilots.

Potential Clinical / Research Applications

The principles outlined can guide the development of mandatory AI-free proficiency quotas in specialties like radiology and pathology to ensure practitioners maintain diagnostic accuracy independently. Educational institutions can implement surprise breaks during medical simulations, where AI support is suddenly withdrawn to test a trainee's readiness to revert to manual practice. In terms of governance, healthcare regulators could move beyond certifying AI as a static medical device and instead certify the human-AI dyad through longitudinal monitoring of concordance rates. This approach would allow for the detection of drift in both the algorithm and the human user over time. Additionally, the concept of operational understanding can guide the design of user interfaces for clinical decision support systems. Instead of complex technical data, these interfaces should provide clear indicators of the model's confidence levels and specific factors contributing to a risk score. Such designs would enable clinicians to quickly determine when to trust, question, or override the AI, thereby improving overall patient safety in real-world environments.

Aviation Lessons for Human-AI Collaboration in Medicine

Overview

Novelty

Potential Clinical / Research Applications

Comments

Leave a Reply Cancel reply