Enhancing Clinician Trust in Automated Diagnostic Systems: A Framework Combining Random Forest, XGBoost, and SHAP for Interpretable and Auditable Diabetes Risk Prediction
- Authors
-
-
Billy Elly
LautechAuthor
-
- Keywords:
- Explainable AI, Diabetes Prediction, Random Forest, XGBoost, SHAP, Clinical Decision Support, Interpretable Machine Learning
- Abstract
-
Diabetes mellitus affects over 537 million adults globally, with early detection critical for reducing long-term complications and healthcare costs. Despite advances in machine learning for disease prediction, the "black-box" nature of many high-performing models limits clinical adoption due to insufficient transparency and clinician trust. This study addresses this gap by developing a hybrid predictive framework that integrates Random Forest and XGBoost ensemble classifiers with SHAP (SHapley Additive exPlanations) for interpretable diabetes risk prediction. Using the PIMA Indian Diabetes dataset, the proposed framework achieves an accuracy of 89.4% and an AUC of 0.91, outperforming individual models and providing both global feature importance and patient-level explanations. SHAP analysis identified glucose, age, and BMI as the most influential predictors, consistent with clinical literature. The framework contributes a replicable, audit-ready approach that balances predictive performance with interpretability, enabling clinicians to understand and validate model decisions. This research demonstrates that explainable AI can bridge the gap between algorithmic accuracy and clinical trust, facilitating the integration of automated diagnostic systems into routine healthcare workflows.
- Downloads
- Published
- 06/27/2026
- Section
- Articles
- License
-
Copyright (c) 2026 Billy Elly (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
