본문

Machine learning based prediction of mortality within 30 days after surgery

02a.jpg

by Prof. Yi-Jun Kim
Department of Environmental Medicine
yijunkim@ewha.ac.kr

In our study, we delve into the escalating global trends in surgical procedures and the subsequent surge in postoperative complications, underscoring the critical need for precise prediction of postoperative mortality. Traditional risk scoring systems, including ASA-PS and POSSUM, exhibit limitations, propelling us to explore machine-learning models to enhance prognostic accuracy. Our primary objective is to develop a machine-learning prediction model for 30-day mortality after non-cardiac surgery, with a distinct focus on leveraging objective and quantitative clinical information. The study emphasizes the validation of this model across multiple medical centers, aspiring to craft a clinically applicable and robust solution that minimizes input inaccuracies and human resource demands. Our hypothesis posits that a lighter model can rival the performance of more complex ones. Throughout the study, we aim to affirm the predictive power and transferability of our model across diverse hospital settings, underscoring its potential clinical utility.

Following the guidelines for developing and reporting machine-learning predictive models in biomedical research, our study secured approval from the institutional review boards of four medical institutions. Encompassing 454,404 patients undergoing non-cardiac surgeries across these institutions, our research strategically excluded certain surgeries and cases with incomplete follow-up. Our primary aim was to predict in-hospital mortality within 30 days after surgery using a comprehensive machine-learning approach. The process of variable selection involved consensus among our team of authors, with a keen eye on features common across the four hospitals. The selected variables span demographics, preoperative laboratory results, type of surgery, type of anesthesia, and emergency status. Our primary outcome of interest remains in-hospital mortality within the critical 30-day postoperative period. Utilizing both traditional machine-learning methods (LR, RF) and advanced techniques (XGBoost, DNN), we divided the dataset for each hospital into training, validation, and test sets. We addressed missing values, preprocessed the data, and employed bootstrap and tenfold cross-validation methodologies to enhance the robustness of our models. Two distinct model types emerged: a conventional model utilizing all variables and a lab model focusing solely on 12 laboratory test parameters. Model performance underwent both local validation within each hospital and external validation across hospitals. Our evaluation metrics included AUROC and AUPRC, with statistical comparisons to ascertain model efficacy. Calibration plots were instrumental in assessing the alignment between observed and expected values, while feature importance was elucidated using Shapley additive explanations (SHAP) values.

Our study, based on data from four medical institutions, successfully crafted a machine-learning prediction model for 30-day mortality after non-cardiac surgery. The characteristics of the study population exhibited variations across hospitals, encompassing differences in age, types of surgeries, and rates of emergency procedures. Machine-learning models, particularly XGBoost, showcased superior prediction performance compared to conventional methods across diverse hospitals. External validation underscored the transferability of the model, with performance variations observed across institutions. Feature importance analysis highlighted distinct influential variables in each model, accentuating the need for tailored interpretations. Our study underscores the potential clinical utility of a machine-learning model for predicting postoperative mortality.

Our study aspires to deliver a pragmatic artificial intelligence (AI) model for predicting surgical outcomes applicable in real-world clinical settings. The ideal model, transferable between hospitals, requiring minimal data input and labor, and maintaining accuracy comparable to existing models, is the crux of our endeavor. We developed a model utilizing only objective and quantitative data from electronic medical records, thereby mitigating variability and bolstering data volume. Results affirm that the model's predictive power remains robust even with a minimal number of variables, showcasing its efficacy when applied to diverse hospitals—an indication of its transferability. Despite acknowledged limitations, our research makes strides in contributing to the development of resilient, generalized AI models with real-world.

02b.jpg

Figure. Schematic diagram of external validation of each hospital model

* Related Article
Seung Wook Lee, Hyung-Chul Lee, Jungyo Suh, Kyung Hyun Lee, Heonyi Lee, Suryang Seo, Tae Kyong Kim, Sang-Wook Lee, Yi-Jun Kim, Multi-center validation of machine learning model for preoperative prediction of postoperative mortality, NPJ Digital Medicine, Vol. Vol.5 (1)91, July 2023