论文已发表
注册即可获取德孚的最新动态
IF 收录期刊
非平衡集成算法在冠心病合并高血压患者全因死亡率预后预测中的应用
Authors Zan J, Dong X, Yang H, Yan J, He Z, Tian J , Zhang Y
Received 5 April 2024
Accepted for publication 24 July 2024
Published 6 August 2024 Volume 2024:17 Pages 1921—1936
DOI https://doi.org/10.2147/RMHP.S472398
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 2
Editor who approved publication: Dr Gulsum Kubra Kaya
Jiaxin Zan,1,2 Xiaojing Dong,1,2 Hong Yang,1,2 Jingjing Yan,1,2 Zixuan He,3 Jing Tian,3 Yanbo Zhang1,2,4
1Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China; 2Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, People’s Republic of China; 3Department of Cardiology, The First Hospital of Shanxi Medical University, Taiyuan, People’s Republic of China; 4School of Health Services and Management, Shanxi University of Chinese Medicine, Taiyuan, People’s Republic of China
Correspondence: Jing Tian; Yanbo Zhang, School of Public health, Shanxi Medical University, 56 Xinjian Road, Taiyuan, Shanxi Province, People’s Republic of China, Tel/Fax +86 15535406059, Email 1105551933@qq.com; sxmuzyb@126.com
Purpose: This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death.
Patients and Methods: Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model.
Results: There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778– 0.839), sensitivity (0.990; 95% CI, 0.981– 1.000), recall (0.990; 95% CI, 0.981– 1.000), and G-mean (0.806; 95% CI, 0.778– 0.827), and the lowest Brier score (0.181; 95% CI, 0.178– 0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors.
Conclusion: BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes.
Keywords: coronary heart disease comorbid with hypertension, ensemble learning, balanced random forest, SHAP, Prognosis