已发表论文

开发一种可解释的机器学习模型以早期预测系统性红斑狼疮患者的心血管受累情况

 

Authors Deng Z, Liu H , Chen F, Liu Q, Wang X , Wang C, Lyu C, Li J, Li T

Received 12 April 2025

Accepted for publication 14 June 2025

Published 1 July 2025 Volume 2025:18 Pages 8629—8641

DOI https://doi.org/10.2147/JIR.S526608

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Qing Lin

Zixian Deng,1 Huadong Liu,1 Feng Chen,2 Qiyun Liu,1 Xiaoyu Wang,3 Caiping Wang,1 Chuangye Lyu,1 Jianghua Li,1 Tangzhiming Li1,4 

1Department of Cardiology, Shenzhen People’s Hospital (The First Affiliated Hospital, Southern University of Science and Technology; The Second Clinical Medical College, Jinan University), Shenzhen, 518020, People’s Republic of China; 2Cardiology Department, Peking University First Hospital, Beijing, People’s Republic of China; 3Department of Cardiology, Long Gang Central Hospital of Shenzhen, Shenzhen, Guangdong, 518116, People’s Republic of China; 4Department of Cardiology, Heping country People’s Hospital, Heyuan, Guangdong, People’s Republic of China

Correspondence: Jianghua Li, Department of Cardiology, Shenzhen People’s Hospital (The First Affiliated Hospital, Southern University of Science and Technology; The Second Clinical Medical College, Jinan University), Shenzhen, 518020, People’s Republic of China, Email Lijianghua06@126.com Tangzhiming Li, Department of Cardiology, Shenzhen People’s Hospital (The First Affiliated Hospital, Southern University of Science and Technology; The Second Clinical Medical College, Jinan University), Shenzhen, 518020, People’s Republic of China, Email litangzhiming@126.com

Background: Cardiovascular disease is a leading cause of death in systemic lupus erythematosus (SLE). Early prediction of cardiac involvement is critical for improving patient outcomes. This study aimed to identify key factors associated with cardiac involvement in SLE and to develop an interpretable machine learning (ML) model for risk prediction.
Methods: We conducted a retrospective analysis of 1,023 SLE patients hospitalized in Shenzhen People’s Hospital between January 2000 and December 2021, with a median age of 31 years at hospitalization (IQR: 25– 39 years), 92.1% being female, and 18.77% developing cardiovascular involvement during a median follow-up of 3,737 days (IQR: 1,920– 5,246). The most predictive features were selected through the intersection of three feature selection techniques: Random Forest, LASSO, and XGBoost. Models were trained on 70% of the dataset and tested on the remaining 30%. Among seven evaluated algorithms, the Gradient Boosting Machine (GBM) demonstrated the best performance on the test set. Model interpretability was assessed using the DALEX package, which generated feature importance plots and instance-level breakdown profiles to visualize decision-making logic.
Results: Over a median follow-up of 3737 days, 192 (18.77%) patients developed cardiac involvement. Seven key predictors—arthritis, hypertension, HDL-C, LDL-C, total cholesterol, CRP, and ESR— were identified from 51 clinical and biological variables at admission. The Gradient Boosting Machine (GBM) model (AUC: 0.748, Accuracy: 0.779, Precision: 0.605, F1 score: 0.433, recall 0.338) performed the best of the seven models.
Conclusion: This study is the first to develop an interpretable ML model to predict the risk of cardiac involvement in SLE. Notably, the GBM model showed optimal performance, and its interpretability allowed clinicians to visualize decision-making processes, facilitating early identification of high-risk patients.

Keywords: systemic lupus erythematosus, cardiovascular involvement, machine learning, prediction model, interpretability