论文已发表
注册即可获取德孚的最新动态
IF 收录期刊
一种结合术前血液指标的机器学习模型用于早期无创检测子宫内膜癌
Authors Wang J, Wu H, Wang F, Wang Y, Gu J
Received 11 April 2025
Accepted for publication 28 July 2025
Published 12 August 2025 Volume 2025:18 Pages 10873—10884
DOI https://doi.org/10.2147/JIR.S530974
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 3
Editor who approved publication: Dr Felix Marsh-Wakefield
Jia Wang,1,* HaoTian Wu,2,* Fei Wang,3 YingXiang Wang,1 Jian Gu1
1Department of Gynecology, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, Guangdong, 510000, People’s Republic of China; 2Department of Neurology, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, Guangdong, 510000, People’s Republic of China; 3Department of Spacial AI, Li Auto Inc., Beijing, 100020, People’s Republic of China
*These authors contributed equally to this work
Correspondence: Jian Gu, Email gujian@mail.sysu.edu.cn
Background: Endometrial cancer (EC) incidence is rising globally, yet early diagnosis remains challenging. Our objective is to develop a non-invasive, preoperative tool to predict EC risk using machine learning (ML) techniques.
Methods: This retrospective analysis included clinical data from patients with endometrial lesions at the Third Affiliated Hospital of Sun Yat-sen University between January 2014 to August 2024. Six machine learning techniques including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Mac (SVM), Gradient Boosting Machine Model (GBDT), Logistic Regression (LR) and Multi-Layer Perceptron (MLP) were used to construct the prediction model of endometrial cancer. Receiver operating characteristic curve (ROC) was used to evaluate the model. S Hapley Additive ExPlanation (SHAP) analysis was applied to determine the predictive role of each feature in the model with the highest predictive performance.
Results: A total of 857 patients were included in the study. Eight baseline characteristics (Age, BMI, Gravidity, Parity, Family history, Menopause status, Diabetes, Hypertension), one imaging feature (Endometrial thickness) and eight peripheral blood-based markers (WBC, NLR, MLR, PLR, SII, SIRI, CA-125, HE4) were selected for develop and validate the machine learning model, these features were obtained noninvasively. Data from 686 patients were randomly assigned to the training group, and data from 171 patients were used for internal validation. Among the six-machine learning model, GBDT had the highest prediction, the model achieved an AUC of 0.95 (95% CI: 0.93– 0.97), accuracy of 90.0% and a Brier score of 0.06. The SHAP analysis showed that HE4, CA-125 and SIRI were the most influential contributors to the prediction.
Conclusion: We developed and validated a GBDT prediction model, which showed the best performance in predicting endometrial cancer. This model can be applied in clinical practice to effectively predict the risk of EC for patients.
Keywords: endometrial cancer, serum inflammatory markers, machine learning, prediction model