已发表论文

基于机器学习的成人代谢综合征4年风险预测:一项回顾性队列研究

 

Authors Zhang H, Chen D, Shao J, Zou P , Cui N, Tang L, Wang X, Wang D, Wu J, Ye Z

Received 9 July 2021

Accepted for publication 18 September 2021

Published 20 October 2021 Volume 2021:14 Pages 4361—4368

DOI https://doi.org/10.2147/RMHP.S328180

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Jong Wha Chang

Purpose: Machine learning (ML) techniques have emerged as a promising tool to predict risk and make decisions in different medical domains. We aimed to compare the predictive performance of machine learning-based methods for 4-year risk of metabolic syndrome in adults with the previous model using logistic regression.
Patients and Methods: This was a retrospective cohort study that employed a temporal validation strategy. Three popular ML techniques were selected to build the prognostic models. These techniques were artificial neural networks, classification and regression tree, and support vector machine. The logistic regression algorithm and ML techniques used the same five predictors. Discrimination, calibration, Brier score, and decision curve analysis were compared for model performance.
Results: Discrimination was above 0.7 for all models except classification and regression tree model in internal validation, while the logistic regression model showed the highest discrimination in external validation (0.782) and the smallest discrimination differences. The logistic regression model had the best calibration performance, and ANN also showed satisfactory calibration in internal validation and external validation. For overall performance, logistic regression had the smallest Brier score differences in internal validation and external validation, and it also had the largest net benefit in external validation.
Conclusion: Overall, this study indicated that the logistic regression model performed as well as the flexible ML-based prediction models at internal validation, while the logistic regression model had the best performance at external validation. For clinical use, when the performance of the logistic regression model is similar to ML-based prediction models, the simplest and more interpretable model should be chosen.
Keywords: prognosis model, metabolic syndrome, calibration, discrimination, machine learning