已发表论文

结合重采样策略和集成机器学习方法,加强坦桑尼亚北部引产后 Apgar 评分低的新生儿的预测

 

Authors Tarimo CS , Bhuyan SS, Li Q, Ren W, Mahande MJ , Wu J 

Received 29 July 2021

Accepted for publication 26 August 2021

Published 7 September 2021 Volume 2021:14 Pages 3711—3720

DOI https://doi.org/10.2147/RMHP.S331077

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Jong Wha Chang

Objective: The goal of this study was to establish the most efficient boosting method in predicting neonatal low Apgar scores following labor induction intervention and to assess whether resampling strategies would improve the predictive performance of the selected boosting algorithms.
Methods: A total of 7716 singleton births delivered from 2000 to 2015 were analyzed. Cesarean deliveries following labor induction, deliveries with abnormal presentation, and deliveries with missing Apgar score or delivery mode information were excluded. We examined the effect of resampling approaches or data preprocessing on predicting low Apgar scores, specifically the synthetic minority oversampling technique (SMOTE), borderline-SMOTE, and the random undersampling (RUS) technique. Sensitivity, specificity, precision, area under receiver operating curve (AUROC), F-score, positive predicted values (PPV), negative predicted values (NPV) and accuracy of the three (3) boosting-based ensemble methods were used to evaluate their discriminative ability. The ensemble learning models tested include adoptive boosting (AdaBoost), gradient boosting (GB) and extreme gradient boosting method (XGBoost).
Results: The prevalence of low (< 7) Apgar scores was 9.5% (n = 733). The prediction models performed nearly similar in their baseline mode. Following the application of resampling techniques, borderline-SMOTE significantly improved the predictive performance of all the boosting-based ensemble methods under observation in terms of sensitivity, F1-score, AUROC and PPV.
Conclusion: Policymakers, healthcare informaticians and neonatologists should consider implementing data preprocessing strategies when predicting a neonatal outcome with imbalanced data to enhance efficiency. The process may be more effective when borderline-SMOTE technique is deployed on the selected ensemble classifiers. However, future research may focus on testing additional resampling techniques, performing feature engineering, variable selection and optimizing further the ensemble learning hyperparameters.
Keywords: low Apgar score, labor induction, machine learning, ensemble learning, resampling methods, imbalanced data