已发表论文

基于成人炎症和营养指标的机器学习预测非小细胞肺癌:一项横断面研究

 

Authors Wang Q , Liang T , Li Y , Liu X

Received 11 January 2024

Accepted for publication 23 May 2024

Published 30 May 2024 Volume 2024:16 Pages 527—535

DOI https://doi.org/10.2147/CMAR.S454638

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Seema Singh

Qiaoli Wang,1,* Tao Liang,2,* Yuexi Li,1,* Xiaoqin Liu1 

1Department of Health Screening Center, Deyang Peoples’ Hospital, Deyang, Sichuan, 618000, People’s Republic of China; 2Department of Gastroenterology, Deyang Peoples’ Hospital, Deyang, Sichuan, 618000, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Xiaoqin Liu, Email 66823572@qq.com

Purpose: The aim of this study was to evaluate the potential benefit of blood inflammation in the diagnosis of non-small cell lung cancer (NSCLC) and propose a machine-learning-based method to predict NSCLC in asymptomatic adults.
Patients and Methods: A cross-sectional study was evaluated using medical records of 139 patients with non-small cell lung cancer and physical examination data from May 2022 to May 2023 of 198 healthy controls. The NSCLC cohort comprised 128 cases of adenocarcinoma, 3 cases of squamous cell carcinoma, and 8 cases of other NSCLC subtypes. The correlation between inflammatory and nutritional markers, such as monocytes, neutrophils, LMR, NLR, PLR, PHR and non-small cell lung cancer was examined. Features were selected using Python’s feature selection library and analyzed by five algorithms. The predictive ability of the model for non-small cell lung cancer diagnosis was assessed by precision, accuracy, recall, F1 score, and area under the curve (AUC).
Results: The results showed that the top 14 important factors were PDW, age, TP, RBC, HGB, LYM, LYM%, RDW, PLR, LMR, PHR, MONO, MONO%, gender. Additionally, the naive Bayes (NB) algorithm demonstrated the highest overall performance in predicting adult NSCLC among the five machine learning algorithms, achieving an accuracy of 0.87, a macro average F1 score of 0.85, a weighted average F1 score of 0.87, and an AUC of 0.84.
Conclusion: In feature ranking, platelet distribution width was the most important feature, and the NB algorithm performed best in predicting adult NSCLC diagnosis.

Keywords: machine learning, non-small cell lung cancer, inflammatory indicators, nutritional indicators, ratio, diagnosis