已发表论文

局灶节段性肾小球硬化诊断模型的开发:整合机器学习在激活通路中的应用及临床验证

 

Authors Ge Y, Liu X , Shu J, Jiang X, Wu Y

Received 28 September 2024

Accepted for publication 18 February 2025

Published 26 February 2025 Volume 2025:18 Pages 1127—1142

DOI https://doi.org/10.2147/IJGM.S498407

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Franco Musio

Yating Ge,1,2,* Xueqi Liu,1,3,* Jinlian Shu,1,2 Xiao Jiang,1,3 Yonggui Wu1,3 

1The Department of Nephrology, The First Affiliated Hospital of Anhui Medical University, Hefei, Anhui, People’s Republic of China; 2Department of Nephrology, The Second People’s Hospital of Hefei, Hefei Hospital Affiliated to Anhui Medical University, Hefei, Anhui, People’s Republic of China; 3Center for Scientific Research of Anhui Medical University, Hefei, Anhui, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Yonggui Wu, The Department of Nephrology, The First Affiliated Hospital of Anhui Medical University, Hefei, People’s Republic of China, Email wuyonggui@medmail.com.cn

Background: Focal segmental glomerulosclerosis (FSGS) represents a major global health challenge, with its incidence rising in parallel with advances in diagnostic techniques and the growing prevalence of chronic diseases. This study seeks to enhance the diagnostic accuracy of FSGS by integrating machine learning approaches to identify activated pathways, complemented by robust clinical validation.
Methods: We analyzed data from 163 FSGS patients and 42 living donors across multiple GEO cohorts via the ComBat algorithm to address batch effects and ensure the comparability of gene expression profiles. Gene set enrichment analysis (GSEA) identified key signaling pathways involved in FSGS pathogenesis. We then developed a highly accurate diagnostic model by integrating nine machine learning algorithms into 101 combinations, achieving near-perfect AUC values across training, validation, and external cohorts. The model identified six genes as potential biomarkers for FSGS. Additionally, immune cell infiltration patterns, particularly those involving natural killer (NK) cells, were explored, revealing the complex interplay between genetics and the immune response in FSGS patients. Immunohistochemical analysis validated the expression of the key markers CD99 and OAZ2 and confirmed the association between NK cells and FSGS.
Results: The glmBoost+Ridge model exhibited exceptional diagnostic accuracy, achieving an AUC of 0.998 using just six genes: BANF1, TUSC2, SMAD3, TGFB1, CD99, and OAZ2. The prediction score was calculated as follows: score = (0.3997×BANF1) + (0.5543×TUSC2) + (0.5279×SMAD3) + (0.4118×TGFB1) + (0.8665×CD99) + (0.5996×OAZ2). Immunohistochemical analysis confirmed significantly elevated expression levels of CD99 and OAZ2 in the glomeruli and tubulointerstitial tissues of FSGS patients compared with those of controls.
Conclusion: This study demonstrates a highly accurate machine learning model for FSGS diagnosis. Immunohistochemical validation confirmed elevated expression of CD99 and OAZ2, offering valuable insights into FSGS pathogenesis and potential biomarkers for clinical application.

Keywords: focal segmental glomerulosclerosis (FSGS), machine learning diagnostic model, gene set enrichment analysis (GSEA), immune cell infiltration