PDF (6311K)
摘要
利用生物信息学方法筛选与宫颈癌发生、发展和预后相关的血管生成相关基因(angiogenesis related gene, ARG), 并进行相关预后风险模型的构建与验证。首先, 从TCGA数据库中检索宫颈癌患者的表达谱和临床特征, 并提取差异表达的ARG; 其次, 采用Lasso Cox回归筛选预后ARG, 构建相关预后模型; 再次, 使用GSE52903和GSE44001数据集进行外部验证; 最后, 利用基因集富集分析(gene set enrichment analysis, GSEA)探讨宫颈癌预后机制。筛选结果显示, 共获得15个预后ARG, 分别为EFNA1、ITGA5、EPHB4、NRP1、CDH5、PLAU、BMP6、DLL4、JUN、CA9、MMP1、BAIAP2L1、SERPINF1、F2RL1和FGFR2。GSE52903和GSE44001数据集的Kaplan-Meier生存曲线显示, 高风险组的总生存期(overall survival, OS) (P=0.005)和无病生存期(disease-free survival, DFS) (P<0.001)显著低于低风险组。受试者操作特征(receiver operating characteristic, ROC)曲线分析结果显示, GSE52903验证集在1年、3年和5年的曲线下面积(area under the curve, AUC)值分别为0.84、0.77和0.73, C-指数为0.72; GSE44001验证集在1年、3年和5年的AUC值分别为0.71、0.72和0.70, C-指数为0.70, 说明该模型对患者预后具有很强的预测效能。GSEA分析富集的通路主要涉及DNA复制、细胞外基质(extracellular matrix, ECM)受体相互作用、补体和凝血级联等, 这些过程与宫颈癌发生、发展紧密相关。以上结果表明, 这15个关键ARG可能是宫颈癌预后潜在的生物标志物。
Abstract
Angiogenesis related genes (ARGs) associated with the occurrence, development and prognosis of cervical cancer (CC) were screened using bioinformatics methods, and then the related prognostic risk model was constructed and verified. Firstly, the expression profile and clinical characteristics of CC patients were searched and differentially expressed ARGs were extracted from TCGA database. Through Lasso Cox regres-sion analysis, the ARGs for predicting prognosis were then selected to construct a relevant model. Further-more, external validation was performed with the GSE52903 and GSE44001 datasets. Finally, the mechanism of CC prognosis was discussed by gene set enrichment analysis (GSEA). Fifteen prognosis-related ARGs were selected out, including EFNA1, ITGA5, EPHB4, NRP1, CDH5, PLAU, BMP6, DLL4, JUN, CA9, MMP1, BA-IAP2L1, SERPINF1, F2RL1 and FGFR2. Kaplan-Meier survival curves based on GSE52903 and GSE44001 datasets showed that the overall survival (OS) (P=0.005) and disease-free survival (DFS) (P<0.001) of high-risk group were significantly lower than those of low-risk group. Receiver operating characteristic (ROC) curve analysis showed that, in the GSE52903 validation set, the area under the curve (AUC) values at 1 year, 3 years and 5 years were 0.84, 0.77 and 0.73, respectively, and the C-index was 0.72, and in the GSE44001 validation set, the AUC values were 0.71, 0.72 and 0.70, respectively, and the C-index was 0.70, indicating that the model has a strong predictive effect on the prognosis of CC patients. The pathways enriched by GSEA are mainly involved in DNA replication, extracellular matrix (ECM) receptor interactions, complement and coagulation cascades, the processes of which are closely related to cervical carcinogenesis and progression. These results suggest that the 15 key ARGs may be potential biomarkers for the prognosis of CC.
关键词
宫颈癌(CC)
/
生物信息学
/
血管生成相关基因(ARG)
/
预后模型
Key words
cervical cancer (CC)
/
bioinformatics
/
angiogenesis related gene (ARG)
/
prognostic model
夏娜娜, 杨京蕊, 康敏, 余敏敏
基于生物信息学筛选宫颈癌血管生成基因及构建预后模型[J].
生命科学研究, 2024, 28(2): 179-188 DOI: