邱 纯,马巧蓉,赵曼曼,苏 强,钟美佐.基于CFS-mRMR特征筛选方法和Adaboost算法的胶质瘤
相关基因筛选及预测模型的建立[J].,2019,19(1):26-30 |
基于CFS-mRMR特征筛选方法和Adaboost算法的胶质瘤
相关基因筛选及预测模型的建立 |
Study of Classification of Gliomas PredictionBased on Machine Learning Method |
投稿时间:2018-05-14 修订日期:2018-06-12 |
DOI:10.13241/j.cnki.pmb.2019.01.006 |
中文关键词: 胶质瘤 特征筛选 差异基因 Adaboost |
英文关键词: Gliomas Feature selection Adaboost Differential express genes |
基金项目: |
|
摘要点击次数: 1435 |
全文下载次数: 1424 |
中文摘要: |
摘要 目的:找出胶质瘤病变发生机制相关的基因群,并在此基础上建立预测胶质瘤病变发生的预测模型。方法:收集GEO中胶质瘤芯片数据,使用关联特征选择(Correlation-based Feature Subset, CFS)和最小冗余最大相关性(Minimum Redundancy Maximum Relevance, mRMR)特征选择方法筛选出差异基因,分析这些差异基因的功能,然后使用Adaboost算法建立胶质瘤的预测模型,并对模型的预测能力进行评估。结果:通过特征筛选,得到了19个和胶质瘤病变相关的的基因;以该19个基因建组成特征子集,结合AdaBoost算法建立了胶质瘤的预测模型,经验证,模型的预报准确率可以达到95.59 %。通过对19个差异基因的GO和KEGG分析,发现这些基因和肿瘤的发生发展有一定作用。结论:CFS-mRMR特征筛选方法可以有效地发现与胶质瘤疾病有关的基因,所筛选的19个差异基因具有生物学意义,且以此构建的胶质瘤预测模型,可以有效地对预测胶质瘤的发生。 |
英文摘要: |
ABSTRACT Objective: This study aims to search the genes related to the mechanisms of occurrences of glioma, and try to build the prediction model of glioma. Methods: In this article, the data were collected from GEO database, and the prediction model of gliomas was studied using the mRMR and correlation-based feature subset (CfsSubset)-Adaboost method. Results:After feature selection,19 genes related to the mechanisms of occurrences of glioma were obtained. Based on the 19 genes, a prediction model based on Adaboost were built, which could be applied to predict the occurrence of glioma. The prediction model yields an accuracy rate of 95.59% for the 10-folds cross validation test. T EGFR and MAD2L1 were found related to gliomas based on GO and KEGG analysis. Conclusion: CFS-mRMR is an efficient feature selection method on searching the key genes correlated to gliomas, which also could be employed to build prediction model. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |