Article Summary
邱 纯,马巧蓉,赵曼曼,苏 强,钟美佐.基于CFS-mRMR特征筛选方法和Adaboost算法的胶质瘤 相关基因筛选及预测模型的建立[J].现代生物医学进展英文版,2019,19(1):26-30.
基于CFS-mRMR特征筛选方法和Adaboost算法的胶质瘤 相关基因筛选及预测模型的建立
Study of Classification of Gliomas PredictionBased on Machine Learning Method
Received:May 14, 2018  Revised:June 12, 2018
DOI:10.13241/j.cnki.pmb.2019.01.006
中文关键词: 胶质瘤  特征筛选  差异基因  Adaboost
英文关键词: Gliomas  Feature selection  Adaboost  Differential express genes
基金项目:
Author NameAffiliationE-mail
QIU Chun 1 XiangYa Hospital of South University, Changsha, Hunan, 410008, China
2 Hainan Provincial Peopl's Hospital, Haikou, Hainan, 570311, China 
139762421217@139.com 
MA Qiao-rong Clinical Laboratory, Affiliated Minzu Hospital Of Guangxi Medical University, Nanning, Guangxi, 530001, China  
ZHAO Man-man Shanghai Key Laboratory of Bio-Energy Crops, College of Life Science, Shanghai University, Shanghai, 200444, China  
SU Qiang Shanghai Key Laboratory of Bio-Energy Crops, College of Life Science, Shanghai University, Shanghai, 200445, China  
ZHONG Mei-zuo XiangYa Hospital of South University, Changsha, Hunan, 410008, China  
Hits: 1391
Download times: 1393
中文摘要:
      摘要 目的:找出胶质瘤病变发生机制相关的基因群,并在此基础上建立预测胶质瘤病变发生的预测模型。方法:收集GEO中胶质瘤芯片数据,使用关联特征选择(Correlation-based Feature Subset, CFS)和最小冗余最大相关性(Minimum Redundancy Maximum Relevance, mRMR)特征选择方法筛选出差异基因,分析这些差异基因的功能,然后使用Adaboost算法建立胶质瘤的预测模型,并对模型的预测能力进行评估。结果:通过特征筛选,得到了19个和胶质瘤病变相关的的基因;以该19个基因建组成特征子集,结合AdaBoost算法建立了胶质瘤的预测模型,经验证,模型的预报准确率可以达到95.59 %。通过对19个差异基因的GO和KEGG分析,发现这些基因和肿瘤的发生发展有一定作用。结论:CFS-mRMR特征筛选方法可以有效地发现与胶质瘤疾病有关的基因,所筛选的19个差异基因具有生物学意义,且以此构建的胶质瘤预测模型,可以有效地对预测胶质瘤的发生。
英文摘要:
       ABSTRACT Objective: This study aims to search the genes related to the mechanisms of occurrences of glioma, and try to build the prediction model of glioma. Methods: In this article, the data were collected from GEO database, and the prediction model of gliomas was studied using the mRMR and correlation-based feature subset (CfsSubset)-Adaboost method. Results:After feature selection,19 genes related to the mechanisms of occurrences of glioma were obtained. Based on the 19 genes, a prediction model based on Adaboost were built, which could be applied to predict the occurrence of glioma. The prediction model yields an accuracy rate of 95.59% for the 10-folds cross validation test. T EGFR and MAD2L1 were found related to gliomas based on GO and KEGG analysis. Conclusion: CFS-mRMR is an efficient feature selection method on searching the key genes correlated to gliomas, which also could be employed to build prediction model.
View Full Text   View/Add Comment  Download reader
Close