

浏览全部资源
扫码关注微信
1.吉林大学人工智能学院,吉林长春 130000
2.河北工业大学人工智能与数据科学学院,天津 300401
Received:27 March 2021,
Revised:2022-04-29,
Published:25 June 2023
移动端阅览
高慧敏,王云鹤,卞闯等.基于混合进化算法的特征选择方法研究[J].电子学报,2023,51(06):1619-1636.
GAO Hui-min,WANG Yun-he,BIAN Chuang,et al.Research on Feature Selection Based on Hybrid Evolutionary Algorithm[J].ACTA ELECTRONICA SINICA,2023,51(06):1619-1636.
高慧敏,王云鹤,卞闯等.基于混合进化算法的特征选择方法研究[J].电子学报,2023,51(06):1619-1636. DOI: 10.12263/DZXB.20210399.
GAO Hui-min,WANG Yun-he,BIAN Chuang,et al.Research on Feature Selection Based on Hybrid Evolutionary Algorithm[J].ACTA ELECTRONICA SINICA,2023,51(06):1619-1636. DOI: 10.12263/DZXB.20210399.
特征选择(Feature Selection,FS)是一种有效的数据预处理方法,它可以通过选择高维数据中一组具有高相关性和低冗余性的特征,从而解决数据冗余引起的维数灾难.目前许多计算方法已经被应用于求解FS问题,其中基于教与学优化(Teaching and Learning-based Optimization Algorithm,TLBO)的特征选择模型由于其高效的全局搜索能力受到越来越多学者的关注.然而,随着数据规模的不断扩大,这些算法所具有的模型不稳定、模型精确度低和局部搜索能力差等局限性,使算法的研究逐步陷入困境.为解决上述问题,本文提出了融合教与学优化算法与局部搜索方法(Local Search,LS)的混合进化Wrapper算法模型(Teaching and Learning-based Optimization- Local Search Algorithm,TLBOLS).首先,由于传统的教与学优化算法不能直接用于求解特征选择问题,算法在初始化阶段将实数型编码转为二进制编码,然后为保证种群的多样性,在教阶段引入最差个体重启机制,并针对进化班级过程中学习者与教学者两种身份采用不同值的TF值,提出二进制的教与学特征选择算法(Binary Teaching and Learning-based Optimization-Local Search Algorithm,BTLBOLS).随后,提出结合多操作的局部搜索方法和变邻域搜索逐渐增强扰动力度,提高整个种群的个体质量.为优化特征选择结果,BTLBOLS利用综合评价指标作为目标函数指导整体进化过程.实验选取45个高维癌症基因表达数据集进行测试并与十种特征选择算法相比,实验结果表明,相比其他算法,BTLBOLS在分类准确率和特征个数上都具有一定优势,算法分类性能有效提高.
Feature selection (FS) is an effective data pre-processing method that solves the dimensionality disaster caused by data redundancy by selecting a set of features with high relevance and low redundancy in high-dimensional data. Many computational methods have been applied to solve the FS problem
among which the teaching and learning-based optimization algorithm (TLBO) feature selection model has received increasing attention from scholars due to its efficient global search capability. However
with the increasing size of data
the limitations of these algorithms
such as model instability
low model accuracy and poor local search ability
have gradually put the research of the algorithms into difficulties. To address these problems
this paper proposes a hybrid evolutionary Wrapper algorithm model (Teaching and Learning-Based Optimization- Local Search algorithm
TLBOLS) that integrates teaching-learning optimization algorithms with local search methods. Firstly
the algorithm converts the real-type coding to binary coding in the initialization phase
then introduces the worst individual restart mechanism in the teaching phase
and proposes a binary teaching-learning feature selection algorithm for the evolutionary class process using different values of TF values for the two identities of learners and pedagogues (Binary Teaching and Learning-Based Optimization- Local Search algorithm
BTLBOLS). Subsequently
a local search method combining multiple operations and variable neighborhood search is proposed to gradually enhance the perturbation strength and improve the individual quality of the whole population. To optimize the feature selection results
BTLBOLS utilizes a comprehensive evaluation metric as an objective function to guide the overall evolutionary process. Forty-five high-dimensional cancer gene expression datasets are selected for testing and compared with ten feature selection algorithms
and the experimental results show that compared to other algorithms
the BTLBOLS has certain advantages in terms of classification accuracy and number of features
which effectively improves the algorithm classification performance.
ZHOU Z H . Machine Learning [M]. Singapore : Springer Singapore , 2021 .
HALL M A . Correlation-Based Feature Selection Formachine Learning [D]. Hamilton : The University of Waikato , 1999 .
EFRON B , HASTIE T , JOHNSTONE I , et al . Least angle regression [J]. The Annals of Statistics , 2004 , 32 ( 2 ): 407 - 451 .
TIBSHIRANI R , HASTIE T , NARASHMAN B , et al . Diagnosis of multiple cancer types by shrunken centroids of gene expression [J]. Proceedings of the National Academy of Sciences of the United States of America , 2002 , 99 ( 10 ): 6567 - 6572 .
ROBNIK-ŠIKONJA M , KONONENKO I . Theoretical and empirical analysis of ReliefF and RReliefF [J]. Machine Learning , 2003 , 53 ( 1 ): 23 - 69 .
DING C , PENG H C . Minimum redundancy feature selection from microarray gene expression data [J]. Journal of Bioinformatics and Computational Biology , 2005 , 3 ( 2 ): 185 - 205 .
PENG H C , LONG F H , DING C . Feature selection based on mutual information criteria of max-dependency, max-relevance, and Min-redundancy [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2005 , 27 ( 8 ): 1226 - 1238 .
YEOH E J , ROSS M E , SHURTLEFF S A , et al . Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling [J]. Cancer Cell , 2002 , 1 ( 2 ): 133 - 143 .
ZHANG J Y , LIU S L , WANG Y . Gene association study with SVM, MLP and cross-validation for the diagnosis of diseases [J]. Progress in Natural Science , 2008 , 18 ( 6 ): 741 - 750 .
GUYON I , ELISSEEFF A . An introduction to variable and feature selection [J]. Journal of Machine Learning Research , 2003 , 3 : 1157 - 1182 .
FALLAHPOUR S , LAKVAN E N , ZADEH M H . Using an ensemble classifier based on sequential floating forward selection for financial distress prediction problem [J]. Journal of Retailing and Consumer Services , 2017 , 34 : 159 - 167 .
YAP B W , IBRAHIM N , HAMID H A , et al . Feature selection methods: Case of filter and wrapper approaches for maximising classification accuracy [J]. Pertanika Journal of Science and Technology , 2018 , 26 ( 1 ): 329 - 340 .
PUDIL P , NOVOVIČOVÁ J , KITTLER J . Floating search methods in feature selection [J]. Pattern Recognition Letters , 1994 , 15 ( 11 ): 1119 - 1125 .
GUYON I , WESTON J , BARNHILL S , et al . Gene selection for cancer classification using support vector machines [J]. Machine Learning , 2002 , 46 ( 1 ): 389 - 422 .
郭广颂 , 陈良骥 , 文振华 , 等 . 求解高维混合指标优化问题的交互式进化计算 [J]. 电子学报 , 2020 , 48 ( 7 ): 1361 - 1368 .
GUO G S , CHEN L J , WEN Z H , et al . Sloving multidimensional optimization problems with hybird indices by interactive evolutionary computation [J]. Acta Electronica Sinica , 2020 , 48 ( 7 ): 1361 - 1368 . (in Chinese)
王宇平 , 焦永昌 , 张福顺 . 解无约束非线性全局优化的一种新进化算法及其收敛性 [J]. 电子学报 , 2002 , 30 ( 12 ): 1867 - 1869 .
WANG Y P , JIAO Y C , ZHANG F S . A new evolutionary algorithm for unconstrained nonlinear global optimization problems and its convergence [J]. Acta Electronica Sinica , 2002 , 30 ( 12 ): 1867 - 1869 . (in Chinese)
茅晓泉 , 胡光锐 , 唐斌 . 一种 DHMM 的混合训练方法 [J]. 电子学报 , 2002 , 30 ( 1 ): 148 - 150 .
MAO X Q , HU G R , TANG B . A hybrid training method for DHMMs [J]. Acta Electronica Sinica , 2002 , 30 ( 1 ): 148 - 150 . (in Chinese)
JIMÉNEZ F , SÁNCHEZ G , GARCÍA J M , et al . Multi-objective evolutionary feature selection for online sales forecasting [J]. Neurocomputing , 2017 , 234 : 75 - 92 .
JIMÉNEZ F , MARTÍNEZ C , MARZANO E , et al . Multiobjective evolutionary feature selection for fuzzy classification [J]. IEEE Transactions on Fuzzy Systems , 2019 , 27 ( 5 ): 1085 - 1099 .
MAFARJA M , ALJARAH I , HEIDARI A A , et al . Evolutionary Population Dynamics and Grasshopper Optimization approaches for feature selection problems [J]. Knowledge-Based Systems , 2018 , 145 : 25 - 45 .
TARADEH M , MAFARJA M , HEIDARI A A , et al . An evolutionary gravitational search-based feature selection [J]. Information Sciences , 2019 , 497 : 219 - 239 .
NAKISA B , RASTGOO M N , TJONDRONEGORO D , et al . Evolutionary computation algorithms for feature selection of EEG-based emotion recognition using mobile sensors [J]. Expert Systems with Applications , 2018 , 93 : 143 - 155 .
GHOSH A , DATTA A , GHOSH S . Self-adaptive differential evolution for feature selection in hyperspectral image data [J]. Applied Soft Computing , 2013 , 13 ( 4 ): 1969 - 1977 .
KHUSHABA R N , AL-ANI A , ALSUKKER A , et al . A combined ant colony and differential evolution feature selection algorithm [C]// Proceedings of the 6th International Conference on Ant Colony Optimization and Swarm Intelligence . Berlin : Springer , 2008 : 1 - 12 .
ZORARPACI E , O¨ZEL S A . A hybrid approach of differential evolution and artificial bee colony for feature selection [J]. Expert Systems with Applications , 2016 , 62 ( 15 ): 91 - 103 .
ALLAOUI M , AHIOD B , EL YAFRANI M . A hybrid crow search algorithm for solving the DNA fragment assembly problem [J]. Expert Systems with Applications , 2018 , 102 : 44 - 56 .
SHUKLA A K , SINGH P , VARDHAN M . Gene selection for cancer types classification using novel hybrid metaheuristics approach [J]. Swarm and Evolutionary Computation , 2020 , 54 : 100661 .
韩冲 , 王俊丽 , 吴雨茜 , 等 . 基于神经进化的深度学习模型研究综述 [J]. 电子学报 , 2021 , 49 ( 2 ): 372 - 379 .
HAN C , WANG J L , WU Y X , et al . A review of deep learning models based on neuroevolution [J]. Acta Electronica Sinica , 2021 , 49 ( 2 ): 372 - 379 . (in Chinese)
RAO R V , SAVSANI V J , VAKHARIA D P . Teaching-learning-based optimization: A novel method for constrained mechanical design optimization problems [J]. Computer-Aided Design , 2011 , 43 ( 3 ): 303 - 315 .
WANG Z , LU R Q , CHEN D B , et al . An experience information teaching-learning-based optimization for global optimization [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems , 2016 , 46 ( 9 ): 1202 - 1214 .
RAO R V , SAVSANI V J , VAKHARIA D P . Teaching-learning-based optimization: An optimization method for continuous non-linear large scale problems [J]. Information Sciences , 2012 , 183 ( 1 ): 1 - 15 .
ZOU F , WANG L , HEI X H , et al . Teaching-learning-based optimization with dynamic group strategy for global optimization [J]. Information Sciences , 2014 , 273 : 112 - 131 .
GHASEMI M , GHANBARIAN M M , GHAVIDEL S , et al . Modified teaching learning algorithm and double differential evolution algorithm for optimal reactive power dispatch problem: A comparative study [J]. Information Sciences , 2014 , 278 : 231 - 249 .
GONZLEZ-LVAREZ D L , VEGA-RODRGUEZ M A , GMEZ-PULIDO J A , et al . Multiobjective teaching-learning-based optimization (MO-TLBO) for motif finding [C]// 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI) . New York : Institute of Electrical and Electronics Engineers , 2013 : 141 - 146 .
LIN D H , TANG X O . Conditional infomax learning: An integrated framework for feature extraction and fusion [C]// European Conference on Computer Vision . Berlin : Springer , 2006 : 68 - 82 .
YANG H H , MOODY J . Data visualization and feature selection: New algorithms for nongaussian data [C]// Proceedings of the 12th International Conference on Neural Information Processing Systems . New York : ACM , 1999 : 687 - 693 .
PENG H C , LONG F H , DING C . Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2005 , 27 ( 8 ): 1226 - 1238 .
MEYER P E , SCHRETTER C , BONTEMPI G . Information-theoretic feature selection in microarray data using variable complementarity [J]. IEEE Journal of Selected Topics in Signal Processing , 2008 , 2 ( 3 ): 261 - 274 .
ROBNIK-ŠIKONJA M , KONONENKO I . Theoretical and empirical analysis of ReliefF and RReliefF [J]. Machine Learning , 2003 , 53 ( 1 ): 23 - 69 .
FARAMARZI A , HEIDARINEJAD M , MIRJALILI S , et al . Marine predators algorithm: A nature-inspired metaheuristic [J]. Expert Systems with Applications , 2020 , 152 : 113377 .
ZHANG Y Y , JIN Z G , MIRJALILI S . Generalized normal distribution optimization and its applications in parameter extraction of photovoltaic models [J]. Energy Conversion and Management , 2020 , 224 : 113301 .
LI S M , CHEN H L , WANG M J , et al . Slime mould algorithm: A new method for stochastic optimization [J]. Future Generation Computer Systems , 2020 , 111 : 300 - 323 .
FARAMARZI A , HEIDARINEJAD M , STEPHENS B , et al . Equilibrium optimizer: A novel optimization algorithm [J]. Knowledge-Based Systems , 2020 , 191 : 105190 .
ZHAO W G , ZHANG Z X , WANG L Y . Manta ray foraging optimization: An effective bio-inspired optimizer for engineering applications [J]. Engineering Applications of Artificial Intelligence , 2020 , 87 : 103300 .
ZHAO W G , ZHANG Z X , WANG L Y . Manta ray foraging optimization: An effective bio-inspired optimizer for engineering applications [J]. Engineering Applications of Artificial Intelligence , 2020 , 87 : 103300 .
0
Views
14
下载量
3
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621