电子学报 ›› 2014, Vol. 42 ›› Issue (7): 1320-1326.DOI: 10.3969/j.issn.0372-2112.2014.07.012

• 学术论文 • 上一篇    下一篇

基于多目标微粒群优化的异质数据特征选择

巩敦卫, 胡滢, 张勇   

  1. 中国矿业大学信息与电气工程学院, 江苏徐州 221116
  • 收稿日期:2013-04-15 修回日期:2013-05-17 出版日期:2014-07-25
    • 作者简介:
    • 巩敦卫 男,1970年3月出生,江苏铜山人,中国矿业大学教授、博导、中国电子学会高级会员、中国煤炭学会高级会员.1992年、1995年和1999年分别在中国矿业大学、北京航空航天大学、中国矿业大学获理学学士、工学硕士和工学博士学位.主要研究方向:基于搜索的软件工程、智能优化与控制.E-mail:dwgong@vip.163.com;胡滢 女,1988年4月出生,安徽黄山人.2011年在安徽理工大学获工学学士学位,现为中国矿业大学硕士研究生,主要研究方向:基于微粒群优化的特征选择.E-mail:hy200712008@126.com;张勇 男,1979年9月出生,山东莱芜人,中国矿业大学副教授.2003年在聊城大学获工学学士学位,2006年和2009年在中国矿业大学分别获工学硕士和工学博士学位.主要研究方向:微粒群优化及其应用.E-mail:yongzh401@126.com
    • 基金资助:
    • 国家自然科学基金 (No.61005089); 江苏省自然科学基金 (No.BK2011215); 高等学校博士学科点专项科研基金 (No.20100095120016); 中国博士后科学基金 (No.2012M521142)

Feature Selection of Heterogeneous Data Based on Multi-Objective Particle Swarm Optimization

GONG Dun-wei, HU Ying, ZHANG Yong   

  1. School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
  • Received:2013-04-15 Revised:2013-05-17 Online:2014-07-25 Published:2014-07-25
    • Supported by:
    • National Natural Science Foundation of China (No.61005089); Natural Science Foundation of Jiangsu Province,  China (No.BK2011215); Research Fund for the Doctoral Program of Higher Education of China (No.20100095120016); Postdoctoral Science Foundation of China (No.2012M521142)

摘要:

环境和测量仪器精度的影响,使得采样数据的不同特征具有不同的质量.对这类异质数据进行特征选择,需要同时考虑特征子集确定分类器的准确度和可靠性,从而增加了特征选择的难度.本文研究异质数据的特征选择问题,提出一种基于多目标微粒群优化的特征选择方法.该方法首先以特征选择的概率为决策变量,将具有离散变量的特征选择问题,转化为连续变量多目标优化问题;然后,采用微粒群优化求解时,基于高斯采样,产生微粒的全局引导者,以提高Pareto解集的分布性;最后,依据储备集中元素更新的速度,确定需要扰动的微粒,以帮助微粒群跳出局部最优.将所提方法应用于多个典型数据集分类问题,实验结果表明了所提方法的有效性.

关键词: 特征选择, 异质数据, 多目标优化, 微粒群优化, 高斯采样

Abstract:

Different features of a sampling datum have different quality as a result the influence of the environment and the equipment precision.For the feature selection of this kind of heterogeneous data,both the accuracy and the reliability of the classifier determined by a feature subset are required to simultaneously consider,which enhances the difficulty of selecting features.The problem of the feature selection of heterogeneous data is focused on in this paper,and a method of selecting features is presented based on multi-objective particle swarm optimization.In this method,the above problem is first converted to a multi-objective optimization problem by regarding the probability of selecting a feature as the decision variable.When particle swarm optimization (PSO) is employed to solve the converted problem,the global guider of particles is generated by Gaussian sampling so as to improve the performance of Pareto solutions in distribution.In addition,the particle to be disturbed is determined according to the speed of updating a particle in the archive to help the swarm jump out of local optima.The proposed method is applied to classify several benchmark data sets,and the experimental results demonstrate its effectiveness.

Key words: feature selection, heterogeneous data, multi-objective optimization, particle swarm optimization, Gaussian sampling

中图分类号: