电子学报 ›› 2022, Vol. 50 ›› Issue (7): 1643-1652.DOI: 10.12263/DZXB.20210415

• 学术论文 • 上一篇    下一篇

基于自适应图学习的半监督特征选择

江兵兵1, 何文达1, 吴兴宇2, 项俊浩1, 洪立斌1, 盛伟国1   

  1. 1.杭州师范大学信息科学与技术学院,浙江 杭州 311121
    2.中国科学技术大学计算机科学与技术学院,安徽 合肥 230027
  • 收稿日期:2021-03-30 修回日期:2021-11-25 出版日期:2022-07-25 发布日期:2022-07-30
  • 作者简介:江兵兵 男,1991年生,安徽颍上人.2019年毕业于中国科学技术大学计算机应用技术专业,获工学博士学位.现为杭州师范大学信息科学与技术学院硕士生导师,讲师.主要研究方向为半监督学习、特征选择、多视图学习和贝叶斯学习.E-mail: jiangbb@hznu.edu.cn
    何文达 男,1996年生,浙江海盐人.杭州师范大学信息科学与技术学院在读硕士研究生,主要研究方向为半监督学习、特征选择等.E-mail: wendahe1996@stu.hznu.edu.cn
    盛伟国(通讯作者) 男,1977年生,浙江瑞安人.2002年毕业于英国诺丁汉大学,获信息技术硕士学位;2005年毕业于英国布鲁耐尔大学,获计算机科学博士学位.现为杭州师范大学信息科学与技术学院教授,博士生导师.主要研究方向为智能算法理论和设计及其在数据挖掘、模式识别和信息安全等.E-mail: weiguouk@hotmail.com
  • 基金资助:
    国家自然科学基金(62006065);杭州师范大学科研启动项目(20204003)

Semi-Supervised Feature Selection with Adaptive Graph Learning

JIANG Bing-bing1, HE Wen-da1, WU Xing-yu2, XIANG Jun-hao1, HONG Li-bin1, SHENG Wei-guo1   

  1. 1.School of Information Science and Technology,Hangzhou Normal University,Hangzhou,Zhejiang 311121,China
    2.School of Computer Science and Technology,University of Science and Technology,Hefei,Anhui 230027,China
  • Received:2021-03-30 Revised:2021-11-25 Online:2022-07-25 Published:2022-07-30

摘要:

随着数据特征维数的增加,如何在少量有标签和大量无标签高维样本的情况下选择相关的特征子集已成为特征选择领域的热点问题.针对现有半监督特征选择算法直接忽略特征选择与局部结构学习之间的相互作用,从而难以有效获取样本分布结构的问题,本文提出了一种基于自适应图学习的半监督特征选择(Semi-supervised Feature Selection with Adaptive Graph learning,SFSAG)算法.利用标签传播将特征空间的稀疏投影学习和近邻图的构建有效地结合起来,实现在选择相关特征的同时还能学习样本的局部结构;自适应地利用样本在投影特征空间中的相似性信息构建可靠的近邻图,从而有效降低噪声特征的干扰并选择更具判别性的特征子集.多种数据集上的实验验证了SFSAG的有效性及其相对于现有半监督特征选择算法的优越性.

关键词: 特征选择, 自适应图学习, 半监督学习, 标签传播, L2,1稀疏正则化

Abstract:

With the increasing feature dimensionality, how to select a relevant feature subset in the case of a few labeled and large amount of unlabeled high-dimensional samples has become a hot issue in feature selection. However, existing semi-supervised feature selection algorithms directly ignore the interaction between feature selection and local structure learning, making it difficult to obtain the distribution structure information. To these ends, a semi-supervised feature selection algorithm with adaptive graph learning(SFSAG) is developed in this paper. Firstly, the label propagation is used to link the tasks of sparse projection learning on the original feature space and construction of affinity graph, such that the feature selection and local structure learning can be performed simultaneously. Then, a reliable neighbor graph is adaptively constructed by using the similarity information of samples in the projected feature space, which largely alleviates the adverse effects of noisy dimensions and facilitates selecting more discriminative features. Extensive experiments are conducted on various datasets, and the results demonstrate the effectiveness of the proposed SFSAG and its superiority in comparison with the state-of-the-art feature selection algorithms.

Key words: feature selection, adaptive graph learning, semi-supervised learning, label propagation, L2,1 sparse regularization

中图分类号: