电子学报 ›› 2013, Vol. 41 ›› Issue (1): 35-41.DOI: 10.3969/j.issn.0372-2112.2013.01.007

• 学术论文 • 上一篇    下一篇

一种增强差异性的半监督协同分类算法

于重重1,2, 商利利2, 谭励2, 涂序彦1, 杨扬1, 王竞燕2   

  1. 1. 北京科技大学计算机与通信工程学院,北京 100083;
    2. 北京工商大学计算机与信息工程学院,北京 100048
  • 收稿日期:2012-08-06 修回日期:2012-10-21 出版日期:2013-01-25
    • 作者简介:
    • 于重重 女,1971年8月生于重庆,北京工商大学计算机与信息工程学院教授、副院长、硕导,主要研究领域:智能信息处理与模式识别、复杂实时监测系统预测与评估. E-mail:chongzhy@vip.sina.com 商利利 女,1986年12月生于河南濮阳,北京工商大学计算机与信息工程学院在读硕士研究生,主要研究领域:机器学习、模式识别. E-mail:shanglili2008v@126.com
    • 基金资助:
    • 国家自然科学基金 (No.61070182); 北京市组织部优秀人才资助项目 (No.2010D005003000008); 北京市学科建设项目 (No.PXM2012_014213_0000_74); 北京市学科建设项目 (No.pxm_2012_014213_000023)

A Semi-supervised Collaboration Classification Algorithm with Enhanced Difference

YU Chong-chong1,2, SHANG Li-li2, TAN Li2, TU Xu-yan1, YANG Yang1, WANG Jing-yan2   

  1. 1. School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China;
    2. School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China
  • Received:2012-08-06 Revised:2012-10-21 Online:2013-01-25 Published:2013-01-25
    • Supported by:
    • National Natural Science Foundation of China (No.61070182); Excellent Talents Fund Project of Organization Department of Beijing Municipality (No.2010D005003000008); Beijing Discipline Construction Project (No.PXM2012_014213_0000_74); Beijing Discipline Construction Project (No.pxm_2012_014213_000023)

摘要: 半监督学习中的Tri-Training算法打破了以往算法对充分冗余视图的限制,并通过利用三个分类器处理标记置信度和样本预测问题提高了标记效率.为进一步增强协同训练过程中分类器之间的差异性以提高性能,本文在其理论基础上提出了一种增强差异性的半监督协同分类算法.该算法利用三个不同的分类器进行学习;考虑到分类模型在更新过程中,可能会因随机抽样导致性能恶化,该算法利用基于标记类别的分层抽样法来对已标记样本集进行抽样,并通过基于分类正确率的加权投票法实现了分类器的集成,提高了预测准确率.本文通过实验对所提出算法与Tri-Training算法做了性能比较,实验结果表明本文所提出的方法在分类问题上具有较好的性能,验证了该算法的有效性和可行性.

关键词: 半监督协同分类算法, Tri-Training算法, 增强差异性策略, 分层抽样法

Abstract: Tri-Training algorithm in semi-supervised learning broke the restriction of previous algorithms on sufficient and redundant views and raised label efficiency by applying three classifiers to deal with labeled confidence.In order to further improve classifiers' performance through enhancing the difference between them,a semi-supervised collaborative classification algorithm with enhanced difference that applies three different classifiers was presented in this paper.Taking the might-be performance deterioration led by random sampling during the updating of classifying models into consideration,a method of stratified sampling based on labeled class was used by the algorithm to sample from the labeled sample sets,and the method of weighted voting based on classification accuracy realized the classifier ensemble,as a result the prediction accuracy is raised.Performance comparison between the proposed algorithm and Tri-Training algorithm was made through experiments,and the results show effectiveness of the former.

Key words: semi-supervised collaboration classification algorithm, Tri-Training algorithm, strategy of enhancing difference, stratified sampling

中图分类号: