电子学报 ›› 2017, Vol. 45 ›› Issue (4): 813-819.DOI: 10.3969/j.issn.0372-2112.2017.04.007

• 学术论文 • 上一篇    下一篇

一种基于单簇核PCM的SVDD离群点检测方法

杨金鸿1, 邓廷权1,2   

  1. 1. 哈尔滨工程大学计算机科学与技术学院, 黑龙江哈尔滨 150001;
    2. 哈尔滨工程大学理学院, 黑龙江哈尔滨 150001
  • 收稿日期:2016-08-26 修回日期:2016-10-26 出版日期:2017-04-25
    • 作者简介:
    • 杨金鸿 女,1987年生于黑龙江哈尔滨.哈尔滨工程大学计算机科学与技术学院博士研究生.研究方向为数据挖掘、机器学习以及不确定性理论等.E-mail:yangjinhong.66@163.com;邓廷权 男,1965年生于四川三台.哈尔滨工程大学理学院教授、博士生导师,研究方向为不确定性理论、数据挖掘和数字图像处理等.
    • 基金资助:
    • 国家自然科学基金 (No.11471001)

A One-Cluster Kernel PCM Based SVDD Method for Outlier Detection

YANG Jin-hong1, DENG Ting-quan1,2   

  1. 1. College of Computer Science and Technology, Harbin Engineering University, Harbin, Heilongjiang 150001, China;
    2. College of Science, Harbin Engineering University, Harbin, Heilongjiang 150001, China
  • Received:2016-08-26 Revised:2016-10-26 Online:2017-04-25 Published:2017-04-25
    • Supported by:
    • National Natural Science Foundation of China (No.11471001)

摘要:

针对支持向量数据描述(Support Vector Data Description,SVDD)的训练集中同时含有正常点和离群点的问题,为降低离群点对SVDD训练模型的不利影响,提出了一种基于单簇核可能性C-均值的SVDD离群点检测算法.本文算法通过单簇核聚类获得每个样本属于正常类的隶属度,将其作为每个样本属于目标类的置信度.将样本置信度引入到SVDD训练模型中,减弱低置信度样本在建立决策边界中的作用.实验表明,与已有的相关方法相比,本文方法能够显著改善SVDD的离群点检测效果.

关键词: 离群点检测, 支持向量数据描述, 可能性C-均值, 置信度

Abstract:

In order to reduce the negative influence of outliers on the model of support vector data description (SVDD) when the training dataset contains both normal samples and outliers which are all labeled as target class,a one-cluster kernel possibilistic C-means based SVDD method for outlier detection is proposed.In this paper,each sample of the training dataset is assigned a confidence level based on the membership degree of each sample belonging to the normal class,which is obtained through the one-cluster kernel PCM clustering.The proposed algorithm incorporates the confidence levels into the training model to reduce the importance of the samples which have less confidence levels.The experimental results show that the proposal significantly improves the effect of outlier detection,compared with the existing SVDD-based outlier detection methods.

Key words: outlier detection, support vector data description, possibilistic C-means, confidence level

中图分类号: