1. 青岛科技大学信息与科学技术学院,山东,青岛,266061
2. 中国科学院计算技术研究所,北京,100080
纸质出版:2010
移动端阅览
江 峰, 眭跃飞, 曹存根, 等. 基于边界和距离的离群点检测[J]. 电子学报, 2010,38(3):700-705.
JIANG Feng, SUI Yue-fei, CAO Cun-gen, et al. Outlier Detection Based on Boundary and Distance[J]. Acta Electronica Sinica, 2010, 38(3): 700-705.
近年来,离群点检测已经引起人们的广泛关注. 离群点检测在网络入侵检测、信用卡欺诈、电子商务犯罪、医疗诊断以及反恐等诸多领域都具有十分重要的作用. 离群点检测的目的是为了发现数据集中的一小部分对象,与数据集中其余的大部分对象相比,这一小部分对象有着特殊的行为或者具有反常的属性. 针对现有的离群点检测方法不能有效处理不确定与不完整数据的问题,本文将粗糙集中边界的概念与 Knorr 等所提出的基于距离的离群点检测方法结合在一起,在粗糙集的框架中提出一种新的离群点定义与检测方法. 针对于该方法,我们设计出相应的离群点检测算法 BDOD,并且通过在临床诊断数据集上所进行的实验,验证了算法BDOD的有效性. 实验结果表明本文的方法为处理离群点检测中的不确定与不完整数据问题提供了一条新的途径.
In recent years
outlier detection has gained considerable interest. The identification of outliers is important for many applications such as intrusion detection
credit card fraud
criminal activities in electronic commerce
medical diagnosis and anti-terrorism
etc. The aim of outlier detection is to find small groups of objects who behave in an unexpected way or have abnormal properties when compared with the rest large amount of data. Since the existing methods for outlier detection cannot deal with uncertain and incomplete data. In this paper
we propose a new method for outlier definition and detection
which exploits the basic notion — boundary of rough sets and Knorr’s method about distance-based outliers. We also give an algorithm BDOD to find such outliers within the framework of rough set theory. The effectiveness of our algorithm is demonstrated on publicly clinical diagnosis data sets. Our method gives a new approach to the solution of uncertain and incomplete data in outlier detection.
0
浏览量
2130
下载量
18
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621