LI Peng, WANG Xiao-long, LIU Yuan-chao, et al. A Classification Method for Imbalance Data Set Based on Hybrid Strategy[J]. Acta Electronica Sinica, 2007, 35(11): 2161-2165.
DOI:
LI Peng, WANG Xiao-long, LIU Yuan-chao, et al. A Classification Method for Imbalance Data Set Based on Hybrid Strategy[J]. Acta Electronica Sinica, 2007, 35(11): 2161-2165.DOI:
A Classification Method for Imbalance Data Set Based on Hybrid Strategy
This paper presents a novel and effective classification method for imbalanced data sets.The core idea of the algorithm
which is composed of three parts
is to provide a general solution for IDS classification by both sample preprocessing and classifier improving.Firstly
we re-sample the imbalance data by using variable SOM clustering so as to overcome the flaws of the traditional re-sampling methods
such as serious randomness
subjective interference and information loss.Then we cut down the sampled data sets according to the K-NN rule to solve the problem of data confusion
which improves the generalization of SVM.Especially
in order to adapt the class imbalance
the class boundary alignment is introduced through conformal transform on kernel function.The comparison results show the effectiveness of three algorithms.Meanwhile
the algorithm has also been used in our question answer system
which obtains outstanding result in the international TREC-2006 QA track.