LU Ke-zhong,CHEN Chao-fan,CAI Huan,et al.Online Classification Algorithm for Concept Drift and Class Imbalance Data Stream[J].ACTA ELECTRONICA SINICA,2022,50(03):585-597.
LU Ke-zhong,CHEN Chao-fan,CAI Huan,et al.Online Classification Algorithm for Concept Drift and Class Imbalance Data Stream[J].ACTA ELECTRONICA SINICA,2022,50(03):585-597. DOI: 10.12263/DZXB.20210094.
Online Classification Algorithm for Concept Drift and Class Imbalance Data Stream
and data stream classification is one of the most important tasks in data mining. This task finds wide application in our life
so it has been attracting great attention of researchers. Concept drift and class imbalance are two main issues that affect the performance of data stream classification algorithms. However
most solutions only address one of these two issues. Even worse
most algorithms can only achieve good performance on data streams under manual settings and cannot be applied to real complex data streams. To solve this problem
an ensemble algorithm of weighted online sequential extreme learning machine with adaptive forgetting factor is proposed to deal with both conceptual drift and imbalance on complex data streams. The proposed algorithm is a weighted online sequential limit learning machine that integrates a weighting mechanism and a forgetting mechanism. In order to adapt to complex data streams
an online integration strategy including adaptive forgetting factor and concept drift detection mechanism was designed as a classifier. Extensive simulation experiments show that the proposed algorithm achieves the best Gmean value on all data sets
has the ability to deal with concept drift and class imbalance
and presents stable
balanced and accurate classification effects.
关键词
Keywords
references
PRIYA S , UTHRA RA . Comprehensive analysis for class imbalance data with concept drift using ensemble based classification [J]. J Ambient Intell Human Comput , 2020 .
WANKHADE K K , DONGRE S S , JONDHALE K C . Data stream classification: A Review [J]. Iran J Comput , 2020 , 3 ( 2 ): 239 ‑ 260 .
SCHLIMMER JEFFREY C , GRANGER RICHARD H . Incremental learning from noisy data [J]. 1986 , 1 ( 3 ): 317 ‑ 354 .
KHANDEKAR V S , SRINATH P . Non-stationary data stream analysis: State-of-the-Art challenges and solutions [C]// Proceeding of International Conference on Computational Science and Applications , Algorithms for Intelligent Systems . Singapore : Springer , 2020 : 67 ‑ 80 .
MAO W , WANG J , HE L , et al . Online sequential prediction of imbalance data with two-stage hybrid strategy by extreme learning machine [J]. Neurocomputing , 2017 , 261 : 94 ‑ 105 .
NITIN M , ANANT V N . A survey on effects of class imbalance in data pre-processing stage of classification problem [J]. International Journal of Computational Systems Engineering , 2020 , 6 ( 2 ): 63 - 75 .
ZHANG B , CHEN Y . Research on detection and integration classification based on concept drift of data stream [J]. EURASIP Journal on Wireless Communications and Networking , 2019 , 86 . DOI: 10. 1186/s13638-019-1408-2 http://dx.doi.org/10.1186/s13638-019-1408-2 .
LIANG N , HUANG G , SARACHANDREN P , SUNDARARAJAN N . A fast and sccurate online sequential learning algorithm for feedforward networks [J]. IEEE Trans Neur Netw, 2006 , 17 ( 6 ): 1411 ‑ 1423 .
HUANG G , ZHU Q , SIEW C . Extreme learning machine : Theory and applications [J]. Neurocomputing , 2006 , 70 ( 1-3 ): 489 ‑ 501 .
MIRZA B , LIN Z , TOH KA . Weighted online sequential extreme learning machine for class imbalance learning [J]. Neural Process Lett , 2013 , 38 : 465 ‑ 486 .
MIRZA B , LIN Z , CAO J . Voting based weighted online sequential extreme learning machine for imbalance multi-Class classification [C]// 2015 IEEE International Symposium on Circuits and Systems(ISCAS) , Lisbon : IEEE , 2015 : 565 ‑ 568 .
KLIKOWSKI J , WONIAK M . Multi sampling random subspace ensemble for imbalanced data stream classification [C]// Progress of Computer Recognition Systems , CORES 2019 . Berlin : Springer , 2020 : 360 ‑ 369 .
ZHU H , LIN G , ZHOU M , et al . Optimizing weighted extreme learning machines for imbalanced classification and application to credit card fraud detection [J]. Neurocomputing , 2020 , 407 : 50 ‑ 62 .
HU W , ZHANG B . Study of sampling techniques and algorithms in data stream environments [C]// 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery . Chongqing, China : IEEE , 2012 : 1028 ‑ 1034 .
NGUYEN H M , COOPER E W , KAMEI K . Online learning from imbalanced data streams [C]// 2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR) . Dalian, China : IEEE , 2011 : 347 ‑ 352 .
DU Z , LI X , ZHENG Z , ZHANG G , MAO Q . Extreme learning machine based on regularization and forgetting factor and its application in fault prediction [J]. Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument , 2015 , 36 ( 7 ): 1546 ‑ 1553 .
YANG R , XU S , FENG L . An ensemble extreme learning machine for data stream classification [J]. Algorithms , 2018 , 11 ( 7 ). DOI: 10. 3390/a11070107 http://dx.doi.org/10.3390/a11070107 .
SIMON H . Adaptive Filter Theory [M]. Upper Saddle River : Prentice Hall , 2002 .
ZHAI T , GAO Y , WANG H , CAO L . Classification of high-dimensional evolving data streams via a resource-efficient online ensemble [J]. Data Mining and Knowledge Discovery , 2017 , 31 ( 5 ): 1242 ‑ 1265
MINKU L L , WHITE A P , YAO X . The impact of diversity on online ensemble learning in the presence of concept Drift [J]. IEEE Transactions on Neural Networks , 2011 , 22 ( 10 ): 1517 ‑ 1531 .
GOMES H M , READ J , BIFET A . Streaming random patches for evolving data stream classification [C]// IEEE International Conference on Data Mining (ICDM) , Beijing, China : IEEE , 2019 : 240 ‑ 249 .
KUBAT M , HOLTE R , MATWIN S . Learning when Negative Examples Abound [M]. Berlin : Springer , 1997 .