1. School of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China;
2. The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao, Hebei 066004, China
Abstract:Online streaming feature selection achieves stream feature space dimensionality reduction by filtering irrelevant features and redundant features in real time.Existing works, such as Alpha-investing and Online Streaming Feature Selection (OSFS), have been proposed to serve this purpose, but they have drawbacks, including low prediction accuracy and high running time if the streaming features exhibit characteristics such as low redundancy and high relevance.We propose a novel classification-oriented online feature selection algorithm for streaming features, named OSFIC.OSFIC uses a four-layer filtering framework to filter irrelevant new features by null-conditional independence, filter redundant new features and redundant features in a candidate feature set by a single-conditional mutual information, and finally filter the remaining redundancy in the candidate feature set by multi-conditional independence.The approximate Markov blanket of the classify label is finally obtained.To analyze the performance of the algorithm, we selected the datasets in NIPS 2003 and Causality Workbench to compare prediction accuracy, number of selected features, runtime, and AUC with existing state-of-the-art algorithms.Experiments show that the average classification accuracy of OSFIC is 4.41% higher than that of Alpha-investing.Under the premise of high precision, the average number of features is 41.9% lower than SAOLA, and the runtime is 91.59% lower than OSFS.Finally, the efficiency of OSFIC is verified in real scenarios.
[1] Charu C Aggarwal.Data Classification:Algorithms and Applications[M].1th ed.Boca Raton:CRC press,2014.37-64.
[2] 乔立岩,彭喜元,彭宇.基于微粒群算法和支持向量机的特征子集选择方法[J].电子学报,2006,34(3):496-498. Qiao Li-yan,Peng Xi-yuan,Peng Yu.BPSO-SVM wrapper for feature subset selection[J].Acta Electronica Sinica,2006,34(3):496-498.(in Chinese)
[3] Li J,Cheng K,Wang S,et al.Feature selection:A data perspective[J].ACM Computing Surveys (CSUR),2018,50(6):94:1-94:45.
[4] Cai J,Luo J,Wang S,et al.Feature selection in machine learning:A new perspective[J].Neurocomputing,2018,300(7):70-79.
[5] Zhang Q,Zhang P,Long G,et al.Online learning from trapezoidal data streams[J].IEEE Trans,2016,KDE-28(10):2709-2723.
[6] Wu X,Yu K,Ding W,et al.Online feature selection with streaming features[J].IEEE Trans,2013,PAMI-35(5):1178-1192.
[7] Kohavi R,John G H.Wrappers for feature subset selection[J].Artificial intelligence,1997,97(1-2):273-324.
[8] Loscalzo S,Yu L,Ding C.Consensus group stable feature selection[A].Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining[C].New York:ACM press,2009.567-576.
[9] Rodriguez-Lujan I,Huerta R,Elkan C,et al.Quadratic programming feature selection[J].Journal of Machine Learning Research,2010,11(1):1491-1516.
[10] Zhao Z,Liu H.Searching for interacting features in subset selection[J].Intelligent Data Analysis,2009,13(2):207-228.
[11] 吴信东,何进,陆汝钤.从大数据到大知识:HACE+BigKE[J].自动化学报,2016,42(7):965-982. Wu Xin-dong,He Jing,Lu Ru-qian.From big data to big knowledge:HACE+ BigKE[J].Acta Automatica Sinica,2016,42(7):965-982.(in Chinese)
[12] Yu K,Wu X,Ding W,et al.Scalable and accurate online feature selection for big data[J].ACM Trans,2016,KDD-11(2):16:1-16:39.
[13] 张昊,陶然,李志勇.基于KNN算法及禁忌搜索算法的特征选择方法在入侵检测中的应用研究[J].电子学报,2009,37(7):1628-1632. Zhang Hao,Tao Ran,Li Zhi-yong.A research and application of feature selection based on KNN and tabu search algorithm in the intrusion detection[J].Acta Electronica Sinica,2009,37(7):1628-1632.(in Chinese)
[14] Jia X,Kuo B C,Crawford M M.Feature mining for hyperspectral image classification[J].Proceedings of the IEEE,2013,101(3):676-697.
[15] Wang D,Irani D,Pu C.Evolutionary study of web spam:Webb spam corpus 2011 versus webb spam corpus 2006[A].Proceedings of the 8th International Conference on Collaborative Computing:Networking,Applications and Worksharing (CollaborateCom)[C].Pittsburgh:IEEE,2012.40-49.
[16] Singh N, Lai K H, Vejvar M, et al.Big data technology:Challenges,prospects and realities[J].IEEE Engineering Management Review,2019,47(1):58-66.
[17] Wu X,Zhu X,Wu G Q,et al.Data mining with big data[J].IEEETrans,2014,KDE-26(1):97-107.
[18] Kira K,Rendell L A .The Feature Selection Problem:Traditional Methods and a New Algorithm[A].Proceedings of the Tenth National Conference on Artificial Intelligence[C].San Jose:AAAI Press,1992.129-134.
[19] Peng H,Long F,Ding C.Feature selection based on mutual information:criteria of max-dependency,max-relevance,and min-redundancy[J].IEEE Trans,2005,PAMI-27(8):1226-1238.
[20] Yu K,Ding W,Simovici D A,et al.Classification with streaming features:An emerging-pattern mining approach[J].ACM Trans,2014,KDD-9(4):30:1-30:31.
[21] Mairal J,Bach F,Ponce J,et al.Online learning for matrix factorization and sparse coding[J].Journal of Machine Learning Research,2010,11(1):19-60.
[22] Wang J,Zhao P,Hoi S C H,et al.Online feature selection and its applications[J].IEEE Trans,2014,KDE-26(3):698-710.
[23] Perkins S,Theiler J.Online feature selection using grafting[A].Proceedings of the 20th International Conference on Machine Learning (ICML)[C].Washington DC:AAAI Press,2003.592-599.
[24] Zhou J, Foster D P, Stine R A, et al.Streamwise Feature Selection [J].Journal of Machine Learning Research,2006,7(9):1861-1885.
[25] Wang J,Wang M,Li P,et al.Online feature selection with group structure analysis[J].IEEE Trans,2015,KDE-27(11):3029-3041.
[26] Yu K,Wu X,Ding W,et al.Scalable and accurate online feature selection for big data[J].ACM Trans,2016,KDD-11(2):16:1-16:39.
[27] Aliferis C F,Statnikov A,Tsamardinos I,et al.Local causal and markov blanket induction for causal discovery and feature selection for classification part i:Algorithms and empirical evaluation[J].Journal of Machine Learning Research,2010,11(1):171-234.
[28] Koller D.Toward optimal feature selection[A].Proceedings of the Proceedings of the 20th International Conference on Machine Learning (ICML)[C].Washington DC:AAAI Press,1996.284-292.
[29] Neapolitan R E.Learning Bayesian Networks[M].1th ed.Upper Saddle River,NJ:Pearson Prentice Hall,2004.600-604.
[30] Press W H,Teukolsky S A,Vetterling W T,et al.Numerical Recipes in C[M].Cambridge:Cambridge university press,1992.632-636.