YOU Dian-long, GUO Song, ZHAO Chun-hui, et al. Online Feature Selection with Streaming Features for Classification[J]. Acta Electronica Sinica, 2020, 48(2): 321-332.
DOI:
YOU Dian-long, GUO Song, ZHAO Chun-hui, et al. Online Feature Selection with Streaming Features for Classification[J]. Acta Electronica Sinica, 2020, 48(2): 321-332. DOI: 10.3969/j.issn.0372-2112.2020.02.015.
Online Feature Selection with Streaming Features for Classification
Online streaming feature selection achieves stream feature space dimensionality reduction by filtering irrelevant features and redundant features in real time. Existing works
such as Alpha-investing and Online Streaming Feature Selection (OSFS)
have been proposed to serve this purpose
but they have drawbacks
including low prediction accuracy and high running time if the streaming features exhibit characteristics such as low redundancy and high relevance. We propose a novel classification-oriented online feature selection algorithm for streaming features
named OSFIC. OSFIC uses a four-layer filtering framework to filter irrelevant new features by null-conditional independence
filter redundant new features and redundant features in a candidate feature set by a single-conditional mutual information
and finally filter the remaining redundancy in the candidate feature set by multi-conditional independence. The approximate Markov blanket of the classify label is finally obtained. To analyze the performance of the algorithm
we selected the datasets in NIPS 2003 and Causality Workbench to compare prediction accuracy
number of selected features
runtime
and AUC with existing state-of-the-art algorithms. Experiments show that the average classification accuracy of OSFIC is 4.41% higher than that of Alpha-investing. Under the premise of high precision
the average number of features is 41.9% lower than SAOLA
and the runtime is 91.59% lower than OSFS. Finally
the efficiency of OSFIC is verified in real scenarios.