HU Xiao-juan, LIU Lei, QIU Ning-jia. A Novel Spam Categorization Algorithm Based on Active Learning Method and Negative Selection Algorithm[J]. Acta Electronica Sinica, 2018, 46(1): 203-209.
DOI:
HU Xiao-juan, LIU Lei, QIU Ning-jia. A Novel Spam Categorization Algorithm Based on Active Learning Method and Negative Selection Algorithm[J]. Acta Electronica Sinica, 2018, 46(1): 203-209. DOI: 10.3969/j.issn.0372-2112.2018.01.028.
A Novel Spam Categorization Algorithm Based on Active Learning Method and Negative Selection Algorithm
active learning negative selection text categorization (ALNSTC) algorithm
based on active learning (AL) method and negative selection (NS) algorithm
is proposed for the problem of spam proliferation. The positive user interest set and the negative user interest set are established according to a small number of labeled samples. And the sampling engine (SE) of AL method is improved by the autologous anomaly detection mechanism of the NS algorithm. The two-way user interest sets are used as detectors
and a new sample set is employed as a self-set. The above two sets are matched with Hamming match rules. The classification process of each sample set is able to update the two user interest sets. The proposed algorithm is carried out with a full-scale test on six common spam corpus
which are selected as experimental material
and analyzed and compared with other five state-of-the-art spam classification methods
which are quick online spam identification (QOSI) method
semi-supervised collaboration classification algorithm with enhanced difference (DSCC)
dynamic web spam filtering (WSF2) method
multilevel spam filtering algorithm based on artificial immunity (MSFA-AI)
and integrated multi-field learning (MFL) method
in different evaluation metrics
such as precision
recall
ROC curve
categorization running time and the labeled number of spam. The results show that the proposed method has better precision rate
recall rate
classification accuracy
and can reduce the artificial labeled number of spam samples. It is advantageous to enhance the classification capacity of the algorithm that the user preferences are converted into positive and negative user interest sets. In addition
the user labeled number is reduced when unknown category features are obtained by the exception detection mechanism.
An Active Semi-Supervised Short Text Classification Method Based on Federated Learning
Active Learning-Based Image Classification Technology: Status and Future
LSASGT:an Approach to Text Categorization Based on Latent Semantic Analysis and Spectral Graph Transducer
SLDA-TC: A Novel Text Categorization Approach Based on Supervised Topic Model
Related Author
LIU Ji-qiang
LIU Yang
YANG Yan-yan
JI Zhen-yan
KONG De-yan
LIU Ying
PANG Yu-liang
ZHANG Wei-dong
Related Institution
Beijing Key Laboratory of Security and Privacy in Intelligent Transportation, School of Cyberspace Science and Techonology, Beijing Jiaotong University
School of Software Engineering, Beijing Jiaotong University
Huddersfield University, West Yorkshire HD13DH, United Kingdom of Great Britain and Northern Ireland
International Joint-Research Center for Wireless Communication and Information Processing
Center for Image and Information Processing, Xi'an University of Posts and Telecommunications