Shanghai Educational Science Research Project (No.C17014);Computer Science and Technology Preponderant Disciplines of Shanghai DianJi University (No.16YSXK04)
L, Pin, YU Wen-bing, et al. Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection[J]. Acta Electronica Sinica, 2019, 47(10): 2228-2234.
DOI:
L, Pin, YU Wen-bing, et al. Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection[J]. Acta Electronica Sinica, 2019, 47(10): 2228-2234. DOI: 10.3969/j.issn.0372-2112.2019.10.026.
Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection
Toxic comment detection is an important work to prevent the negative impact of social media platform on users
and it is also one of the important fields of natural language processing. In order to solve the problems of unstable model accuracy and low accuracy of boosting ensemble model when an individual classifier detects toxic comments
a stack generalization with heterogeneous classifiers is proposed. In this method
the classification problem of multi-label toxic comments is transformed into binary categories by using deep recurrent neural network
which prevents the model accuracy from being unstable. Individual classifiers called GRU (Gated Recurrent Unit) and NB-SVM (Naïve Bayes-Support Vector Machine) are used during stacked generalization in order to embody the differences on model structure and classification deviation of individual classifiers
the goal is to improve the model accuracy. Experimental results on Wikipedia toxic comments show that the proposed method has better than boosting ensemble
which reports that stacked generalization of heterogeneous classifiers is feasible and effective for toxic comments detection.