Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection

L; Pin; YU Wen-bing; WANG Xin; JI Chun-lei

doi:10.3969/j.issn.0372-2112.2019.10.026

您当前的位置：

首页 >

文章列表页 >

Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection

更新时间：2025-07-16

- Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection
- Acta Electronica Sinica Vol. 47, Issue 10, Pages: 2228-2234(2019)
- 作者机构：
  
  1. 上海电机学院电子信息学院,上海,201306
  2. 上海电机学院文理学院,上海,201306
  3. 上海超级计算中心,上海,201203
  4. 上海电机学院电子信息学院,上海,201306
  5. 上海电机学院文理学院,上海,201306
  6. 上海超级计算中心,上海,201203
- 作者简介：
- 基金信息：
  
  Shanghai Educational Science Research Project (No.C17014);Computer Science and Technology Preponderant Disciplines of Shanghai DianJi University (No.16YSXK04)
- DOI：10.3969/j.issn.0372-2112.2019.10.026
  CLC： TP391
- Published Online：25 October 2019，
  
  Published：2019
- 稿件说明：
移动端阅览
L, Pin, YU Wen-bing, et al. Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection[J]. Acta Electronica Sinica, 2019, 47(10): 2228-2234.
DOI：

L, Pin, YU Wen-bing, et al. Stacked Generalization of Heterogeneous Classifiers and Its Application in Toxic Comments Detection[J]. Acta Electronica Sinica, 2019, 47(10): 2228-2234. DOI： 10.3969/j.issn.0372-2112.2019.10.026.

摘要

恶意评论检测是预防社会媒体平台给用户带来负面影响的一项重要工作，是自然语言处理的重要领域之一.为解决单分类器实现恶意评论检测时模型精度不稳定、boosting集成模型精度较低的问题，提出一种异构分类器堆叠泛化的方法.该方法用深度循环神经网络将多标签的恶意评论分类问题转变为二类分类，防止了模型精度不稳定；用堆叠泛化集成时单个分类器GRU（Gated Recurrent Unit）和NB-SVM（Nave Bayes-Support Vector Machine）在模型结构和分类偏差上的差异性，改善了模型精度.在维基百科恶意评论数据集上的对比实验证明：提出的方法优于boosting集成，说明堆叠泛化异构分类器实现恶意评论检测是可行且有效的.

Abstract

Toxic comment detection is an important work to prevent the negative impact of social media platform on users

and it is also one of the important fields of natural language processing. In order to solve the problems of unstable model accuracy and low accuracy of boosting ensemble model when an individual classifier detects toxic comments

a stack generalization with heterogeneous classifiers is proposed. In this method

the classification problem of multi-label toxic comments is transformed into binary categories by using deep recurrent neural network

which prevents the model accuracy from being unstable. Individual classifiers called GRU (Gated Recurrent Unit) and NB-SVM (Naïve Bayes-Support Vector Machine) are used during stacked generalization in order to embody the differences on model structure and classification deviation of individual classifiers

the goal is to improve the model accuracy. Experimental results on Wikipedia toxic comments show that the proposed method has better than boosting ensemble

which reports that stacked generalization of heterogeneous classifiers is feasible and effective for toxic comments detection.

关键词

Keywords

references

Views

127

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Satellite Amplitude-Phase Signals Modulation Identification and Demodulation Algorithm Based on the Cyclic Neural Network

Related Author

ZHA Xiong

PENG Hua

QIN Xin

LI Tian-yun

LI Guang

Related Institution

PLA Strategic Support Force Information Engineering University

⁰