WU Yu-jia, LI Jing, SONG Cheng-fang, CHANG Jun
The existing text classification methods based on deep learning do not consider the importance and association of text features. The association between the text features perhaps affects the accuracy of the classification. To solve this problem, in this study, a framework based on high utility neural networks (HUNN) for text classification were proposed. Which can effectively mine the importance of text features and their association. Mining high utility itemsets (MHUI) from databases is an emerging topic in data mining. It can mine the importance and the co-occurrence frequency of each feature in the dataset.The co-occurrence frequency of the feature reflects the association between the text features. Using MHUI as the mining layer of HUNN, it is used to mine strong importance and association text features in each type, select these text features as input to the neural networks. And then acquire the high-level features with strong ability of categorical representation through the convolution layer for improving the accuracy of model classification. The experimental results showed that the proposed model performed significantly better on six different public datasets compared with convolutional neural networks (CNN), recurrent neural networks (RNN), recurrent convolutional neural networks (RCNN), fast text classifier (FAST), and hierarchical attention networks (HAN).