1.中央财经大学信息学院,北京 100081
2.天津财经大学统计学院,天津 300222
[ "王友卫 男,1987年出生,山东临沂人.副教授.主要研究方向为情感分析、数据挖掘、深度学习.E-mail: ywwang15@126.com" ]
[ "刘瑞 女,1998年出生,安徽淮南人.硕士研究生. 主要研究方向为情感分析、深度学习.E-mail: 1056892410@qq.com" ]
[ "凤丽洲 女,1986年出生,吉林长春人.副教授. 主要研究方向为情感分析、内容安全、深度学习. E-mail: lzfeng15@126.com" ]
收稿:2022-06-02,
修回:2022-12-29,
纸质出版:2024-05-25
移动端阅览
王友卫, 刘瑞, 凤丽洲. 基于用户性格和语义-结构特征的文本评论情感分类方法[J]. 电子学报, 2024, 52(05): 1657-1669.
WANG You-wei, LIU Rui, FENG Li-zhou. A Sentiment Classification Method for Text Comments Based on User Personality and Semantic-Structural Features[J]. Acta Electronica Sinica, 2024, 52(05): 1657-1669.
王友卫, 刘瑞, 凤丽洲. 基于用户性格和语义-结构特征的文本评论情感分类方法[J]. 电子学报, 2024, 52(05): 1657-1669. DOI:10.12263/DZXB.20220645
WANG You-wei, LIU Rui, FENG Li-zhou. A Sentiment Classification Method for Text Comments Based on User Personality and Semantic-Structural Features[J]. Acta Electronica Sinica, 2024, 52(05): 1657-1669. DOI:10.12263/DZXB.20220645
由于传统文本评论情感分类方法通常忽略用户性格对于情感分类结果的影响,提出一种基于用户性格和语义-结构特征的文本评论情感分类方法(User Personality and Semantic-structural Features based Sentiment Classification Method for Text Comments,BF_BiGAC).依据大五人格模型能够有效表达用户性格的优势,通过计算不同维度性格得分,从评论文本中获取用户性格特征.利用双向门控循环单元(Bidirectional Gated Recurrent Unit,BiGRU)和卷积神经网络(Convolutional Neural Network,CNN)可以有效提取文本上下文语义特征和局部结构特征的优势,提出一种基于BiGRU、CNN和双层注意力机制的文本语义-结构特征获取方法.为区分不同类型特征的影响,引入混合注意力层实现对用户性格特征和文本语义-结构特征的有效融合,以此获得最终的文本向量表达.在IMDB、Yelp-2、Yelp-5及Ekman四个评论数据集上的对比实验结果表明,BF_BiGAC在分类准确率(Accuracy)和加权macro
F
1
值(
F
w
)上均获得较好表现,相对于拼接BiGRU、CNN的情感分类方法(Sentiment Classification Method Concatenating BiGRU and CNN,BiGRU_CNN)在Accuracy值上分别提升0.020、0.012、0.017及0.011,相对于拼接CNN、BiGRU的情感分类方法(Sentiment Classification Method Concatenating CNN and BiGRU,ConvBiLSTM)
F
w
值上分别提升0.022、0.013、0.028及0.023;相对于预训练模型BERT和RoBERTa,BF_BiGAC在保证分类精度的情况下获得了较高的运行效率.
Since the traditional sentiment classification methods for text comments usually ign
ore the influence of user personality on sentiment classification results
a sentiment classification method for text comments based on user personality and semantic-structural features is proposed. According to the advantage of Big Five personality model on effectively expressing the user personality
the user personality feature is obtained from the comment texts by calculating the personality scores from different dimensions. Moreover
the advantages of bidirectional gated recurrent unit (BiGRU) and convolutional neural network (CNN) on effectively extracting the contextual semantic features and the local structural features are taken
and a new text semantic-structural feature acquisition method based on BiGRU
CNN and two-layer attention mechanism is proposed. Finally
in order to distinguish the influence of the features with different types
the hybrid attention layer is introduced to obtain the final text vector representation by integrating the user personality feature and the textural semantic-structural feature effectively. The experimental results on the datasets of IMDB
Yelp-2
Yelp-5 and Ekman show that BF_BiGAC achieves good performance when the measurements of Accuracy and weighted macro
F
1
(
F
w
) are used. Specifically
it achieves the improvements of 0.020
0.012
0.017 and 0.011 compared to sentiment classification method concatenating BiGRU and CNN (BiGRU_CNN) on accuracy
and achieves the improvements of 0.022
0.013
0.028 and 0.023 compared to sentiment classification method concatenating CNN and BiGRU (ConvBiLSTM) on F
w
. Moreover
when comparing with the pre-trained models of BERT and RoBERTa
BF_BiGAC achieves higher executing efficiency while ensuring the classification accuracy.
栗雨晴 , 礼欣 , 韩煦 , 等 . 基于双语词典的微博多类情感分析方法 [J ] . 电子学报 , 2016 , 44 ( 9 ): 2068 - 2073 .
LI Y Q , LI X , HAN X , et al . A bilingual lexicon-based multi-class semantic orientation analysis for microblogs [J ] . Acta Electronica Sinica , 2016 , 44 ( 9 ): 2068 - 2073 . (in Chinese)
DASHTIPOUR K , GOGATE M , GELBUKH A , et al . Extending persian sentiment lexicon with idiomatic expressions for sentiment analysis [J ] . Social Network Analysis and Mining , 2022 , 12 ( 1 ): 1 - 13 .
曾雪强 , 华鑫 , 刘平生 , 等 . 基于情感轮和情感词典的文本情感分布标记增强方法 [J ] . 计算机学报 , 2021 , 44 ( 6 ): 1080 - 1094 .
ZENG X Q , HUA X , LIU P S , et al . Emotion wheel and lexicon based text emotion distribution label enhancement method [J ] . Chinese Journal of Computers , 2021 , 44 ( 6 ): 1080 - 1094 . (in Chinese)
BLEI D M , NG A Y , JORDAN M I . Latent dirichlet allocation [J ] . Journal of Machine Learning Research , 2003 , 3 : 993 - 1022 .
LIN C , HE Y , EVERSON R , et al . Weakly supervised joint sentiment-topic detection from text [J ] . IEEE Transactions on Knowledge and Data engineering , 2011 , 24 ( 6 ): 1134 - 1145 .
PORIA S , CHATURVEDI I , CAMBRIA E , et al . Sentic LDA: Improving on LDA with semantic similarity for aspect-based sentiment analysis [C ] // 2016 International Joint Conference on Neural Networks (IJCNN) . Piscataway : IEEE , 2016 : 4465 - 4473 .
OZYURT B , AKCAYOL M A . A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA [J ] . Expert Systems with Applications , 2021 , 168 : 114231 .
黄发良 , 冯时 , 王大玲 , 等 . 基于多特征融合的微博主题情感挖掘 [J ] . 计算机学报 , 2017 , 40 ( 4 ): 872 - 888 .
HUANG F L , FENG S , WANG D L , et al . Topic sentiment model based on multi-feature fusion [J ] . Chinese Journal of Computers , 2017 , 40 ( 4 ): 872 - 888 . (in Chinese)
MENG Y , ZHANG Y , HUANG J , et al . Text classification using label names only: A language model self-training approach [EB/OL ] . ( 2020 )[2022 ] . https://arxiv.org/abs/2010.07245 https://arxiv.org/abs/2010.07245 .
WANG Y , HUANG S T . Training TSVM with the proper number of positive samples [J ] . Pattern Recognition Letters , 2005 , 26 ( 14 ): 2187 - 2194 .
SAMAH K A . Naïve Bayes Twitter sentiment analysis in visualizing the reputation of communication service providers: During Covid-19 pandemic [J ] . Turkish Journal of Computer and Mathematics Education (TURCOMAT) , 2021 , 12 ( 5 ): 1753 - 1764 .
XIA H , YANG Y , PAN X , et al . Sentiment analysis for online reviews using conditional random fields and support vector machines [J ] . Electronic Commerce Research , 2020 , 20 ( 2 ): 343 - 360 .
IQBAL M , KARIM A , KAMIRAN F . Balancing prediction errors for robust sentiment classification [J ] . ACM Transactions on Knowledge Discovery from Data (TKDD) , 2019 , 13 ( 3 ): 1 - 21 .
HAMA AZIZ R H , DIMILILER N . SentiXGboost: Enhanced sentiment analysis in social media posts with ensemble XGBoost classifier [J ] . Journal of the Chinese Institute of Engineers , 2021 , 44 ( 6 ): 562 - 572 .
程艳 , 叶子铭 , 王明文 , 等 . 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析 [J ] . 计算机研究与发展 , 2020 , 57 ( 12 ): 2583 - 2595 .
CHENG Y , YE Z M , WANG M W , et al . Text sentiment orientation analysis of multi-channels CNN and BiGRU based on attention mechanism [J ] . Journal of Computer Research and Development , 2020 , 57 ( 12 ): 2583 - 2595 . (in Chinese)
JELODAR H , WANG Y , ORJI R , et al . Deep sentiment classification and topic discovery on novel coronavirus or COVID-19 online discussions: NLP using LSTM recurrent neural network approach [J ] . IEEE Journal of Biomedical and Health Informatics , 2020 , 24 ( 10 ): 2733 - 2742 .
LI L , YANG L , ZENG Y . Improving sentiment classification of restaurant reviews with attention-based bi-GRU neural network [J ] . Symmetry , 2021 , 13 ( 8 ): 1517 .
GAO Z , FENG A , SONG X , et al . Target-dependent sentiment classification with BERT [J ] . IEEE Access , 2019 , 7 : 154290 - 154299 .
叶星鑫 , 徐杨 , 罗梦诗 . 基于ALBERT-AFSFN的中文短文本情感分析 [J ] . 计算机工程与应用 , 2021 , 3 : 1 - 11 .
YE X X , XU Y , LUO M S . Sentiment analysis of Chinese short text based on ALBERT-AFSFN [J ] . Computer Engineering and Applications , 2021 , 3 : 1 - 11 . (in Chinese)
TAM S , SAID R B , TANRIOVER Ö Ö . A ConvBiLSTM deep learning model-based approach for Twitter sentiment classification [J ] . IEEE Access , 2021 , 9 : 41283 - 41293 .
YAN W , ZHOU L , QIAN Z , et al . Sentiment analysis of student texts using the CNN-BiGRU-AT model [J ] . Scientific Programming , 2021 , 202 : 8405623 .
SINDHU C , SOM B , SINGH S P . Aspect-oriented sentiment classification using BiGRU-CNN model [C ] // 5th International Conference on Computing Methodologies and Communication (ICCMC) . Piscataway : IEEE , 2021 : 984 - 989 .
HUANG F , LI X , YUAN C , et al . Attention-emotion-enhanced convolutional LSTM for sentiment analysis [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2022 , 33 ( 9 ): 4332 - 4345 .
WEN Z , CAO J , YANG R , et al . Automatically select emotion for response via personality-affected emotion transition [C ] // Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 . Stroudsburg : ACL , 2021 : 5010 - 5020 .
PENNEBAKER J W , KING L A . Linguistic styles: language use as an individual difference [J ] . Journal of Personality and Social Psychology , 1999 , 77 ( 6 ): 1296 - 1312 .
SUMMER C , BYERS A , BOOCHEVER R , et al . Predicting dark triad personality traits from twitter usage and a linguistic analysis of tweets [C ] // 2012 11th International Conference on Machine Learning and Applications . Piscataway : IEEE , 2012 , 2 : 386 - 393 .
DUAN Y , LI H , HE M , et al . A BiGRU autoencoder remaining useful life prediction scheme with attention mechanism and skip connection [J ] . IEEE Sensors Journal , 2021 , 21 ( 9 ): 10905 - 10914 .
MAIRESSE F , WALKER M A , MEHL M R , et al . Using linguistic cues for the automatic recognition of personality in conversation and text [J ] . Journal of Artificial Intelligence Research , 2007 , 30 : 457 - 500 .
DEMSZKY D , MOVSHOVITZ A D , KO J , et al . GoEmotions: A dataset of fine-grained emotions [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : ACL , 2020 : 4040 - 4054 .
ASGHAR N . Yelp dataset challenge: Review rating prediction [EB/OL ] . ( 2016 ) [2022 ] . https://arxiv.org/abs/1605.05362 https://arxiv.org/abs/1605.05362 .
YANG J , ZOU X , ZHANG W , et al . Microblog sentiment analysis via embedding social contexts into an attentive LSTM [J ] . Engineering Applications of Artificial Intelligence , 2021 , 97 : 104048 .
DEVLIN J , CHANG M W , LEE K , et al . BERT: Pre-training of deep bidirectional transformers for language understanding [C ] // Proceedings of NAACL-HLT . Stroudsburg : ACL , 2019 : 4171 - 4186 .
LIU Y , OTT M , GOYAL N , et al . RoBERTa: A robustly optimized BERT pretraining approach [EB/OL ] . ( 2019 )[2022 ] . https://arxiv.org/abs/ 1907.11692 https://arxiv.org/abs/1907.11692 .
ZHANG Y F , YU X L , CUI Z Y , et al . Every document owns its structure: Inductive text classification via graph neural networks [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : ACL , 2020 : 334 - 339 .
0
浏览量
12
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621