

浏览全部资源
扫码关注微信
1.西北工业大学电子信息学院,陕西西安 710129
2.西北工业大学空天地海一体化大数据应用技术国家工程实验室,陕西西安 710129
Received:12 May 2023,
Revised:2024-11-12,
Published:25 December 2024
移动端阅览
郭哲, 张智博, 周炜杰, 等. 融合引导注意力的中文长文本摘要生成[J]. 电子学报, 2024, 52(12): 3914-3930.
GUO Zhe, ZHANG Zhi-bo, ZHOU Wei-jie, et al. Chinese Long Text Summarization with Guided Attention[J]. Acta Electronica Sinica, 2024, 52(12): 3914-3930.
郭哲, 张智博, 周炜杰, 等. 融合引导注意力的中文长文本摘要生成[J]. 电子学报, 2024, 52(12): 3914-3930. DOI:10.12263/DZXB.20230429
GUO Zhe, ZHANG Zhi-bo, ZHOU Wei-jie, et al. Chinese Long Text Summarization with Guided Attention[J]. Acta Electronica Sinica, 2024, 52(12): 3914-3930. DOI:10.12263/DZXB.20230429
当前基于深度学习的中文长文本摘要生成的研究存在以下问题:(1)生成模型缺少信息引导,缺乏对关键词汇和语句的关注,存在长文本跨度下关键信息丢失的问题;(2)现有中文长文本摘要模型的词表常以字为基础,并不包含中文常用词语与标点,不利于提取多粒度的语义信息. 针对上述问题,本文提出了融合引导注意力的中文长文本摘要生成(Chinese Long text Summarization with Guided Attention,CLSGA)方法. 首先,针对中文长文本摘要生成任务,利用抽取模型灵活抽取长文本中的核心词汇和语句,构建引导文本,用以指导生成模型在编码过程中将注意力集中于更重要的信息. 其次,设计中文长文本词表,将文本结构长度由字统计改变至词组统计,有利于提取更加丰富的多粒度特征,进一步引入层次位置分解编码,高效扩展长文本的位置编码,加速网络收敛. 最后,以局部注意力机制为骨干,同时结合引导注意力机制,以此有效捕捉长文本跨度下的重要信息,提高摘要生成的精度. 在四个不同长度的公共中文摘要数据集LCSTS(大规模中文短文本摘要数据集)、CNewSum(大规模中国新闻摘要数据集)、NLPCC2017和SFZY2020上的实验结果表明:本文方法对于长文本摘要生成具有显著优势,能够有效提高ROUGE-1、ROUGE-2、ROUGE-L值.
Current research on Chinese long text summarization based on deep learning has the following problems: (1) summarization models lack information guidance
fail to focus on keywords and sentences
leading to the problem of losing critical information under long-distance span; (2) the word lists of existing Chinese long text summarization models are often word-based and do not contain common Chinese words and punctuation
which is not conducive to extracting multi-grained semantic information. To solve the above problems
a Chinese long text summarization method with guided attention (CLSGA) is proposed in this paper. Firstly
for the long text summarization task
an extraction model is presented to extract the core words and sentences in the long text to construct the guided text
which can guide the generation model to focus on more important information in the encoding process. Secondly
the Chinese long text vocabulary is designed to changing the text structure from words statistics to phrases statistics
which is conducive to extracting richer multi-granularity features. Hierarchical location decomposition encoding is then introduced to efficiently extend location encoding of long text and accelerate network convergence. Finally
the local attention mechanism is combined with the guided attention mechanism to effectively capture the important information under the long text span and improve the accuracy of summarization. Experimental results on four public Chinese abstract datasets with different lengths
LCSTS
CNewSum
NLPCC2017 and SFZY2020
show that our proposed method has significant advantages over long text summarization and can effectively improve the value of ROUGE-1
ROUGE-2 and ROUGE-L.
李金鹏 , 张闯 , 陈小军 , 等 . 自动文本摘要研究综述 [J ] . 计算机研究与发展 , 2021 , 58 ( 1 ): 1 - 21 .
LI J P , ZHANG C , CHEN X J , et al . Survey on automatic text summarization [J ] . Journal of Computer Research and Development , 2021 , 58 ( 1 ): 1 - 21 . (in Chinese)
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // The 31st International Conference on Neural Information Processing Systems , New York : Curran Associates Inc. , 2017 ( 30 ): 6000 - 6010 .
SUTSKEVER I , VINYALS O , LE Q V . Sequence to sequence learning with neural networks [J ] . Advances in Neural Information Processing Systems , 2014 , 4(January): 3104- 3112 .
侯丽微 , 胡珀 , 曹雯琳 . 主题关键词信息融合的中文生成式自动摘要研究 [J ] . 自动化学报 , 2019 , 45 ( 3 ): 530 - 539 .
HOU L W , HU P , CAO W L . Automatic Chinese abstractive summarization with topical keywords fusion [J ] . Acta Automatica Sinica , 2019 , 45 ( 3 ): 530 - 539 . (in Chinese)
BELTAGY I , PETERS M E , COHAN A . Longformer: The long-document transformer [EB/OL ] . ( 2020-04-10 )[ 2023-05-12 ] . https://arxiv.org/abs/2004.05150v2 https://arxiv.org/abs/2004.05150v2 .
ZAHEER M , GURUGANESH G , DUBEY K , et al . Bigbird: Transformers for longer sequences [C ] // The 34st International Conference on Neural Information Processing Systems . New York : Curran Associates Inc. , 2020 ( 33 ): 17283 - 17297 .
鲍宇 , 黄书剑 , 周浩 , 等 . 基于句法模板采样的无监督复述生成方法 [J ] . 中国科学: 信息科学 , 2022 , 52 ( 10 ): 1808 - 1821 .
BAO Y , HUANG S J , ZHOU H , et al . Unsupervised paraphrasing via syntactic template sampling [J ] . Scientia Sinica (Informationis) , 2022 , 52 ( 10 ): 1808 - 1821 . (in Chinese)
HU B T , CHEN Q C , ZHU F Z . LCSTS: A large scale Chinese short text summarization dataset [J ] . Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing , 2015 : 1967 - 1972 .
WANG D Q , CHEN J Z , WU X Z , et al . CNewSum: A large-scale summarization dataset with human-annotated adequacy and deducibility level [M ] // Lecture Notes in Computer Science . Cham : Springer International Publishing , 2021 : 389 - 400 .
HUA L F , WAN X J , LI L . Overview of the NLPCC 2017 shared task: Single document summarization [C ] // CCF International Conference on Natural Language Processing and Chinese Computing . Cham : Springer International Publishing , 2017 : 942 - 947 .
ZHONG M , LIU P F , CHEN Y R , et al . Extractive summarization as text matching [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2020 : 6197 - 6208 .
LIU Y . Fine-tune BERT for extractive summarization [EB/OL ] . ( 2019-03-25 )[ 2023-05-12 ] . https://arxiv.org/abs/1903.10318v2 https://arxiv.org/abs/1903.10318v2 .
DEVLIN J , CHANG M W , LEE K , et al . BERT: Pre-training of deep bidirectional transformers for language understanding [C ] // The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Stroudsburg : Association for Computational Linguistics , 2019 : 4171 - 4186 .
LEWIS M , LIU Y H , GOYAL N , et al . BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2020 : 7871 - 7880 .
SHAO Y F , GENG Z C , LIU Y T , et al . CPT: A pre-trained unbalanced transformer for both Chinese language understanding and generation [EB/OL ] . ( 2021-09-13 )[ 2023-05-12 ] . https://arxiv.org/abs/2109.05729v4 https://arxiv.org/abs/2109.05729v4 .
DOU Z Y , LIU P F , HAYASHI H , et al . GSum: A general framework for guided neural abstractive summarization [C ] // Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Stroudsburg : Association for Computational Linguistics , 2021 : 4830 - 4842 .
李志欣 , 彭智 , 唐素勤 , 等 . 融合上下文信息和关键信息的文本摘要 [J ] . 中文信息学报 , 2022 , 36 ( 1 ): 83 - 91 .
LI Z X , PENG Z , TANG S Q , et al . Fusing context information and key information for text summarization [J ] . Journal of Chinese Information Processing , 2022 , 36 ( 1 ): 83 - 91 . (in Chinese)
KOH H Y , JU J X , LIU M , et al . An empirical survey on long document summarization: Datasets, models, and metrics [J ] . ACM Computing Surveys , 2023 , 55 ( 8 ): 1 - 35 .
COHAN A , DERNONCOURT F , KIM D S , et al . A discourse-aware attention model for abstractive summarization ofLong documents [C ] // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human LanguageTechnologies . Stroudsburg : Association for Computational Linguistics , 2018 : 615 - 621 .
LIU Y , LAPATA M . Hierarchical transformers for multi-document summarization [C ] // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2019 : 5070 - 5081 .
MANAKUL P , GALES M . Long-span summarization via local attention and content selection [C ] // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2021 : 6026 - 6041 .
MAO Z M , WU C H , NI A S , et al . DYLE: Dynamic latent extraction for abstractive long-input summarization [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2022 : 1687 - 1698 .
CAO S Y , WANG L . HIBRIDS: Attention with hierarchical biases for structure-aware long document summarization [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2022 : 786 - 807 .
FONSECA M , ZISER Y , COHEN S B . Factorizing content and budget decisions in abstractive summarization of long documents [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2022 : 6341 - 6364 .
GU N L , ASH E , HAHNLOSER R . MemSum: Extractive summarization of long documents using multi-step episodic Markov decision processes [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2022 : 6507 - 6522 .
WAHAB M H H , ABDUL HAMID N A W , SUBRAMANIAM S , et al . Decomposition-based multi-objective differential evolution for extractive multi-document automatic text summarization [J ] . Applied Soft Computing , 2024 , 151 : 110994 .
WU W H , LI W , XIAO X Y , et al . BASS: Boosting abstractive summarization with unified semantic graph [C ] // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2021 : 6052 - 6067 .
JING B Y , YOU Z Y , YANG T , et al . Multiplex graph neural network for extractive text summarization [C ] // Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2021 : 133 - 139 .
QIU Y F , COHEN S B . Abstractive summarization guided by latent hierarchical document structure [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2022 : 5303 - 5317 .
ZHANG H P , LIU X , ZHANG J W . HEGEL: Hypergraph transformer for long document summarization [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2022 : 10167 - 10176 .
BENGIO S , VINYALS O , JAITLY N , et al . Scheduled sampling for sequence prediction with recurrent neural networks [C ] // Proceedings of the 29th International Conference on Neural Information Processing Systems . New York : ACM , 2015 : 1171 - 1179 .
WU Y , SCHUSTER M , CHEN Z , et al . Google’s neural machine translation system: Bridging the gap between human and machine translation [EB/OL ] . ( 2016-09-26 )[ 2023-05-12 ] . https://arxiv.org/abs/1609.08144 https://arxiv.org/abs/1609.08144 .
WU F , LAO N , BLITZER J , et al . Fast reading comprehension with ConvNets [EB/OL ] . ( 2017-11-12 )[ 2023-05-12 ] . https://arxiv.org/abs/1711.04352v1 https://arxiv.org/abs/1711.04352v1 .
LIU Y , LAPATA M . Text summarization with pretrained encoders [C ] // Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Stroudsburg : Association for Computational Linguistics , 2019 : 3730 - 3740 .
LIN C Y . Rouge: A package for automatic evaluation of summaries [C ] // Proceedings of Workshop on Text Summarization Branches Out . Stroudsburg : Association for Computational Linguistics , 2004 : 74 - 81 .
MIHALCEA R , TARAU P . Textrank: Bringing order into text [C ] // The 2004 conference on empirical methods in natural language processing . Stroudsburg : Association for Computational Linguistics , 2004 : 404 - 411 .
DE B P T , KROESE D P , MANNOR S , et al . A tutorial on the cross-entropy method [J ] . Annals of Operations Research , 2005 , 134 ( 1 ): 19 - 67 .
TILLMANN C , NEY H . Word reordering and a dynamic programming beam search algorithm for statistical machine translation [J ] . Computational Linguistics , 2003 , 29 ( 1 ): 97 - 133 .
KRYSCINSKI W , MCCANN B , XIONG C M , et al . Evaluating the factual consistency of abstractive text summarization [C ] // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2020 : 9332 - 9346 .
GU J T , LU Z D , LI H , et al . Incorporating copying mechanism in sequence-to-sequence learning [C ] // Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2016 : 1631 - 1640 .
SEE A , LIU P J , MANNING C D . Get to the point: Summarization with pointer-generator networks [C ] // Proceedings of the 55th Annual Meeting of the Association forComputational Linguistics . Stroudsburg : Association for Computational Linguistics , 2017 : 1073 - 1083 .
ZHOU Q Y , YANG N , WEI F R , et al . Neural document summarization by jointly learning to score and select sentences [C ] // Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2018 : 654 - 663 .
0
Views
15
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621