Chinese Long Text Summarization with Guided Attention

GUO Zhe; ZHANG Zhi-bo; ZHOU Wei-jie; FAN Yang-yu; ZHANG Yan-ning

doi:10.12263/DZXB.20230429

您当前的位置：

首页 >

文章列表页 >

Chinese Long Text Summarization with Guided Attention

PAPERS | 更新时间：2025-12-11

- Chinese Long Text Summarization with Guided Attention
  增强出版
- ACTA ELECTRONICA SINICA Vol. 52, Issue 12, Pages: 3914-3930(2024)
- 作者机构：
  
  1.西北工业大学电子信息学院，陕西西安 710129
  2.西北工业大学空天地海一体化大数据应用技术国家工程实验室，陕西西安 710129
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62071384);Key Research and Development Project of Shaanxi Province(2023-YBGY-239)
- DOI：10.12263/DZXB.20230429
  CLC： TP391.11;
- Received：12 May 2023，
  
  Revised：2024-11-12，
  
  Published：25 December 2024
- 稿件说明：
移动端阅览
郭哲, 张智博, 周炜杰, 等. 融合引导注意力的中文长文本摘要生成[J]. 电子学报, 2024, 52(12): 3914-3930.

GUO Zhe, ZHANG Zhi-bo, ZHOU Wei-jie, et al. Chinese Long Text Summarization with Guided Attention[J]. Acta Electronica Sinica, 2024, 52(12): 3914-3930.
郭哲, 张智博, 周炜杰, 等. 融合引导注意力的中文长文本摘要生成[J]. 电子学报, 2024, 52(12): 3914-3930. DOI：10.12263/DZXB.20230429

GUO Zhe, ZHANG Zhi-bo, ZHOU Wei-jie, et al. Chinese Long Text Summarization with Guided Attention[J]. Acta Electronica Sinica, 2024, 52(12): 3914-3930. DOI：10.12263/DZXB.20230429

摘要

当前基于深度学习的中文长文本摘要生成的研究存在以下问题：（1）生成模型缺少信息引导，缺乏对关键词汇和语句的关注，存在长文本跨度下关键信息丢失的问题；（2）现有中文长文本摘要模型的词表常以字为基础，并不包含中文常用词语与标点，不利于提取多粒度的语义信息. 针对上述问题，本文提出了融合引导注意力的中文长文本摘要生成（Chinese Long text Summarization with Guided Attention，CLSGA）方法. 首先，针对中文长文本摘要生成任务，利用抽取模型灵活抽取长文本中的核心词汇和语句，构建引导文本，用以指导生成模型在编码过程中将注意力集中于更重要的信息. 其次，设计中文长文本词表，将文本结构长度由字统计改变至词组统计，有利于提取更加丰富的多粒度特征，进一步引入层次位置分解编码，高效扩展长文本的位置编码，加速网络收敛. 最后，以局部注意力机制为骨干，同时结合引导注意力机制，以此有效捕捉长文本跨度下的重要信息，提高摘要生成的精度. 在四个不同长度的公共中文摘要数据集LCSTS（大规模中文短文本摘要数据集）、CNewSum（大规模中国新闻摘要数据集）、NLPCC2017和SFZY2020上的实验结果表明：本文方法对于长文本摘要生成具有显著优势，能够有效提高ROUGE-1、ROUGE-2、ROUGE-L值.

Abstract

Current research on Chinese long text summarization based on deep learning has the following problems: (1) summarization models lack information guidance

fail to focus on keywords and sentences

leading to the problem of losing critical information under long-distance span; (2) the word lists of existing Chinese long text summarization models are often word-based and do not contain common Chinese words and punctuation

which is not conducive to extracting multi-grained semantic information. To solve the above problems

a Chinese long text summarization method with guided attention (CLSGA) is proposed in this paper. Firstly

for the long text summarization task

an extraction model is presented to extract the core words and sentences in the long text to construct the guided text

which can guide the generation model to focus on more important information in the encoding process. Secondly

the Chinese long text vocabulary is designed to changing the text structure from words statistics to phrases statistics

which is conducive to extracting richer multi-granularity features. Hierarchical location decomposition encoding is then introduced to efficiently extend location encoding of long text and accelerate network convergence. Finally

the local attention mechanism is combined with the guided attention mechanism to effectively capture the important information under the long text span and improve the accuracy of summarization. Experimental results on four public Chinese abstract datasets with different lengths

LCSTS

CNewSum

NLPCC2017 and SFZY2020

show that our proposed method has significant advantages over long text summarization and can effectively improve the value of ROUGE-1

ROUGE-2 and ROUGE-L.

关键词

Keywords

references

李金鹏 , 张闯 , 陈小军 , 等 . 自动文本摘要研究综述 [J ] . 计算机研究与发展 , 2021 , 58 ( 1 ): 1 - 21 .

LI J P , ZHANG C , CHEN X J , et al . Survey on automatic text summarization [J ] . Journal of Computer Research and Development , 2021 , 58 ( 1 ): 1 - 21 . (in Chinese)

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // The 31st International Conference on Neural Information Processing Systems , New York : Curran Associates Inc. , 2017 ( 30 ): 6000 - 6010 .

SUTSKEVER I , VINYALS O , LE Q V . Sequence to sequence learning with neural networks [J ] . Advances in Neural Information Processing Systems , 2014 , 4(January): 3104- 3112 .

侯丽微 , 胡珀 , 曹雯琳 . 主题关键词信息融合的中文生成式自动摘要研究 [J ] . 自动化学报 , 2019 , 45 ( 3 ): 530 - 539 .

HOU L W , HU P , CAO W L . Automatic Chinese abstractive summarization with topical keywords fusion [J ] . Acta Automatica Sinica , 2019 , 45 ( 3 ): 530 - 539 . (in Chinese)

BELTAGY I , PETERS M E , COHAN A . Longformer: The long-document transformer [EB/OL ] . ( 2020-04-10 )[ 2023-05-12 ] . https://arxiv.org/abs/2004.05150v2 https://arxiv.org/abs/2004.05150v2 .

ZAHEER M , GURUGANESH G , DUBEY K , et al . Bigbird: Transformers for longer sequences [C ] // The 34st International Conference on Neural Information Processing Systems . New York : Curran Associates Inc. , 2020 ( 33 ): 17283 - 17297 .

鲍宇 , 黄书剑 , 周浩 , 等 . 基于句法模板采样的无监督复述生成方法 [J ] . 中国科学: 信息科学 , 2022 , 52 ( 10 ): 1808 - 1821 .

BAO Y , HUANG S J , ZHOU H , et al . Unsupervised paraphrasing via syntactic template sampling [J ] . Scientia Sinica (Informationis) , 2022 , 52 ( 10 ): 1808 - 1821 . (in Chinese)

HU B T , CHEN Q C , ZHU F Z . LCSTS: A large scale Chinese short text summarization dataset [J ] . Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing , 2015 : 1967 - 1972 .

WANG D Q , CHEN J Z , WU X Z , et al . CNewSum: A large-scale summarization dataset with human-annotated adequacy and deducibility level [M ] // Lecture Notes in Computer Science . Cham : Springer International Publishing , 2021 : 389 - 400 .

HUA L F , WAN X J , LI L . Overview of the NLPCC 2017 shared task: Single document summarization [C ] // CCF International Conference on Natural Language Processing and Chinese Computing . Cham : Springer International Publishing , 2017 : 942 - 947 .

ZHONG M , LIU P F , CHEN Y R , et al . Extractive summarization as text matching [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2020 : 6197 - 6208 .

LIU Y . Fine-tune BERT for extractive summarization [EB/OL ] . ( 2019-03-25 )[ 2023-05-12 ] . https://arxiv.org/abs/1903.10318v2 https://arxiv.org/abs/1903.10318v2 .

DEVLIN J , CHANG M W , LEE K , et al . BERT: Pre-training of deep bidirectional transformers for language understanding [C ] // The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Stroudsburg : Association for Computational Linguistics , 2019 : 4171 - 4186 .

LEWIS M , LIU Y H , GOYAL N , et al . BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2020 : 7871 - 7880 .

SHAO Y F , GENG Z C , LIU Y T , et al . CPT: A pre-trained unbalanced transformer for both Chinese language understanding and generation [EB/OL ] . ( 2021-09-13 )[ 2023-05-12 ] . https://arxiv.org/abs/2109.05729v4 https://arxiv.org/abs/2109.05729v4 .

DOU Z Y , LIU P F , HAYASHI H , et al . GSum: A general framework for guided neural abstractive summarization [C ] // Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Stroudsburg : Association for Computational Linguistics , 2021 : 4830 - 4842 .

李志欣 , 彭智 , 唐素勤 , 等 . 融合上下文信息和关键信息的文本摘要 [J ] . 中文信息学报 , 2022 , 36 ( 1 ): 83 - 91 .

LI Z X , PENG Z , TANG S Q , et al . Fusing context information and key information for text summarization [J ] . Journal of Chinese Information Processing , 2022 , 36 ( 1 ): 83 - 91 . (in Chinese)

KOH H Y , JU J X , LIU M , et al . An empirical survey on long document summarization: Datasets, models, and metrics [J ] . ACM Computing Surveys , 2023 , 55 ( 8 ): 1 - 35 .

COHAN A , DERNONCOURT F , KIM D S , et al . A discourse-aware attention model for abstractive summarization ofLong documents [C ] // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human LanguageTechnologies . Stroudsburg : Association for Computational Linguistics , 2018 : 615 - 621 .

LIU Y , LAPATA M . Hierarchical transformers for multi-document summarization [C ] // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2019 : 5070 - 5081 .

MANAKUL P , GALES M . Long-span summarization via local attention and content selection [C ] // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2021 : 6026 - 6041 .

MAO Z M , WU C H , NI A S , et al . DYLE: Dynamic latent extraction for abstractive long-input summarization [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2022 : 1687 - 1698 .

CAO S Y , WANG L . HIBRIDS: Attention with hierarchical biases for structure-aware long document summarization [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2022 : 786 - 807 .

FONSECA M , ZISER Y , COHEN S B . Factorizing content and budget decisions in abstractive summarization of long documents [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2022 : 6341 - 6364 .

GU N L , ASH E , HAHNLOSER R . MemSum: Extractive summarization of long documents using multi-step episodic Markov decision processes [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2022 : 6507 - 6522 .

WAHAB M H H , ABDUL HAMID N A W , SUBRAMANIAM S , et al . Decomposition-based multi-objective differential evolution for extractive multi-document automatic text summarization [J ] . Applied Soft Computing , 2024 , 151 : 110994 .

WU W H , LI W , XIAO X Y , et al . BASS: Boosting abstractive summarization with unified semantic graph [C ] // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2021 : 6052 - 6067 .

JING B Y , YOU Z Y , YANG T , et al . Multiplex graph neural network for extractive text summarization [C ] // Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2021 : 133 - 139 .

QIU Y F , COHEN S B . Abstractive summarization guided by latent hierarchical document structure [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2022 : 5303 - 5317 .

ZHANG H P , LIU X , ZHANG J W . HEGEL: Hypergraph transformer for long document summarization [C ] // Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2022 : 10167 - 10176 .

BENGIO S , VINYALS O , JAITLY N , et al . Scheduled sampling for sequence prediction with recurrent neural networks [C ] // Proceedings of the 29th International Conference on Neural Information Processing Systems . New York : ACM , 2015 : 1171 - 1179 .

WU Y , SCHUSTER M , CHEN Z , et al . Google’s neural machine translation system: Bridging the gap between human and machine translation [EB/OL ] . ( 2016-09-26 )[ 2023-05-12 ] . https://arxiv.org/abs/1609.08144 https://arxiv.org/abs/1609.08144 .

WU F , LAO N , BLITZER J , et al . Fast reading comprehension with ConvNets [EB/OL ] . ( 2017-11-12 )[ 2023-05-12 ] . https://arxiv.org/abs/1711.04352v1 https://arxiv.org/abs/1711.04352v1 .

LIU Y , LAPATA M . Text summarization with pretrained encoders [C ] // Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Stroudsburg : Association for Computational Linguistics , 2019 : 3730 - 3740 .

LIN C Y . Rouge: A package for automatic evaluation of summaries [C ] // Proceedings of Workshop on Text Summarization Branches Out . Stroudsburg : Association for Computational Linguistics , 2004 : 74 - 81 .

MIHALCEA R , TARAU P . Textrank: Bringing order into text [C ] // The 2004 conference on empirical methods in natural language processing . Stroudsburg : Association for Computational Linguistics , 2004 : 404 - 411 .

DE B P T , KROESE D P , MANNOR S , et al . A tutorial on the cross-entropy method [J ] . Annals of Operations Research , 2005 , 134 ( 1 ): 19 - 67 .

TILLMANN C , NEY H . Word reordering and a dynamic programming beam search algorithm for statistical machine translation [J ] . Computational Linguistics , 2003 , 29 ( 1 ): 97 - 133 .

KRYSCINSKI W , MCCANN B , XIONG C M , et al . Evaluating the factual consistency of abstractive text summarization [C ] // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : Association for Computational Linguistics , 2020 : 9332 - 9346 .

GU J T , LU Z D , LI H , et al . Incorporating copying mechanism in sequence-to-sequence learning [C ] // Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2016 : 1631 - 1640 .

SEE A , LIU P J , MANNING C D . Get to the point: Summarization with pointer-generator networks [C ] // Proceedings of the 55th Annual Meeting of the Association forComputational Linguistics . Stroudsburg : Association for Computational Linguistics , 2017 : 1073 - 1083 .

ZHOU Q Y , YANG N , WEI F R , et al . Neural document summarization by jointly learning to score and select sentences [C ] // Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : Association for Computational Linguistics , 2018 : 654 - 663 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

A Rumor Detection Approach Based on Multi-Feature Propagation Tree

A Survey of Text Generation and Evaluation Based on Intrinsic Quality Constraints

Research and Development of Named Entity Recognition in Chinese Electronic Medical Record

SMGN: A State Memory Graph Network for Dialogue State Tracking

Related Author

ZHANG Yan-ning

GUO Zhe

MAO Qin-jiao

PAN Shan-liang

ZHANG Xin-xin

LI Guan-cheng

XIN Ting-ting

XIA Bing-can

Related Institution

National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology

College of Information Science and Engineering,Ningbo University

College of Cyberspace Security, Ningbo University of Technology

Laboratory of Social Intelligence & Complex Data Processing， School of Software Engineering， Xi’an Jiaotong University

Beijing China Changfeng Electromechanical Technology Research and Design Institute

⁰