电子学报 ›› 2022, Vol. 50 ›› Issue (8): 1851-1858.DOI: 10.12263/DZXB.20201463
张志昌, 于沛霖, 庞雅丽, 朱林, 曾扬扬
收稿日期:
2020-12-18
修回日期:
2021-04-15
出版日期:
2022-08-25
作者简介:
基金资助:
ZHANG Zhi-chang, YU Pei-lin, PANG Ya-li, ZHU Lin, ZENG Yang-yang
Received:
2020-12-18
Revised:
2021-04-15
Online:
2022-08-25
Published:
2022-09-08
摘要:
对话状态跟踪是任务型对话系统的重要模块.已有研究使用注意力机制模拟图结构来引入历史信息,但这种方法无法显式利用对话状态的结构性.此外,如何生成复杂格式的对话状态也为研究带来了挑战.针对以上问题,本文提出一种状态记忆图网络SMGN(State Memory Graph Network).该网络通过状态记忆图保存历史对话信息,并使用图结构与当前对话进行特征交互.本文还设计了一种基于状态记忆图的复杂对话状态生成方法.实验结果表明,本文提出的方法在CrossWOZ数据集上联合正确率提高1.39%,在MultiWOZ数据集上提高1.86%.
中图分类号:
张志昌, 于沛霖, 庞雅丽, 朱林, 曾扬扬. SMGN:用于对话状态跟踪的状态记忆图网络[J]. 电子学报, 2022, 50(8): 1851-1858.
ZHANG Zhi-chang, YU Pei-lin, PANG Ya-li, ZHU Lin, ZENG Yang-yang. SMGN: A State Memory Graph Network for Dialogue State Tracking[J]. Acta Electronica Sinica, 2022, 50(8): 1851-1858.
对话文本 | 用户:帮我定一间香格里拉酒店的标准间. 系统:请问您哪一天入住呢? 用户:明天.再帮我预定一辆从首都国际机场到酒店的出租车. 系统:请问您几点可以从首都国际机场出发? 用户:…… |
---|---|
对话状态 | {酒店-名称-香格里拉酒店,酒店-房间类型-标准间,酒店-入住日期-明天,出租车-出发地-首都国际机场,出租车-目的地-香格里拉酒店} |
表1 对话状态跟踪任务示例
对话文本 | 用户:帮我定一间香格里拉酒店的标准间. 系统:请问您哪一天入住呢? 用户:明天.再帮我预定一辆从首都国际机场到酒店的出租车. 系统:请问您几点可以从首都国际机场出发? 用户:…… |
---|---|
对话状态 | {酒店-名称-香格里拉酒店,酒店-房间类型-标准间,酒店-入住日期-明天,出租车-出发地-首都国际机场,出租车-目的地-香格里拉酒店} |
统计信息 | CrossWOZ | MultiWOZ 2.0 | MultiWOZ 2.1 |
---|---|---|---|
语言 | 中文 | 英文 | 英文 |
领域 | 5 | 7 | 5 |
槽位 | 72 | 25 | 25 |
槽值 | 7 871 | 4 510 | 4 510 |
对话数 | 5 012 | 8 438 | 8 133 |
轮次总数 | 84 692 | 115 424 | 113 556 |
表2 数据集信息统计
统计信息 | CrossWOZ | MultiWOZ 2.0 | MultiWOZ 2.1 |
---|---|---|---|
语言 | 中文 | 英文 | 英文 |
领域 | 5 | 7 | 5 |
槽位 | 72 | 25 | 25 |
槽值 | 7 871 | 4 510 | 4 510 |
对话数 | 5 012 | 8 438 | 8 133 |
轮次总数 | 84 692 | 115 424 | 113 556 |
任务分类 | TRADE联合正确率/% | SMGN联合 正确率/% |
---|---|---|
Single-domain (S) | 71.67 | 72.63 |
Independent multi-domain (M) | 45.29 | 46.76 |
multi-domain+traffic (M+T) | 37.98 | 39.25 |
Cross multi-domain (CM) | 30.77 | 32.92 |
Cross multi-domain+traffic (CM+T) | 25.65 | 27.74 |
Overall | 36.08 | 37.47 |
表3 CrossWOZ数据集实验结果
任务分类 | TRADE联合正确率/% | SMGN联合 正确率/% |
---|---|---|
Single-domain (S) | 71.67 | 72.63 |
Independent multi-domain (M) | 45.29 | 46.76 |
multi-domain+traffic (M+T) | 37.98 | 39.25 |
Cross multi-domain (CM) | 30.77 | 32.92 |
Cross multi-domain+traffic (CM+T) | 25.65 | 27.74 |
Overall | 36.08 | 37.47 |
模型 | Ontology | BERT | MultiWOZ 2.0 联合正确率/% | MultiWOZ 2.1 联合正确率/% |
---|---|---|---|---|
DSTreader | 39.41 | 36.40 | ||
HyST | √ | 42.33 | 38.10 | |
TRADE | 48.60 | 45.60 | ||
DSTQA | √ | 51.44 | 51.17 | |
SOM-DST | √ | 51.38 | 52.57 | |
SUMBT | √ | √ | 48.81 | 52.75 |
SST | √ | 51.17 | 55.23 | |
SMGN | √ | 53.03 | 54.88 |
表4 MultiWOZ数据集实验结果
模型 | Ontology | BERT | MultiWOZ 2.0 联合正确率/% | MultiWOZ 2.1 联合正确率/% |
---|---|---|---|---|
DSTreader | 39.41 | 36.40 | ||
HyST | √ | 42.33 | 38.10 | |
TRADE | 48.60 | 45.60 | ||
DSTQA | √ | 51.44 | 51.17 | |
SOM-DST | √ | 51.38 | 52.57 | |
SUMBT | √ | √ | 48.81 | 52.75 |
SST | √ | 51.17 | 55.23 | |
SMGN | √ | 53.03 | 54.88 |
Model | MultiWOZ 2.0 联合正确率/% |
---|---|
SMGN | 53.03 |
-Graph Update Layer | 45.97(-7.06) |
GAT-->Transformer | 50.25(-2.78) |
Value Node Tagger-->State prediction | 50.79(-2.24) |
BERT-->Word2Vec+BiLSTM | 50.74(-2.29) |
表5 消融实验结果
Model | MultiWOZ 2.0 联合正确率/% |
---|---|
SMGN | 53.03 |
-Graph Update Layer | 45.97(-7.06) |
GAT-->Transformer | 50.25(-2.78) |
Value Node Tagger-->State prediction | 50.79(-2.24) |
BERT-->Word2Vec+BiLSTM | 50.74(-2.29) |
Model | MultiWOZ 2.0联合正确率/% | 时间/ms |
---|---|---|
TRADE | 48.60 | 979 |
SOM-DST | 51.38 | 245 |
SUMBT | 48.81 | 392 |
SST | 51.17 | 79 |
SMGN | 53.03 | 165 |
表6 模型计算效率对比
Model | MultiWOZ 2.0联合正确率/% | 时间/ms |
---|---|---|
TRADE | 48.60 | 979 |
SOM-DST | 51.38 | 245 |
SUMBT | 48.81 | 392 |
SST | 51.17 | 79 |
SMGN | 53.03 | 165 |
用户:请帮我在延寿寺附近找一个人均消费100元的餐馆. 系统:为您推荐盛和轩烤鸭店,人均消费是93元,评分4分. 用户:行.另外我再问问那里有高档型酒店吗? | |
---|---|
正确结果 | (……,餐馆-地点-延寿寺附近,酒店-地点-延寿寺附近) |
TRADE | (……,餐馆-地点-延寿寺) |
SMGN | (……,餐馆-地点-延寿寺附近,酒店-地点-延寿寺附近) |
表7 CrossWOZ数据集示例结果对比
用户:请帮我在延寿寺附近找一个人均消费100元的餐馆. 系统:为您推荐盛和轩烤鸭店,人均消费是93元,评分4分. 用户:行.另外我再问问那里有高档型酒店吗? | |
---|---|
正确结果 | (……,餐馆-地点-延寿寺附近,酒店-地点-延寿寺附近) |
TRADE | (……,餐馆-地点-延寿寺) |
SMGN | (……,餐馆-地点-延寿寺附近,酒店-地点-延寿寺附近) |
1 | HENDERSON M, THOMSON B, YOUNG S. Word-based dialog state tracking with recurrent neural networks[C]//Proceedings of the 15th Annual Meeting of the SIGDIAL. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 292-299. |
2 | MRKŠIĆ N, SÉAGHDHA D Ó, WEN T H, et al. Neural belief tracker: Data-driven dialogue state tracking[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada: Association for Computational Linguistics, 2017: 1777-1788. |
3 | ZHONG V, XIONG C M, SOCHER R. Global-locally self-attentive encoder for dialogue state tracking[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers). Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 1458-1467. |
4 | CHAO G L, LANE I. BERT-DST: Scalable end-to-end dialogue state tracking with bidirectional encoder representations from transformer[C]//Proceedings of the 20th Annual Conference of the International Speech Communication Association. Graz, Austria: ISCA, 2019: 1468-1472. |
5 | WU C S, MADOTTO A, HOSSEINI-ASL E, et al. Transferable multi-domain state generator for task-oriented dialogue systems[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 808-819. |
6 | ZHOU L, SMALL K. Multi-domain dialogue state tracking as dynamic knowledge graph enhanced question answering[EB/OL]. (2020-06-20)[2020-12]. . |
7 | GAO S Y, SETHI A, AGARWAL S, et al. Dialog state tracking: A neural reading comprehension approach[C]//Proceedings of the SIGDial 2019 Conference. Stockholm, Sweden: Association for Computational Linguistics, 2019: 264-273. |
8 | ZHANG J G, HASHIMOTO K, WU C S, et al. Find or classify? Dual strategy for slot-value predictions on multi-domain dialog state tracking[C]//Proceedings of the 9th Joint Conference on Lexical and Computational Semantics. Barcelona, Spain: Association for Computational Linguistics, 2020: 154-167. |
9 | CHEN L, LV B E, WANG C, et al. Schema-guided multi-domain dialogue state tracking with graph attention neural networks[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 7521-7528. |
10 | ZHU S, LI J Y, CHEN L, et al. Efficient context and schema fusion networks for multi-domain dialogue state tracking[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Virtual Conference: Association for Computational Linguistics, 2020: 766-781. |
11 | LEE H, LEE J, KIM T Y. SUMBT: Slot-utterance matching for universal and scalable belief tracking[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 5478-5483. |
12 | HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. (2015-08-09)[2020-12]. |
13 | 尹存燕, 黄书剑, 戴新宇, 等. 中英命名实体识别及对齐中的中文分词优化[J]. 电子学报, 2015, 43(8): 1481-1487. |
YIN C Y, HUANG S J, DAI X Y, et al. Optimization of Chinese word segmentation in named entity recognition and word alignment[J]. Acta Electronica Sinica, 2015, 43(8): 1481-1487. (in Chinese) | |
14 | 郜成胜, 张君福, 李伟平, 等. 一种基于混合神经网络的命名实体识别与共指消解联合模型[J]. 电子学报, 2020, 48(3): 442-448. |
GAO C S, ZHANG J F, LI W P, et al. A joint model of named entity recognition and coreference resolution based on hybrid neural network[J]. Acta Electronica Sinica, 2020, 48(3): 442-448. (in Chinese) | |
15 | FENG Y, WANG Y, LI H. A sequence-to-sequence approach to dialogue state tracking[C]//Proceedings of the 59th ACL and the 11th IJCNLP. Virtual Conference: Association for Computational Linguistics, 2021:1714-1725 |
16 | VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2018-02-04)[2020-12]. . |
17 | ZHU Q, HUANG K L, ZHANG Z, et al. CrossWOZ: A large-scale Chinese cross-domain task-oriented dialogue dataset[J]. Transactions of the Association for Computational Linguistics, 2020, (8): 281-295. |
18 | BUDZIANOWSKI P, WEN T H, TSENG B H, et al. MultiWOZ: A large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling[C]//Proceedings of the 2018 Conference on EMNLP. Brussels: Association for Computational Linguistics, 2018: 5016-5026. |
19 | ERIC M, GOEL R, PAUL S, et al. MultiWOZ 2.1: A consolidated multi-domain dialogue dataset with state corrections and state tracking baselines[C]//Proceedings of the 12th Language Resources and Evaluation Conference(LREC 2020). Marseille, France: European Language Resources Association, 2020: 422-428 |
20 | DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT 2019. Minneapolis, Minnesota, USA: Association for Computational Linguistics, 2019: 4171-4186. |
21 | 张志昌, 曾扬扬, 庞雅丽. 融合语义角色和自注意力机制的中文文本蕴含识别[J]. 电子学报, 2020, 48(11): 2162-2169. |
ZHANG Z C, ZENG Y Y, PANG Y L. A Chinese textual entailment recognition method incorporating semantic role and self-Attention. Acta Electronica Sinica, 2020, 48(11): 2162-2169. (in Chinese) | |
22 | DE BOER P T, KROESE D P, MANNOR S, et al. A tutorial on the cross-entropy method[J]. Annals of Operations Research, 2005, 134(1): 19-67. |
23 | CUI Y M, CHE W X, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2021, 29: 3504-3514. |
24 | KINGMA D P, BA J. Adam: A method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations(ICLR 2015). San Diego: arXiv.org, 2015: 1-13. |
25 | GOEL R, PAUL S, HAKKANI-TÜR D. HyST: A hybrid approach for flexible and accurate dialogue state tracking[C]//Proceedings of the 20th Annual Conference of the International Speech Communication Association. Graz, Austria: ISCA, 2019: 1458-1462. |
26 | KIM S, YANG S, KIM G, et al. Efficient dialogue state tracking by selectively overwriting memory[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Virtual Conference: Association for Computational Linguistics, 2020: 567-582. |
[1] | 吴靖, 叶晓晶, 黄峰, 陈丽琼, 王志锋, 刘文犀. 基于深度学习的单帧图像超分辨率重建综述[J]. 电子学报, 2022, 50(9): 2265-2294. |
[2] | 李雪莹, 王田路, 梁鹏, 王翀. 基于系统模型的用户评论中非功能需求的自动分类[J]. 电子学报, 2022, 50(9): 2079-2089. |
[3] | 琚长瑞, 秦晓燕, 袁广林, 李豪, 朱虹. 尺度敏感损失与特征融合的快速小目标检测方法[J]. 电子学报, 2022, 50(9): 2119-2126. |
[4] | 刘群, 谭洪胜, 张优敏, 王国胤. 基于元路径的动态异质网络表示学习[J]. 电子学报, 2022, 50(8): 1830-1839. |
[5] | 张亚洲, 俞洋, 朱少林, 陈锐, 戎璐, 梁辉. 一种量子概率启发的对话讽刺识别网络模型[J]. 电子学报, 2022, 50(8): 1885-1893. |
[6] | 王飞扬, 冀鹏欣, 孙笠, 危倩, 李根, 张忠宝. 一种基于深度学习的动态社交网络用户对齐方法[J]. 电子学报, 2022, 50(8): 1925-1936. |
[7] | 徐兴荣, 刘聪, 李婷, 郭娜, 任崇广, 曾庆田. 基于双向准循环神经网络和注意力机制的业务流程剩余时间预测方法[J]. 电子学报, 2022, 50(8): 1975-1984. |
[8] | 欧阳与点, 谢鲲, 谢高岗, 文吉刚. 面向大规模网络测量的数据恢复算法:基于关联学习的张量填充[J]. 电子学报, 2022, 50(7): 1653-1663. |
[9] | 裴炤, 邱文涛, 王淼, 马苗, 张艳宁. 基于Transformer动态场景信息生成对抗网络的行人轨迹预测方法[J]. 电子学报, 2022, 50(7): 1537-1547. |
[10] | 赵俊男, 佘青山, 孟明, 陈云. 基于多流空间注意力图卷积SRU网络的骨架动作识别[J]. 电子学报, 2022, 50(7): 1579-1585. |
[11] | 李政伟, 李佳树, 尤著宏, 聂茹, 赵欢, 钟堂波. 基于异质图注意力网络的miRNA与疾病关联预测算法[J]. 电子学报, 2022, 50(6): 1428-1435. |
[12] | 彭闯, 王伦文, 胡炜林. 融合深度特征的电磁频谱异常检测算法[J]. 电子学报, 2022, 50(6): 1359-1369. |
[13] | 杨伟超, 杜宇, 文伟, 侯舒维, 徐常志, 张建华. 基于多重分形谱智能分析的卫星信号调制识别研究[J]. 电子学报, 2022, 50(6): 1336-1343. |
[14] | 张波, 陆云杰, 秦东明, 邹国建. 一种卷积自编码深度学习的空气污染多站点联合预测模型[J]. 电子学报, 2022, 50(6): 1410-1427. |
[15] | 冀振燕, 韩梦豪, 宋晓军, 冯其波. 面向激光光条图像修复的循环相似度映射网络[J]. 电子学报, 2022, 50(5): 1234-1242. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||