电子学报 ›› 2021, Vol. 49 ›› Issue (7): 1305-1313.DOI: 10.12263/DZXB.20200654
所属专题: 自然语言处理技术
黄名选1,2
收稿日期:
2020-07-05
修回日期:
2020-09-25
出版日期:
2021-07-25
作者简介:
基金资助:
HUANG Ming-xuan1,2
Received:
2020-07-05
Revised:
2020-09-25
Online:
2021-07-25
Published:
2021-08-11
Supported by:
摘要:
针对自然语言处理中查询主题漂移和词不匹配问题,提出基于CSC(Copulas-based Support and Confidence)框架的关联模式挖掘与规则扩展算法,并将基于统计学分析的关联模式与具有上下文语义信息的词向量融合,提出关联模式挖掘与词向量学习融合的伪相关反馈查询扩展模型.该模型对伪相关反馈文档集挖掘规则扩展词,对初检文档集进行词嵌入学习训练得到词向量,计算规则扩展词与原查询的向量相似度,提取向量相似度不低于阈值的规则扩展词作为最终扩展词.实验结果表明,所提扩展模型能有效地减少查询主题漂移和词不匹配问题,提高检索性能,与现有基于关联模式的和基于词向量的查询扩展方法比较,MAP(Mean Average Precision)平均增幅最大可达17.52%,对短查询更有效.所提挖掘方法可用于其他文本挖掘任务和推荐系统,以提高其性能.
中图分类号:
黄名选. 关联模式挖掘与词向量学习融合的伪相关反馈查询扩展[J]. 电子学报, 2021, 49(7): 1305-1313.
HUANG Ming-xuan. Pseudo-Relevance Feedback Query Expansion Based on the Fusion of Association Pattern Mining and Word Embedding Learning[J]. Acta Electronica Sinica, 2021, 49(7): 1305-1313.
语料集 | 简称 | 文档数量 |
---|---|---|
udn2000 | UN0 | 244038 |
udn2001 | UN1 | 222526 |
ude2000 | UE0 | 40445 |
ude2001 | UE1 | 51851 |
mhn2000 | MN0 | 84437 |
mhn2001 | MN1 | 85302 |
edn2000 | EN0 | 79380 |
end2001 | EN1 | 93467 |
表1 实验原始语料集及其数量
语料集 | 简称 | 文档数量 |
---|---|---|
udn2000 | UN0 | 244038 |
udn2001 | UN1 | 222526 |
ude2000 | UE0 | 40445 |
ude2001 | UE1 | 51851 |
mhn2000 | MN0 | 84437 |
mhn2001 | MN1 | 85302 |
edn2000 | EN0 | 79380 |
end2001 | EN1 | 93467 |
算法 | 数据集 | 评价 标准 | |||||||
---|---|---|---|---|---|---|---|---|---|
UN0 | UN1 | UE0 | UE1 | MN0 | MN1 | EN0 | EN1 | ||
BLR | 0.2180 | 0.2679 | 0.3701 | 0.2497 | 0.3049 | 0.3144 | 0.4278 | 0.1992 | Relax |
QE_WAPM | 0.2777 | 0.2963 | 0.4130 | 0.2694 | 0.3447 | 0.3517 | 0.4777 | 0.2551 | |
QE_WPNPM | 0.2699 | 0.2714 | 0.4622 | 0.2815 | 0.3481 | 0.3370 | 0.4631 | 0.2010 | |
QE_WMSM | 0.2831 | 0.2993 | 0.4375 | 0.2842 | 0.3264 | 0.3783 | 0.4615 | 0.2301 | |
QE_W2Vec | 0.2713 | 0.3122 | 0.5003 | 0.3006 | 0.2871 | 0.3570 | 0.4628 | 0.2122 | |
QE_ARCSC | 0.2890 | 0.3077 | 0.4983 | 0.3264 | 0.3691 | 0.3742 | 0.4875 | 0.2647 | |
QE_AP&SG | 0.2939 | 0.3186 | 0.5344 | 0.3208 | 0.3706 | 0.3955 | 0.5060 | 0.2735 | |
QE_AP&GL | 0.2704 | 0.3079 | 0.5362 | 0.3259 | 0.3623 | 0.3675 | 0.5038 | 0.2711 | |
QE_AP&BT | 0.2940 | 0.3178 | 0.5316 | 0.3228 | 0.3752 | 0.3787 | 0.5136 | 0.2728 | |
BLR | 0.1253 | 0.1839 | 0.2075 | 0.1795 | 0.2089 | 0.1850 | 0.2814 | 0.1200 | Rigid |
QE_WAPM | 0.1597 | 0.1954 | 0.2165 | 0.1661 | 0.2016 | 0.1976 | 0.3313 | 0.1690 | |
QE_WPNPM | 0.1496 | 0.1776 | 0.2452 | 0.1668 | 0.2172 | 0.1856 | 0.3359 | 0.1398 | |
QE_WMSM | 0.1596 | 0.1906 | 0.2531 | 0.1787 | 0.1987 | 0.2142 | 0.3145 | 0.1310 | |
QE_W2Vec | 0.1474 | 0.2054 | 0.3056 | 0.1997 | 0.1894 | 0.2147 | 0.3218 | 0.1383 | |
QE_ARCSC | 0.1608 | 0.2043 | 0.2712 | 0.2026 | 0.2335 | 0.2074 | 0.3338 | 0.1728 | |
QE_AP&SG | 0.1613 | 0.2155 | 0.3061 | 0.2036 | 0.2412 | 0.2038 | 0.3514 | 0.1752 | |
QE_AP&GL | 0.1507 | 0.2036 | 0.3057 | 0.2023 | 0.2281 | 0.1971 | 0.3501 | 0.1717 | |
QE_AP&BT | 0.1615 | 0.2092 | 0.3059 | 0.2004 | 0.2385 | 0.2064 | 0.3553 | 0.1749 |
表2 本文算法与基准、对比算法的检索性能MAP值(Title查询)
算法 | 数据集 | 评价 标准 | |||||||
---|---|---|---|---|---|---|---|---|---|
UN0 | UN1 | UE0 | UE1 | MN0 | MN1 | EN0 | EN1 | ||
BLR | 0.2180 | 0.2679 | 0.3701 | 0.2497 | 0.3049 | 0.3144 | 0.4278 | 0.1992 | Relax |
QE_WAPM | 0.2777 | 0.2963 | 0.4130 | 0.2694 | 0.3447 | 0.3517 | 0.4777 | 0.2551 | |
QE_WPNPM | 0.2699 | 0.2714 | 0.4622 | 0.2815 | 0.3481 | 0.3370 | 0.4631 | 0.2010 | |
QE_WMSM | 0.2831 | 0.2993 | 0.4375 | 0.2842 | 0.3264 | 0.3783 | 0.4615 | 0.2301 | |
QE_W2Vec | 0.2713 | 0.3122 | 0.5003 | 0.3006 | 0.2871 | 0.3570 | 0.4628 | 0.2122 | |
QE_ARCSC | 0.2890 | 0.3077 | 0.4983 | 0.3264 | 0.3691 | 0.3742 | 0.4875 | 0.2647 | |
QE_AP&SG | 0.2939 | 0.3186 | 0.5344 | 0.3208 | 0.3706 | 0.3955 | 0.5060 | 0.2735 | |
QE_AP&GL | 0.2704 | 0.3079 | 0.5362 | 0.3259 | 0.3623 | 0.3675 | 0.5038 | 0.2711 | |
QE_AP&BT | 0.2940 | 0.3178 | 0.5316 | 0.3228 | 0.3752 | 0.3787 | 0.5136 | 0.2728 | |
BLR | 0.1253 | 0.1839 | 0.2075 | 0.1795 | 0.2089 | 0.1850 | 0.2814 | 0.1200 | Rigid |
QE_WAPM | 0.1597 | 0.1954 | 0.2165 | 0.1661 | 0.2016 | 0.1976 | 0.3313 | 0.1690 | |
QE_WPNPM | 0.1496 | 0.1776 | 0.2452 | 0.1668 | 0.2172 | 0.1856 | 0.3359 | 0.1398 | |
QE_WMSM | 0.1596 | 0.1906 | 0.2531 | 0.1787 | 0.1987 | 0.2142 | 0.3145 | 0.1310 | |
QE_W2Vec | 0.1474 | 0.2054 | 0.3056 | 0.1997 | 0.1894 | 0.2147 | 0.3218 | 0.1383 | |
QE_ARCSC | 0.1608 | 0.2043 | 0.2712 | 0.2026 | 0.2335 | 0.2074 | 0.3338 | 0.1728 | |
QE_AP&SG | 0.1613 | 0.2155 | 0.3061 | 0.2036 | 0.2412 | 0.2038 | 0.3514 | 0.1752 | |
QE_AP&GL | 0.1507 | 0.2036 | 0.3057 | 0.2023 | 0.2281 | 0.1971 | 0.3501 | 0.1717 | |
QE_AP&BT | 0.1615 | 0.2092 | 0.3059 | 0.2004 | 0.2385 | 0.2064 | 0.3553 | 0.1749 |
查询类型 (评价标准) | 本文算法 | 基准检索和对比算法 | ||||
---|---|---|---|---|---|---|
BLR | QE_ WAPM | QE_ WPNPM | QE_ WMSM | QE_ W2Vec | ||
Title (Relax) | QE_ARCSC | 24.96 | 8.62 | 12.28 | 8.28 | 9.59 |
QE_AP&SG | 28.69 | 11.86 | 15.63 | 11.48 | 12.75 | |
QE_AP&GL | 25.49 | 9.18 | 12.78 | 8.84 | 9.95 | |
QE_AP&BT | 28.36 | 11.58 | 15.31 | 11.25 | 12.51 | |
Title (Rigid) | QE_ARCSC | 21.18 | 9.52 | 12.1 | 10.09 | 5.91 |
QE_AP&SG | 25.41 | 13.46 | 15.86 | 13.87 | 9.32 | |
QE_AP&GL | 21.78 | 10.22 | 12.45 | 10.56 | 5.99 | |
QE_AP&BT | 24.93 | 12.95 | 15.32 | 13.35 | 8.85 | |
Desc (Relax) | QE_ARCSC | 23.88 | 6.62 | 14.15 | 12.95 | 6.22 |
QE_AP&SG | 25.67 | 8.26 | 15.71 | 14.7 | 7.67 | |
QE_AP&GL | 21.74 | 5.02 | 12.34 | 11.31 | 4.46 | |
QE_AP&BT | 25.99 | 8.55 | 16.13 | 15.04 | 7.99 | |
Desc (Rigid) | QE_ARCSC | 22.21 | 7.02 | 14.46 | 14.99 | 6.26 |
QE_AP&SG | 24.93 | 9.45 | 16.9 | 17.52 | 8.43 | |
QE_AP&GL | 19.71 | 4.96 | 12.28 | 12.86 | 4.19 | |
QE_AP&BT | 23.36 | 8.03 | 15.51 | 16.11 | 7.13 |
表3 本文算法检索结果MAP值在8个数据集上的平均增幅(%)
查询类型 (评价标准) | 本文算法 | 基准检索和对比算法 | ||||
---|---|---|---|---|---|---|
BLR | QE_ WAPM | QE_ WPNPM | QE_ WMSM | QE_ W2Vec | ||
Title (Relax) | QE_ARCSC | 24.96 | 8.62 | 12.28 | 8.28 | 9.59 |
QE_AP&SG | 28.69 | 11.86 | 15.63 | 11.48 | 12.75 | |
QE_AP&GL | 25.49 | 9.18 | 12.78 | 8.84 | 9.95 | |
QE_AP&BT | 28.36 | 11.58 | 15.31 | 11.25 | 12.51 | |
Title (Rigid) | QE_ARCSC | 21.18 | 9.52 | 12.1 | 10.09 | 5.91 |
QE_AP&SG | 25.41 | 13.46 | 15.86 | 13.87 | 9.32 | |
QE_AP&GL | 21.78 | 10.22 | 12.45 | 10.56 | 5.99 | |
QE_AP&BT | 24.93 | 12.95 | 15.32 | 13.35 | 8.85 | |
Desc (Relax) | QE_ARCSC | 23.88 | 6.62 | 14.15 | 12.95 | 6.22 |
QE_AP&SG | 25.67 | 8.26 | 15.71 | 14.7 | 7.67 | |
QE_AP&GL | 21.74 | 5.02 | 12.34 | 11.31 | 4.46 | |
QE_AP&BT | 25.99 | 8.55 | 16.13 | 15.04 | 7.99 | |
Desc (Rigid) | QE_ARCSC | 22.21 | 7.02 | 14.46 | 14.99 | 6.26 |
QE_AP&SG | 24.93 | 9.45 | 16.9 | 17.52 | 8.43 | |
QE_AP&GL | 19.71 | 4.96 | 12.28 | 12.86 | 4.19 | |
QE_AP&BT | 23.36 | 8.03 | 15.51 | 16.11 | 7.13 |
查询类型 (评价标准) | 本文算法 | 基准检索和对比算法 | ||||
---|---|---|---|---|---|---|
BLR | QE_ WAPM | QE_ WPNPM | QE_ WMSM | QE_ W2Vec | ||
Title (Relax) | QE_ARCSC | 22.21 | 4.95 | 10.41 | 9.96 | 12.73 |
QE_AP&SG | 23.58 | 6.30 | 11.62 | 11.04 | 13.53 | |
QE_AP&GL | 23.48 | 6.13 | 11.78 | 11.35 | 14.16 | |
QE_AP&BT | 23.54 | 6.23 | 11.90 | 11.31 | 14.12 | |
Title (Rigid) | QE_ARCSC | 16.70 | 4.25 | 8.36 | 11.39 | 12.93 |
QE_AP&SG | 19.81 | 7.29 | 10.98 | 13.66 | 14.76 | |
QE_AP&GL | 17.99 | 5.50 | 9.32 | 12.16 | 13.55 | |
QE_AP&BT | 18.08 | 5.59 | 9.66 | 12.65 | 14.05 | |
Desc (Relax) | QE_ARCSC | 12.77 | 0.84 | 5.15 | 4.57 | -0.71 |
QE_AP&SG | 14.9 | 2.67 | 7.08 | 6.38 | 1.08 | |
QE_AP&GL | 11.51 | -0.29 | 4.14 | 3.32 | -1.70 | |
QE_AP&BT | 16.38 | 4.06 | 8.50 | 7.91 | 2.49 | |
Desc (Rigid) | QE_ARCSC | 14.06 | 1.10 | 4.05 | 3.81 | -0.99 |
QE_AP&SG | 17.61 | 4.33 | 7.32 | 7.28 | 2.02 | |
QE_AP&GL | 15.16 | 1.86 | 4.98 | 4.66 | 0.00 | |
QE_AP&BT | 19.11 | 5.70 | 8.63 | 8.72 | 3.20 |
表4 本文算法检索结果P@5值在8个数据集上的平均增幅(%)
查询类型 (评价标准) | 本文算法 | 基准检索和对比算法 | ||||
---|---|---|---|---|---|---|
BLR | QE_ WAPM | QE_ WPNPM | QE_ WMSM | QE_ W2Vec | ||
Title (Relax) | QE_ARCSC | 22.21 | 4.95 | 10.41 | 9.96 | 12.73 |
QE_AP&SG | 23.58 | 6.30 | 11.62 | 11.04 | 13.53 | |
QE_AP&GL | 23.48 | 6.13 | 11.78 | 11.35 | 14.16 | |
QE_AP&BT | 23.54 | 6.23 | 11.90 | 11.31 | 14.12 | |
Title (Rigid) | QE_ARCSC | 16.70 | 4.25 | 8.36 | 11.39 | 12.93 |
QE_AP&SG | 19.81 | 7.29 | 10.98 | 13.66 | 14.76 | |
QE_AP&GL | 17.99 | 5.50 | 9.32 | 12.16 | 13.55 | |
QE_AP&BT | 18.08 | 5.59 | 9.66 | 12.65 | 14.05 | |
Desc (Relax) | QE_ARCSC | 12.77 | 0.84 | 5.15 | 4.57 | -0.71 |
QE_AP&SG | 14.9 | 2.67 | 7.08 | 6.38 | 1.08 | |
QE_AP&GL | 11.51 | -0.29 | 4.14 | 3.32 | -1.70 | |
QE_AP&BT | 16.38 | 4.06 | 8.50 | 7.91 | 2.49 | |
Desc (Rigid) | QE_ARCSC | 14.06 | 1.10 | 4.05 | 3.81 | -0.99 |
QE_AP&SG | 17.61 | 4.33 | 7.32 | 7.28 | 2.02 | |
QE_AP&GL | 15.16 | 1.86 | 4.98 | 4.66 | 0.00 | |
QE_AP&BT | 19.11 | 5.70 | 8.63 | 8.72 | 3.20 |
查询 | 算法 | Relax | Rigid |
---|---|---|---|
No.4 | BLR | 0.1257 | 0.1051 |
QE_WAPM | 0.2870 | 0.2316 | |
QE_WPNPM | 0.3722 | 0.2400 | |
QE_WMSM | 0.3299 | 0.2128 | |
QE_W2Vec | 0.4364 | 0.2887 | |
QE_APMCC | 0.3607 | 0.2033 | |
QE_AP&SG | 0.4426 | 0.2924 | |
No.42 | BLR | 0.5000 | 0.5000 |
QE_WAPM | 0.5833 | 0.5833 | |
QE_WPNPM | 0.5000 | 0.5000 | |
QE_WMSM | 0.5833 | 0.5833 | |
QE_W2Vec | 0.5000 | 0.5000 | |
QE_APMCC | 0.5833 | 0.5833 | |
QE_AP&SG | 0.7500 | 0.7500 |
表5 查询实例Desc主题的MAP值(UE0)比较
查询 | 算法 | Relax | Rigid |
---|---|---|---|
No.4 | BLR | 0.1257 | 0.1051 |
QE_WAPM | 0.2870 | 0.2316 | |
QE_WPNPM | 0.3722 | 0.2400 | |
QE_WMSM | 0.3299 | 0.2128 | |
QE_W2Vec | 0.4364 | 0.2887 | |
QE_APMCC | 0.3607 | 0.2033 | |
QE_AP&SG | 0.4426 | 0.2924 | |
No.42 | BLR | 0.5000 | 0.5000 |
QE_WAPM | 0.5833 | 0.5833 | |
QE_WPNPM | 0.5000 | 0.5000 | |
QE_WMSM | 0.5833 | 0.5833 | |
QE_W2Vec | 0.5000 | 0.5000 | |
QE_APMCC | 0.5833 | 0.5833 | |
QE_AP&SG | 0.7500 | 0.7500 |
1 | VaidyanathanR, DasS, SrivastavaN. Query expansion strategy based on pseudo relevance feedback and term weight scheme for monolingual retrieval[J]. International Journal of Computer Applications, 2015, 105(8):1 - 6. |
2 | KeikhaA, EnsanF, BagheriE. Query expansion using pseudo relevance feedback on Wikipedia[J]. Journal of Intelligent Information Systems, 2018, 50(3): 455 - 478. |
3 | PanM, HuangJ, HeT, et al. A simple kernel co-occurrence-based enhancement for pseudo-relevance feedback[J]. Journal of the Association for Information Science and Technology (JASIST), 2020, 71(3): 264 - 281. |
4 | LatiriC, HaddadH, HamrouniT. Towards an effective automatic query expansion process using an association rule mining approach[J]. Journal of Intelligent Information Systems, 2012, 39(1): 209 - 247. |
5 | BouziriA, LatiriC, GaussierE, et al. Learning query expansion from association rules between terms[A]. Fred A, Dietz J, Aveiro D, et al. Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)[C].Lisbon,Portugal: Scitepress, 2015. 525 - 530. |
6 | BouziriA, LatiriC, GaussierE. Efficient association rules selecting for automatic query expansion[A]. Gelbukh A. Proceedings of the 18th International Conference on Computational Linguistics & Intelligent Text Processing[C]. Budapest, Hungary: Springer, 2017. 563 - 574. |
7 | BouziriA, LatiriC, GaussierE. LTR-expand: Query Expansion Model Based on Learning to Rank Association Rules[EB/OL]. ,2020. 03.21/2020.08.15. |
8 | JabriS, DahbiA, GadiT, et al. Improving retrieval performance based on query expansion with Wikipedia and text mining technique[J]. International Journal of Intelligent Engineering & Systems, 2018, 11(4): 283 - 292. |
9 | JabriS, DahbiA, GadiT. A graph-based approach for text query expansion using pseudo relevance feedback and association rules mining[J]. International Journal of Electrical&Computer Engineering, 2019, 9(6): 5016 - 5023. |
10 | 黄名选, 严小卫, 张师超. 基于矩阵加权关联规则挖掘的伪相关反馈查询扩展[J].软件学报, 2009, 20(7): 1854 - 1865. |
HuangMX, YanXW, ZhangSC. Query expansion of pseudo relevance feedback based on matrix-weighted association rules mining[J]. Journal of Software, 2009, 20(7): 1854 - 1865. (in Chinese) | |
11 | 黄名选. 完全加权模式挖掘与相关反馈融合的印尼汉跨语言查询扩展[J]. 小型微型计算机系统, 2017, 38(8): 1783 - 1791. |
HUANGMing-xuan. Indonesian-Chinese cross language query expansion based on all-weighted patterns mining and relevance feedback[J]. Journal of Chinese Computer Systems, 2017, 38(8): 1783 - 1791. (in Chinese) | |
12 | 黄名选. 基于加权关联模式挖掘的越-英跨语言查询扩展[J].情报学报,2017,36(3): 307 - 318. |
HUANGMing-xuan. Vietnamese-English cross language query expansion based on weighted association patterns mining[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(3): 307 - 318. (in Chinese) | |
13 | 黄名选, 蒋曹清. 基于项权值排序挖掘的跨语言查询扩展[J]. 电子学报, 2020,48(3): 568 - 576. |
HUANGMing-xuan, JIANGCao-qing. Cross language query expansion based on item weight sorting mining[J]. Acta Electronica Sinica, 2020,48(3): 568 - 576. (in Chinese) | |
14 | 黄名选, 蒋曹清. 基于完全加权正负关联模式挖掘的越-英跨语言查询译后扩展[J]. 电子学报, 2018, 46(12): 3029 - 3036. |
HUANGMing-xuan, JIANGCao-qing. Vietnamese-Eng- lish cross language query post-translation expansion based on all-weighted positive and negative association patterns mining[J]. Acta Electronica Sinica, 2018, 46(12): 3029 - 3036. (in Chinese) | |
15 | ZhangH R, ZhangJ W, WeiX Y, et al. A new frequent pattern mining algorithm with weighted multiple minimum supports[J]. Intelligent Automation & Soft Computing, 2017, 23(4): 605 - 612. |
16 | RoyD, GangulyD, MitraM, et al. Word vector compositionality based relevance feedback using kernel density estimation[A]. Proceedings of the 25th ACM International Conference on Information and Knowledge Management[C]. New York, USA: ACM Press, 2016. 1281 - 1290. |
17 | KuziS, ShtokA, KurlandO. Query expansion using word embeddings[A]. Proceedings of the 25th ACM International Conference on Information and Knowledge Management[C]. New York, USA: ACM Press, 2016. 1929 - 1932. |
18 | 许侃, 林原, 曲忱, 等. 专利查询扩展的词向量方法研究[J]. 计算机科学与探索, 2018, 12(6): 972 - 980. |
XUKan, LINYuan, QUChen, et al. Research on patent query expansion methods using word embedding[J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(6): 972 - 980. (in Chinese) | |
19 | SklarA. Fonctions de repartition à n dimensions et leurs marges[J]. Publication de l’Institut de Statistique l’Universite Paris, 1959, 8(1): 229 - 231. |
20 | EickhoffC, De VriesA P, Collins-ThompsonK. Copulas for information retrieval[A]. Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'13)[C]. New York, USA: ACM Press, 2013. 663 - 672. |
21 | 张书波, 张引, 张斌, 等. 基于Copulas框架的混合式查询扩展方法[J]. 计算机科学, 2016,43(6A):485 - 488 496. |
ZHANGShu-bo, ZHANGYin, ZHANGBin, et al. Combined query expansion method based on copulas framework[J]. Computer Science, 2016,43(6A): 485 - 488, 496. (in Chinese) | |
22 | NelsonR B. An Introduction to Copulas(Second Edition)[M]. New York, USA: Springer Science+Business Media, Inc, 2006. 17 - 22. |
23 | MikolovT, ChenK, CorradogG,et al.Efficient Estimation of Word Representations in Vector Space[EB/OL]. |
cs.CL] 7 Sep 2013/2020.08.15. | |
24 | MikolovT, SutskeverI, ChenK, et al. Distributed representations of words and phrases and their compositionality[A]. Burges C J C, Bottou L, Welling M. Proceedings of Advances in Neural Information Processing Systems(NIPS 2013)[C]. New York, USA: Curran Associates Inc, 2013. 3111 - 3119. |
25 | PenningtonJ, SocherR, ManningC. Glove: Global vectors for word representation[A]. Moschitti A, Pang B, Daelemans W. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014)[C]. Doha, Qatar: Association for Computational Linguistics, 2014. 1532 - 1543. |
26 | 张剑, 屈丹, 李真. 基于词向量特征的循环神经网络语言模型[J]. 模式识别与人工智能, 2015, 28(4):299 - 305. |
ZHANGJian, QUDan, LIZhen. Recurrent neural network language model based on word vector features[J]. PR& AI, 2015, 28(4): 299 - 305. (in Chinese) | |
27 | DevlinJ, ChangM W, LeeK, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [EB/OL]., arXiv:1810. 04805v |
28 | cs.CL] 24 May 2019/2020.08.15. |
[1] | 张志昌, 于沛霖, 庞雅丽, 朱林, 曾扬扬. SMGN:用于对话状态跟踪的状态记忆图网络[J]. 电子学报, 2022, 50(8): 1851-1858. |
[2] | 张昱, 刘开峰, 张全新, 王艳歌, 高凯龙. 基于组合-卷积神经网络的中文新闻文本分类[J]. 电子学报, 2021, 49(6): 1059-1067. |
[3] | 杨启萌, 禹龙, 田生伟, 艾山·吾买尔. 基于深度强化学习的维吾尔语人称代词指代消解[J]. 电子学报, 2020, 48(6): 1077-1083. |
[4] | 马路遥, 夏博, 肖叶, 荀恩东. 面向句法结构的文本检索方法研究[J]. 电子学报, 2020, 48(5): 833-839. |
[5] | 黄名选, 蒋曹清. 基于项权值排序挖掘的跨语言查询扩展[J]. 电子学报, 2020, 48(3): 568-576. |
[6] | 吴玉佳, 李晶, 宋成芳, 常军. 基于高效用神经网络的文本分类方法[J]. 电子学报, 2020, 48(2): 279-284. |
[7] | 张志昌, 曾扬扬, 庞雅丽. 融合语义角色和自注意力机制的中文文本蕴含识别[J]. 电子学报, 2020, 48(11): 2162-2169. |
[8] | 尤洪峰, 田生伟, 禹龙, 吕亚龙. 基于Word Embedding的遥感影像检测分割[J]. 电子学报, 2020, 48(1): 75-83. |
[9] | 马慧芳, 刘文, 李志欣, 蔺想红. 融合耦合距离区分度和强类别特征的短文本相似度计算方法[J]. 电子学报, 2019, 47(6): 1331-1336. |
[10] | 吕品, 于文兵, 汪鑫, 计春雷, 周曦民. 异构分类器堆叠泛化及其在恶意评论检测中的应用[J]. 电子学报, 2019, 47(10): 2228-2234. |
[11] | 田生伟, 秦越, 禹龙, 吐尔根·依布拉音, 冯冠军. 基于Bi-LSTM的维吾尔语人称代词指代消解[J]. 电子学报, 2018, 46(7): 1691-1699. |
[12] | 马慧芳, 刘芳, 夏琴, 郝占军. 基于加权超图随机游走的文献关键词提取算法[J]. 电子学报, 2018, 46(6): 1410-1414. |
[13] | 黄名选, 蒋曹清. 基于完全加权正负关联模式挖掘的越-英跨语言查询译后扩展[J]. 电子学报, 2018, 46(12): 3029-3036. |
[14] | 张维维, 陈喆, 殷福亮, 张俊星. 复调音乐主旋律提取方法综述[J]. 电子学报, 2017, 45(4): 1000-1011. |
[15] | 周炫余, 刘娟, 邵鹏, 卢笑, 罗飞. 基于测度优化Laplacian SVM的中文指代消解方法[J]. 电子学报, 2016, 44(12): 3064-3072. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||