电子学报 ›› 2018, Vol. 46 ›› Issue (3): 607-613.DOI: 10.3969/j.issn.0372-2112.2018.03.014
张雄, 陈福才, 黄瑞阳
收稿日期:
2016-07-11
修回日期:
2016-10-24
出版日期:
2018-03-25
作者简介:
基金资助:
ZHANG Xiong, CHEN Fu-cai, HUANG Rui-yang
Received:
2016-07-11
Revised:
2016-10-24
Online:
2018-03-25
Published:
2018-03-25
Supported by:
摘要: 针对实体上下文信息主题漂移的问题,提出一种基于双词主题模型的实体消歧方法.方法考虑到实体在一定语义环境下具有不同的主题,且在同一文档中同时出现的其他实体在一定程度上能够帮助待消歧实体确定所指代内容,利用命名实体构建双词的思想,将协同实体关系融合到主题模型中,并在此基础上利用维基百科知识库,进行半监督消歧.本文最后在网络文本数据上进行了相关的实验,验证了所提算法的有效性.实验表明该方法有效的提高了实体消歧精度.
中图分类号:
张雄, 陈福才, 黄瑞阳. 基于双词主题模型的半监督实体消歧方法研究[J]. 电子学报, 2018, 46(3): 607-613.
ZHANG Xiong, CHEN Fu-cai, HUANG Rui-yang . Semi-supervised Entity Disambiguation Method Research Based on Biterm Topic Model[J]. Acta Electronica Sinica, 2018, 46(3): 607-613.
[1] Blanco R,Ottaviano G,Meij E.Fast and space-efficient entity linking for queries[A].Proceedings of the Eighth ACM International Conference on Web Search and Data Mining[C].Shanghai:ACM,2015.179-188. [2] Mihalcea R,Csomai A.Wikify!:linking documents to encyclopedic knowledge[A].Sixteenth ACM Conference on Conference on Information and Knowledge Management[C].Lisbon:CIKM,2007.233-242. [3] Li Y,Tan S,Sun H,et al.Entity disambiguation with linkless knowledge bases[A].Proceedings of the 25th International Conference on World Wide Web.International World Wide Web Conferences Steering Committee[C].Montreal:WWW,2016.1261-1270. [4] Habib M B,Keulen M.A generic open world named entity disambiguation approach for tweets[A].KDIR/KMIS 2013[C].Algarve,Portugal:Scitepress,2013.267-276. [5] Mikolov T,Chen K,Corrado G,et al.Efficient estimation of word representations in vector space[J].Computer Science,2013. [6] Mikolov T,Sutskever I,Chen K,et al.Distributed representations of words and phrases and their compositionality[J].Advances in Neural Information Processing Systems,2013,26:3111-3119. [7] Pennington J,Socher R,Manning C D.Glove:Global vectors for word representation[A].Proceedings of the Empiricial Methods in Natural Language Processing[C].Doha:EMNLP,2014.1532-1543. [8] Yamada I,Shindo H,Takeda H,et al.Joint learning of the embedding of words and entities for named entity disambiguation[A].Signll Conference on Computational Natural Language Learning[C].Lisbon:SIGNLL,2016.250-259. [9] 王英帅,李培峰,朱巧明.一种基于LDA和上下文摘要的Web人名消歧方法[J].计算机应用与软件,2011,28(7):13-15. Yingshuai W,Peifeng L,Qiaoming Z.A web name disambiguation approach based on LDA and name's context snippets[J].Computer Applications and Software,2011,28(7):13-15.(in Chinese) [10] Guo Z,Barbosa D.Entity linking with a unified semantic representation[A].Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion.International World Wide Web Conferences Steering Committee[C].Seoul:WWW,2014.1305-1310. [11] Barrena A,Donostia B C,Soroa A,et al.Combining mention context and hyperlinks from wikipedia for named entity disambiguation[J].Lexical and Computational Semantics,2015:101-105. [12] Pershina M,He Y,Grishman R.Personalized page rank for named entity disambiguation[A].Proc 2015 Annual Conference of the North American Chapter of the ACL[C].Denver:NAACL HLT,2015.238-243. [13] 怀宝兴,宝腾飞,祝恒书,等.一种基于概率主题模型的命名实体链接方法[J].软件学报,2014,25(9):2076-2087. Baoxing H,Tengfei B,HengShu Z,et al.Topic modeling approach to named entity linking[J].Journal of Software,2014,25(9):2076-2087.(in Chinese) [14] Nguyen D B,Theobald M,Hoffart J,et al.AIDA-light:high-throughput named-entity disambiguation[A].Proceedings of the Workshop on Linked Data on the Web Co-Located with the 23rd International World Wide Web Conference[C].Seoul,Korea:WWW,2014.1184-1194. [15] 左乃彻.基于维基百科的中英文命名实体消歧[D].北京:北京邮电大学,2015. Naiqie Z.Named entity disambiguation based on Chinese and English Wikipedia knowledge base[D].Beijing:Beijing University of Posts and Telecommunications,2015.(in Chinese) [16] Kulkarni S,Singh A,Ramakrishnan G,et al.Collective annotation of Wikipedia entities in web text[A].Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.[C].Paris:ACM,2009.457-466. [17] Li Y,Wang C,Han F,et al.Mining evidences for named entity disambiguation[A].Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.[C].Chicago:ACM,2013.1070-1078. [18] Yan X,Guo J,Lan Y,et al.A biterm topic model for short texts[A].Proceedings of the 22nd International Conference on World Wide Web[C].Seoul:ACM,2013.1445-1456. [19] Blei D M,Ng A Y,Jordan M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3(Jan):993-1022. [20] Griffiths T L,Steyvers M.Colloquium paper:mapping knowledge domains:finding scientific topics[J].Proceedings of the National Academy of Sciences of the United States of America,2004,101(Suppl 1):5228-5235. [21] Kataria S S,Kumar K S,Rastogi R R,et al.Entity disambiguation with hierarchical topic models[A].ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C].San Diego:DBLP,2011.1037-1045. [22] 徐佳俊.命名实体语义消歧方法的研究[D].上海:上海交通大学,2014. Jiajun X.A Study of Semantic-Disambiguation Approach on Name Entities[D].Shanghai:Shanghai Jiaotong University,2014.(in Chinese) |
[1] | 皮德常, 吴致远, 曹建军. 基于知识图谱表示学习的谣言早期检测方法[J]. 电子学报, 2023, 51(2): 385-395. |
[2] | 金翊, 张红红, 陈迅雷, 王舒欣, 欧阳山, 沈云付, 江家宝. SD16的三值逻辑光学运算器理论和结构[J]. 电子学报, 2023, (): 1-9. |
[3] | 张宏科, 于成晓, 权伟, 张宇明. 融算网络体系基础研究[J]. 电子学报, 2022, 50(12): 2928-2934. |
[4] | 金明, 丁蓉. 一种联合时域和空域残差的网络异常检测与节点定位方法[J]. 电子学报, 2022, (): 1-8. |
[5] | 胡钢, 牛琼, 许丽鹏, 卢志宇, 过秀成. 基于网络超链接信息熵的节点重要性序结构演化建模分析[J]. 电子学报, 2022, 50(11): 2638-2644. |
[6] | 杨宏宇, 王泽霖, 张良, 成翔. 面向物联网的多协议僵尸网络检测方法[J]. 电子学报, 2022, (): 1-9. |
[7] | 孟超, 周倩, 郭林, 王攀, 孙知信. 基于相关性传输模型的无线链路质量估计方法及路由优化算法[J]. 电子学报, 2022, 50(10): 2409-2424. |
[8] | 蒋伟进, 张婉清, 陈萍萍, 陈君鹏, 孙永霞, 刘权. 基于IWOA群智感知中数量敏感的任务分配方法[J]. 电子学报, 2022, 50(10): 2489-2502. |
[9] | 杨明亮, 吴春明, 沈丛麒, 邱于兵. 基于IEEE 802.1的TSN交换机队列调度技术研究[J]. 电子学报, 2022, 50(9): 2090-2095. |
[10] | 熊小峰, 黄淳岚, 乐光学, 戴亚盛, 杨晓慧, 杨忠明. 边缘计算中基于综合信任评价的任务卸载策略[J]. 电子学报, 2022, 50(9): 2134-2145. |
[11] | 魏振春, 傅宇, 马仲军, 吕增威, 石雷, 张本宏. 带时间窗的无线可充电传感器网络多目标路径规划算法[J]. 电子学报, 2022, 50(8): 1819-1829. |
[12] | 赵耿, 马英杰, 陈磊, 董有恒, 侯艳丽. 基于扰动时空混沌系统的动态S盒设计[J]. 电子学报, 2022, 50(8): 2037-2042. |
[13] | 欧阳与点, 谢鲲, 谢高岗, 文吉刚. 面向大规模网络测量的数据恢复算法:基于关联学习的张量填充[J]. 电子学报, 2022, 50(7): 1653-1663. |
[14] | 陈嘉兴, 程杰, 董云玲, 刘志华. 基于弯曲声线和测距修正的水下节点定位算法[J]. 电子学报, 2022, 50(7): 1567-1572. |
[15] | 易令, 李泽平. 基于深度强化学习的码率自适应算法研究[J]. 电子学报, 2022, 50(5): 1192-1200. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||