

浏览全部资源
扫码关注微信
1. 北京工商大学计算机与信息工程学院,北京,100048
2. 中国北京社会科学院民族学与人类学研究所,北京,100081
3. 北京工商大学计算机与信息工程学院,北京,100048
4. 中国北京社会科学院民族学与人类学研究所,北京,100081
Published Online:25 August 2018,
Published:2018
移动端阅览
PAN Bo, YU Chong-chong, ZHANG Qing-chuan, et al. The Improved Model for word2vec Based on Part of Speech and Word Order[J]. Acta Electronica Sinica, 2018, 46(8): 1976-1982.
PAN Bo, YU Chong-chong, ZHANG Qing-chuan, et al. The Improved Model for word2vec Based on Part of Speech and Word Order[J]. Acta Electronica Sinica, 2018, 46(8): 1976-1982. DOI: 10.3969/j.issn.0372-2112.2018.08.024.
词性是自然语言处理的基本要素,词语顺序包含了所传达的语义与语法信息,它们都是自然语言中的关键信息.在word embedding模型中如何有效地将两者结合起来,是目前研究的重点.本文提出的Structured word2vec on POS联合了词语顺序与词性两种信息,不仅使模型可以感知词语位置顺序,而且利用词性关联信息来建立上下文窗口内词语之间的固有句法关系.Structured word2vec on POS将词语按其位置顺序定向嵌入,对词向量和词性相关加权矩阵进行联合优化.实验通过词语类比、词相似性任务,证明了所提出的方法的有效性.
Part of speech(POS) is the basic element of Natural Language Processing(NLP)
word order consists of its conveyed semantic and syntax information
both are the key information of language.There is still lack of such a word embedding model that combines the two together as the influential element.This paper presents the Structured Word2vec on POS that linked the two information of word order and POS together
not only enables the model to sense the words position and order
but alsouse the POS information to establish the inherent syntactic relation between words in the context window.Structured Word2vec on POS is capable to directionally embed the words into context window according to their position
and optimizes the word vector and POSrelevance weight matrix.Experiment through word analogy
word similarity task proved the effectiveness of our method.
0
Views
269
下载量
6
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621