1. 南京大学计算机软件新技术国家重点实验室,江苏,南京,210023
2. 南京大学计算机科学与技术系,江苏,南京,210023
3. 南京大学计算机软件新技术国家重点实验室,江苏,南京,210023
4. 南京大学计算机科学与技术系,江苏,南京,210023
纸质出版:2015
移动端阅览
尹存燕, 黄书剑, 戴新宇, 等. 中英命名实体识别及对齐中的中文分词优化[J]. 电子学报, 2015,43(8):1481-1487.
YIN Cun-yan, HUANG Shu-jian, DAI Xin-yu, et al. Optimization of Chinese Word Segmentation in Named Entity Recognition and Word Alignment[J]. Acta Electronica Sinica, 2015, 43(8): 1481-1487.
尹存燕, 黄书剑, 戴新宇, 等. 中英命名实体识别及对齐中的中文分词优化[J]. 电子学报, 2015,43(8):1481-1487. DOI: 10.3969/j.issn.0372-2112.2015.08.003.
YIN Cun-yan, HUANG Shu-jian, DAI Xin-yu, et al. Optimization of Chinese Word Segmentation in Named Entity Recognition and Word Alignment[J]. Acta Electronica Sinica, 2015, 43(8): 1481-1487. DOI: 10.3969/j.issn.0372-2112.2015.08.003.
中文分词结果对中英命名实体识别及对齐有着直接的影响
本文提出了一种命名实体识别及对齐中的中文分词优化方法.该方法利用实体词汇的对齐信息
首先修正命名实体识别结果
然后根据实体对齐结果调整分词粒度、修正错误分词.分词优化后的结果使得双语命名实体尽可能多地实现一一对应
进而提高中英命名实体翻译抽取和统计机器翻译的效果.实验结果表明了本文优化方法的有效性.
Bilingual named entity recognition and alignment are important for many natural language processing.Named entity translation can improve a lot the performance of the system like statistical machine translation or cross-language information retrieval.Quality of Chinese word segmentation does have a big impact over named entity (NE) recognition and bilingual NE extraction.Bilingual alignment information provides indications for NE recognition and word segmentation.Accordingly
based on the characteristics of NE recognition
NE alignment
and word segmentation
this paper proposes an optimization algorithm of Chinese word segmentation.By correcting word segmentation error and adjusting word segmentation granularity
the optimization algorithm can enhance extraction effect of Chinese-English NE translation and performance of statistical machine translation.The experimental result on Chinese-English news corpus shows the efficiency of our algorithm.
0
浏览量
2
下载量
5
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621