中国科学院声学研究所语音交互信息技术研究中心,北京,100080
纸质出版:2004
移动端阅览
王显芳, 杜利民. 一种能够检测所有交叉歧义的汉语分词算法[J]. 电子学报, 2004,32(1):50-54.
WANG Xian-fang, DU Li-min. A Method of Sentence Segmentation That Check All Overlapping Ambiguity[J]. Acta Electronica Sinica, 2004, 32(1): 50-54.
本文给出了一种能够检测句子中所有交叉歧义的汉语分词算法.该算法基于"长词优先"的切分原则.它解决了切分路径数随句子长度的增长而呈几何级数增长的问题
并且提供了一种方法可将句子的覆盖歧义和交叉歧义分开处理.算法的运算复杂度为
O(N)
N
为句子长度.它的输出使得进行下一步处理的运算量大大减少.
We proposed a new method of Chinese automatic segmentation that can check all overlapping ambiguity in sentence. This algorithm is based on the principle of "Choose Longer Word". It solves the problem that the count of segmentation way is exponentially increasing with the sentence length
and provides a method to handle overlaying ambiguity and overlapping ambiguity separately. The time complexity of this algorithm is
O(N)
where
N
is the length of sentence. Its output can greatly decrease the computing cost of post processing.
0
浏览量
1456
下载量
6
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621