SUN Guang-lu, WANG Xiao-long, LIU Bing-quan, et al. Statistical Chinese Chunking Model Based on Word Clustering Features[J]. Acta Electronica Sinica, 2008, 36(12): 2450-2453.
DOI:
SUN Guang-lu, WANG Xiao-long, LIU Bing-quan, et al. Statistical Chinese Chunking Model Based on Word Clustering Features[J]. Acta Electronica Sinica, 2008, 36(12): 2450-2453.DOI:
Statistical Chinese Chunking Model Based on Word Clustering Features
An entropy-based hierarchical word clustering algorithm is proposed.Word clusters generated by the clustering algorithm were used as features in Chinese chunking model.Based on words' chunk tags and the theory of entropy
a binary hierarchical clustering algorithm was applied to the words in Chinese chunking corpus.An accelerating algorithm was employed to save the clustering time.With the recognition of name entity and factoid
the new Chinese chunking system was constructed based on maximum entropy Markov models
while part-of-speech features were replaced with the entropy-based word clustering features.Experimental results show that the algorithm increases the efficiency of the word clustering
and the entropy-based word clustering features improve the performance of Chinese chunking effectively.