华南工学院
纸质出版:1984
移动端阅览
[1]石贵青,徐秉铮.汉字字频分布、最佳编码与输入问题[J].电子学报,1984(04):94-96.
Shi Gui-qing, Xu Bing-zheng. On the Frequency Distribution, Optimum Coding and Input Scheme for Chinese Characters[J]. Acta Electronica Sinica, 1984, (4): 94-96.
本文根据100万字的科技资料中出现3129个不同汉字的字频统计数据
将汉字按字频大小排序
得出字序n较小时
字频近似于Zipt分布
大n时趋于指数分布的规律。根据这种分布
得到汉字的一维熵和汉字最佳编码的平均码长。又根据汉语拼音统计数据
估计汉字多维熵以及汉语拼音熵
并据以分析汉语拼音输入方案。
Serial numbers are given to 3129 different Chinese characters according to their frequency of occurrence in chinese scientific publications amounting to 1 million characters. The result is
with small serial number n
the frequency of occurrence is approximately Zipf distribution
while with big n the distribution approaches exponential. The frist order entropy and the average code length of the optimum code of the Chinese characters are evaluated. The multiple order entropy and the phonetic entropy of the chinese characters are also evaluated
and an input scheme is analyzed.
0
浏览量
135
下载量
8
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621