a method of estimating an upper bound of the entropy of printed Chinese is presented. A bound of 5. 17bits/character for the entropy is obtained by computing the entropy of the sample of Chinese corpus. The perplexity of several language models
which is a quantitative measurement for the ability of language models
is discussed. A new method of approximating high scale language model by the lower ones is also presented.