Solving the data sparseness problem is an important problem about head-driven parsing
cluster-based statistic language model is an important method to solve the problem of sparse data.Based on the analysis of the classical smoothing technology
this paper proposes a word clustering algorithm by utilizing mutual information and semantic dependency
and an absolute weighted difference method was presented and was used to construct vari-gram language model which has good predictable ability
then proposes an improved head-driven parsing model based on word cluster and vari-gram model.Experiments are conducted for the refined statistical parser
it achieves 84.53% precision and 82.41% recall
F measure is improved 2.02% comparing with the head-driven parsing model introduced by Collins.