The maximum entropy approach is proved to be expressive and effective for the statistics language modeling, but it suffers from the computational expensiveness of the model building. An improved maximum entropy approach which makes use of mutual information of information theory to select features based on Z-test is proposed. The approach is applied to Chinese word sense disambiguation. The experiments show that it has higher efficiency and precision.
[1]Ronnald Rosenfeld. A maximum entropy to adaptive statistical language learning. Computer Speech and Language, 1996,10(3):187~228
[2]Andrei Mikheev et al. Collocation Lattices and maximum entropy models. In: Zhou Joe ed. Proceedings of the 5th Workshop on Very Large Corpora. Beijing: Association for Computational Lingnistics, 1997. 216~230
[3]Berger A L, Della Pietra S et al. A maximum entropy approach to natural language processing. Computational Linguistics, 1996,22(1):40~72
[4]Della Pietra S, Della Pietra V et al. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligent, 1997,19(4),380~393
[5]Church K, Hanks P. Word association norms, mutual information, and lexicography. Computational Linguistics, 1990,16(1),22~29
[6]Frank Smadja. Retrieving collocation from text: Xtract. Computational Linguistics, 1993,19(1):143~175