Part of Speech Tagging Chinese Corpus Based on Statistics and Rules

微信服务号

微信订阅号

2025-6-6- 15

Home > Archive>Volume 9, Issue 2, 1998 >134-138

Part of Speech Tagging Chinese Corpus Based on Statistics and Rules
DOI:
                        
                    
Author:
                        ZHANG MinZHANG Min

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI ShengLI Sheng

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHAO Tie-junZHAO Tie-jun

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHANG Yan-fengZHANG Yan-feng

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

This paper proposes an algorithm of automaticallytagging the POS(part of speech) of Chinese words which is based on integration of the statistical technique and the rule technique with the priority of the quantitative statistical analysis. The confidence intervals in the estimation of parameters is employed in the algorithm, and this makes the high-accuracy quantitative statistical technique as the top priority of tagging a corpus. Then the untagging part of the corpus is tagged in terms of rules, and some errors by statistics can be corrected by rules. Both closed and opened tests indicated that the accuracies of the algorithm are 98.9% and 98.1% respectively without consideration of both unknown words and segmentation errors.

Key words:Chinese, part of speech tagging, hidden Markov model, rule, confidence intervals.

Get Citation

张民,李生,赵铁军,张艳风.统计与规则并举的汉语词性自动标注算法.软件学报,1998,9(2):134-138

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 21,1996
Revised:March 20,1997
Adopted:
Online:
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History