A Computational Model for Chinese Syntactic Structure Induction Based on Sentence Alignment

微信服务号

微信订阅号

2025-4-10- 0

Home > Archive>Volume 18, Issue 3, 2007 >538-546

A Computational Model for Chinese Syntactic Structure Induction Based on Sentence Alignment
DOI:
                        
                    
Author:
                        WANG Hou-FengWANG Hou-Feng

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
WANG BoWANG Bo

Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

This paper introduces an unsupervised learning framework of Chinese syntactic structure based sentences similarity. First, all sentence pairs in the Chinese sentence corpus are aligned, and each pair is partitioned into similarity segmentations and different ones which alternately occur, Then, aligned similarity segmentations or different ones are selected as potential constituent candidates based on the strategy of similarity priority or of difference priority respectively. As the boundary friction may be introduced in the later step, its disambiguation is further carried out. Finally, by inducing sentence constituents, the syntactic structures are learned. In order to reduce word sparseness in the process, some words are replaced by classes in advance. Three forms of the sentence units, such as the sequence of words, the sequence of POS (part of speech)-tags and the sequence of words with POS-tag, are examined and the learned syntactic structures are evaluated respectively. The results show that different priority strategy achieves a better performance than the similarity one, and the Fs are above 46% for all three forms, with the best one being 49.52%, which is better than those having been reported.

Key words:sentence alignment;unsupervised learning;boundary friction;similarity priority;difference priority;Chinese syntactic structure induction

Get Citation

王厚峰,王波.基于句子对齐的汉语句法结构推导的计算模型.软件学报,2007,18(3):538-546

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:January 26,2004
Revised:April 12,2006
Adopted:
Online:
Published:

You are the first2034247Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History