• Article
  • | |
  • Metrics
  • |
  • Reference [21]
  • |
  • Related [20]
  • |
  • Cited by [1]
  • | |
  • Comments
    Abstract:

    Locally linear embedding greatly depends on whether the neighborhood graph can realistically reflect the underlying geometry structure of the data manifolds. The topological structure of constructed neighborhood with the existing approaches is unstable. It is sensitive to the noisy and sparse data sets. Based on the relative cognitive law, the relative transformation is presented, by which the relative space and the relative manifold are further constructed. The relative transformation can improve the distinguishing ability between data points and reduce the impact of noise and sparsity of data. To determine the neighborhood in the relative space and the relative manifold can more truly reflect the manifold structure, based on which the enhanced local linear embedding algorithms are developed with significantly improved performance. Besides, the speed is also enhanced with this approach. The experiments on challenging benchmark data sets validate the proposed approach.

    Reference
    [1] Sproat R, Emerson T. The 1st Int’l Chinese Word Segmentation Bakeoff. In: Proc. of the 2nd SIGHAN Workshop on Chinese Language Processing. 2003. http://www.aclweb.org/anthology-new/W/W03/W03-1719.pdf
    [2] Emerson T. The 2nd Int’l Chinese Word Segmentation Bakeoff. In: Proc. of the 4th SIGHAN Workshop on Chinese Language Processing. 2005. http://www.aclweb.org/anthology-new/I/I05/I05-3017.pdf
    [3] Levow G. The 3rd Int’l Chinese Language Proc. Bakeoff: Word segmentation and name entity recognition. In: Proc. of the 5th SIGHAN Workshop on Chinese Language Proc. 2006.
    [4] Xue N, Shen L. Chinese word segmentation as LMR tagging. In: Proc. of the 2nd SIGHAN Workshop on Chinese Language Proc. 2003. http://www.aclweb.org/anthology-new/W/W03/W03-1728.pdf
    [5] Huang C, Zhao H. Which is essential for chinese word segmentation: Character versus word. In: Proc. of the 20th Pacific Asia Conf. on Language, Information and Computation (PACLIC-20). 2006. 1-12.
    [6] Huang C, Zhao H. Chinese word segmentation: A decade review. Journal of Chinese Information Processing, 2007,21(3):8-18 (in Chinese with English abstract).
    [7] Zhang R, Kikui G, Sumita E. Subword-Based tagging by conditional random fields for Chinese word segmentation. In: Proc. of the HLT/NAACL-2006. 2006.
    [8] Zhao H, Kit C. Effective subsequence-based tagging for chinese word segmentation. Journal of Chinese Information Processing, 2007,21(5):8-13 (in Chinese with English abstract).
    [9] Zhao H, Huang C, Li M, Lu B. Effective tag set selection in Chinese word segmentation via conditional random field modeling. In: Proc. of the 20th Pacific Asia Conf. on Language, Information and Computation (PACLIC-20). 2006. 87-94.
    [10] Berger A, Pietra SAD, Pietra VJD. A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22:39-71.
    [11] Ratnaparkhi A. A maximum entropy model for part-of-speech tagging. In: Proc. of the Conf. on Empirical Methods in Natural Language Processing. 1996. http://www.aclweb.org/anthology-new/W/W96/W96-0213.pdf
    [12] Lafferty J, McCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of the 18th Int’l Conf. on Machine Learning (ICML 2001). 2001. http://www.cis.upenn.edu/~pereira/papers/crf.pdf
    [13] Sha F, Pereira F. Shallow parsing with conditional random fields. In: Proc. of the HLT-NAACL 2003. 2003. http://www.aclweb. org/anthology-new/N/N03/N03-1028.pdf
    [14] Peng FC, Feng FF, McCallum A. Chinese segmentation and new word detection using conditional random fields. In: Proc. of the 20th Int’l Conf. on Computational Linguisticsd. 2004. http://www.aclweb.org/anthology-new/C/C04/C04-1081.pdf
    [15] Zhao H, Huang C, Li M. An improved chinese word segmentation system with conditional random field. In: Proc. of the 5th SIGHAN Workshop on Chinese Language Processing. 2006. 162-165.
    [16] Zhang H, Liu T, Ma J, Liao X. Chinese word segmentation with multiple postprocessors in HIT-IRLab. In: Proc. of the 4th SIGHAN Workshop on Chinese Language Processing. 2005. http://www.aclweb.org/anthology-new/I/I05/I05-3028.pdf
    [17] Katz SM. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. on Acoustics, Speech, and Signal Processing, 1987,35(3):400-401.
    [18] Zhao H, Kit C. Unsupervised segmentation helps supervised learning of character tagging for word segmentation and named entity recognition. In: Proc. of the 6th SIGHAN Workshop on Chinese Language Processing (SIGHAN-6). 2008. http://www.aclweb.org/ anthology-new/I/I08/I08-4017.pdf
    [19] Jiang W, Huang L, Liu Q, Lü Y. A cascaded linear model for joint Chinese word segmentation and part-of-speech tagging. In: Proc. of the 46th Annual Meeting of the Association for Computational Linguistics (ACL-08). 2008. 附中文参考文献:
    [6] 黄昌宁,赵海.中文分词十年回顾,中文信息学报,2007,21(3):8-18.
    [8] 赵海,揭春雨.基于有效子串标注的中文分词.中文信息学报,2007,21(5):8-13.
    Comments
    Comments
    分享到微博
    Submit
Get Citation

文贵华,陆庭辉,江丽君,文军.基于相对流形的局部线性嵌入.软件学报,2009,20(9):3476-2386

Copy
Share
Article Metrics
  • Abstract:4554
  • PDF: 6789
  • HTML: 0
  • Cited by: 0
History
  • Received:November 06,2007
  • Revised:March 14,2008
You are the first2032462Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063