Domain-Specific Terms Extraction Based on Web Resource and User Behavior
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The automatic domain-specific term extraction is an important task in natural language processing, which can be adoptedin domain-specific ontology construction, vertical search, text classification, class-based language model etc. A Web page contains lots of noises and irrelevantcontents, therefore, extracting domain-specific terms from original pages becomes a challenging task. Different from previous works, which rely on the original text of Web pages, this study focuses on anchor text and query log history of pages. This strategy would avoid the trouble of information extraction from the original Web pageand therefore improves the term extraction performance. In this paper, a novel term extraction algorithmis based onanalysis into Web resource and user behaviors. Different Web resources including body of the page data, anchor text of the page and the information of user query data were employed to extract the domain-specific terms and their performances were compared.The result based on scale of the network datademonstratesthe resources of anchor text and the way user query data can obtain a better effect.

    Reference
    Related
    Cited by
Get Citation

闫兴龙,刘奕群,方奇,张敏,马少平,茹立云.基于网络资源与用户行为信息的领域术语提取.软件学报,2013,24(9):2089-2100

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 28,2011
  • Revised:May 25,2012
  • Adopted:
  • Online: September 07,2013
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063