Open Source Software Classification Using Cost-Sensitive Multi-Label Learning

doi:10.13328/j.cnki.jos.004639

微信服务号

微信订阅号

Home > Archive>Volume 25, Issue 9, 2014 >1982-1991. DOI:10.13328/j.cnki.jos.004639

PDF HTML XML Export Cite reminder

Open Source Software Classification Using Cost-Sensitive Multi-Label Learning
DOI:
                        10.13328/j.cnki.jos.004639
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

With the explosive growth of open source software, retrieving desired software in open source software communities becomes a great challenge. Tagging open source software is usually a manual process which assigns software with several tags describing its functions and characteristics. Users can search their desired software by matching the keywords. Because of the simplicity and convenience, software retrieval based on tags has been widely used. However, since human effort is expensive and time-consuming, developers are not willing to tag software sufficiently when uploading software projects. Thus automatic software tagging, with tags describing functions and characteristics according to software projects’ text descriptions provided by users, becomes key to effective software retrieval. This article formalizes this problem as a multi-label learning problem and proposes a new multi-label learning method ML-CKNN which can effectively solve this problem when the number of different tags is extremely large. By imposing cost value of wrong classification into multi-label learning, ML-CKNN can effectively solve this imbalanced problem, as each tag instances associated with this tag are much less than those not associated with this tag. Experiments on three open source software community datasets show that ML-CKNN can provide high-quality tags for new uploading open source software while significantly outperforming existing methods.

Reference

Cited by

Get Citation

韩乐,黎铭.基于代价敏感多标记学习的开源软件分类.软件学报,2014,25(9):1982-1991

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 24,2014
Revised:May 14,2014
Adopted:
Online: September 09,2014
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History