Cross-project Defect Prediction Method Based on Feature Transfer and Instance Transfer
Author:
Affiliation:

Clc Number:

Fund Project:

National Natural Science Foundation of China (61373012, 61202006, 91218302, 61321491); Open Fund of State Key Laboratory for Novel Software Technology (Nanjing University) (KFKT2016B18, KFKT2018B17); Natural Science Foundation of Jiangsu Province (BK20180695); State Scholarship Fund of China Scholarship Council (201806190172)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In real software development, a project, which needs defect prediction, may be a new project or maybe has less training data. A simple solution is to use training data from other projects (i.e., source projects) to construct the model, and use the trained model to perform prediction on the current project (i.e., target project). However, datasets among different projects may have large distribution difference. To solve this problem, a novel two phase cross-project defect prediction method FeCTrA is proposed, which considers both feature transfer and instance transfer. In the feature transfer phase, FeCTrA uses cluster analysis to select features, which have high distribution similarity between the source project and the target project. In the instance transfer phase, FeCTrA utilizes TrAdaBoost, which selects relevant instances from the source project when give some labeled instances in the target project. To verify the effectiveness of FeCTrA, Relink and AEEEM datasets are choosen as the experimental subjects and F1 as the performance measure. Firstly, it is found that FeCTrA outperforms single phase methods, which only consider feature transfer or instance transfer. Then after comparing with state-of-the-art baseline methods (i.e., TCA+, Peters filter, Burak filter, and DCPDP), the performance of FeCTrA improves 23%, 7.2%, 9.8%, and 38.2% on Relink dataset and the performance of FeCTrA improves 96.5%, 108.5%, 103.6%, and 107.9% on AEEEM dataset. Finally, the influence of factors in FeCTrA is analyzed and a guideline to effectively use this method is provided.

    Reference
    Related
    Cited by
Get Citation

倪超,陈翔,刘望舒,顾庆,黄启国,李娜.基于特征迁移和实例迁移的跨项目缺陷预测方法.软件学报,2019,30(5):1308-1329

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 28,2018
  • Revised:October 31,2018
  • Adopted:
  • Online: May 08,2019
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063