Cost-sensitive Decision Tree Induction on Dirty Data
Author:
Affiliation:

Clc Number:

Fund Project:

National Natural Science Foundation of China (U1509216, 61472099); National Sci-Tech Support Plan (2015BAH10F01)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Cost-sensitive decision tree is a kind of decision tree which maximizes the sum of misclassification costs and test costs. Recently, with the explosive growth of data size, dirty data appears more frequently. In the process of cost-sensitive decision tree induction, dirty data in training datasets have negative impacts on selection of splitting attributes and division of decision tree nodes. Therefore, dirty data cleaning is necessary before classification tasks. Nevertheless, in practice, many users provide an acceptable threshold of data cleaning costs since time costs and expenses of data cleaning are expensive. Therefore, in addition to misclassification cost and test cost, data-cleaning cost is also an essential factor in cost-sensitive decision tree induction. However, existing researches have not considered data quality in the problem. To fill this gap, this study aims to focus on cost-sensitive decision tree induction on dirty data. Three decision tree induction methods integrated with data cleaning algorithms are presented. Experimental results demonstrate the effective of the proposed approaches.

    Reference
    Related
    Cited by
Get Citation

齐志鑫,王宏志,周雄,李建中,高宏.劣质数据上代价敏感决策树的建立.软件学报,2019,30(3):604-619

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 19,2018
  • Revised:September 20,2018
  • Adopted:
  • Online: March 06,2019
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063