Rare Category Detection Algorithm Based on k-Nearest Neighbor Graphs
Author:
Affiliation:

Clc Number:

Fund Project:

National Natural Science Foundation of China (61502347, 61272275, 61202033, 61070013, U1135005); Fundamental Research Funds for the Central Universities (2042015kf0038); Research Funds for Introduced Talents of Wuhan University

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Rare category detection aims at finding at least one data example for each class in an unlabeled data set to prove the existence of these classes, especially the rare classes (a.k.a. rare categories) that have only a few data examples. It has various applications in the fields like financial fraud detection and network intrusion detection. Nevertheless, the existing approaches to this problem suffer either in terms of time complexity or the requirements for prior information about data sets (e.g., the proportion of data examples in each class). In this paper, a prior-free and efficient algorithm, called KRED is proposed for rare category detection. The algorithm explores the changes on local data distribution caused by the presence of the compact clusters of rare classes. To this end, it transforms a data set into a k-nearest neighbor graph, and investigates the variations in both edge lengths and in-degrees between the nodes. Finally, nodes with the maximal variations are selected as the candidate data examples of rare classes. Experimental results show that KRED effectively improves the efficiency of discovering new classes in data sets, and notably reduces the execution time.

    Reference
    Related
    Cited by
Get Citation

王淞,黄浩,余果,梁楠,王黎维,孙月明.一种基于k近邻图的稀有类检测算法.软件学报,2016,27(9):2320-2331

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 01,2014
  • Revised:March 10,2015
  • Adopted:
  • Online: September 02,2016
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063