Rare Category Detection Algorithm Based on Weighted Boundary Degree
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [22]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    This paper proposes an efficient algorithm named CATION (rare category detection algorithm based on weighted boundary degree) for rare category detection. By employing a rare-category criterion known as weighted boundary degree (WBD), this algorithm can make use of reverse k-nearest neighbors to help find the boundary points of rare categories and selects the boundary points with maximum WBDs for labeling. Extensive experimental results demonstrate that this algorithm avoids the limitations of existing approaches, has a significantly better efficiency on discovering new categories in data sets, and effectively reduces runtime, compared against the existing approaches.

    Reference
    [1] Bay S, Kumaraswamy K, Anderle MG, Kumar R, Steier DM. Large scale detection of irregularities in accounting data. In: Proc. of the ICDM 2006. Washington: IEEE Computer Society, 2006. 75-86. [doi: 10.1109/ICDM.2006.93]
    [2] Wu JJ, Xiong H, Wu P, Chen J. Local decomposition for rare class analysis. In: Proc. of the KDD 2007. New York: ACM Press, 2007. 814-823. [doi: 10.1145/1281192.1281279]
    [3] Stokes JW, Platt JC, Kravis J, Shilman M. ALADIN: Active learning of anomalies to detect intrusions. Technical Report, MSR-TR-2008-24, Microsoft Research, 2008. http://research.microsoft.com/en-us/um/people/jstokes/aladintechreport.pdf
    [4] Breunig MM, Kriegel HP, Ng RT, Sander J. LOF: Identifying density-based local outliers. In: Proc. of the SIGMOD 2000. New York: ACM Press, 2000. 93-104. [doi: 10.1145/335191.335388]
    [5] He JR, Carbonell J. Nearest-Neighbor-Based active learning for rare category detection. In: Platt JC, Koller D, Singer Y, Roweis S, eds. Advances in Neural Information Processing Systems 20. Cambridge: MIT Press, 2008. 633-640. http://books.nips.cc/papers/files/nips20/NIPS2007_0051.pdf
    [6] He JR, Liu Y, Lawrence R. Graph-Based rare category detection. In: Proc. of the ICDM 2008. Washington: IEEE Computer Society, 2008. 833-838. [doi: 10.1109/ICDM.2008.122]
    [7] He JR, Carbonell J. Prior-Free rare category detection. In: Proc. of the SDM 2009. Sparks, 2009. 155-163. http://www.siam.org/proceedings/datamining/2009/dm09_015_hej.pdf
    [8] Wang W, Zhou ZH. A new analysis of co-training. In: Proc. of the ICML 2010. Haifa, 2010. 1135-1142. http://www.icml2010.org/papers/275.pdf
    [9] Deng C, Guo, MZ. Tri-Training and data editing based semi-supervised clustering algorithm. Journal of Software, 2008,19(3): 663-673 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/19/663.htm [doi: 10.3724/SP.J.1001.2008.00663]
    [10] Hospedales TM, Gong SG, Xiang T. Finding rare classes: Adapting generative and discriminative models in active learning. In: Huang JZ, Cao L, Srivastava J, eds. Advances in Knowledge Discovery and Data Mining (PAKDD 2011). LNAI 6635, Heidelberg: Springer-Verlag, 2011. 296-308. [doi: 10.1007/978-3-642-20847-8_25]
    [11] Jian P, Kapoor A. Active learning for large multi-class problems. In: Proc. of the CVPR 2009. Washington: IEEE Computer Society, 2009. 762-769. [doi: 10.1109/CVPR.2009.5206651]
    [12] He JR, Tong HH, Carbonell J. Rare category characterization. In: Proc. of the ICDM 2010. Washington: IEEE Computer Society, 2010. 226-235. [doi: 10.1109/ICDM.2010.154]
    [13] Pelleg D, Moore A. Active learning for anomaly and rare-category detection. In: Saul LK, Weiss Y, Bottou L, eds. Advance in Neural Information Processing Systems 17. Cambridge: MIT Press, 2005. 1073-1080. http://books.nips.cc/papers/files/nips17/NIPS2004_0438.pdf
    [14] Vatturi P, Wong WK. Category detection using hierarchical mean shift. In: Proc. of the KDD 2009. New York: ACM Press, 2009. 847-856. [doi: 10.1145/1557019.1557112]
    [15] Xia CY, Hsu W, Lee ML, Ooi BC. BORDER: Efficient computation of boundary points. IEEE Trans. on Knowledge and Data Engineering, 2006,18(3):289-303. [doi: 10.1109/TKDE.2006.38]
    [16] Xue LX, Qiu BZ. Boundary points detection algorithm based on coefficient of variation. Pattern Recognition and Artificial Intelligence, 2009,22(5):799-802 (in Chinese with English abstract).
    [17] Qiu BZ, Yue F, Shen JY. BRIM: An efficient boundary points detecting algorithm. In: Zhou ZH, Li H, Yang Q, eds. Advances in Knowledge Discovery and Data Mining (PAKDD 2007). LNAI 4426, Heidelberg: Springer-Verlag, 2007. 761-768. [doi: 10.1007/978-3-540-71701-0_83]
    [18] Xue AR, Ju SG, He WH, Chen WH. Study on algorithms for local outlier detection. Chinese Journal of Computers, 2007,30(8): 1455-1463 (in Chinese with English abstract).
    [19] Moor A. A tutorial on kd-trees. Technical Report, University of Cambridge Computer Laboratory, 1991. http://www.autonlab.org/autonweb/documents/papers/moore-tutorial.pdf
    [20] Asuncion A, Newman D. UCI machine learning repository. Irvine: University of California, 2007. http://archive.ics.uci.edu/ml/datasets.html
    [21] Leung Y, Zhang JS, Xu ZB. Clustering by scale-space filtering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000, 22(12):1396-1410. [doi: 10.1109/34.895974]
    [22] Huang H, He QM, He JF, Ma LH. RADAR: Rare category detection via computation of boundary degree. In: Huang JZ, Cao L, Srivastava J, eds. Advances in Knowledge Discovery and Data Mining (PAKDD 2011). LNAI 6635, Heidelberg: Springer-Verlag, 2011. 258-269. [doi: 10.1007/978-3-642-20847-8_22]
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

黄浩,何钦铭,陈奇,钱烽,何江峰,马连航.基于加权边界度的稀有类检测算法.软件学报,2012,23(5):1195-1206

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 11,2011
  • Revised:July 01,2011
  • Online: April 29,2012
You are the first2032462Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063