基于分辨粒度的gROC曲线分析方法
作者:
基金项目:

国家自然科学基金(60863010, 61163044); 国家重点基础研究发展计划(973)(2010CB334709); 吉林省科技发展计划(20090704)


gROC Curve Analysis Method Based on Discernible Granularity
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [25]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    ROC曲线是模型选择的一种重要方法,但ROC曲线的不确定性影响了模型选择的准确性.基于分辨粒度,从反映得分的不确定性的角度提出gROC和gAUC的概念,从理论上讨论了gROC的若干性质.在给出其算法之后,利用双正态模型检验了gROC的合理性.在此基础上,提出了两个模型选择度量——λAUC和ρAUC,并在UCI数据集上验证了该模型选择度量的高效性.实验结果表明,gROC能够有效反映ROC曲线的不确定性,基于λAUC和ρAUC的模型选择方法优于基于AUC或sAUC的模型选择方法,在某些情况下,gROC具有更强的对分类器性能的比较能力.

    Abstract:

    ROC Curve is an important method of model selection, but its uncertainty affects the accuracy of model selection. Based on discernible granularity and the view of reflecting the score's uncertainty, the study proposes the concept of gROC and gAUC, and discusses, theoretically, some properties of the gROC. The study also tests the reasonableness of gROC using binormal model after gave its algorithm. On this basis, the paper also proposes two model selection measures, λAUC and ρAUC. The effieciency of these measures is verified based on UCI data sets. Experimental results show that the gROC can effectively reflect the uncertainty of ROC curve, and the model selection methods based on λAUC and ρAUC are better than the method based on AUC or sAUC. In some cases, gROC has stronger capability on comparison of classifiers performance.

    参考文献
    [1] Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006,27(8):861-874. [doi: 10.1016/j.patrec.2005.10.010]
    [2] Egan JP. Signal detection theory and ROC analysis. In: Proc. of the Series in Cognition and Perception. New York: Academic Press, 1975.
    [3] Spackman KA. Signal detection theory: Valuable tools for evaluating inductive learning. In: Segre AM, ed. Proc. of the 6th Int'l Workshop on Machine Learning (ML'89). San Francisco: Morgan Kaufman Publishers, 1989. 160-163.
    [4] Swets J, Dawes R, Monahan J. Better Decisions Through Science. Scientific American, 2000,283(4):82-87. http://www.citeulike. org/user/rabio/article/3484365
    [5] Zweig MH, Campbell G. Receiver operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine. Clinical Chemistry, 1993,39(8):561-577.
    [6] Adams NM, Hand DJ. Comparing classifiers when the misallocation costs are uncertain. Pattern Recognition, 1999,32(7): 1139-1147. [doi: 10.1016/S0031-3203(98)00154-X]
    [7] Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 1997, 30:1145-1159. [doi: 10.1016/S0031-3203(96)00142-2]
    [8] Huang J, Ling CX. Using AUC and accuray in evaluating learing algorithms. IEEE Trans. on Knowledge and Data Engineering, 2005,17(3):299-310. [doi: 10.1109/TKDE.2005.50]
    [9] Huang J, Ling CX. Constructing new and better evaluation measures for machine learning. In: Proc. of the 20th Int'l Joint Conf. on Artifical Intelligence. 2007. 859-864. http://www.ijcai.org/papers07/Papers/IJCAI07-138.pdf
    [10] Wan BK, Xue SJ, Li J, Wang RP. Application of ROC curve to select the pattern classification algorithms. Progress in Natural Science, 2006,16(11):1511-1516 (in Chinese with English abstract).
    [11] Castanho MJP, Barros LC, Yamakami A, Vendite LL. Fuzzy receiver operating characteristic curve: An option to evaluate diagnostic tests. IEEE Trans. on Information Technology in Biomedicine, 2007,11(3):244-250. [doi: 10.1109/TITB.2006.879593]
    [12] Ferri C, Flach P, Hernández-Orallo J, Senad A. Modifying ROC curves to incorporate predicted probabilities. In: Proc. of the ICML 2005 Workshop on ROC Analysis in Machine Learning. Bonn, 2005. http://www.dsic.upv.es/%7Eflip/ROCML2005/papers/ ferriCRC.pdf
    [13] Wu SM, Flach P. A scored AUC metric for classifier evaluation and selection. In: Proc. of the ICML 2005 Workshop on ROC Analysis in Machine Learning. Bonn, 2005. http://www.dsic.upv.es/%7Eflip/ROCML2005/papers/wuCRC.pdf
    [14] Wu SM, Flach P, Ferri C. An improved model selection heuristic for AUC. In: Proc. of the 18th European Conf. on Machine Learning (ECML 2007). Berlin: Springer-Verlag, 2007. 478-489. [doi: 10.1007/978-3-540-74958-5_44]
    [15] Calders T, Jaroszewicz S. Efficient AUC optimization for classification. In: Proc. of the 11th European Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD 2007). Berlin: Springer-Verlag, 2007. 42-53. [doi: 10.1007/978-3-540- 74976-9_8]
    [16] Vanderlooy S, Hüllermeier E. A critical analysis of variants of the AUC. Machine Learning, 2008,72(3):247-262. [doi: 10.1007/ s10994-008-5070-x]
    [17] Hand DJ. Measuring classifier performance: A coherent alternative to the area under the ROC curve. Machine Learning, 2009,77(1): 103-123. [doi: 10.1007/s10994-009-5119-5]
    [18] David RP, Steven CG, Mark EO, Timothy DR. Development of a bayesian framework for determining uncertainty in receiver operating characteristic curve estimates. IEEE Trans. on Knowledge and Data Engineering, 2010,22(1):31-45. [doi: 10.1109/TKDE. 2009.50]
    [19] Macskassy S, Provost F. Confidence bands for ROC curves: Methods and an empirical study. In: Hernández-Orallo J, Ferri C, Lachiche N, Flash PA, eds. Proc. of the 1st Workshop ROC Analysis in AI (ROCAI 2004) at ECAI-2004. 2004. 61-70. http:// www.dsic.upv.es/ecai2004/workshops/accepted.html#w19#w19
    [20] Macskassy S, Provost F, Rosset S. ROC confidence bands: An empirical evaluation. In: De Raedt L, Wrobel S, eds. Proc. of the 22nd Int'l Conf. on Machine Learning (ICML 2005). New York: ACM Press, 2005. 537-544. [doi: 10.1145/1102351.1102419]
    [21] Elazmeh W, Japkowicz N, Matwin S. A framework for comparative evaluation of classifiers in the presence of class imbalance. In: Proc. of the ICML 2006 Workshop on ROC Analysis in Machine Learning. Pittsburgh, 2006. http://users.dsic.upv.es/~flip/ ROCML2006/Papers/elazmehROCML06.pdf
    [22] Efron B, Tibshirani R. An Introduction to the Bootstrap. New York: Chapman and Hall, 1993.
    [23] David RP. Uncertainty estimation for target detection system discrimination and confidence performance metrics [Ph.D. Thesis]. Ohio: Air Force Institute of Technology, 2006.
    [24] Metz C, Herman B, Shen J. Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously distributed data. Statistics in Medicine, 1998,17(9):1033-1053.
    [25] Blake C, Keogh E, Merz CJ. UCI repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California, 2011. http://www.ics.uci.edu/mlearn/MLRepository.html
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

董元方,李雄飞,李军,赵海英.基于分辨粒度的gROC曲线分析方法.软件学报,2013,24(1):109-120

复制
分享
文章指标
  • 点击次数:4598
  • 下载次数: 6607
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2011-05-19
  • 最后修改日期:2012-03-19
  • 在线发布日期: 2012-12-29
文章二维码
您是第19753439位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号