基于奇异值分解的异常切片挖掘
作者:
基金项目:

Supported by the National Natural Science Foundation of China under Grant Nos.60473051, 60473072 (国家自然科学基金)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [11]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    切片操作是联机分析处理的主要功能之一,在决策支持应用中发挥着重要作用.由于人工的切片过程非常低效,且易忽略重要信息,提出了一种自动、智能的异常切片挖掘方法.该方法基于奇异值分解技术来提取切片的数据分布特征,然后在提取出的奇异值特征之上,利用基于距离的孤立点检测技术发现异常的切片.在人工生成的数据和实际应用的切片数据上所作的实验结果都表明了该方法的高效性和可行性.

    Abstract:

    Slice is one of the major operations in on-line analysis processing, which has played an important role in the application of decision support. In this paper, a method of mining exceptional slices is presented for extracting the distribution feature of the slice data based on the technique of the singular value decomposition, and the exceptional slices can be found by utilizing the distance-based outlier detection technique on the singular value feature. The effectiveness of the approach is experimentally demonstrated on the artificial data and the real slices data.

    参考文献
    [1]Imielinski T, Khachiyan L, Abdulghani A. Cubegrades: Generalizing association rules. In: Proc. of the 8th Int'l Conf. on Data Mining and Knowledge Discovery. Edmonton: ACM Press, 2002. 219-257.
    [2]Lakshmanan VS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Proc. of the 28th Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann Publishers, 2002. 778-789.
    [3]Sarawagi S, Agrawal R, Megiddo N. Discovery-Driven exploration of OLAP data cubes. In: Proc. of the Int'l Conf. on Extending Database Technology. LNCS 1377, Springer-Verlag, 1998. 168-182.
    [4]Sarawagi S. Explaining differences in multidimensional aggregates. In: Proc. of the 25th Int'l Conf. on Very Large Data Bases. Edinburgh: Morgan Kaufmann Publishers, 1999. 42-53.
    [5]Sarawagi S. User-Adaptive exploration of multidimensional data. In: Proc. of the 26th Int'l Conf. on Very Large Data Bases. Cairo: Morgan Kaufmann Publishers, 2000. 307-316.
    [6]Sathe G, Sarawagi S. Intelligent rollups in multidimensional OLAP data. In: Proc. of the 27th Int'l Conf. on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 531-540.
    [7]Shi RC. Matrix Analysis. Beijing: Beijing Institute of Technology Press, 1996. 149-153 (in Chinese).
    [8]Guha S, Gunopulos D, Koudas N. Correlating synchronous and asynchronous data streams. In: Proc. of the 9th Int'l Conf. on Knowledge Discovery and Data Mining. Washington DC: ACM Press, 2003. 529-534.
    [9]Han JW, Kamber M. Data Mining: Concepts and Techniques. Beijing: High Education Press, 2001. 381-388.
    [10]Knorr E, Ng R. Algorithms for mining distance-based outliers in large datasets. In: Proc. of the 27th Int'l Conf. on Very Large Data Bases. New York: Morgan Kaufmann Publishers, 1998. 392-403.
    [7]史荣昌.矩阵分析.北京:北京理工大学出版社,1996.149-153.
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

遇辉,马秀莉,谭少华,唐世渭,杨冬青.基于奇异值分解的异常切片挖掘.软件学报,2005,16(7):1282-1288

复制
分享
文章指标
  • 点击次数:3757
  • 下载次数: 5704
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2004-07-16
  • 最后修改日期:2005-03-11
文章二维码
您是第19868014位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号