国家自然科学基金(61806005); 安徽省高校协同创新项目(GXXT-2019-025, GXXT-2020-012, GXXT-2022-052); CCF-蚂蚁科研基金(CCF-AFSGRF20210003)
由于多视图数据特征复杂, 多视图离群检测已经成为离群点检测中一个极具挑战性的研究课题. 多视图数据中存在3种类型的离群点, 分别为类离群点、属性离群点和类-属性离群点. 早期多视图离群点检测方法大多基于聚类假设, 当数据中没有聚类结构时很难检测出离群点. 近年来, 许多多视图离群点检测方法使用多视图一致的近邻假设来代替聚类假设, 但仍存在新增数据检测效率低的问题. 此外, 大多数现有的多视图离群点检测方法都是无监督的, 在模型学习过程中会受到离群点的影响, 处理高离群率的数据集时效果不佳. 为了解决这些问题, 提出一种用于高效多视图离群点检测的视图内重建和跨视图生成网络来检测3种类型的离群点, 所提方法包含视图内重建和跨视图生成两个模块. 通过使用正常数据训练, 所提出方法可以充分捕捉正常数据中每个视图的特征, 并较好地重建和生成相应的视图. 此外, 还提出一个新的离群值计算方法, 为每一个样本计算相应的离群值得分, 从而高效地检测新增数据. 大量的实验结果表明, 所提出的方法明显优于现有的方法. 这是第1项将基于生成对抗网络的深度模型应用于多视图离群点检测的工作.
Due to the complex features of multi-view data, multi-view outlier detection has become a very challenging research topic in outlier detection. There are three types of outliers in multi-view data, namely class outliers, attribute outliers, and class-attribute outliers. Most of the early multi-view outlier detection methods are based on the assumption of clustering, which makes it difficult to detect outliers when there is no clustering structure in the data. In recent years, many multi-view outlier detection methods use the multi-view consistent nearest neighbor assumption instead of the clustering assumption, but they still suffer from the problem of inefficient detection of new data. In addition, most existing multi-view outlier detection methods are unsupervised, which are affected by outliers during model learning and do not work well when dealing with datasets with high outlier rates. To address these issues, this study proposes an intra-view reconstruction and cross-view generation network for effective multi-view outlier detection to detect the three types of outliers, which consists of two modules: intra-view reconstruction and cross-view generation. By training with normal data, the proposed method can fully capture the features of each view in the normal data and reconstruct and generate the corresponding views better. In addition, a new outlier calculation method is proposed to calculate the corresponding outlier scores for each sample to efficiently detect new data. Extensive experimental results show that the proposed method significantly outperforms existing methods. It is known that this is the first work to apply a deep model based on generative adversarial networks to multi-view outlier detection.
郑啸,王权鑫,黄俊. IRCGN: 用于高效多视图离群点检测的生成式网络.软件学报,,():1-16复制