基于IMIU的在线类增量对比学习
作者:
作者单位:

作者简介:

刘雨薇(1999-), 女, 硕士生, 主要研究领域为机器学习, 模式识别. ;陈松灿(1962-), 男, 博士, 教授, 博士生导师, CCF高级会员, 主要研究领域为机器学习, 模式识别, 神经计算.

通讯作者:

陈松灿, E-mail: s.chen@nuaa.edu.cn

中图分类号:

TP18

基金项目:

国家自然科学基金(62076124); 南京航空航天大学研究生科研与实践创新计划(xcxjh20221601)


Online Class Incremental Contrastive Learning Based on Incremental Mixup-induced Universum
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在线类增量连续学习旨在数据流场景下进行有效的新类学习, 并保证模型满足小缓存和小批次约束. 然而由于数据流的单趟(one-pass)特性, 小批次内的类别信息难以如离线学习那样被多趟探索利用. 为缓解该问题, 目前常采用数据多重增广并借助对比回放建模. 但考虑到小缓存和小批次限制, 现有随机选择和保存数据的策略不利于获取多样性的负样本, 制约了模型判别性. 已有研究表明困难负样本是提升对比学习性能的关键, 但这鲜少在在线学习场景被探索. Universum学习提出的概念含混(condued)数据恰好提供一种生成困难负样本的简单直观思路, 据此先前用特定系数插值混合(mixup)诱导出的Universum数据(mixup-induced Universum, MIU)已有效提升了离线对比学习的性能. 受此启发, 尝试将其引入在线场景. 但不同于先前静态生成的Universum, 数据流场景面临着某些额外挑战. 首先随类数的动态增加, 相对基于全局给定类生成的静态Universum不再适用, 需重新加以定义和动态生成, 为此提出仅利用当前数据(局部)递归生成相对已见类熵为最大的MIU (称为增量MIU, IMIU), 并为其提供额外的小缓存从总体上满足内存限制; 其次将生成的IMIU和小批次内的正样本再次插值混合出多样且高质的困难负样本. 最后综合上述各步, 发展出基于IMIU的在线类增量对比学习(incrementally mixup-induced Universum based online class-increment contrastive learning, IUCL)学习算法. 在标准数据集CIFAR-10、CIFAR-100和Mini-ImageNet上的对比实验验证所提算法一致的有效性.

    Abstract:

    Online class-increment learning aims to learn new classes effectively under data stream scenarios and guarantee that the model meets the small cache and small batch constraints. However, due to the one-pass nature of data streams, it is difficult for the category information in small batches like offline learning to be exploited by multiple explorations. To alleviate this problem, current studies adopt multiple data augmentation combined with contrastive learning for model training. Nevertheless, considering the limitations of small cache and small batches, existing methods of selecting and storing data randomly are not conducive to obtaining diverse negative samples, which restricts the model discriminability. Previous studies have shown that hard negative samples are the key to improving contrastive learning performance, but this is rarely explored in online learning scenarios. The condued data proposed in traditional Universum learning provides a simple yet intuitive strategy using hard negative samples. Specifically, this study has proposed mixup-induced Universum (MIU) with certain coefficients previously, which effectively improves the performance of offline contrastive learning. Inspired by this, it tries to introduce MIU to online scenes, which is different from the previously statically generated Universum, and data stream scenarios face some additional challenges. Firstly, due to the increasing number of classes, the conventional approach of generating Universum based on globally given classes statically becomes inapplicable, necessitating redefinition and dynamic generation. Therefore, this study proposes to recursively generate MIU with the maximum entropy (incremental MIU, IMIU) relative to the seen (local) class and provides it with an additional small cache to meet the memory limit generally. Secondly, the generated IMIU and positive samples in small batches are mixed up together again to produce diverse and high-quality hard negative samples. Finally, by combining the above steps, the IMIU-based contrastive learning (IUCL) algorithm is developed. Meanwhile, comparison experiments on the standard datasets CIFAR-10, CIFAR-100, and Mini-ImageNet verify the validity of the proposed algorithm.

    参考文献
    相似文献
    引证文献
引用本文

刘雨薇,陈松灿.基于IMIU的在线类增量对比学习.软件学报,2024,35(12):5544-5557

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-07-25
  • 最后修改日期:2023-09-16
  • 录用日期:
  • 在线发布日期: 2024-06-12
  • 出版日期: 2024-12-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号