避免近期偏好的自学习掩码分区增量学习
作者:
作者简介:

姚红革(1968-),男,博士,副教授,主要研究领域为人工智能,计算机视觉;邬子逸(1996-),男,硕士生,主要研究领域为元强化学习,小样本类增量学习;马姣姣(1997-),女,硕士生,主要研究领域为机器学习,计算机视觉;石俊(1972-),男,博士,讲师,主要研究领域为机器学习,计算机视觉,无人机控制;程嗣怡(1980-),男,博士,主要研究领域为机器学习,电子对抗理论与技术;陈游(1983-),男,博士,副教授,主要研究领域为雷达信号处理,信息对抗理论;喻钧(1970-),女,硕士,教授,主要研究领域为图像处理与模式识别,计算机网络与信息安全,无线传感器网络;姜虹(1977-),女,博士,副教授,主要研究领域为软件工程,图像处理

通讯作者:

邬子逸, E-mail: wuziyi@st.xatu.edu.cn

中图分类号:

TP18


Recency-bias-avoiding Partitioned Incremental Learning Based on Self-learning Mask
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [49]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    遗忘是人工神经网络在增量学习中的最大问题, 被称为“灾难性遗忘”. 而人类可以持续地获取新知识, 并能保存大部分经常用到的旧知识. 人类的这种能持续“增量学习”而很少遗忘是与人脑具有分区学习结构和记忆回放能力相关的. 为模拟人脑的这种结构和能力, 提出一种“避免近期偏好的自学习掩码分区增量学习方法”简称ASPIL. 它包含“区域隔离”和“区域集成”两阶段, 二者交替迭代实现持续的增量学习. 首先, 提出“BN稀疏区域隔离” 方法, 将新的学习过程与现有知识隔离, 避免干扰现有知识; 对于“区域集成”, 提出自学习掩码(SLM)和双分支融合(GBF)方法. 其中SLM准确提取新知识, 并提高网络对新知识的适应性, 而GBF将新旧知识融合, 以达到建立统一的、高精度的认知的目的; 训练时, 为确保进一步兼顾旧知识, 避免对新知识的偏好, 提出间隔损失正则项来避免“近期偏好”问题. 为评估以上所提出方法的效用, 在增量学习标准数据集CIFAR-100和miniImageNet上系统地进行消融实验, 并与最新的一系列知名方法进行比较. 实验结果表明, 所提方法提高了人工神经网络的记忆能力, 与最新知名方法相比识别率平均提升5.27%以上.

    Abstract:

    Forgetting is the biggest problem of artificial neural networks in incremental learning and is thus called “catastrophic forgetting”. In contrast, humans can continuously acquire new knowledge and retain most of the frequently used old knowledge. This continuous “incremental learning” ability of human without extensive forgetting is related to the partitioned learning structure and memory replay ability of the human brain. To simulate this structure and ability, the study proposes an incremental learning approach of “recency-bias-avoiding partitioned incremental learning based on self-learning mask (SLM)”, or ASPIL for short. ASPIL involves the two stages of regional isolation and regional integration, which are alternately iterated to accomplish continuous incremental learning. Specifically, this study proposes the “Bayesian network (BN)-based sparse regional isolation” method to isolate the new learning process from the existing knowledge and thereby avoid the interference with the existing knowledge. For regional integration, SLM and dual-branch fusion (GBF) methods are proposed. The SLM method can accurately extracts new knowledge and improves the adaptability of the network to new knowledge, while the GBF method integrates the old and new knowledge to achieve the goal of fostering unified and high-precision cognition. During training, a regularization term for Margin Loss is proposed to avoid the “recency bias”, thereby ensuring the further balance of the old knowledge and the avoidance of the bias towards the new knowledge. To evaluate the effectiveness of the proposed method, this study also presents systematic ablation experiments performed on the standard incremental learning datasets CIFAR-100 and miniImageNet and compares the proposed method with a series of well-known state-of-the-art methods. The experimental results show that the method proposed in this study improves the memory ability of the artificial neural network and outperforms the latest well-known methods by more than 5.27% in average identification rate.

    参考文献
    [1] 缪永彪. 基于深度学习的图像增量学习研究 [硕士学位论文]. 杭州: 浙江工业大学, 2020.
    Miao YB. Research on image incremental learning based on deep learning [MS. Thesis]. Hangzhou: Zhejiang University of Technology, 2020 (in Chinese with English abstract).
    [2] 何丽, 韩克平, 朱泓西, 刘颖. 双分支迭代的深度增量图像分类方法. 模式识别与人工智能, 2020, 33(2): 150–159. [doi: 10.16451/j.cnki.issn1003-6059.202002007]
    He L, Han KP, Zhu HX, Liu Y. Deep incremental image classification method based on double-branch iteration. Pattern Recognition and Artificial Intelligence, 2020, 33(2): 150–159 (in Chinese with English abstract). [doi: 10.16451/j.cnki.issn1003-6059.202002007]
    [3] 丁思宇. 增量类学习若干问题研究 [硕士学位论文]. 南京: 东南大学, 2019.
    Ding SY. Research on augmented class learning [MS. Thesis]. Nanjing: Southeast University, 2019 (in Chinese with English abstract).
    [4] Parisi GI, Kemker R, Part JL, Kanan C, Wermter S. Continual lifelong learning with neural networks: A review. Neural Networks, 2019, 113: 54–71. [doi: 10.1016/j.neunet.2019.01.012]
    [5] O’Reilly RC, Bhattacharyya R, Howard MD, Ketz N. Complementary learning systems. Cognitive Science, 2014, 38(6): 1229–1248. [doi: 10.1111/j.1551-6709.2011.01214.x]
    [6] O’Neill J, Pleydell-Bouverie B, Dupret D, Csicsvari J. Play it again: Reactivation of waking experience and memory. Trends in Neurosciences, 2010, 33(5): 220–229. [doi: 10.1016/j.tins.2010.01.006]
    [7] Li ZZ, Hoiem D. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(12): 2935–2947. [doi: 10.1109/TPAMI.2017.2773081]
    [8] Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D, Hadsell R. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114(13): 3521–3526. [doi: 10.1073/pnas.1611835114]
    [9] Zenke F, Poole B, Ganguli S. Continual learning through synaptic intelligence. In: Proc. of the 34th Int’l Conf. on Machine Learning (PMLR). Sydney: JMLR.org, 2017. 3987–3995.
    [10] French RM. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 1999, 3(4): 128–135. [doi: 10.1016/S1364-6613(99)01294-2]
    [11] Rebuffi SA, Kolesnikov A, Sperl G, Lampert CH. iCaRL: Incremental classifier and representation learning. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017. 5533–5542.
    [12] Chaudhry A, Ranzato MA, Rohrbach M, Elhoseiny M. Efficient lifelong learning with A-GEM. In: Proc. of the 7th Int’l Conf. on Learning Representations (ICLR). New Orleans: OpenReview.net, 2019.
    [13] Robins A. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 1995, 7(2): 123–146. [doi: 10.1080/09540099550039318]
    [14] Wang ZF, Jian T, Chowdhury K, Wang YZ, Dy J, Ioannidis S. Learn-prune-share for lifelong learning. In: Proc. of the 2020 IEEE Int’l Conf. on Data Mining (ICDM). Sorrento: IEEE, 2020. 641–650.
    [15] Kim JY, Choi DW. Split-and-bridge: Adaptable class incremental learning within a single neural network. In: Proc. of the 35th AAAI Conf. on Artificial Intelligence. AAAI, 2021. 8137-8145.
    [16] 韩亚楠, 刘建伟, 罗雄麟. 连续学习研究进展. 计算机研究与发展, 2022, 59(6): 1213–1239. [doi: 10.7544/issn1000-1239.20201058]
    Han YN, Liu JW, Luo XL. Research progress of continual learning. Journal of Computer Research and Development, 2022, 59(6): 1213–1239 (in Chinese with English abstract). [doi: 10.7544/issn1000-1239.20201058]
    [17] Liu YY, Schiele B, Sun QR. Adaptive aggregation networks for class-incremental learning. In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville: IEEE, 2021. 2544–2553.
    [18] Hou SH, Pan XY, Loy CC, Wang ZL, Lin DH. Learning a unified classifier incrementally via rebalancing. In: Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019. 831–839.
    [19] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases, 2009, 1(4).
    [20] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proc. of the 30th Int’l Conf. on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 3637–3645.
    [21] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv:1503.02531, 2015.
    [22] Howard J, Ruder S. Universal language model fine-tuning for text classification. In: Proc. of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne: ACL, 2018. 328–339.
    [23] Rannen A, Aljundi R, Blaschko MB, Tuytelaars T. Encoder based lifelong learning. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision (ICCV). Venice: IEEE, 2017. 1329–1337.
    [24] Schwarz J, Czarnecki W, Luketina J, Grabska-Barwinska A, Teh YW, Pascanu R, Hadsell R. Progress & compress: A scalable framework for continual learning. In: Proc. of the 35th Int’l Conf. on Machine Learning. Stockholm: ACM, 2018. 4535–4544.
    [25] Liu XL, Masana M, Herranz L, Van De Weijer J, Lopez AM, Bagdanov AD. Rotate your networks: Better weight consolidation and less catastrophic forgetting. In: Proc. of the 24th Int’l Conf. on Pattern Recognition (ICPR). Beijing: IEEE, 2018. 2262–2268.
    [26] Aljundi R, Babiloni F, Elhoseiny M, Rohrbach M, Tuytelaars T. Memory aware synapses: Learning what (not) to forget. In: Proc. of the 15th European Conf. on Computer Vision. Munich: Springer, 2018. 144–161.
    [27] Wu Y, Chen YP, Wang LJ, Ye YC, Liu ZC, Guo YD, Fu Y. Large scale incremental learning. In: Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019. 374–382.
    [28] Dhar P, Singh RV, Peng KC, Wu ZY, Chellappa R. Learning without memorizing. In: Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019. 5133–5141.
    [29] Zhang JT, Zhang J, Ghosh S, Li DW, Tasci S, Heck L, Zhang HM, Kuo CCJ. Class-incremental learning via deep model consolidation. In: Proc. of the 2020 IEEE Winter Conf. on Applications of Computer Vision (WACV). Snowmass: IEEE, 2020. 1120–1129.
    [30] Castro FM, Marín-Jiménez MJ, Guil N, Schmid C, Alahari K. End-to-end incremental learning. In: Proc. of the 15th European Conf. on Computer Vision. Munich: Springer, 2018. 241–257.
    [31] Mai ZD, Li RW, Kim H, Sanner S. Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). Nashville: IEEE, 2021. 3584–3594.
    [32] Smith J, Hsu YC, Balloch J, Shen YL, Jin HX, Kira Z. Always be dreaming: A new approach for data-free class-incremental learning. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Montreal: IEEE, 2021. 9354–9364.
    [33] Hung SCY, Tu CH, Wu CE, Chen CH, Chan YM, Chen CS. Compacting, picking and growing for unforgetting continual learning. In: Proc. of the 33rd Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 1225.
    [34] Mehta N, Liang KJ, Verma VK, Carin L. Continual learning using a Bayesian nonparametric dictionary of weight factors. arXiv:2004.10098, 2020.
    [35] Luo JH, Wu JX, Lin WY. ThiNet: A filter level pruning method for deep neural network compression. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision (ICCV). Venice: IEEE, 2017. 5068–5076.
    [36] Golkar S, Kagan M, Cho K. Continual learning via neural pruning. arXiv:1903.04476, 2019.
    [37] Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273–297. [doi: 10.1023/A:1022627411411]
    [38] Liu Z, Li JG, Shen ZQ, Huang G, Yan SM, Zhang CS. Learning efficient convolutional networks through network slimming. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision (ICCV). Venice: IEEE, 2017. 2755–2763.
    [39] Cao Y, Xu JR, Lin S, Wei FY, Hu H. GCNet: Non-local networks meet squeeze-excitation networks and beyond. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision Workshop (ICCVW). Seoul: IEEE, 2019. 1971–1980.
    [40] Liao MH, Wan ZY, Yao C, Chen K, Bai X. Real-time scene text detection with differentiable binarization. In: Proc. of the 34th AAAI Conf. on Artificial Intelligence. New York: AAAI Press, 2020. 11474–11481.
    [41] Li XT, Zhao HL, Han L, Tong YH, Tan SH, Yang KY. Gated fully fusion for semantic segmentation. In: Proc. of the 34th AAAI Conf. on Artificial Intelligence (AAAI). New York: AAAI Press, 2020. 11418–11425.
    [42] He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In: Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 770–778.
    [43] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proc. of the 3rd Int’l Conf. on Learning Representations. San Diego: ICLR, 2015.
    [44] Chaudhry A, Rohrbach M, Elhoseiny M, Ajanthan T, Dokania PK, Torr PHS, Ranzato MA. On tiny episodic memories in continual learning. arXiv:1902.10486, 2019.
    [45] Wah C, Branson S, Welinder P, Perona P, Belongie S. The caltech-ucsd birds-200-2011 dataset. 2011. https://resolver.caltech.edu/CaltechAUTHORS:20111026-120541847
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

姚红革,邬子逸,马姣姣,石俊,程嗣怡,陈游,喻钧,姜虹.避免近期偏好的自学习掩码分区增量学习.软件学报,2024,35(7):3428-3453

复制
分享
文章指标
  • 点击次数:485
  • 下载次数: 1701
  • HTML阅读次数: 785
  • 引用次数: 0
历史
  • 收稿日期:2022-08-31
  • 最后修改日期:2023-01-15
  • 在线发布日期: 2023-09-13
  • 出版日期: 2024-07-06
文章二维码
您是第20061809位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号