面向深度学习的图像数据增强综述

doi:10.13328/j.cnki.jos.007263

微信服务号

微信订阅号

2025年3月28日 21:36 星期五

首页 > 过刊浏览>2025年第36卷第3期 >1390-1412. DOI:10.13328/j.cnki.jos.007263

PDF HTML阅读 XML下载导出引用引用提醒

面向深度学习的图像数据增强综述
DOI:
                        10.13328/j.cnki.jos.007263
                    
CSTR:
                        32375.14.jos.007263
                    
作者:
                        杨锁荣杨锁荣
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023;南京大学 计算机科学与技术系, 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
杨洪朝杨洪朝
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023;南京大学 计算机科学与技术系, 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
申富饶申富饶
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023;南京大学 人工智能学院, 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
赵健赵健
南京大学 电子科学与工程学院, 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(62276127)

Image Data Augmentation for Deep Learning: A Survey

Author:

YANG Suo-Rong
YANG Suo-Rong
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China;Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
YANG Hong-Chao
YANG Hong-Chao
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China;Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
SHEN Fu-Rao
SHEN Fu-Rao
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China;School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
ZHAO Jian
ZHAO Jian
School of Electronic Science and Engineering, Nanjing University, Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [101]

相似文献

引证文献

资源附件

文章评论

摘要:

深度学习已经在许多计算机视觉任务中取得了显著的成果. 然而, 深度神经网络通常需要大量的训练数据以避免过拟合, 但实际应用中标记数据可能非常有限. 因此, 数据增强已成为提高训练数据充分性和多样性的有效方法, 也是深度学习模型成功应用于图像数据的必要环节. 系统地回顾不同的图像数据增强方法, 并提出一个新的分类方法, 为研究图像数据增强提供了新的视角. 从不同的类别出发介绍各类数据增强方法的优势和局限性, 并阐述各类方法的解决思路和应用价值. 此外, 还介绍语义分割、图像分类和目标检测这3种典型计算机视觉任务中常用的公共数据集和性能评价指标, 并在这3个任务上对数据增强方法进行实验对比分析. 最后, 讨论当前数据增强所面临的挑战和未来的发展趋势.

关键词:深度学习;图像数据增强;图像识别;泛化性能;计算机视觉

Abstract:

Deep learning has yielded remarkable achievements in many computer vision tasks. However, deep neural networks typically require a large amount of training data to prevent overfitting. In practical applications, labeled data may be extremely limited. Thus, data augmentation has become an effective way to enhance the adequacy and diversity of training data and is also a necessary link for the successful application of deep learning models to image data. This study systematically reviews different image data augmentation methods and proposes a new classification method to provide a fresh perspective for studying image data augmentation. The advantages and limitations of various data augmentation methods are introduced from different categories, and the solution ideas and application values of these methods are elaborated. In addition, commonly used public datasets and performance evaluation indicators in three typical computer vision tasks of semantic segmentation, image classification, and object detection are presented. Experimental comparative analysis of data augmentation methods is conducted on these three tasks. Finally, the challenges and future development trends currently faced by data augmentation are discussed.

Key words:deep learning;image data augmentation;image recognition;generalization performance;computer vision

参考文献

[1] Hassaballah M, Awad AI. Deep Learning in Computer Vision: Principles and Applications. Boca Raton: CRC Press, 2020. [doi: 10.1201/9781351003827]

[2] Liu BC, Zeng QT, Lu LK, Li YL, You FC. A survey of recommendation systems based on deep learning. Journal of Physics: Conf. Series, 2021, 1754: 012148.

[3] Torfi A, Shirvani RA, Keneshloo Y, Tavaf N, Fox EA. Natural language processing advancements by deep learning: A survey. arXiv:2003.01200, 2020.

[4] He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In: Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 770–778. [doi: 10.1109/CVPR.2016.90]

[5] Babyak MA. What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medicine, 2004, 66(3): 411–421.

[6] Ying X. An overview of overfitting and its solutions. Journal of Physics: Conf. Series, 2019, 1168(2): 022022.

[7] Alzubaidi L, Zhang JL, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Santamaría J, Fadhel MA, Al-Amidie M, Farhan L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 2021, 8(1): 53.

[8] Nickolls J, Dally WJ. The GPU computing era. IEEE Micro, 2010, 30(2): 56–69.

[9] Sun YF, Agostini NB, Dong S, Kaeli D. Summarizing CPU and GPU design trends with product data. arXiv:1911.11313, 2019.

[10] Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding with unsupervised learning. 2018. https://openai.com/index/language-unsupervised/

[11] Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog, 2019, 1(8): 9.

[12] Brown TB, Mann B, Ryder N, et al. Language models are few-shot learners. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 1877–1901.

[13] Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma SA, Huang ZH, Karpathy A, Khosla A, Bernstein M, Berg AC, Li FF. ImageNet large scale visual recognition challenge. Int’l Journal of Computer Vision, 2015, 115(3): 211–252.

[14] Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft COCO: Common objects in context. In: Proc. of the 13th European Conf. on Computer Vision. Zurich: Springer, 2014. 740–755. [doi: 10.1007/978-3-319-10602-1_48]

[15] Everingham M, Eslami SMA, van Gool L, Williams CKI, Winn J, Zisserman A. The PASCAL visual object classes challenge: A retrospective. Int’l Journal of Computer Vision, 2015, 111(1): 98–136.

[16] Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Proc. of the 38th AAAI Conf. on Artificial Intelligence. Vancouver: AAAI Press, 2017. 4278–4284. [doi: 10.1609/aaai.v31i1.11231]

[17] DeVries T, Taylor GW. Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552, 2017.

[18] Inoue H. Data augmentation by pairing samples for images classification. arXiv:1801.02929, 2018.

[19] Perez L, Wang J. The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621, 2017.

[20] Cubuk ED, Zoph B, ManéD, Vasudevan V, Le QV. AutoAugment: Learning augmentation strategies from data. In: Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 113–123. [doi: 10.1109/CVPR.2019.00020]

[21] Lim S, Kim I, Kim T, Kim C. Fast AutoAugment. In: Proc. of the 33rd Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 32.

[22] Zhang HY, Cisse M, Dauphin Y N, Lopez-Paz D. Mixup: Beyond empirical risk minimization. arXiv:1710.09412, 2017.

[23] Yun S, Han D, Chun S, Oh SJ, Yoo Y, Choe J. CutMix: Regularization strategy to train strong classifiers with localizable features. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision. Seoul: IEEE, 2019. 6022–6031. [doi: 10.1109/ICCV.2019.00612]

[24] Chen PG, Liu S, Zhao HS, Wang XQ, Jia JY. GridMask data augmentation. arXiv:2001.04086, 2020.

[25] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial networks. Communications of the ACM, 2020, 63(11): 139–144.

[26] Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv:1312.6114, 2013.

[27] Wang X, Wang K, Lian SG. A survey on face data augmentation for the training of deep neural networks. Neural Computing and Applications, 2020, 32(19): 15503–15531.

[28] Khosla C, Saini BS. Enhancing performance of deep learning models with different data augmentation techniques: A survey. In: Proc. of the 2020 Int’l Conf. on Intelligent Engineering and Management (ICIEM). London: IEEE, 2020. 79–85.

[29] Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6(1): 60.

[30] Hendrycks D, Mu N, Cubuk ED, Zoph P, Gilmer J, Lakshminarayanan B. AugMix: A simple data processing method to improve robustness and uncertainty. arXiv:1912.02781, 2019.

[31] Han JL, Fang PF, Li WH, Hong J, Armin MA, Reid I, Petersson L, Li HD. You only cut once: Boosting data augmentation with a single cut. In: Proc. of the 39th Int’l Conf. on Machine Learning. Baltimore: PMLR. 2022. 8196–8212.

[32] Singh KK, Yu H, Sarmasi A, Pradeep G, Lee YJ. Hide-and-seek: A data augmentation technique for weakly-supervised localization and beyond. arXiv:1811.02545, 2018.

[33] Zhong Z, Zheng L, Kang GL, Yang L. Random erasing data augmentation. In: Proc. of the 38th AAAI Conf. on Artificial Intelligence. Vancouver: AAAI Press, 2020. 13001–13008. [doi: 10.1609/aaai.v34i07.7000]

[34] Li P, Li XY, Long X. FenceMask: A data augmentation approach for pre-extracted image features. arXiv:2006.07877, 2020.

[35] Yang SR, Li JQ, Zhang TY, Zhao J, Shen FR. AdvMask: A sparse adversarial attack-based data augmentation method for image classification. Pattern Recognition, 2023, 144: 109847.

[36] Harris E, Marcu A, Painter M, et al. FMix: Enhancing mixed sample data augmentation. arXiv:2002.12047, 2020.

[37] Wu R, Yan SG, Shan Y, Dang QQ. Deep image: Scaling up image recognition. arXiv:1501.02876, 2015.

[38] Verma V, Lamb A, Beckham C, Najafi A, Mitliagkas I, Lopez-Paz D, Bengio Y. Manifold Mixup: Better representations by interpolating hidden states. In: Proc. of the 36th Int’l Conf. on Machine Learning. Long Beach: PMLR, 2019. 6438–6447.

[39] Tran T, Pham T, Carneiro G, Palmer L, Reid I. A Bayesian data augmentation approach for learning deep models. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 2794–2803.

[40] Kurtuluş E, Li ZC, Dauphin Y, Cubuk ED. Tied-Augment: Controlling representation similarity improves data augmentation. In: Proc. of the 40th Int’l Conf. on Machine Learning. Honolulu: PMLR, 2023. 17994–18007.

[41] Ho D, Liang E, Chen X, Stoica I, Abbeel P. Population based augmentation: Efficient learning of augmentation policy schedules. In: Proc. of the 36th Int’l Conf. on Machine Learning. Long Beac: PMLR, 2019. 2731–2741.

[42] Lin SQ, Zhang ZZ, Li X, Chen ZB. SelectAugment: Hierarchical deterministic sample selection for data augmentation. In: Proc. of the 38th AAAI Conf. on Artificial Intelligence. Vancouver: AAAI Press, 2023. 1604–1612. [doi: 10.1609/aaai.v37i2.25247]

[43] Cubuk ED, Zoph B, Shlens J, Le QV. RandAugment: Practical automated data augmentation with a reduced search space. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle: IEEE, 2020. 3008–3017.

[44] Müller SG, Hutter F. TriviAlaugment: Tuning-free yet state-of-the-art data augmentation. In: Proc. of the 2021 IEEE/CVF Int’l Conf. on Computer Vision. Montreal: IEEE, 2021. 754–762. [doi: 10.1109/ICCV48922.2021.00081]

[45] Suzuki T TeachAugment: Data augmentation optimization using teacher knowledge. In: Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). New Orleans: IEEE, 2022. 10894–10904. [doi: 10.1109/CVPR52688.2022.01063]

[46] Cheung TH, Yeung DY. AdaAug: Learning class-and instance-adaptive data augmentation policies. In: Proc. of the 10th Int’l Conf. on Learning Representations. OpenReview.net, 2022.

[47] Gong CY, Wang DL, Li M, Chandra V, Liu Q. KeepAugment: A simple information-preserving data augmentation approach. In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021. 1055–1064.

[48] Tian KY, Lin C, Sun M, Zhou LP, Yan JJ, Ouyang WL. Improving auto-augment via augmentation-wise weight sharing. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020. 19088–19098.

[49] DeVries T, Taylor GW. Dataset augmentation in feature space. arXiv:1702.05538, 2017.

[50] Kuo CW, Ma CY, Huang JB, Kira Z. FeatMatch: Feature-based augmentation for semi-supervised learning. In: Proc. of the 16th European Conf. on Computer Vision. Glasgow: Springer, 2020. 479–495. [doi: 10.1007/978-3-030-58523-5_28]

[51] Li BY, Wu F, Lim SN, Belongie S, Weinberger KQ. On feature normalization and data augmentation. In: Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville: IEEE, 2021. 12378–12387. [doi: 10.1109/CVPR46437.2021.01220]

[52] Mirza M, Osindero S. Conditional generative adversarial nets. arXiv:1411.1784, 2014.

[53] Isola P, Zhu JY, Zhou TH, Efros AA. Image-to-image translation with conditional adversarial networks. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017. 5967–5976. [doi: 10.1109/CVPR.2017.632]

[54] Zhu JY, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proc. of the 2017 IEEE Int’l Conf. on Computer Vision. Venice: IEEE, 2017. 2242–2251. [doi: 10.1109/ICCV.2017.244]

[55] Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 8789–8797.

[56] Choi Y, Uh Y, Yoo J, Ha JW. StarGAN v2: Diverse image synthesis for multiple domains. In: Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Seattle: IEEE, 2020. 8185–8194. [doi: 10.1109/CVPR42600.2020.00821]

[57] Summers C, Dinneen MJ. Improved mixed-example data augmentation. In: Proc. of the 2019 IEEE Winter Conf. on Applications of Computer Vision (WACV). Waikoloa: IEEE, 2019. 1262–1270. doi:10.1109/WACV.2019.00139]

[58] Lin C, Guo MH, Li CM, Yuan X, Wu W, Yan JJ D Li DH, Ouyang WL. Online hyper-parameter learning for auto-augmentation strategy. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). Seoul: IEEE, 2019. 6578–6587. [doi: 10.1109/ICCV.2019.00668]

[59] 姜枫, 顾庆, 郝慧珍, 等. 基于内容的图像分割方法综述. 软件学报, 2017, 28(1): 160–183. http://www.jos.org.cn/1000-9825/5136.htm

Jiang F, Gu Q, Hao HZ, Li N, Guo YW, Chen DX. Survey on content-based image segmentation methods. Ruan Jian Xue Bao/Journal of Software, 2017, 28(1): 160–183 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5136.htm

[60] Hao SJ, Zhou Y, Guo YR. A brief survey on semantic segmentation with deep learning. Neurocomputing, 2020, 406: 302–321.

[61] 罗建豪, 吴建鑫. 基于深度卷积特征的细粒度图像分类研究综述. 自动化学报, 2017, 43(8): 1306-1318.

Luo JH, Wu JX. A survey on fine-grained image categorization using deep convolutional features. Acta Automatica Sinica, 2017, 43(8): 1306–1318 (in Chinese with English abstract).

[62] 苏赋, 吕沁, 罗仁泽. 基于深度学习的图像分类研究综述. 电信科学, 2019, 35(11): 58–74.

Su F, Lv Q, Luo RZ. Review of image classification based on deep learning. Telecommunications Science, 2019, 35(11: 58–74 (in Chinese with English abstract).

[63] Wang W, Yang YJ, Wang X, Wang WZ, Li J. Development of convolutional neural network and its application in image classification: A survey. Optical Engineering, 2019, 58(4): 040901. [doi: 10.1117/1.OE.58.4.040901]

[64] 吴帅, 徐勇, 赵东宁. 基于深度卷积网络的目标检测综述. 模式识别与人工智能, 2018, 31(4): 335–346.

Wu S, Xu Y, Zhao DN. Survey of object detection based on deep convolutional network. Pattern Recognition and Artificial Intelligence, 2018, 31(4): 335–346 (in Chinese with English abstract). [doi: 10.16451/j.cnki.issn1003-6059.201804005]

[65] Zou ZX, Chen KY, Shi ZW, Guo YH, Ye JP. Object detection in 20 years: A survey. Proc. of the IEEE, 2023, 111(3): 257–276.

[66] 田萱, 王亮, 丁琪. 基于深度学习的图像语义分割方法综述. 软件学报, 2019, 30(2): 440–468. http://www.jos.org.cn/1000-9825/5659.htm

Tian X, Wang L, Ding Q. Review of image semantic segmentation based on deep learning. Ruan Jian Xue Bao/Journal of Software, 2019, 30(2): 440-468 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5659.htm

[67] Csurka G, Perronnin F. An efficient approach to semantic segmentation. Int’l Journal of Computer Vision, 2011, 95(2): 198-212. [doi: 10.1007/s11263-010-0344-8]

[68] Everingham M, van Gool L, Williams CKI, Winn J, Zisserman V. The PASCAL visual object classes (VOC) challenge. Int’l Journal of Computer Vision, 2010, 88(2): 303–338.

[69] Zhao HS, Shi JP, Qi XJ, Wang XG, Jia JY. Pyramid scene parsing network. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017. 6230–6239. [doi: 10.1109/CVPR.2017.660]

[70] Cao Y, Xu JR, Lin S, Wei FY, Hu H. GCNet: Non-local networks meet squeeze-excitation networks and beyond. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision Workshop (ICCVW). Seoul: IEEE, 2019. 1971–1980. [doi: 10.1109/ICCVW.2019.00246]

[71] Huang L, Yuan YH, Guo JY, Zhang C, Chen XL, Wang JD. Interlaced sparse self-attention for semantic segmentation. arXiv:1907.12273, 2019.

[72] Wang ZF, Berman M, Rannen-Triki A, Torr PHS, Tuia D, Tuytelaars T, van Gool L, Yu JQ, Blaschko MB. Revisiting evaluation metrics for semantic segmentation: Optimization and evaluation of fine-grained intersection over union. In: Proc. of the 37th Int’l Conf. on Neural Information Processing Systems. New Orlean: Curran Associates Inc., 2024. 60144–60225.

[73] Yang SR, Xiao WK, Zhang MC, Guo SH, Zhao J, Shen FR. Image data augmentation for deep learning: A survey. arXiv:2204.08610, 2022.

[74] Ouyang WL, Zeng XY, Wang XG, Qiu S, Luo P, Tian YL, Li HS, Yang S, Wang Z, Li HY, Wang K, Yan JJ, Loy CC, Tang XO. DeepID-Net: Object detection with deformable part based convolutional neural networks. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2017, 39(7): 1320–1334.

[75] Diba A, Sharma V, Pazandeh A, Pirsiavash H, van Gool L. Weakly supervised cascaded convolutional networks. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017. 5131–5139. [doi: 10.1109/CVPR.2017.545.]

[76] Hu GS, Yang YX, Yi D, Kittler J, Christmas W, Li SZ, Hospedales T. When face recognition meets with deep learning: An evaluation of convolutional neural networks for face recognition. In: Proc. of the 2015 IEEE Int’l Conf. on Computer Vision Workshop (ICCVW). Santiago: IEEE, 2015. 384–392. [doi: 10.1109/ICCVW.2015.58]

[77] Lawrence S, Giles CL, Tsoi AC, Back AD. Face recognition: A convolutional neural-network approach. IEEE Trans. on Neural Networks, 1997, 8(1): 98–113.

[78] Cao Z, Simon T, Wei SE, Sheikh Y. Realtime multi-person 2D pose estimation using part affinity fields. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017, 1302–1310. [doi: 10.1109/CVPR.2017.143]

[79] Toshev A, Szegedy C. DeepPose: Human pose estimation via deep neural networks. In: Proc. of the 2014 IEEE Conf. on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014. 1653–1660. [doi: 10.1109/CVPR.2014.214]

[80] Zagoruyko S, Komodakis N. Wide residual networks. arXiv:1605.07146, 2016.

[81] Huang G, Liu Z, van der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017. 2261–2269. [doi: 10.1109/CVPR.2017.243]

[82] Gastaldi X. Shake-shake regularization.arXiv:1705.07485, 2017.

[83] Netzer Y, Wang T, Coates A, et al. Reading digits in natural images with unsupervised feature learning. In: Proc. of the 2011 NIPS Workshop on Deep Learning and Unsupervised Feature Learning. 2011.

[84] 梁新宇, 林洗坤, 权冀川, 等. 基于深度学习的图像实例分割技术研究进展. 电子学报, 2020, 48(12): 2476–2486.

Liang XY, Lin XK, Quan JC, Xiao KH. Research on the progress of image instance segmentation based on deep learning. Acta Electronica Sinica, 2020, 48(12): 2476–2486 (in Chinese with English abstract).

[85] 高文, 朱明, 贺柏根, 吴笑天. 目标跟踪技术综述. 中国光学, 2014, 7(3): 365–375.

Gao W, Zhu M, He BG, Wu XT. Overview of target tracking technology. Chinese Optics, 2014, 7(3): 365–375 (in Chinese with English abstract).

[86] Ren SQ, He KM, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proc. of the 28th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2015. 91–99.

[87] Duan W, Bai S, Xie LX, Qi HG, Huang QM, Tian Q. CenterNet: Keypoint triplets for object detection. In: Proc. of the 2019 IEEE/CVF Int’l Conf on Computer Vision (ICCV). Seoul: IEEE, 2019. 6568–6577.

[88] Algan G, Ulusoy I. Image classification with deep learning in the presence of noisy labels: A survey. Knowledge-based Systems, 2021, 215: 106771.

[89] Liu L, Ouyang WL, Wang XG, Fieguth P, Chen J, Liu XW, Pietikäinen M. Deep learning for generic object detection: A survey. Int’l Journal of Computer Vision, 2020, 128(2): 261–318.

[90] Minaee S, Boykov Y Y, Porikli F, Plaza A, Kehtarnavaz N, Terzopoulos D. Image segmentation using deep learning: A survey. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2022, 44(7): 3523–3542.

[91] Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X. Improved techniques for training GANs. In: Proc. of the 30th Int’l Conf. on Neural Information Processing Systems. Barcelona: Curran Associates Inc., 2016. 2234–2242.

[92] Sun YM, Wong AKC, Kamel MS. Classification of imbalanced data: A review. Int’l Journal of Pattern Recognition and Artificial Intelligence, 2009, 23(4): 687–719.

[93] Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321–357.

[94] Pawara P, Okafor E, Schomaker L, Wiering M. Data augmentation for plant classification. In: Proc. of the 18th Int’l Conf. on Advanced Concepts for Intelligent Vision Systems. Antwerp: Springer, 2017. 615–626. [doi: 10.1007/978-3-319-70353-4_52]

引用本文

杨锁荣,杨洪朝,申富饶,赵健.面向深度学习的图像数据增强综述.软件学报,2025,36(3):1390-1412

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-03-14
最后修改日期:2023-09-01
录用日期:
在线发布日期: 2024-12-09
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码