基于StarGAN和类别编码器的图像风格转换
作者:
作者简介:

许新征(1980-),男,博士,教授,CCF高级会员,主要研究领域为机器学习,数据挖掘,模式识别;
丁世飞(1963-),男,博士,教授,CCF杰出会员,主要研究领域为机器学习,数据挖掘,模式识别;
常建英(1996-),女,硕士生,主要研究领域为深度学习,计算机视觉.

通讯作者:

许新征,E-mail:xuxinzh@163.com

基金项目:

国家自然科学基金(61976217,61976216)


Image Style Transfering Based on StarGAN and Class Encoder
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [36]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    图像风格转换技术已经融入到人们的生活中,并被广泛应用于图像艺术化、卡通化、图像着色、滤镜处理和去遮挡等实际场景中,因此,图像风格转换具有重要的研究意义与应用价值.StarGAN是近年来用于多域图像风格转换的生成对抗网络框架.StarGAN通过简单地下采样提取特征,然后通过上采样生成图片,但是生成图片的背景颜色信息、人物脸部的细节特征会与输入图像有较大差异.对StarGAN的网络结构进行改进,通过引入U-Net和边缘损失函数,提出了用于图像风格转换的UE-StarGAN模型.同时,将类别编码器引入到UE-StarGAN模型的生成器中,构建了融合类别编码器的小样本图像风格转换模型,实现了小样本的图像风格转换.实验结果表明:该模型可以提取到更精细的特征,在小样本的情况下具有一定的优势,以此进行图像风格转换后的图片无论是定性分析还是定量分析都有一定的提升,验证了所提模型的有效性.

    关键词:半监督学习
    Abstract:

    The image style transferring technology has been widely integrated into people’s life, and it is widely used in image artistry, cartoon, picture coloring, filter processing, and occlusion removal of the practical scenarios, so image style transfering has an important research significance and application value. StarGAN is a generative adversarial network framework for multi-domain image style transfering in recent years. StarGAN extracts features through simple down-sampling, and then generates images through up-sampling. Nevertheless, the background color information and detailed features of people’s faces in the generated images are quite different from those in the input images. In this study, by improving the network structure of StarGAN, after analyzing the existing problems of the StarGAN, a UE-StarGAN model for image style transfering is proposed by introducing U-Net and edge-promoting adversarial loss function. At the same time, the class encoder is introduced into the generator of UE-StarGAN, and a small sample image style transfering model is designed to realize the small sample image style transfer. The results of this experiment show that the model can extract more detailed features, have some advantages in the case of small sample size, and to a certain extent, the qualitative and quantitative analysis results of the images can be improved after the image style transfering, which verifies the effectiveness of the proposed model.

    参考文献
    [1] Ligeza A. Artificial intelligence: A modern approach. Applied Mechanics & Materials, 2009, 263(2): 2829-2833.
    [2] Pollack ME. Artificial intelligence—A modern approach (a review). AI Magazine, 1995, 16: 73-74.
    [3] Zhou FY, Jin LP, Dong J. Review of convolutional neural network. Chinese Journal of Computers, 2017, 40(6): 1229-1251 (in Chinese with English abstract).
    [4] Lu HT, Zhang QC. Applications of deep convolutional neural network in computer vision. Journal of Data Acquisition and Processing, 2016, 31(1): 1-17 (in Chinese with English abstract).
    [5] Li S. Research and development of natural language processing. Journal of Yanshan University, 2013, 37(5): 377-384 (in Chinese with English abstract).
    [6] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Ghahramani Z, ed. Proc. of the Advances in Neural Information Processing Systems. MIT Press, 2014. 2672-2680.
    [7] Zhang D, Shao J, Hu G, et al. Sharp and real image super-resolution using generative adversarial network. In: Proc. of the Int’l Conf. on Neural Information Processing. Cham: Springer International Publishing, 2017. 217-226.
    [8] Ouyang N, Liang T, Lin LP. Self-attention network based image super-resolution. Journal of Computer Applications, 2019, 39(8): 2391-2395 (in Chinese with English abstract).
    [9] Gao Y, Liu Z, Qin PL, et al. Medical image super-resolution algorithm based on deep residual generative adversarial network. Journal of Computer Applications, 2018, 38(9): 2689-2695 (in Chinese with English abstract).
    [10] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis. JMLR.org. 2016.
    [11] Karras T,Laine S,Aila T. A Style-based generator architecture for generative adversarial networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. IEEE, 2019. 4401-4410.
    [12] Phillip I, Zhu JY, Zhou TH, et al. Image-to-image translation with conditional adversarial networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. IEEE, 2017. 5967-5976.
    [13] Zhu JY, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv: 1703.10593, 2017.
    [14] Kim T, Cha M, Kim H, et al. Learning to discover cross-domain relations with generative adversarial networks. In: Proc. of the 34th Int’l Conf. on Machine Learning (ICML). IMLS, 2017. 2941-2949.
    [15] Yi Z, Zhang H, Gong PTM. DualGAN: Unsupervised dual learning for image-to-image translation. arXiv: 1704.02510, 2017.
    [16] Ratliff LJ, Burden SA, Sastry SS. Characterization and computation of local Nash equilibria in continuous games. In: Proc. of the 51st Annu. Allerton Conf. on Communication, Control, and Computing (Allerton). 2013. 917-924.
    [17] Chen Y, Lai YK, Liu YJ. CartoonGAN: Generative adversarial networks for photo cartoonization. In: Proc. of the IEEE/CVF Conf. on Computer Vision & Pattern Recognition. IEEE, 2018. 9465-9474.
    [18] Qian R, Tan RT, Yang W, et al. Attentive generative adversarial network for raindrop removal from a single image. In: Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018. 2482-2491.
    [19] Yao NM, Guo QP, Qiao FC, et al. Robust facial expression recognition with generative adversarial networks. Acta Automatica Sinica, 2018, 44(5): 865-877 (in Chinese with English abstract).
    [20] Chang JY. Image style transferring based on generative adversarial network [MS. Thesis]. Xuzhou: China University of Mining and Technology, 2021 (in Chinese with English abstract).
    [21] Li C, Wand M. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Proc. of the European Conf. on Computer Vision. Cham: Springer, 2016. 702-716.
    [22] Choi Y, Choi M, Kim M, et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. arXiv: 1711.09020, 2017.
    [23] Yao Z, Zhang BY, Wang ZY. IntersectGAN: Learning domain intersection for generating images with multiple attributes. In: Proc. of the ACM Int’l Conf. ACM, 2019. 1842-1850.
    [24] Wang Z, Bovik AC, Sheikh HR, etal. Image quality assessment: From error visibility to structural similarity. IEEE Trans. on Image Processing, 2004, 13(4): 600-612.
    [25] Huynh-Thu Q, Ghanbari M. Scope of validity of PSNR in image/video quality assessment. Electronics Letters, 2008, 44(13): 800-801.
    [26] Zhu XS, Yao SR, Sun B, et al. Image quality assessment: Combining the characteristics of HVS and structural similarity index. Journal of Harbin Institute of Technology, 2018, 50(5): 121-128 (in Chinese with English abstract).
    [27] Liu M Y, Huang X, Mallya A, et al. Few-shot unsupervised image-to-image translation. In: Proc. of the 2019 IEEE/CVF Int’l Conf. on Computer Vision (ICCV). IEEE, 2019. 10550-10559.
    附中文参考文献:
    [3] 卢宏涛, 张秦川. 深度卷积神经网络在计算机视觉中的应用研究综述. 数据采集与处理, 2016, 31(1): 1-17.
    [4] 李生. 自然语言处理的研究与发展. 燕山大学学报, 2013, 37(5): 377-384.
    [5] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述. 计算机学报, 2017, 40(6): 1229-1251.
    [8] 欧阳宁, 梁婷, 林乐平. 基于自注意力网络的图像超分辨率重建. 计算机应用, 2019, 39(8): 2391-2395.
    [9] 高媛, 刘志, 秦品乐, 等. 基于深度残差生成对抗网络的医学影像超分辨率算法. 计算机应用, 2018, 38(9): 2689-2695.
    [19] 姚乃明, 郭清沛, 乔逢春, 等. 基于生成式对抗网络的鲁棒人脸表情识别. 自动化学报, 2018, 44(5): 865-877.
    [20] 常建英. 基于生成对抗网络的图像风格转换 [硕士学位论文].中国矿业大学, 2021.
    [26] 朱新山, 姚思如, 孙彪, 等. 图像质量评价: 融合视觉特性与结构相似性指标. 哈尔滨工业大学学报, 2018, 50(5): 121-128.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

许新征,常建英,丁世飞.基于StarGAN和类别编码器的图像风格转换.软件学报,2022,33(4):1516-1526

复制
分享
文章指标
  • 点击次数:1284
  • 下载次数: 4755
  • HTML阅读次数: 2827
  • 引用次数: 0
历史
  • 收稿日期:2021-06-01
  • 最后修改日期:2021-07-16
  • 在线发布日期: 2021-10-26
  • 出版日期: 2022-04-06
文章二维码
您是第19705195位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号