基于知识的零样本视觉识别综述
作者:
作者简介:

冯耀功(1992-),男,博士生,CCF学生会员,主要研究领域为深度学习,计算机视觉,零样本学习.
于剑(1969-),男,博士,教授,博士生导师,CCF会士,主要研究领域为人工智能,机器学习.
桑基韬(1985-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为多媒体计算,网络数据挖掘,可信赖机器学习.
杨朋波(1993-),男,博士生,CCF学生会员,主要研究领域为深度学习,计算机视觉,对抗鲁棒性.

通讯作者:

于剑,E-mail:jianyu@bjtu.edu.cn

基金项目:

国家重点研发计划(2017YFC1703506);国家自然科学基金(61632004,61832002,61672518);中央高校基本科研业务费专项资金(2020YJS030,2018JBZ006,2019JBZ110)


Survey on Knowledge-based Zero-shot Visual Recognition
Author:
Fund Project:

National Key Research and Development Program of China (2017YFC1703506); National Natural Science Foundation of China (61632004, 61832002, 61672518); Fundamental Research Funds for the Central Universities (2020YJS030, 2018JBZ006, 2019JBZ110)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [204]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    零样本学习旨在通过运用已学到的已知类知识去认知未知类.近年来,“数据+知识驱动”已经成为当下的新潮流,而在计算机视觉领域内的零样本任务中,“知识”本身却缺乏统一明确的定义.针对这种情况,尝试从知识的角度出发,梳理了本领域内“知识”这一概念所覆盖的范畴,共划分为初级知识、抽象知识以及外部知识.基于前面对知识的定义和划分,梳理了当前的零样本学习(主要是图像分类任务的模型)工作,分为基于初级知识的零样本模型、基于抽象知识的零样本模型以及引入外部知识的零样本模型.还对领域内存在的域偏移和枢纽点问题进行了阐述,并基于问题对现有工作进行了总结归纳.最后总结了目前常用的图像分类任务的数据集和知识库、图像分类实验评估标准以及代表性的模型实验结果,并对未来的工作进行了展望.

    Abstract:

    Zero-shot learning aims to recognize the unseen classes by using the knowledge of the seen classes that has been learned. In recent years, ‘knowledge+data driven’ has become a new trend but lacking of unified definition of ‘knowledge’ in the current zero-shot tasks of computer vision. This study tries to define the ‘knowledge’ in this field and divided it into three categories, which are primary knowledge, abstract knowledge, and external knowledge. In addition, based on the definition and classification of knowledge, the current works on zero-shot learning (mainly in image classification task) are sorted out, they are divided into zero-shot models based on primary knowledge, zero-shot models based on abstract knowledge, and zero-shot models based on external knowledge. This study also introduces the problems which are domain shift and hubness in this field, and further summarizes existing works based on the problems. Finally, the paper summarizes the datasets and knowledge bases that commonly used in image classification tasks, the evaluation criteria of image classification experiment and the experimental results of representative models. The future works are also summarized and prospected.

    参考文献
    [1] Zhou ZH. A brief introduction to weakly supervised learning. National Science Review, 2018,5(1):44-53.
    [2] Thrun S, Pratt L. Learning to learn:Introduction and overview. In:Proc. of the Learning to Learn. Springer-Verlag, 1998. 3-17.
    [3] Fu Y, Xiang T, Jiang YG, et al. Recent advances in zero-shot recognition:Toward data-efficient understanding of visual content. IEEE Signal Processing Magazine, 2018,35(1):112-125.
    [4] Xian Y, Lampert CH, Schiele B, et al. Zero-shot learning-A comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2018,41(9):2251-2265.
    [5] Wang W, Zheng VW, Yu H, et al. A survey of zero-shot learning:Settings, methods, and applications. ACM Trans. on Intelligent Systems and Technology (TIST), 2019,10(2):1-37.
    [6] Wang Y, Yao Q, Kwok JT, et al. Generalizing from a few examples:A survey on few-shot learning. ACM Computing Surveys (CSUR), 2020,53(3):1-34.
    [7] Larochelle H, Erhan D, Bengio Y. Zero-data learning of new tasks. In:Proc. of the AAAI. 2008. 646-651.
    [8] Wang YX, Girshick R, Hebert M, et al. Low-shot learning from imaginary data. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 7278-7286.
    [9] Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans. on Knowledge and Data Engineering, 2009,22(10):1345-1359.
    [10] Ye M, Guo Y. Progressive ensemble networks for zero-shot recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 11728-11736.
    [11] Kodirov E, Xiang T, Fu Z, et al. Unsupervised domain adaptation for zero-shot learning. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015. 2452-2460.
    [12] Zhao A, Ding M, Guan J, et al. Domain-invariant projection learning for zero-shot recognition. In:Advances in Neural Information Processing Systems. 2018. 1019-1030.
    [13] Fu Y, Hospedales TM, Xiang T, et al. Transductive multi-view zero-shot learning. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2015,37(11):2332-2345.
    [14] Fu Z, Xiang T, Kodirov E, et al. Zero-shot learning on semantic class prototype graph. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2017,40(8):2009-2022.
    [15] Dinu G, Lazaridou A, Baroni M. Improving zero-shot learning by mitigating the hubness problem. arXiv preprint arXiv:1412. 6568, 2014.
    [16] Rohrbach M, Stark M, Schiele B. Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In:Proc. of the CVPR 2011. 2011. 1641-1648.
    [17] Kumar N, Berg AC, Belhumeur PN, et al. Attribute and simile classifiers for face verification. In:Proc. of the 12th IEEE Int'l Conf. on Computer Vision. 2009. 365-372.
    [18] Xiong F, Abdalmageed W. Unknown presentation attack detection with face RGB images. In:Proc. of the 9th IEEE Int'l Conf. on Biometrics Theory, Applications and Systems (BTAS). 2018. 1-9.
    [19] Liu Y, Stehouwer J, Jourabloo A, et al. Deep tree learning for zero-shot face anti-spoofing. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 4680-4689.
    [20] Liu J, Kuipers B, Savarese S. Recognizing human actions by attributes. In:Proc. of the CVPR 2011. 2011. 3337-3344.
    [21] Cheng HT, Sun FT, Griss M, et al. Nuactiv:Recognizing unseen new activities using semantic attribute-based learning. In:Proc. of the 11th Annual Int'l Conf. on Mobile Systems, Applications, and Services. 2013. 361-374.
    [22] Antol S, Zitnick CL, Parikh D. Zero-shot learning via visual abstraction. In:Proc. of the European Conf. on Computer Vision. 2014. 401-416.
    [23] Jain M, Van Gemert JC, Mensink T, et al. Objects2action:Classifying and localizing actions without any video example. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015. 4588-4596.
    [24] Gan C, Lin M, Yang Y, et al. Exploring semantic inter-class relationships (SIR) for zero-shot action recognition. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2015. 3769-3775.
    [25] Xu X, Hospedales TM, Gong S. Multi-task zero-shot action recognition with prioritised data augmentation. In:Proc. of the European Conf. on Computer Vision. 2016. 343-359.
    [26] Gan C, Lin M, Yang Y, et al. Concepts not alone:Exploring pairwise relationships for zero-shot video activity recognition. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2016. 3487-3493.
    [27] Wang W, Miao C, Hao S. Zero-shot human activity recognition via nonlinear compatibility based method. In:Proc. of the Int'l Conf. on Web Intelligence. 2017. 322-330.
    [28] Xu X, Hospedales T, Gong SG. Transductive zero-shot action recognition by word-vector embedding. Int'l Journal of Computer Vision, 2017,123(3):309-333.
    [29] Qin J, Liu L, Shao L, et al. Zero-shot action recognition with error-correcting output codes. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 2833-2842.
    [30] Mishra A, Verma VK, Reddy MSK, et al. A generative approach to zero-shot and few-shot action recognition. In:Proc. of the 2018 IEEE Winter Conf. on Applications of Computer Vision (WACV). 2018. 372-380.
    [31] Gao J, Zhang T, Xu C. I know the relationships:Zero-shot action recognition via two-stream graph convolutional networks and knowledge graphs. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2019. 8303-8311.
    [32] Mandal D, Narayan S, Dwivedi SK, et al. Out-of-distribution detection for generalized zero-shot action recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 9985-9993.
    [33] Lampert CH, Nickisch H, Harmeling S. Attribute-based classification for zero-shot visual object categorization. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2013,36(3):453-465.
    [34] Deng J, Ding N, Jia Y, et al. Large-scale object classification using label relation graphs. In:Proc. of the European Conf. on Computer Vision. 2014. 48-64.
    [35] Li LJ, Su H, Lim Y, et al. Objects as attributes for scene classification. In:Proc. of the European Conf. on Computer Vision. 2010. 57-69.
    [36] Kordumova S, Mensink T, Snoek CG. Pooling objects for recognizing scenes without examples. In:Proc. of the 2016 ACM Int'l Conf. on Multimedia Retrieval. 2016. 143-150.
    [37] Bucher M, Tuan-Hung VU, Cord M, et al. Zero-shot semantic segmentation. In:Advances in Neural Information Processing Systems. 2019. 468-479.
    [38] Shen Y, Liu L, Shen F, et al. Zero-shot sketch-image hashing. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 3598-3607.
    [39] Yelamarthi SK, Reddy SK, Mishra A, et al. A zero-shot framework for sketch based image retrieval. In:Proc. of the European Conf. on Computer Vision. 2018. 316-333.
    [40] Dutta A, Akata Z. Semantically tied paired cycle consistency for zero-shot sketch-based image retrieval. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 5089-5098.
    [41] Fu Y, Hospedales TM, Xiang T, et al. Attribute learning for understanding unstructured social activity. In:Proc. of the European Conf. on Computer Vision. 2012. 530-543.
    [42] Fu Y, Hospedales TM, Xiang T, et al. Learning multimodal latent attributes. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2013,36(2):303-316.
    [43] Xu B, Fu Y, Jiang YG, et al. Video emotion recognition with transferred deep feature encodings. In:Proc. of the 2016 ACM Int'l Conf. on Multimedia Retrieval. 2016. 15-22.
    [44] Zhang CR, Peng YX. Visual data synthesis via GAN for zero-shot video classification. arXiv preprint arXiv:1804.10073, 2018.
    [45] Bansal A, Sikka K, Sharma G, et al. Zero-shot object detection. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018. 384-400.
    [46] Lu C, Krishna R, Bernstein M, et al. Visual relationship detection with language priors. In:Proc. of the European Conf. on Computer Vision. 2016. 852-869.
    [47] Dalton J, Allan J, Mirajkar P. Zero-shot video retrieval using content and concepts. In:Proc. of the 22nd ACM Int'l Conf. on Information and Knowledge Management. 2013. 1857-1860.
    [48] Wu S, Bondugula S, Luisier F, et al. Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2014. 2665-2672.
    [49] Chang X, Yang Y, Hauptmann A, et al. Semantic concept discovery for large-scale zero-shot event detection. In:Proc. of the 24th Int'l Joint Conf. on Artificial Intelligence. 2015. 2234-2240.
    [50] Chang X, Yang Y, Long G, et al. Dynamic concept composition for zero-example event detection. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2016. 3464-3470.
    [51] Blitzer J, Foster DP, Kakade SM. Zero-shot domain adaptation:A multi-view approach. Technical Report, TTI-TR-2009-1, Toyota Technological Institute at Chicago, 2009.
    [52] Yazdani M, Henderson J. A model of zero-shot learning of spoken language understanding. In:Proc. of the 2015 Conf. on Empirical Methods in Natural Language Processing. 2015. 244-249.
    [53] Jiang H, Wang R, Shan S, et al. Learning class prototypes via structure alignment for zero-shot recognition. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018. 118-134.
    [54] Parikh D, Grauman K. Relative attributes. In:Proc. of the 2011 Int'l Conf. on Computer Vision. 2011. 503-510.
    [55] Parikh D, Grauman K. Interactively building a discriminative vocabulary of nameable attributes. In:Proc. of the CVPR 2011. 2011. 1681-1688.
    [56] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301. 3781, 2013.
    [57] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In:Advances in Neural Information Processing Systems. 2013. 3111-3119.
    [58] Pennington J, Socher R, Manning C. Glove:Global vectors for word representation. In:Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP). 2014. 1532-1543.
    [59] Fu Y, Hospedales TM, Xiang T, et al. Transductive multi-view embedding for zero-shot recognition and annotation. In:Proc. of the European Conf. on Computer Vision. 2014. 584-599.
    [60] Fu Z, Xiang T, Kodirov E, et al. Zero-shot object recognition by semantic manifold distance. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015. 2635-2644.
    [61] Fu Y, Sigal L. Semi-supervised vocabulary-informed learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 5337-5346.
    [62] Zhang C, Lyu X, Tang Z. TGG:Transferable graph generation for zero-shot and few-shot learning. In:Proc. of the 27th ACM Int'l Conf. on Multimedia. 2019. 1641-1649.
    [63] Lampert C H, Nickisch H, Harmeling S. Learning to detect unseen object classes by between-class attribute transfer. In:Proc. of the 2009 IEEE Conf. on Computer Vision and Pattern Recognition. 2009. 951-958.
    [64] Huang S, Elhoseiny M, Elgammal A, et al. Learning hypergraph-regularized attribute predictors. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015. 409-417.
    [65] Ye M, Guo Y. Zero-shot classification with discriminative semantic representation learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 7140-7148.
    [66] Kodirov E, Xiang T, Gong S. Semantic autoencoder for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 3174-3183.
    [67] Lu Z, Guan J, Li A, et al. Zero and few shot learning with semantic feature synthesis and competitive learning. arXiv preprint arXiv:1810.08332, 2018.
    [68] Zhang L, Xiang T, Gong S. Learning a deep embedding model for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 2021-2030.
    [69] Liu S, Long M, Wang J, et al. Generalized zero-shot learning with deep calibration network. In:Advances in Neural Information Processing Systems. 2018. 2005-2015.
    [70] Liu ZZ, Zhang XX, Zhu ZF, et al. Convolutional prototype learning for zero-shot recognition. arXiv preprint arXiv:1910.09728, 2019.
    [71] Jiang H, Wang R, Shan S, et al. Learning discriminative latent attributes for zero-shot classification. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 4223-4232.
    [72] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In:Advances in Neural Information Processing Systems. 2012. 1097-1105.
    [73] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    [74] van der Maaten L, Hinton G. Visualizing data using t-SNE. Journal of Machine Learning Research, 2008,9:2579-2605.
    [75] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In:Advances in Neural Information Processing Systems. 2017. 4077-4087.
    [76] Zhou L, Cui P, Yang S, et al. Learning to learn image classifiers with informative visual analogy. arXiv preprint arXiv:1710. 06177, 2017.
    [77] Wan Z, Chen D, Li Y, et al. Transductive zero-shot learning with visual structure constraint. In:Advances in Neural Information Processing Systems. 2019. 9972-9982.
    [78] Zhu Y, Xie J, Tang Z, et al. Semantic-guided multi-attention localization for zero-shot learning. In:Advances in Neural Information Processing Systems. 2019. 14917-14927.
    [79] Li Y, Wang D, Hu H, et al. Zero-shot recognition using dual visual-semantic mapping paths. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 3279-3287.
    [80] Wang X, Pang S, Zhu J, et al. Visual space optimization for zero-shot learning. arXiv preprint arXiv:1907.00330, 2019.
    [81] Socher R, Ganjoo M, Manning CD, et al. Zero-shot learning through cross-modal transfer. In:Advances in Neural Information Processing Systems. 2013. 935-943.
    [82] Li Y, Zhang J, Zhang J, et al. Discriminative learning of latent features for zero-shot recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 7463-7471.
    [83] Li J, Jin M, Lu K, et al. Leveraging the invariant side of generative zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 7402-7411.
    [84] Zhu P, Wang H, Saligrama V. Generalized zero-shot recognition based on visually semantic embedding. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 2995-3003.
    [85] Zhao B, Wu B, Wu T, et al. Zero-shot learning posed as a missing data problem. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 2616-2622.
    [86] Kumar Verma V, Arora G, Mishra A, et al. Generalized zero-shot learning via synthesized examples. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 4281-4289.
    [87] Changpinyo S, Chao WL, Gong B, et al. Classifier and exemplar synthesis for zero-shot learning. Int'l Journal of Computer Vision, 2020,128(1):166-201.
    [88] Long Y, Liu L, Shao L, et al. From zero-shot learning to conventional supervised classification:Unseen visual data synthesis. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 1627-1636.
    [89] Changpinyo S, Chao WL, Sha F. Predicting visual exemplars of unseen classes for zero-shot learning. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 3476-3485.
    [90] Liu B, Dong QL, Hu ZY. Zero-shot learning from adversarial feature residual to compact visual feature. In:Proc. of the AAAI. 2020. 11547-11554.
    [91] Zhang Z, Saligrama V. Zero-shot recognition via structured prediction. In:Proc. of the European Conf. on Computer Vision. 2016. 533-548.
    [92] Xu X, Shen F, Yang Y, et al. Matrix tri-factorization with manifold regularizations for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 3798-3807.
    [93] Xu X, Shen F, Yang Y, et al. Transductive visual-semantic embedding for zero-shot learning. In:Proc. of the 2017 ACM Int'l Conf. on Multimedia Retrieval. 2017. 41-49.
    [94] Wang Q, Chen K. Zero-shot visual recognition via bidirectional latent embedding. Int'l Journal of Computer Vision, 2017,124(3):356-383.
    [95] Li Y, Jia Z, Zhang J, et al. Deep semantic structural constraints for zero-shot learning. In:Proc. of the 32nd AAAI Conf. on Artificial Intelligence. 2018. 7049-7056.
    [96] Changpinyo S, Chao WL, Gong B, et al. Synthesized classifiers for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 5327-5336.
    [97] Mukherjee T, Hospedales T. Gaussian visual-linguistic embedding for zero-shot recognition. In:Proc. of the 2016 Conf. on Empirical Methods in Natural Language Processing. 2016. 912-918.
    [98] Guo Y, Ding G, Han J, et al. Synthesizing samples fro zero-shot learning. In:Proc. of the 26th Int'l Joint Conf. on Artificial Intelligence. 2017. 1774-1780.
    [99] Bucher M, Herbin S, Jurie F. Generating visual representations for zero-shot classification. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 2666-2673.
    [100] Verma VK, Rai P. A simple exponential family framework for zero-shot learning. In:Proc. of the Joint European Conf. on Machine Learning and Knowledge Discovery in Databases. 2017. 792-808.
    [101] Wang W, Pu Y, Verma VK, et al. Zero-shot learning via class-conditioned deep generative models. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2018. 4211-4218.
    [102] Mishra A, Reddy SK, Mittal A, et al. A generative model for zero shot learning using conditional variational autoencoders. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops. 2018. 2188-2196.
    [103] Schonfeld E, Ebrahimi S, Sinha S, et al. Generalized zero-and few-shot learning via aligned variational autoencoders. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 8247-8255.
    [104] Yu H, Lee B. Zero-shot learning via simultaneous generating and learning. In:Advances in Neural Information Processing Systems. 2019. 46-56.
    [105] Tong B, Klinkigt M, Chen J, et al. Adversarial zero-shot learning with semantic augmentation. In:Proc. of the 32nd AAAI Conf. on Artificial Intelligence. 2018.
    [106] Xian Y, Lorenz T, Schiele B, et al. Feature generating networks for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 5542-5551.
    [107] Zhu Y, Elhoseiny M, Liu B, et al. A generative adversarial approach for zero-shot learning from noisy texts. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 1004-1013.
    [108] Paul A, Krishnan NC, Munjal P. Semantically aligned bias reducing zero shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 7056-7065.
    [109] Ni J, Zhang S, Xie H. Dual adversarial semantics-consistent network for generalized zero-shot learning. In:Advances in Neural Information Processing Systems. 2019. 6143-6154.
    [110] Felix R, Kumar VB, Reid I, et al. Multi-modal cycle-consistent generalized zero-shot learning. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018. 21-37.
    [111] Huang H, Wang C, Yu PS, et al. Generative dual adversarial network for generalized zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 801-810.
    [112] Xian Y, Sharma S, Schiele B, et al. f-VAEGAN-D2:A feature generating framework for any-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 10275-10284.
    [113] Liu H, Zheng QH, Luo MN, Zhao HK, Xiao Y, Lü YZ. Cross-domain adversarial learning for zero-shot classification. Journal of Computer Research and Development, 2019,56(12):2521-2535(in Chinese with English abstract).
    [114] Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114, 2013.
    [115] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In:Advances in Neural Information Processing Systems. 2014. 2672-2680.
    [116] Mirza M, Osindero S. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
    [117] Arjovsky M, Chintala S, Bottou L. Wasserstein GAN. arXiv preprint arXiv:1701.07875, 2017.
    [118] Ba JL, Swersky K, Fidler S, et al. Predicting deep zero-shot convolutional neural networks using textual descriptions. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015. 4247-4255.
    [119] Qiao R, Liu L, Shen C, et al. Less is more:Zero-shot learning from online textual documents with noise suppression. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 2249-2257.
    [120] Elhoseiny M, Zhu Y, Zhang H, et al. Link the head to the "beak":Zero shot learning from noisy text description at part precision. In:Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2017. 6288-6297.
    [121] Speer R, Havasi C. ConceptNet 5:A large semantic network for relational knowledge. In:Proc. of the People's Web Meets NLP. Springer-Verlag, 2013. 161-176.
    [122] Speer R, Chin J, Havasi C. Conceptnet 5.5:An open multilingual graph of general knowledge. In:Proc. of the 31st AAAI Conf. on Artificial Intelligence. 2017. 4444-4451.
    [123] Liu ZY, Sun MS, Lin YK, Xie RB. Knowledge representation learning:A review. Journal of Computer Research and Development, 2016,53(2):247-261(in Chinese with English abstract).
    [124] Scarselli F, Gori M, Tsoi AC, et al. The graph neural network model. IEEE Trans. on Neural Networks, 2008,20(1):61-80.
    [125] Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
    [126] Al-Halah Z, Stiefelhagen R. How to transfer? Zero-shot object recognition via hierarchical transfer of semantic attributes. In:Proc. of the 2015 IEEE Winter Conf. on Applications of Computer Vision. 2015. 837-843.
    [127] Li X, Liao S, Lan W, et al. Zero-shot image tagging by hierarchical semantic embedding. In:Proc. of the 38th Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. 2015. 879-882.
    [128] Li AX, Zhang KX, Wang LW. Zero-shot fine-grained classification by deep feature learning with semantics. Int'l Journal of Automation and Computing, 2019,16(5):563-574.
    [129] Cui P, Liu SW, Zhu WW. General knowledge embedded image representation learning. IEEE Trans. on Multimedia, 2017,20(1):198-207.
    [130] Wang X, Ye Y, Gupta A. Zero-shot recognition via semantic embeddings and knowledge graphs. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 6857-6866.
    [131] Kampffmeyer M, Chen Y, Liang X, et al. Rethinking knowledge graph propagation for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 11487-11496.
    [132] Lee CW, Fang W, Yeh CK, et al. Multi-label zero-shot learning with structured knowledge graphs. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 1576-1585.
    [133] Deselaers T, Ferrari V. Visual and semantic similarity in imagenet. In:Proc. of the CVPR 2011. 2011. 1777-1784.
    [134] Farhadi A, Endres I, Hoiem D, et al. Describing objects by their attributes. In:Proc. of the 2009 IEEE Conf. on Computer Vision and Pattern Recognition. 2009. 1778-1785.
    [135] Cheng YH, Qiao X, Wang XS. Hybrid attribute-based zero-shot image classification. Acta Electronica Sinica, 2017,45(6):1462-1468(in Chinese with English abstract).
    [136] Turakhia N, Parikh D. Attribute dominance:What pops out? In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2013. 1225-1232.
    [137] Suzuki M, Sato H, Oyama S, et al. Transfer learning based on the observation probability of each attribute. In:Proc. of the 2014 IEEE Int'l Conf. on Systems, Man, and Cybernetics (SMC). 2014. 3627-3631.
    [138] Jayaraman D, Grauman K. Zero-shot recognition with unreliable attributes. In:Advances in Neural Information Processing Systems. 2014. 3464-3472.
    [139] Rohrbach M, Stark M, Szarvas G, et al. What helps where-And why? Semantic relatedness for knowledge transfer. In:Proc. of the 2010 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2010. 910-917.
    [140] Yu X, Aloimonos Y. Attribute-based transfer learning for object categorization with zero/one training example. In:Proc. of the European Conf. on Computer Vision. 2010. 127-140.
    [141] Wang X, Ji Q. A unified probabilistic approach modeling relationships between attributes and objects. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2013. 2120-2127.
    [142] Hariharan B, Vishwanathan SVN, Varma M. Efficient max-margin multi-label classification with applications to zero-shot learning. Machine learning, 2012,88(1-2):127-155.
    [143] Kankuekul P, Kawewong A, Tangruamsub S, et al. Online incremental attribute-based zero-shot learning. In:Proc. of the 2012 IEEE Conf. on Computer Vision and Pattern Recognition. 2012. 3657-3664.
    [144] Morgado P, Vasconcelos N. Semantically consistent regularization for zero-shot recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 6060-6069.
    [145] Lu J, Li J, Yan Z, et al. Attribute-based synthetic network (ABS-net):Learning more from pseudo feature representations. Pattern Recognition, 2018,80:129-142.
    [146] Jayaraman D, Sha F, Grauman K. Decorrelating semantic visual attributes by resisting the urge to share. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2014. 1629-1636.
    [147] Liang K, Chang H, Shan S, et al. A unified multiplicative framework for attribute learning. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015. 2506-2514.
    [148] Gan C, Yang T, Gong B. Learning attributes equals multi-source domain generalization. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 87-97.
    [149] Li H, Li D, Luo X. Bap:Bimodal attribute prediction for zero-shot image categorization. In:Proc. of the 22nd ACM Int'l Conf. on Multimedia. 2014. 1013-1016.
    [150] Yu FX, Cao L, Feris RS, et al. Designing category-level attributes for discriminative visual recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2013. 771-778.
    [151] Wang D, Li Y, Lin Y, et al. Relational knowledge transfer for zero-shot learning. In:Proc. of the 30th AAAI Conf. on Artificial Intelligence. 2016.
    [152] Lazaridou A, Bruni E, Baroni M. Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world. In:Proc. of the 52nd Annual Meeting of the Association for Computational Linguistics (Vol.1:Long Papers). 2014. 1403-1414.
    [153] Guo Y, Ding G, Jin X, et al. Transductive zero-shot recognition via shared model space learning. In:Proc. of the 30th AAAI Conf. on Artificial Intelligence. 2016.
    [154] Romera-Paredes B, Torr P. An embarrassingly simple approach to zero-shot learning. In:Proc. of the Int'l Conf. on Machine Learning. 2015. 2152-2161.
    [155] Lazaridou A, Dinu G, Baroni M. Hubness and pollution:Delving into cross-space mapping for zero-shot learning. In:Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int'l Joint Conf. on Natural Language Processing (Vol.1:Long Papers). 2015. 270-280.
    [156] Atzmon Y, Chechik G. Adaptive confidence smoothing for generalized zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 11671-11680.
    [157] Pasupat P, Liang P. Zero-shot entity extraction from Web pages. In:Proc. of the 52nd Annual Meeting of the Association for Computational Linguistics (Vol.1:Long Papers). 2014. 391-401.
    [158] Shigeto Y, Suzuki I, Hara K, et al. Ridge regression, hubness, and zero-shot learning. In:Proc. of the Joint European Conf. on Machine Learning and Knowledge Discovery in Databases. 2015. 135-151.
    [159] Frome A, Corrado GS, Shlens J, et al. Devise:A deep visual-semantic embedding model. In:Advances in Neural Information Processing Systems. 2013. 2121-2129.
    [160] Annadani Y, Biswas S. Preserving semantic relations for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 7603-7612.
    [161] Chen L, Zhang H, Xiao J, et al. Zero-Shot visual recognition using semantics-preserving adversarial embedding networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 1043-1052.
    [162] Tong B, Wang C, Klinkigt M, et al. Hierarchical disentanglement of discriminative latent features for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2019. 11467-11476.
    [163] Norouzi M, Mikolov T, Bengio S, et al. Zero-shot learning by convex combination of semantic embeddings. arXiv preprint arXiv:1312.5650, 2013.
    [164] Zhang Z, Saligrama V. Zero-shot learning via semantic similarity embedding. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2015. 4166-4174.
    [165] Demirel B, Gokberk Cinbis R, Ikizler-Cinbis N. Attributes2Classname:A discriminative model for attribute-based unsupervised zero-shot learning. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2017. 1232-1241.
    [166] Reed S, Akata Z, Lee H, et al. Learning deep representations of fine-grained visual descriptions. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 49-58.
    [167] Ding Z, Shao M, Fu Y. Low-rank embedded ensemble semantic dictionary for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 2050-2058.
    [168] Ji Z, Fu Y, Guo J, et al. Stacked semantics-guided attention model for fine-grained zero-shot learning. In:Advances in Neural Information Processing Systems. 2018. 5995-6004.
    [169] Akata Z, Perronnin F, Harchaoui Z, et al. Label-embedding for attribute-based classification. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2013. 819-826.
    [170] Akata Z, Reed S, Walter D, et al. Evaluation of output embeddings for fine-grained image classification. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015. 2927-2936.
    [171] Xian Y, Akata Z, Sharma G, et al. Latent embeddings for zero-shot classification. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 69-77.
    [172] Zhang Z, Saligrama V. Zero-shot learning via joint latent similarity embedding. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 6034-6042.
    [173] Yu Y, Ji Z, Guo J, et al. Transductive zero-shot learning with adaptive structural embedding. IEEE Trans. on Neural Networks and Learning Systems, 2017,29(9):4116-4127.
    [174] Jiang H, Wang R, Shan S, et al. Adaptive metric learning for zero-shot recognition. IEEE Signal Processing Letters, 2019,26(9):1270-1274.
    [175] Song J, Shen C, Yang Y, et al. Transductive unbiased embedding for zero-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 1024-1033.
    [176] Zheng J, Zhuang F, Shi C. Local ensemble across multiple sources for collaborative filtering. In:Proc. of the 2017 ACM on Conf. on Information and Knowledge Management. 2017. 2431-2434.
    [177] Tsai YHH, Huang LK, Salakhutdinov R. Learning robust visual-semantic embeddings. In:Proc. of the 2017 IEEE Int'l Conf. on Computer Vision (ICCV). 2017. 3591-3600.
    [178] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
    [179] Jiang H, Wang R, Shan S, et al. Transferable contrastive network for generalized zero-shot learning. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2019. 9765-9774.
    [180] Sung F, Yang Y, Zhang L, et al. Learning to compare:Relation network for few-shot learning. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2018. 1199-1208.
    [181] Hu RL, Xiong C, Socher R. Correction networks:Meta-learning for zero-shot learning. In:Proc. of the ICLR. 2019. 1-12.
    [182] Akata Z, Perronnin F, Harchaoui Z, et al. Label-embedding for image classification. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2015,38(7):1425-1438.
    [183] Misra I, Gupta A, Hebert M. From red wine to red tomato:Composition with context. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2017. 1792-1801.
    [184] Belkin M, Niyogi P, Sindhwani V. Manifold regularization:A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 2006,7(85):2399-2434.
    [185] Vilnis L, McCallum A. Word representations via Gaussian embedding. arXiv preprint arXiv:1412.6623, 2014.
    [186] Micaelli P, Storkey AJ. Zero-shot knowledge transfer via adversarial belief matching. In:Advances in Neural Information Processing Systems. 2019. 9551-9561.
    [187] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396, 2016.
    [188] Hendricks LA, Akata Z, Rohrbach M, et al. Generating visual explanations. In:Proc. of the European Conf. on Computer Vision. 2016. 3-19.
    [189] Mitchell T, Cohen W, Hruschka E, et al. Never-ending learning. Communications of the ACM, 2018,61(5):103-115.
    [190] Chen X, Shrivastava A, Gupta A. Neil:Extracting visual knowledge from Web data. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2013. 1409-1416.
    [191] Li Y, Tarlow D, Brockschmidt M, et al. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493, 2015.
    [192] Marino K, Salakhutdinov R, Gupta A. The more you know:Using knowledge graphs for image classification. arXiv preprint arXiv:1612.04844, 2016.
    [193] Low T, Borgelt C, Stober S, et al. The hubness phenomenon:Fact or artifact? In:Towards Advanced Data Analysis by Combining Soft Computing and Statistics. Berlin, Heidelberg:Springer-Verlag, 2013. 267-278.
    [194] Wah C, Branson S, Welinder P, et al. The Caltech-UCSD Birds-200-2011 dataset. Technical Report, 2011.
    [195] Patterson G, Hays J. Sun attribute database:Discovering, annotating, and recognizing scene attributes. In:Proc. of the 2012 IEEE Conf. on Computer Vision and Pattern Recognition. 2012. 2751-2758.
    [196] Nilsback ME, Zisserman A. Automated flower classification over a large number of classes. In:Proc. of the 2008 6th Indian Conf. on Computer Vision, Graphics and Image Processing. 2008. 722-729.
    [197] Xiao J, Hays J, Ehinger KA, et al. Sun database:Large-scale scene recognition from abbey to zoo. In:Proc. of the 2010 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2010. 3485-3492.
    [198] Elhoseiny M, Saleh B, Elgammal A. Write a classifier:Zero-shot learning using purely textual descriptions. In:Proc. of the IEEE Int'l Conf. on Computer Vision. 2013. 2584-2591.
    [199] Deng J, Dong W, Socher R, et al. Imagenet:A large-scale hierarchical image database. In:Proc. of the 2009 IEEE Conf. on Computer Vision and Pattern Recognition. 2009. 248-255.
    [200] Chao WL, Changpinyo S, Gong B, et al. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In:Proc. of the European Conf. on Computer Vision. 2016. 52-68.
    附中文参考文献:
    [113] 刘欢,郑庆华,罗敏楠,赵洪科,肖阳,吕彦章.基于跨域对抗学习的零样本分类.计算机研究与发展,2019,56(12):2521-2535.
    [123] 刘知远,孙茂松,林衍凯,谢若冰.知识表示学习研究进展.计算机研究与发展,2016,53(2):247-261.
    [135] 程玉虎,乔雪,王雪松.基于混合属性的零样本图像分类.电子学报,2017,45(6):1462-1468.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

冯耀功,于剑,桑基韬,杨朋波.基于知识的零样本视觉识别综述.软件学报,2021,32(2):370-405

复制
分享
文章指标
  • 点击次数:3754
  • 下载次数: 10296
  • HTML阅读次数: 5044
  • 引用次数: 0
历史
  • 收稿日期:2020-07-03
  • 最后修改日期:2020-08-11
  • 在线发布日期: 2020-10-12
  • 出版日期: 2021-02-06
文章二维码
您是第19758672位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号