跨模态交互融合与全局感知的RGB-D显著性目标检测
作者:
基金项目:

国家自然科学基金(61976042, 61972068);兴辽英才计划(XLYC2007023);辽宁省高等学校创新人才支持计划(LR2019020)


RGB-D Salient Object Detection Based on Cross-modal Interactive Fusion and Global Awareness
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [92]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    近年来, RGB-D显著性检测方法凭借深度图中丰富的几何结构和空间位置信息, 取得了比RGB显著性检测模型更好的性能, 受到学术界的高度关注. 然而, 现有的RGB-D检测模型仍面临着持续提升检测性能的需求. 最近兴起的Transformer擅长建模全局信息, 而卷积神经网络(CNN)擅长提取局部细节. 因此, 如何有效结合CNN和Transformer两者的优势, 挖掘全局和局部信息, 将有助于提升显著性目标检测的精度. 为此, 提出一种基于跨模态交互融合与全局感知的RGB-D显著性目标检测方法, 通过将Transformer网络嵌入U-Net中, 从而将全局注意力机制与局部卷积结合在一起, 能够更好地对特征进行提取. 首先借助U-Net编码-解码结构, 高效地提取多层次互补特征并逐级解码生成显著特征图. 然后, 使用Transformer模块学习高级特征间的全局依赖关系增强特征表示, 并针对输入采用渐进上采样融合策略以减少噪声信息的引入. 其次, 为了减轻低质量深度图带来的负面影响, 设计一个跨模态交互融合模块以实现跨模态特征融合. 最后, 5个基准数据集上的实验结果表明, 所提算法与其他最新的算法相比具有显著优势.

    Abstract:

    In recent years, RGB-D salient detection method has achieved better performance than RGB salient detection model by virtue of its rich geometric structure and spatial position information in depth maps and thus has been highly concerned by the academic community. However, the existing RGB-D detection model still faces the challenge of improving performance continuously. The emerging Transformer is good at modeling global information, while the convolutional neural network (CNN) is good at extracting local details. Therefore, effectively combining the advantages of CNN and Transformer to mine global and local information will help to improve the accuracy of salient object detection. For this purpose, an RGB-D salient object detection method based on cross-modal interactive fusion and global awareness is proposed in this study. The transformer network is embedded into U-Net to better extract features by combining the global attention mechanism with local convolution. First, with the help of the U-Net encoder-decoder structure, this study efficiently extracts multi-level complementary features and decodes them step by step to generate a salient feature map. Then, the Transformer module is used to learn the global dependency between high-level features to enhance the feature representation, and the progressive upsampling fusion strategy is used to process the input and reduce the introduction of noise information. Moreover, to reduce the negative impact of low-quality depth maps, the study also designs a cross-modal interactive fusion module to realize cross-modal feature fusion. Finally, experimental results on five benchmark datasets show that the proposed algorithm has an excellent performance than other latest algorithms.

    参考文献
    [1] Borji A, Cheng MM, Jiang HZ, Li J. Salient object detection:A benchmark. IEEE Transactions on Image Processing, 2015, 24(12):5706-5722.[doi:10.1109/TIP.2015.2487833]
    [2] Wang WG, Lai QX, Fu HZ, Shen JB, Ling HB, Yang RG. Salient object detection in the deep learning era:An in-depth survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6):3239-3259.[doi:10.1109/TPAMI.2021.3051099]
    [3] Cheng MM, Zhang ZM, Lin WY, Torr P. BING:Binarized normed gradients for objectness estimation at 300 fps. In:Proc. of the 2014 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Columbus:IEEE, 2014. 3286-3293.
    [4] Cheng MM, Hou QB, Zhang SH, Rosin PL. Intelligent visual media processing:When graphics meets vision. Journal of Computer Science and Technology, 2017, 32(1):110-121.[doi:10.1007/s11390-017-1681-7]
    [5] Wang WG, Shen JB, Yang RG, Porikli F. Saliency-aware video object segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(1):20-33.[doi:10.1109/TPAMI.2017.2662005]
    [6] Cheng MM, Zhang FL, Mitra NJ, Huang XL, Hu SM. RepFinder:Finding approximately repeated scene elements for image editing. ACM Transactions on Graphics, 2010, 29(4):83.[doi:10.1145/1778765.1778820]
    [7] Fan DP, Wang WG, Cheng MM, Shen JB. Shifting more attention to video salient object detection. In:Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Long Beach:IEEE, 2019. 8546-8556.
    [8] Yan PX, Li GB, Xie Y, Li Z, Wang C, Chen TS, Lin L. Semi-supervised video salient object detection using pseudo-labels. In:Proc. of the 2019 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Seoul:IEEE, 2019. 7283-7292.
    [9] Wang YB, Wang FS, Wang C, Sun FM, He JJ. Learning saliency-aware correlation filters for visual tracking. The Computer Journal, 2022, 65(7):1846-1859.[doi:10.1093/comjnl/bxab026]
    [10] Zhou ZK, Pei WJ, Li X, Wang HP, Zheng F, He ZY. Saliency-associated object tracking. In:Proc. of the 2021 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Montreal:IEEE, 2021. 9846-9855.
    [11] Liu JJ, Hou QB, Cheng MM, Feng JS, Jiang JM. A simple pooling-based design for real-time salient object detection. In:Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Long Beach:IEEE, 2019. 3912-3921.
    [12] Wang LZ, Wang LJ, Lu HC, Zhang PP, Ruan X. Salient object detection with recurrent fully convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(7):1734-1746.[doi:10.1109/TPAMI.2018.2846598]
    [13] Cheng MM, Mitra NJ, Huang XL, Torr PHS, Hu SM. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3):569-582.[doi:10.1109/TPAMI.2014.2345401]
    [14] Zhang DW, Meng DY, Han JW. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(5):865-878.[doi:10.1109/TPAMI.2016.2567393]
    [15] Chen H, Li YF. Three-stream attention-aware network for RGB-D salient object detection. IEEE Transactions on Image Processing, 2019, 28(6):2825-2835.[doi:10.1109/TIP.2019.2891104]
    [16] Piao YR, Ji W, Li JJ, Zhang M, Lu HC. Depth-induced multi-scale recurrent attention network for saliency detection. In:Proc. of the 2019 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Seoul:IEEE, 2019. 7253-7262.
    [17] Giancola S, Valenti M, Sala R. State-of-the-art devices Comparison. In:Giancola S, Valenti M, Sala R, eds. A Survey on 3D Cameras:Metrological Comparison of Time-of-flight, Structured-light and Active Stereoscopy Technologies. Cham:Springer, 2018. 29-39.
    [18] Li NY, Ye JW, Ji Y, Ling HB, Yu JY. Saliency detection on light field. In:Proc. of the 2014 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Columbus:IEEE, 2014. 2806-2813.
    [19] Zhao JX, Cao Y, Fan DP, Cheng MM, Li XY, Zhang L. Contrast prior and fluid pyramid integration for RGBD salient object detection. In:Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Long Beach:IEEE, 2019. 3922-3931.
    [20] Cong RM, Lei JJ, Fu HZ, Huang QM, Cao XC, Ling N. HSCS:Hierarchical sparsity based co-saliency detection for RGBD images. IEEE Transactions on Multimedia, 2019, 21(7):1660-1671.[doi:10.1109/TMM.2018.2884481]
    [21] Peng HW, Li B, Xiong WH, Hu WM, Ji RR. RGBD salient object detection:A benchmark and algorithms. In:Proc. of the 13th European Conf. on Computer Vision (ECCV). Zurich:Springer, 2014. 92-109.
    [22] Zhu CB, Cai X, Huang K, Li TH, Li G. PDNet:Prior-model guided depth-enhanced network for salient object detection. In:Proc. of the 2019 IEEE Int'l Conf. on Multimedia and Expo (ICME). Shanghai:IEEE, 2019. 199-204.
    [23] Fan DP, Zhai YJ, Borji A, Yang JF, Shao L. BBS-Net:RGB-D salient object detection with a bifurcated backbone strategy network. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 275-292.
    [24] Zhang PP, Wang D, Lu HC, Wang HY, Ruan X. Amulet:Aggregating multi-level convolutional features for salient object detection. In:Proc. of the 2017 IEEE Int'l Conf. on Computer Vision (ICCV). Venice:IEEE, 2017. 202-211.
    [25] Hou QB, Cheng MM, Hu XW, Borji A, Tu ZW, Torr PHS. Deeply supervised salient object detection with short connections. In:Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Honolulu:IEEE, 2017. 5300-5309.
    [26] Wang TT, Zhang LH, Wang S, Lu HC, Yang G, Ruan X, Borji A. Detect globally, refine locally:A novel approach to saliency detection. In:Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Salt Lake City:IEEE, 2018. 3127-3135.
    [27] Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. DeepLab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.[doi:10.1109/TPAMI.2017.2699184]
    [28] Li CY, Cong RM, Piao YR, Xu QQ, Loy CC. RGB-D salient object detection with cross-modality modulation and selection. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 225-241.
    [29] Deng ZJ, Hu XW, Zhu L, Xu XM, Qin J, Han GQ, Heng PA. R3Net:Recurrent residual refinement network for saliency detection. In:Proc. of the 27th Int'l Joint Conf. on Artificial Intelligence (IJCAI). Stockholm:AAAI, 2018. 684-690.
    [30] Wang TT, Borji A, Zhang LH, Zhang PP, Lu HC. A stagewise refinement model for detecting salient objects in images. In:Proc. of the 2017 IEEE Int'l Conf. on Computer Vision (ICCV). Venice:IEEE, 2017. 4039-4048.
    [31] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. In:Proc. of the 31st Neural Information Processing Systems (NIPS). Long Beach:Curran Associates Inc., 2017. 6000-6010.
    [32] Qin XB, Zhang ZC, Huang CY, Dehghan M, Zaiane OR, Jagersand M. U2-Net:Going deeper with nested U-structure for salient object detection. Pattern Recognition, 2020, 106:107404.[doi:10.1016/j.patcog.2020.107404]
    [33] Tang ZQ, Peng X, Geng SJ, Wu LF, Zhang ST, Metaxas D. Quantized densely connected U-nets for efficient landmark localization. In:Proc. of the 15th European Conf. on Computer Vision (ECCV). Munich:Springer, 2018. 348-364.
    [34] Li GB, Yu YZ. Visual saliency based on multiscale deep features. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Boston:IEEE, 2015. 5455-5463.
    [35] Li GB, Yu YZ. Visual saliency detection based on multiscale deep CNN features. IEEE Transactions on Image Processing, 2016, 25(11):5012-5024.[doi:10.1109/TIP.2016.2602079]
    [36] Wang LJ, Lu HC, Ruan X, Yang MH. Deep networks for saliency detection via local estimation and global search. In:Proc. of the 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Boston:IEEE, 2015. 3183-3192.
    [37] Zhao R, Ouyang WL, Li HS, Wang XG. Saliency detection by multi-context deep learning. In:Proc. of the 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Boston:IEEE, 2015. 1265-1274.
    [38] Li X, Zhao LM, Wei LN, Yang MH, Wu F, Zhuang YT, Ling HB, Wang JD. DeepSaliency:Multi-task deep neural network model for salient object detection. IEEE Transactions on Image Processing, 2016, 25(8):3919-3930.[doi:10.1109/TIP.2016.2579306]
    [39] 王文冠, 沈建冰, 贾云得. 视觉注意力检测综述. 软件学报, 2019, 30(2):416-439. http://www.jos.org.cn/1000-9825/5636.htm
    Wang WG, Shen JB, Jia YD. Review of visual attention detection. Ruan Jian Xue Bao/Journal of Software, 2019, 30(2):416-439 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5636.htm
    [40] Zhao XQ, Pang YW, Zhang LH, Lu HC, Ruan X. Self-supervised pretraining for RGB-D salient object detection. In:Proc. of the 36th AAAI Conf. Artificial Intelligence (AAAI). AAAI, 2022. 3463-3471.
    [41] Qu LQ, He SF, Zhang JW, Tian JD, Tang YD, Yang QX. RGBD salient object detection via deep fusion. IEEE Transactions on Image Processing, 2017, 26(5):2274-2285.[doi:10.1109/TIP.2017.2682981]
    [42] Cong RM, Lei JJ, Fu HZ, Huang QM, Cao XC, Hou CP. Co-saliency detection for RGBD images based on multi-constraint feature matching and cross label propagation. IEEE Transactions on Image Processing, 2018, 27(2):568-579.[doi:10.1109/TIP.2017.2763819]
    [43] Feng D, Barnes N, You SD, McCarthy C. Local background enclosure for RGB-D salient object detection. In:Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Las Vegas:IEEE, 2016. 2343-2350.
    [44] Ren JQ, Gong XJ, Yu L, Zhou WH, Yang MY. Exploiting global priors for RGB-D saliency detection. In:Proc. of the 2015 IEEE Conf. on Computer Vision and Pattern Recognition workshops (CVPR). Boston:IEEE, 2015:25-32.
    [45] Chen Q, Liu Z, Zhang Y, Fu KR, Zhao QJ, Du HW. RGB-D salient object detection via 3D convolutional neural networks. In:Proc. of the 35th AAAI Conf. on Artificial Intelligence (AAAI). AAAI, 2021. 1063-1071.
    [46] Chen H, Li YF. Progressively complementarity-aware fusion network for RGB-D salient object detection. In:Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Salt Lake City:IEEE, 2018. 3051-3060.
    [47] Fu KR, Fan DP, Ji GP, Zhao QJ, Shen JB, Zhu C. Siamese network for RGB-D salient object detection and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9):5541-5559.[doi:10.1109/TPAMI.2021.3073689]
    [48] Pang YW, Zhang LH, Zhao XQ, Lu HC. Hierarchical dynamic filtering network for RGB-D salient object detection. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 235-252.
    [49] Chen ZY, Cong RM, Xu QQ, Huang QM. DPANet:Depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Transactions on Image Processing, 2021, 30:7012-7024.[doi:10.1109/TIP.2020.3028289]
    [50] Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. End-to-end object detection with Transformers. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 213-229.
    [51] Zhu XZ, Su WJ, Lu LW, Li B, Wang XG, Dai JF. Deformable DETR:Deformable Transformers for end-to-end object detection. In:Proc. of the 9th Int'l Conf. on Learning Representations (ICLR). OpenReview.net, 2021.
    [52] Yan B, Peng HW, Fu JL, Wang D, Lu HC. Learning spatio-temporal Transformer for visual tracking. In:Proc. of the 2021 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Montreal:IEEE, 2021. 10428-10437.
    [53] Stoffl L, Vidal M, Mathis A. End-to-end trainable multi-instance pose estimation with transformers. arXiv:2103.12115, 2021.
    [54] Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai XH, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words:Transformers for image recognition at scale. In:Proc. of the 9th Int'l Conf. on Learning Representations (ICLR). OpenReview.net, 2021.
    [55] Liu Z, Lin YT, Cao Y, Hu H, Wei YX, Zhang Z, Lin S, Guo BN. Swin Transformer:Hierarchical vision transformer using shifted windows. In:Proc. of the 2021 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Montreal:IEEE, 2021. 9992-10002.
    [56] Strudel R, Garcia R, Laptev I, Schmid C. Segmenter:Transformer for semantic segmentation. In:Proc. of the 2021 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Montreal:IEEE, 2021. 7242-7252.
    [57] Xie EZ, Wang WH, Yu ZD, Anandkumar A, Alvarez JM, Luo P. SegFormer:Simple and efficient design for semantic segmentation with transformers. In:Proc. of the 35th Neural Information Processing Systems (NIPS). 2021. 12077-12090.
    [58] Wang WH, Xie EZ, Li X, Fan DP, Song KT, Liang D, Lu T, Luo P, Shao L. Pyramid vision transformer:A versatile backbone for dense prediction without convolutions. In:Proc. of the 2021 IEEE/CVF Int'l Conf. on Computer Vision (ICCV). Montreal:IEEE, 2021. 548-558.
    [59] Zhu HQ, Sun X, Li YX, Ma K, Zhou SK, Zheng YF. DFTR:Depth-supervised fusion Transformer for salient object detection. arXiv:2203.06429, 2022.
    [60] Chen JN, Lu YY, Yu QH, Luo XD, Adeli E, Wang Y, Lu L, Yuille AL, Zhou YY. TransUNet:Transformers make strong encoders for medical image segmentation. arXiv:2102.04306, 2021.
    [61] Ronneberger O, Fischer P, Brox T. U-Net:Convolutional networks for biomedical image segmentation. In:Proc. of the 18th Int'l Conf. on Medical Image Computing and Computer-assisted Intervention (MICCAI). Munich:Springer, 2015. 234-241.
    [62] Wang HY, Zhu YK, Adam H, Yuille A, Chen LC. MaX-DeepLab:End-to-end panoptic segmentation with mask Transformers. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville:IEEE, 2021. 5459-5470.
    [63] Zhang YD, Liu HY, Hu Q. TransFuse:Fusing Transformers and CNNs for medical image segmentation. In:Proc. of the 24th Int'l Conf. on Medical Image Computing and Computer-assisted Intervention (MICCAI). Strasbourg:Springer, 2021. 14-24.
    [64] Luo XD, Hu MH, Song T, Wang GT, Zhang ST. Semi-supervised medical image segmentation via cross teaching between CNN and Transformer. In:Proc. of the 2022 Int'l Conf. on Medical Imaging with Deep Learning. Zurich:PMLR, 2022. 820-833.
    [65] Liu C, Yang G, Wang S, Wang HX, Zhang YH, Wang YT. TANet:Transformer-based asymmetric network for RGB-D salient object detection. arXiv:2207.01172, 2022.
    [66] Chen X, Yan B, Zhu JW, Wang D, Yang XY, Lu HC. Transformer tracking. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville:IEEE, 2021. 8122-8131.
    [67] Xie YT, Zhang JP, Shen CH, Xia Y. CoTr:Efficiently bridging CNN and Transformer for 3D medical image segmentation. In:Proc. of the 24th Int'l Conf. on Medical Image Computing and Computer-assisted Intervention (MICCAI). Strasbourg:Springer, 2021. 171-180.
    [68] Hou QB, Zhou DQ, Feng JS. Coordinate attention for efficient mobile network design. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville:IEEE, 2021. 13708-13717.
    [69] Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P. Res2Net:A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2):652-662.[doi:10.1109/TPAMI.2019.2938758]
    [70] Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv:1607.06450, 2016.
    [71] Ju R, Ge L, Geng WJ, Ren TW, Wu GS. Depth saliency based on anisotropic center-surround difference. In:Proc. of the 2014 IEEE Int'l Conf. on Image Processing (ICIP). Paris:IEEE, 2014. 1115-1119.
    [72] Fan DP, Lin Z, Zhang Z, Zhu ML, Cheng MM. Rethinking RGB-D salient object detection:Models, data sets, and large-scale benchmarks. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(5):2075-2089.[doi:10.1109/TNNLS.2020.2996406]
    [73] Cheng YP, Fu HZ, Wei XX, Xiao JJ, Cao XC. Depth enhanced saliency detection method. In:Proc. of the 2014 Int'l Conf. on Internet Multimedia Computing and Service (ICIMCS). Xiamen:ACM, 2014. 23-27.
    [74] Chen SH, Fu Y. Progressively guided alternate refinement network for RGB-D salient object detection. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 520-538.
    [75] Ji W, Li JJ, Zhang M, Piao YR, Lu HC. Accurate RGB-D salient object detection via collaborative learning. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 52-69.
    [76] Liu ZY, Wang Y, Tu ZZ, Xiao Y, Tang B. TriTransNet:RGB-D salient object detection with a triplet transformer embedding network. In:Proc. of the 29th ACM Int'l Conf. on Multimedia (ACM). ACM, 2021. 4481-4490.
    [77] Zhao XQ, Zhang LH, Pang YW, Lu HC, Zhang L. A single stream network for robust and real-time RGB-D salient object detection. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 646-662.
    [78] Fan DP, Gong C, Cao Y, Ren B, Cheng MM, Borji A. Enhanced-alignment measure for binary foreground map evaluation. In:Proc. of the 27th Int'l Joint Conf. on Artificial Intelligence (IJCAI). Stockholm:AAAI, 2018. 698-704.
    [79] Fan DP, Cheng MM, Liu Y, Li T, Borji A. Structure-measure:A new way to evaluate foreground maps. In:Proc. of the 2017 IEEE Int'l Conf. on Computer Vision (ICCV). Venice:IEEE, 2017. 4558-4567.
    [80] Achanta R, Hemami S, Estrada F, Susstrunk S. Frequency-tuned salient region detection. In:Proc. of the 2009 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Miami:IEEE, 2009. 1597-1604.
    [81] Perazzi F, Krähenbühl P, Pritch Y, Hornung A. Saliency filters:Contrast based filtering for salient region detection. In:Proc. of the 2012 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Providence:IEEE, 2012. 733-740.
    [82] Kingma DP, Ba J. Adam:A method for stochastic optimization. In:Proc. of the 3rd Int'l Conf. on Learning Representations (ICLR). San Diego:ICLR, 2015. 1-13.
    [83] Zhang M, Ren WS, Piao YR, Rong ZK, Lu HC. Select, supplement and focus for RGB-D saliency detection. In:Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Seattle:IEEE, 2020. 3469-3478.
    [84] Zhang M, Fei SX, Liu J, Xu S, Piao YR, Lu HC. Asymmetric two-stream architecture for accurate RGB-D saliency detection. In:Proc. of the 16th European Conf. on Computer Vision (ECCV). Glasgow:Springer, 2020. 374-390.
    [85] Wu JY, Sun FM, Xu R, Meng J, Wang FS. Aggregate interactive learning for RGB-D salient object detection. Expert Systems with Applications, 2022, 195:116614.[doi:10.1016/j.eswa.2022.116614]
    [86] Huang NC, Yang Y, Zhang DW, Zhang Q, Han JG. Employing bilinear fusion and saliency prior information for RGB-D salient object detection. IEEE Transactions on Multimedia, 2022, 24:1651-1664.[doi:10.1109/TMM.2021.3069297]
    [87] Jin WD, Xu J, Han Q, Zhang Y, Cheng MM. CDNet:Complementary depth network for RGB-D salient object detection. IEEE Transactions on Image Processing, 2021, 30:3376-3390.[doi:10.1109/TIP.2021.3060167]
    [88] Li GY, Liu Z, Chen MY, Bai Z, Lin WS, Ling HB. Hierarchical alternate interaction network for RGB-D salient object detection. IEEE Transactions on Image Processing, 2021, 30:3528-3542.[doi:10.1109/TIP.2021.3062689]
    [89] Sun P, Zhang WH, Wang HY, Li SY, Li X. Deep RGB-D saliency detection with depth-sensitive attention and automatic multi-modal fusion. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville:IEEE, 2021. 1407-1417.
    [90] Ji W, Li JJ, Yu S, Zhang M, Piao YR, Yao SY, Bi Q, Ma K, Zheng YF, Lu HC, Cheng L. Calibrated RGB-D salient object detection. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR). Nashville:IEEE, 2021. 9466-9476.
    [91] Woo S, Park J, Lee JY, Kweon IS. CBAM:Convolutional block attention module. In:Proc. of the 15th European Conf. on Computer Vision (ECCV). Munich:Springer, 2018. 3-19.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

孙福明,胡锡航,武景宇,孙静,王法胜.跨模态交互融合与全局感知的RGB-D显著性目标检测.软件学报,2024,35(4):1899-1913

复制
分享
文章指标
  • 点击次数:1088
  • 下载次数: 2595
  • HTML阅读次数: 897
  • 引用次数: 0
历史
  • 收稿日期:2022-06-29
  • 最后修改日期:2022-09-01
  • 在线发布日期: 2023-06-14
  • 出版日期: 2024-04-06
文章二维码
您是第19937774位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号