Event-fusion-based Spatial Attentive and Temporal Memorable Network for Video Deraining

doi:10.13328/j.cnki.jos.007023

微信服务号

微信订阅号

2025-4-24- 21

Home > Archive>Volume 35, Issue 5, 2024 >2220-2234. DOI:10.13328/j.cnki.jos.007023

PDF HTML XML Export Cite reminder

Event-fusion-based Spatial Attentive and Temporal Memorable Network for Video Deraining
DOI:
                        10.13328/j.cnki.jos.007023
                    
Author:
                        SUN Shang-QuanSUN Shang-Quan
Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100085, China;School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
REN Wen-QiREN Wen-Qi
School of Cyber Science and Technology, Sun Yat-sen University, Shenzhen 518107, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
CAO Xiao-ChunCAO Xiao-Chun
School of Cyber Science and Technology, Sun Yat-sen University, Shenzhen 518107, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [48]

Related [20]

Cited by

Materials

Comments

Abstract:

In recent years, digital video shooting equipment has been continuously upgraded. Although the improvement of the latitude of its image sensor and shutter rate has greatly enriched the diversity of the scene that can be photographed, the degraded factors such as rain streaks caused by raindrops passing through the field of view at high speed are also easier to be recorded. The dense rain streaks in the foreground block the effective information of the background scene, thus affecting the effective acquisition of images. Therefore, video image deraining becomes an urgent problem to be solved. The previous video deraining methods focus on using the information of conventional images themselves. However, due to the physical limit of the image sensors of conventional cameras, the constraints of the shutter mechanism, etc., much optical information is lost during video acquisition, which affects the subsequent video deraining effect. Therefore, taking advantage of the complementarity of event data and conventional video information, as well as the high dynamic range and high temporal resolution of event information, this study proposes a video deraining network based on event data fusion, spatial attention, and temporal memory, which uses three-dimensional alignment to convert the sparse event stream into an expression form that matches the size of the image and superimposes the input to the event-image fusion module that integrates the spatial attention mechanism, so as to effectively extract the spatial information of the image. In addition, in continuous frame processing, the inter-frame memory module is used to utilize the previous frame features, which are finally constrained by the three-dimensional convolution and two loss functions. The video deraining method is effective on the publicly available dataset and meets the standard of real-time video processing.

Key words:video deraining;event data;multi-mode fusion;spatial attention;temporal memory

Reference

[1] Li SY, Araujo IB, Ren WQ, Wang ZY, Tokuda EK, Junior RH, Cesar-Junior R, Zhang JW, Guo XJ, Cao XC. Single image deraining:A comprehensive benchmark analysis. In:Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach:IEEE, 2019. 3838-3847.

[2] Le T, Le NT, Jang YM. Performance of rolling shutter and global shutter camera in optical camera communications. In:Proc. of the 2015 Int'l Conf. on Information and Communication Technology Convergence. Jeju:IEEE, 2015. 124-128.

[3] Lichtsteiner P, Posch C, Delbruck T. A 128×128 120 dB 15 μs latency asynchronous temporal contrast vision sensor. IEEE Journal of Solid-state Circuits, 2008, 43(2):566-576.

[4] Fowles GR. Introduction to Modern Optics. 2nd ed., New York:Dover Publications, 1989.

[5] Jiang TX, Huang TZ, Zhao XL, Deng LJ, Wang Y. A novel tensor-based video rain streaks removal approach via utilizing discriminatively intrinsic priors. In:Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu:IEEE, 2017. 2818-2827.

[6] Garg K, Nayar SK. Vision and rain. Int'l Journal of Computer Vision, 2007, 75(1):3-27.

[7] Barnum PC, Narasimhan S, Kanade T. Analysis of rain and snow in frequency space. Int'l Journal of Computer Vision, 2010, 86(2):256-274.

[8] 肖进胜, 王文, 邹文涛, 童乐, 雷俊锋. 基于景深和稀疏编码的图像去雨算法. 计算机学报, 2019, 42(9):2024-2034.

Xiao JS, Wang W, Zou WT, Tong L, Lei JF. An image rain removal algorithm via depth of field and sparse coding. Chinese Journal of Computers, 2019, 42(9):2024-2034 (in Chinese with English abstract).

[9] Kim JH, Sim JY, Kim CS. Video deraining and desnowing using temporal correlation and low-rank matrix completion. IEEE Trans. on Image Processing, 2015, 24(9):2658-2670.

[10] Ren WH, Tian JD, Han Z, Chan A, Tang YD. Video desnowing and deraining based on matrix decomposition. In:Proc. of the 2017 IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu:IEEE, 2017. 2838-2847.

[11] Chen J, Chau LP. A rain pixel recovery algorithm for videos with highly dynamic scenes. IEEE Trans. on Image Processing, 2014, 23(3):1097-1104.

[12] Li MH, Xie Q, Zhao Q, Wei W, Gu SH, Tao J, Meng DY. Video rain streak removal by multiscale convolutional sparse coding. In:Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 6644-6653.

[13] Jiang TX, Huang TZ, Zhao XL, Deng LJ, Wang Y. FastDeRain:A novel video rain streak removal method using directional gradient priors. IEEE Trans. on Image Processing, 2019, 28(4):2089-2102.

[14] Chen J, Tan CH, Hou JH, Chau LP, Li H. Robust video content alignment and compensation for rain removal in a CNN framework. In:Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 6286-6295.

[15] Liu JY, Yang WH, Yang S, Guo ZM. Erase or fill? Deep joint recurrent rain removal and reconstruction in videos. In:Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 3233-3242.

[16] Liu JY, Yang WH, Yang S, Guo ZM. D3R-Net:Dynamic routing residue recurrent network for video rain removal. IEEE Trans. on Image Processing, 2019, 28(2):699-712.

[17] Yang WH, Liu JY, Feng JS. Frame-consistent recurrent video deraining with dual-level flow. In:Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach:IEEE, 2019. 1661-1670.

[18] Yang WH, Tan RT, Wang SQ, Liu JY. Self-learning video rain streak removal:When cyclic consistency meets temporal correspondence. In:Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle:IEEE, 2020. 1717-1726.

[19] 张学锋, 李金晶. 基于双注意力残差循环单幅图像去雨集成网络. 软件学报, 2021, 32(10):3283-3292. http://www.jos.org.cn/1000-9825/6018.htm

Zhang XF, Li JJ. Single image de-raining using a recurrent dual-attention-residual ensemble network. Ruan Jian Xue Bao/Journal of Software, 2021, 32(10):3283-3292 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6018.htm

[20] Yue ZS, Xie JW, Zhao Q, Meng DY. Semi-supervised video deraining with dynamical rain generator. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Nashville:IEEE, 2021. 642-652.

[21] 孟祥玉, 薛昕惟, 李汶霖, 王祎. 基于运动估计与时空结合的多帧融合去雨网络. 计算机科学, 2021, 48(5):170-176.

Meng XY, Xue XW, Li WL, Wang Y. Motion-estimation based space-temporal feature aggregation network for multi-frames rain removal. Computer Science, 2021, 48(5):170-176 (in Chinese with English abstract).

[22] Zhang KH, Li DX, Luo WH, Ren WQ, Liu W. Enhanced spatio-temporal interaction learning for video deraining:Faster and better. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2023, 45(1):1287-1293.

[23] Posch C, Matolin D, Wohlgenannt R. A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS. IEEE Journal of Solid-state Circuits, 2011, 46(1):259-275.

[24] Brandli C, Berner R, Yang MH, Liu SC, Delbruck T. A 240×180 130 dB 3 μs latency global shutter spatiotemporal vision sensor. IEEE Journal of Solid-state Circuits, 2014, 49(10):2333-2341.

[25] Gallego G, Delbrück T, Orchard G, Bartolozzi C, Taba B, Censi A, Leutenegger S, Davison AJ, Conradt J, Daniilidis K, Scaramuzza D. Event-based vision:A survey. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2022, 44(1):154-180.

[26] Andreopoulos A, Kashyap HJ, Nayak TK, Amir A, Flickner MD. A low power, high throughput, fully event-based stereo system. In:Proc. of the 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 7532-7542.

[27] Gallego G, Lund JEA, Mueggler E, Rebecq H, Delbruck T, Scaramuzza D. Event-based, 6-DOF camera tracking from photometric depth maps. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2018, 40(10):2402-2412.

[28] Mitrokhin A, Fermüller C, Parameshwara C, Aloimonos Y. Event-based moving object detection and tracking. In:Proc. of the 2018 IEEE/RSJ Int'l Conf. on Intelligent Robots and Systems. Madrid:IEEE, 2018. 1-9.

[29] Wan ZX, Dai YC, Mao YX. Learning dense and continuous optical flow from an event camera. IEEE Trans. on Image Processing, 2022, 31:7237-7251.

[30] Wang L, Kim TK, Yoon KJ. EventSR:From asynchronous events to image reconstruction, restoration, and super-resolution via end-to-end adversarial learning. In:Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle:IEEE, 2020. 8312-8322.

[31] Han J, Zhou C, Duan PQ, Tang YH, Xu C, Xu C, Huang TJ, Shi BX. Neuromorphic camera guided high dynamic range imaging. In:Proc. of the 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Seattle:IEEE, 2020. 1727-1736.

[32] Orchard G, Benosman R, Etienne-Cummings R, Thakor NV. A spiking neural network architecture for visual motion estimation. In:Proc. of the 2013 IEEE Biomedical Circuits and Systems Conf. Rotterdam:IEEE, 2013. 298-301.

[33] Brosch T, Tschechne S, Neumann H. On event-based optical flow detection. Frontiers in Neuroscience, 2015, 9:137.

[34] Paredes-Vallés F, Scheper KYW, de Croon GCHE. Unsupervised learning of a hierarchical spiking neural network for optical flow estimation:From events to global motion perception. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2020, 42(8):2051-2064.

[35] Zhang JQ, Dong B, Zhang HW, Ding JC, Heide F, Yin BC, Yang X. Spiking transformers for event-based single object tracking. In:Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans:IEEE, 2022. 8791-8800.

[36] Shang W, Ren DW, Zou DQ, Ren JS, Luo P, Zuo WM. Bringing events into video deblurring with non-consecutively blurry frames. In:Proc. of the 2021 IEEE/CVF Int'l Conf. on Computer Vision. Montreal:IEEE, 2021. 4511-4520.

[37] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In:Proc. of the 31st Int'l Conf. on Neural Information Processing Systems. Long Beach:Curran Associates Inc., 2017. 6000-6010.

[38] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8):1735-1780.

[39] Hui TW, Loy CC. LiteFlowNet3:Resolving correspondence ambiguity for more accurate optical flow estimation. In:Proc. of the 16th European Conf. on Computer Vision. Glasgow:Springer, 2020. 169-184.

[40] Ren DW, Zuo WM, Hu QH, Zhu PF, Meng DY. Progressive image deraining networks:A better and simpler baseline. In:Proc. of the 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Long Beach:IEEE, 2019. 3932-3941.

[41] Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang MH, Shao L. Multi-stage progressive image restoration. In:Proc. of the 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. Nashville:IEEE, 2021. 14816-14826.

[42] Kingma DP, Ba JL. Adam:A method for stochastic optimization. arXiv:1412.6980, 2015.

[43] Rebecq H, Gehrig D, Scaramuzza D. ESIM:An open event camera simulator. In:Proc. of the 2nd Annual Conf. on Robot Learning. Zürich:PMLR, 2018. 969-982.

[44] Tan MX, Le Q. EfficientNet:Rethinking model scaling for convolutional neural networks. In:Proc. of the 36th Int'l Conf. on Machine Learning. Long Beach:PMLR, 2019. 6105-6114.

[45] Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang MH. Restormer:Efficient transformer for high-resolution image restoration. In:Proc. of the 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition. New Orleans:IEEE, 2022. 5728-5739.

Get Citation

孙上荃,任文琦,操晓春.事件融合与空间注意力和时间记忆力的视频去雨网络.软件学报,2024,35(5):2220-2234

Copy

Article Metrics

Abstract:877
PDF: 3274
HTML: 1366
Cited by: 0

History

Received:April 07,2023
Revised:June 08,2023
Adopted:
Online: September 11,2023
Published: May 06,2024

You are the first2038288Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History