





The Target Detection Method of Aerial Photography Images with Improved SSD
Fund Project:

National Natural Science Foundation of China (61001158, 61272369, 61370070); Liaoning Provincial Natural Science Foundation of China (2014025003); Scientific Research Fund of Liaoning Provincial Education Department (L2012270); Science and Technology Innovation Foundation of Dalian (2018J12GX043); Key Research and Development Plan Program of Liaoning Province

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [39]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论

    近年来,无人机技术的快速发展使得无人机地面目标检测技术成为计算机视觉领域的重要研究方向,无人机在军事侦察、交通管制等场景中具有普遍的应用价值.针对无人机场景下目标分辨率低、尺度变化大、相机快速运动、目标遮挡和光照变化等问题,提出一种基于残差网络的航拍目标检测算法.在SSD(single shot multibox detector)目标检测算法的基础上,用表征能力更强的残差网络进行基准网络的替换,用残差学习降低网络训练难度,提高目标检测精度;引入跳跃连接机制降低提取特征的冗余度,解决层数增加出现的性能退化问题.同时,针对SSD目标检测算法存在的目标重复检测和小样本漏检问题,提出一种基于特征融合的航拍目标检测算法.算法引入不同分类层的特征融合机制,把网络结构中低层视觉特征与高层语义特征有机地结合在一起.实验结果表明,算法在检测准确性和实时性方面均具有较好的表现.


    In recent years, the rapid development of UAV (Unmanned Aerial Vehicle) technology makes UAV ground target detection technology become an important research direction in the field of computer vision. UAV has a wide range of applications in military investigation, traffic control, and other scenarios. Nevertheless, the UAV images have many problems such as low target resolution, scale changes, environmental changes, multi-target interference, and complex background environment. Aiming at the above difficulties, derived from the original SSD target detection algorithm, this study uses a residual network with better characterization ability to replace the basic network and a residual learning to reduce the network training difficulty and improve the target detection accuracy. By introducing a hopping connection mechanism, the redundancy of the extracted features is reduced, and the problem of performance degradation after the increase of the number of layers is solved. The effectiveness of the algorithm is verified through experimental comparison. Aiming at the problem of target repeated detection and small sample missing detection of the original SSD target detection algorithm, this study proposes an aerial target detection algorithm based on feature information fusion. By integrating information with different feature layers, this algorithm effectively makes up for the difference between low-level visual features and high-level semantic features in neural networks. Results show that the algorithm has sound performance in both detection accuracy and real-time performance.

    [1] Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In:O'Conner L, ed. Proc. of the 2014 IEEE Conf. on Computer Vision and Pattern Recognition. Columbus, Ohio:IEEE Computer Society, 2014. 580-587.
    [2] Girshick R. Fast R-CNN. In:O'Conner L, ed. Proc. of the 2015 IEEE Int'l Conf. on Computer Vision. Santiago:IEEE Computer Society, 2015. 1440-1448.
    [3] Ren SQ, He KM, Girshick R, Sun J. Faster R-CNN:Towards real-time object detection with region proposal networks. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2017,39(6):1137-1149.
    [4] He KM, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In:O'Conner L, ed. Proc. of the 2017 IEEE Int'l Conf. on Computer Vision. Venice:IEEE Computer Society, 2018. 2980-2988.
    [5] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once:Unified, real-time object detection. In:O'Conner L, ed. Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition. Seattle:IEEE Computer Society, 2016. 779-788.
    [6] Redmon J, Farhadi A. YOLO9000:Better, faster, stronger. In:O'Conner L, ed. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu:IEEE Computer Society, 2017. 6517-6525.
    [7] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC. SSD:Single shot MultiBox detector. In:Leibe B, ed. Proc. of the 2016 European Conf. on Computer Vision. Amsterdam:Springer Int'l Publishing, 2016. 21-37.
    [8] Shen ZQ, Liu Z, Li JG, Jiang YG, Chen YR, Xue XY. DSOD:Learning deeply supervised object detectors from scratch. In:O'Conner L, ed. Proc. of the 2017 IEEE Int'l Conf. on Computer Vision. Venice:IEEE Computer Society, 2017. 1937-1945.
    [9] Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. In:O'Conner L, ed. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Honolulu:IEEE Computer Society, 2017. 2261-2269.
    [10] Fang LP, He HJ, Zhou GM. Research overview of object detection methods. Computer Engineering and Applications, 2018,54(13):11-18,33(in Chinese with English abstract).
    [11] Jeong J, Park H, Kwak N. Enhancement of SSD by concatenating feature maps for object detection. In:Proc. of the British Machine Vision Conf. (BMVC). 2017. https://arxiv.org/abs/1705.09587
    [12] Redmon J, Farhadi A. YOLOv3:An incremental improvement. https://arxiv.org/abs/1804.02767
    [13] Ali S, Shah M. COCOA:Tracking in aerial imagery. In:Proc. of the Society of Photo-optical Instrumentation Engineers (SPIE). Florida, 2006. 6209. http://spie.org/Publications/Proceedings/Paper/10.1117/12.667266
    [14] Ibrahim AWN, Pang WC, Seet GLG, Lau WSM, Czajewski W. Moving objects detection and tracking framework for UAV-based surveillance. In:Werner B, ed. Proc. of the 20104th Pacific-rim Symp. on Image and Video Technology. Singapore:IEEE Computer Society, 2010. 456-461.
    [15] Tong XM. The research of moving object detection and tracking methods based on aerial video[Ph.D. Thesis]. Xi'an:Northwestern Polytechnical University, 2015(in Chinese with English abstract).
    [16] Zhang H. Researches on UAV based moving targets detection and tracking and vision aided UAV landing system[Ph.D. Thesis]. Changsha:National University of Defense Technology, 2008(in Chinese with English abstract).
    [17] Tan X, Yu XC, Liu JZ, Huang WJ. Object fast tracking based on unmanned aerial vehicle video. Bulletin of Surveying and Mapping, 2011,(9):32-34,41(in Chinese with English abstract).
    [18] Dong J, Fu D, Yang X. Real-time moving object detection and tracking by using UAV videos. Journal of Applied Optics, 2013, 34(2):255-259(in Chinese with English abstract).
    [19] Tang Y, Zhou PC, Xiao X, Chang C, Liu YL, Pan F. Researches of moving targets detection and tracking algorithm based on UAV. Robot Technique and Application, 2017(3):35-37(in Chinese with English abstract).
    [20] Rani LD, Prasad CGVN, Rao CK. Aerial image analysis using dynamic bayesian network. Int'l Journal of Research, 2014,1(8):909-915.
    [21] Teutsch M, Kruger W. Detection, segmentation, and tracking of moving objects in UAV videos. In:O'Conner L, ed. Proc. of the IEEE 9th Int'l Conf. on Advanced Video and Signal-based Surveillance. Beijing:IEEE Computer Society, 2012. 313-318.
    [22] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556
    [23] He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In:O'Conner L, ed. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Seattle:IEEE Computer Society, 2016. 770-778.
    [24] Szegedy C, Reed S, Erhan D, Erhan D, Anguelov D, Loffe S. Scalable high quality object detection. https://arxiv.org/abs/1412. 1441
    [25] Sermanet P, Eigen D, Zhang X, Michael M, Fergus R, LeCun Y. OverFeat:Integrated recognition, localization and detection using convolutional networks. https://arxiv.org/abs/1312.6229
    [26] He KM, Zhang XY, Ren SQ, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2015,37(9):1904-1916.
    [27] Jia YQ, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe:Convolutional architecture for fast feature embedding. In:Proc. of the 22nd ACM Int'l Conf. on Multimedia. 2014. 675-678. https://dl.acm.org/citation.cfm?id=2654889
    [28] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In:O'Conner L, ed. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. Boston:IEEE Computer Society, 2015. 3431-3440.
    [29] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. In:Proc. of the Int'l Conf. on Learning Representation (ICLR). 2016. https://arxiv.org/abs/1511.07122
    [30] Zeiler MD, Krishnan D, Taylor GW, Fergus R. Deconvolutional networks. In:O'Conner L, ed. Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. San Francisco:IEEE Computer Society, 2010. 2528-2535.
    [31] https://ivul.kaust.edu.sa/Pages/Dataset-UAV123.aspx
    [32] https://downloads.greyc.fr/vedai/
    [10] 方路平,何杭江,周国民.目标检测算法研究综述.计算机工程与应用,2018,54(13):11-18,33.
    [15] 仝小敏.航拍视频运动目标检测与跟踪方法研究[博士学位论文].西安:西北工业大学,2015.
    [16] 张恒.无人机平台运动目标检测与跟踪及其视觉辅助着陆系统研究[博士学位论文].长沙:国防科技大学,2008.
    [17] 谭熊,余旭初,刘景正,黄伟杰.基于无人机视频的运动目标快速跟踪.测绘通报,2011(9):32-34,41.
    [18] 董晶,傅丹,杨夏.无人机视频运动目标实时检测及跟踪.应用光学,2013,34(2):255-259.
    [19] 汤轶,周鹏程,肖璇,常成,刘益麟,潘峰.基于无人机平台的运动目标检测与跟踪算法研究.机器人技术与应用,2017(3):35-37.
    发 布


  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
  • 收稿日期:2018-07-20
  • 最后修改日期:2018-09-20
  • 在线发布日期: 2019-03-06
版权所有:中国科学院软件研究所 京ICP备05046678号-3
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn

京公网安备 11040202500063号