深度代码模型安全综述
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家自然科学基金(61932012, 62372228)


Survey on Security of Deep Code Models
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着深度学习技术在计算机视觉与自然语言处理等领域取得巨大成功, 软件工程研究者开始尝试将其引入到软件工程任务求解当中. 已有研究结果显示, 深度学习技术在各种代码相关任务(例如代码检索与代码摘要)上具有传统方法与机器学习方法无法比拟的优势. 这些面向代码相关任务训练的深度学习模型统称为深度代码模型. 然而, 由于神经网络的脆弱性和不可解释性, 与自然语言处理模型与图像处理模型一样, 深度代码模型安全也面临众多挑战, 已经成为软件工程领域的焦点. 近年来, 研究者提出了众多针对深度代码模型的攻击与防御方法. 然而, 目前仍缺乏对深度代码模型安全研究的系统性综述, 不利于后续研究者对该领域进行快速的了解. 因此, 为了总结该领域研究现状、挑战及时跟进该领域的最新研究成果, 搜集32篇该领域相关论文, 并将现有的研究成果主要分为后门攻击与防御技术和对抗攻击与防御技术两类. 按照不同技术类别对所收集的论文进行系统地梳理和总结. 随后, 总结该领域中常用的实验数据集和评估指标. 最后, 分析该领域所面临的关键挑战以及未来可行的研究方向, 旨在为后续研究者进一步推动深度代码模型安全的发展提供有益指导.

    Abstract:

    With the significant success of deep learning in fields such as computer vision and natural language processing, researchers in software engineering have begun to explore its integration into solving software engineering tasks. Existing research indicates that deep learning exhibits advantages in various code-related tasks, such as code retrieval and code summarization, that traditional methods and machine learning cannot match. Deep learning models trained for code-related tasks are referred to as deep code models. However, similar to natural language processing and image processing models, the security of deep code models faces numerous challenges due to the vulnerability and inexplicability of neural networks. It has become a research focus in software engineering. In recent years, researchers have proposed numerous attack and defense methods for deep code models. Nevertheless, there is a lack of a systematic review of research on deep code model security, hindering the rapid understanding of subsequent researchers in this field. To provide a comprehensive overview of the current research, challenges, and latest findings in this field, this study collects 32 relevant papers and categorizes existing research results into two main classes: backdoor attack and defense techniques, and adversarial attack and defense techniques. This study systematically analyzes and summarizes the collected papers based on the above two categories. Subsequently, it outlines commonly used experimental datasets and evaluation metrics in this field. Finally, it analyzes key challenges in this field and suggests feasible future research directions, aiming to provide valuable guidance for further advancements in the security of deep code models.

    参考文献
    相似文献
    引证文献
引用本文

孙伟松,陈宇琛,赵梓含,陈宏,葛一飞,韩廷旭,黄胜寒,李佳讯,房春荣,陈振宇.深度代码模型安全综述.软件学报,,():1-28

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-12-18
  • 最后修改日期:2024-02-12
  • 录用日期:
  • 在线发布日期: 2024-12-09
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号