针对黑盒智能语音软件的对抗样本生成方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

吉顺慧,E-mail:shunhuiji@hhu.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(U21B2016, 61702159); 江苏省自然科学基金(BK20191297); 中央高校基本科研业务费


Adversarial Example Generation Method for Black Box Intelligent Speech Software
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着深度学习技术的成熟, 智能语音识别软件获得了广泛的应用, 存在于智能软件内部的各种深度神经网络发挥了关键性的作用. 然而, 最近的研究表明: 含有微小扰动的对抗样本会对深度神经网络的安全性和鲁棒性构成极大威胁. 研究人员通常将生成的对抗样本作为测试用例输入到智能语音识别软件中, 观察对抗样本是否会让软件产生错误判断, 从而采取防御方法来提高智能软件安全性和鲁棒性. 在对抗样本的生成中, 黑盒智能语音软件在生活中较为常见, 具有实际的研究价值, 而现有的生成方法却存在一定的局限性. 为此, 针对黑盒智能语音软件, 提出了一种基于萤火虫算法和梯度评估方法的目标对抗样本生成方法, 即萤火虫-梯度对抗样本生成方法. 针对设定的目标文本, 在原始的音频样本中不断加入干扰, 根据当前对抗样本的文本内容与目标文本之间的编辑距离, 选择使用萤火虫算法或梯度评估方法来优化对抗样本, 最终生成目标对抗样本. 为了验证方法的效果, 在常用的语音识别软件上, 使用公共语音数据集、谷歌命令数据集和LibriSpeech数据集这3种不同类型的语音数据集进行了实验评估, 并寻找志愿者进行对抗样本的质量评估. 实验表明, 提出的方法能有效提高目标对抗样本生成的成功率, 例如针对DeepSpeech语音识别软件, 在公共语音数据集上生成对抗样本的成功率相比对比方法提升了13%.

    Abstract:

    With the maturity of deep learning technology, intelligent speech recognition software has been widely used. Various deep neural networks in the intelligent software play a crucial role. Recent studies have shown that minor disturbances in adversarial examples significantly threaten the security and robustness of deep neural networks. Researchers usually take the generated adversarial examples as the test cases and input them into the intelligent speech recognition software to test whether the adversarial examples will make the software misjudge. And then defense methods are adopted to improve the security and robustness of intelligent software. For the adversarial example generation, black box intelligent speech software is more common in life and has practical research value. However, the existing generation methods have some limitations. Therefore, this study proposes a target adversarial example generation method for the black box speech software based on the firefly algorithm and gradient evaluation method, namely the firefly-gradient adversarial example generation method. With the set target text, disturbances are added to the original speech example. The firefly algorithm or gradient evaluation method is chosen to optimize the adversarial example according to the edit distance between the text of the current generated adversarial example and the target text so that the target adversarial example is generated finally. To verify the effectiveness of the method, this study conducts an experimental evaluation on common speech recognition software, using three different types of speech datasets: Common Speech dataset, Google Command dataset and LibriSpeech dataset, and looks for volunteers to evaluate the generated adversarial examples. Experimental results show that the proposed method can effectively improve the success rate of target adversarial example generation. For example, for the DeepSpeech speech recognition software, the success rate of generating adversarial examples on Common Speech datasets is 13% higher than that of the compared method.

    参考文献
    相似文献
    引证文献
引用本文

袁天昊,吉顺慧,张鹏程,蔡涵博,戴启印,叶仕俊,任彬.针对黑盒智能语音软件的对抗样本生成方法.软件学报,2022,33(5):1569-1586

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-08-08
  • 最后修改日期:2021-10-09
  • 录用日期:
  • 在线发布日期: 2022-01-28
  • 出版日期: 2022-05-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号