Abstract:With machine learning widely applied to the natural language processing (NLP) domain in recent years, the security of NLP tasks receives growing natural concerns. Existing studies found that small modifications in examples might lead to wrong machine learning predictions, which was also called adversarial attack. The textual adversarial attack can effectively reveal the vulnerability of NLP models for improvement. Nevertheless, existing textual adversarial attack methods all focus on designing complex adversarial example generation strategies with a limited improvement of success rate, and the highly invasive modifications bring the decline of textual quality. Thus, a simple and effective method with high adversarial example quality is in demand. To solve this problem, the sememe-level sentence dilution algorithm (SSDA) and the dilution pool construction algorithm (DPCA) are proposed from a new perspective of improving the process of adversarial attack. SSDA is a new process that can be freely embedded into the classical adversarial attack workflow. SSDA first uses dilution pools constructed by DPCA to dilute the original examples, then generates adversarial examples through those diluted examples. It can not only improve the success rate of any adversarial attack methods without any limit of datasets or victim models but also obtain higher adversarial example quality compared with the original method. Through the experiments of different datasets, dilution pools, victim models, and textual adversarial attack methods, it is successfully verified the improvement of SSDA on the success rate and proved that dilution pools constructed by DPCA can further enhance the dilution ability of SSDA. The experiment results demonstrate that SSDA reveals more vulnerabilities of models than classical methods, and DPCA can help SSDA to improve success rate with higher adversarial example quality.