Stealthy Backdoor Attack for Code Models,IEEE Transactions on Software Engineering

当前位置： X-MOL 学术 › IEEE Trans. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Stealthy Backdoor Attack for Code Models
IEEE Transactions on Software Engineering ( IF 7.4 ) Pub Date : 2024-02-09 , DOI: 10.1109/tse.2024.3361661
Zhou Yang ₁ , Bowen Xu ₂ , Jie M. Zhang ₃ , Hong Jin Kang ₄ , Jieke Shi ₁ , Junda He ₁ , David Lo ₁

Affiliation

Code models, such as CodeBERT and CodeT5, offer general-purpose representations of code and play a vital role in supporting downstream automated software engineering tasks. Most recently, code models were revealed to be vulnerable to backdoor attacks. A code model that is backdoor-attacked can behave normally on clean examples but will produce pre-defined malicious outputs on examples injected with triggers that activate the backdoors. Existing backdoor attacks on code models use unstealthy and easy-to-detect triggers. This paper aims to investigate the vulnerability of code models with stealthy backdoor attacks. To this end, we propose Afraidoor ( A dversarial F eatu r e as A dapt i ve Back door ). Afraidoor achieves stealthiness by leveraging adversarial perturbations to inject adaptive triggers into different inputs. We apply Afraidoor to three widely adopted code models (CodeBERT, PLBART, and CodeT5) and two downstream tasks (code summarization and method name prediction). We evaluate three widely used defense methods and find that Afraidoor is more unlikely to be detected by the defense methods than by baseline methods. More specifically, when using spectral signature as defense, around 85% of adaptive triggers in Afraidoor bypass the detection in the defense process. By contrast, only less than 12% of the triggers from previous work bypass the defense. When the defense method is not applied, both Afraidoor and baselines have almost perfect attack success rates. However, once a defense is applied, the attack success rates of baselines decrease dramatically, while the success rate of Afraidoor remains high. Our finding exposes security weaknesses in code models under stealthy backdoor attacks and shows that state-of-the-art defense methods cannot provide sufficient protection. We call for more research efforts in understanding security threats to code models and developing more effective countermeasures.

中文翻译：

代码模型的隐秘后门攻击

CodeBERT 和 CodeT5 等代码模型提供通用的代码表示，并在支持下游自动化软件工程任务中发挥着至关重要的作用。最近，代码模型被发现容易受到后门攻击。受后门攻击的代码模型可以在干净的示例上正常运行，但会在注入的示例上产生预定义的恶意输出激活后门的触发器。现有的针对代码模型的后门攻击使用不隐蔽且易于检测的触发器。本文旨在研究代码模型的漏洞隐秘的后门攻击。为此，我们建议恐惧多尔（对抗性的特征反应适应我回来了门）。Afraidor 通过利用对抗性扰动将自适应触发器注入不同的输入来实现隐秘性。我们申请害怕三种广泛采用的代码模型（CodeBERT、PLBART 和 CodeT5）和两个下游任务（代码摘要和方法名称预测）。我们评估了三种广泛使用的防御方法，发现与基线方法相比，防御方法更不可能检测到 Afraidor。更具体地说，当使用光谱特征作为防御时，大约 85% 的自适应触发Afraidoor绕过防御过程中的检测。相比之下，之前研究中只有不到 12% 的触发器绕过了防御。当不采用防御方法时，两者Afraidoor 和 Baseline 的攻击成功率几乎完美。然而，一旦应用防御，基线的攻击成功率就会急剧下降，而基线的攻击成功率就会急剧下降。Afraidoor 仍然很高。我们的发现暴露了隐秘后门攻击下代码模型的安全弱点，并表明最先进的防御方法无法提供足够的保护。我们呼吁开展更多研究工作，了解代码模型的安全威胁并制定更有效的对策。

更新日期：2024-02-09

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>