BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning.

This is a preliminary implementation for NeurIPS 2023 paper - BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning.

The code is based on and inspired by PA_AD attack and TrojDRL.

requirements

pip install -r requirements.txt

packages

pip install -e .

Usage example

Use the following commands to train an untargeted backdoored agent in PongNoFrameskip-v4:

python trainer_victim/train_trojan.py --attack_method untargeted --env-name PongNoFrameskip-v4 --save-dir path_to_saved_model --poison --color 100 --pixels_to_poison_h 10 --pixels_to_poison_v 10 --start_position 0,0 --budget 20000 --when_to_poison uniformly

The trainer_victim/train_trojan.py provides different arguments:

--attack_method: the backdoor attack method, targeted or untargeted
--color: pixel value for the path trigger
--pixels_to_poison_h: size of the trigger
--start_position: starting position of the trigger
--move: whether the trigger is moving or not

Use the following commands to perform trigger restoration on a victim agent in PongNoFrameskip-v4:

python trainer_victim/restore.py --use-value --max-adv --beta --reg_l1 1e-3 --smooth 1e-5 --init-alpha 100 --load --env-name PongNoFrameskip-v4 --victim-dir path_to_victim_agent

The trainer_victim/restore.py provides different arguments including:

--use-value: use the value network of the victim agent to restore the trigger
--beta: use Beta distribution as the generative model
--reg_l1: weights for the first regulazation $\mathcal{R}_1$
--smooth: weights for the second regulazation $\mathcal{R}_2$
--init-alpha: value for $\alpha$ of the Beta distribution

Use the following commands to perform backdoor removal on a backdoored agent in PongNoFrameskip-v4:

python trainer_victim/train_repair.py --topk 30 --kl --budget 500000 --save-interval 5000 --load --env-name PongNoFrameskip-v4 --victim-dir path_to_victim_agent --trigger-path path_to_trigger --save-dir path_to_repaired_agent

The trainer_victim/train_repair.py provides different arguments including:

--topk: the number of re-initialized neurons
--kl: whether add the constraint on KL divergence
--budget: total time steps for adding restored trigger
--victim-dir: the path to the backdoored agent to be repaired

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
code_bird		code_bird
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning.

requirements

packages

Usage example

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ucsb-mlsec/bird

Folders and files

Latest commit

History

Repository files navigation

BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning.

requirements

packages

Usage example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages