Attackers API¶
ClassificationAttacker¶
BAEAttacker¶
-
class
BAEAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(mlm_path='bert-base-uncased', k=50, threshold_pred_score=0.3, max_length=512, batch_size=32, replace_rate=1.0, insert_rate=0.0, device=None, sentence_encoder=None, filter_words=None)[source]¶ BAE: BERT-based Adversarial Examples for Text Classification. Siddhant Garg, Goutham Ramakrishnan. EMNLP 2020. [pdf] [code]
This script is adapted from <https://github.com/LinyangLee/BERT-Attack> given the high similarity between the two attack methods.
This attacker supports the 4 attack methods (BAE-R, BAE-I, BAE-R/I, BAE-R+I) in the paper.
- Parameters
mlm_path (str) – The path to the masked language model. Default: ‘bert-base-uncased’
k (int) – The k most important words / sub-words to substitute for. Default: 50
threshold_pred_score (float) – Threshold used in substitute module. Default: 0.3
max_length (int) – The maximum length of an input sentence for bert. Default: 512
batch_size (int) – The size of a batch of input sentences for bert. Default: 32
replace_rate (float) – Replace rate.
insert_rate (float) – Insert rate.
device (Optional[torch.device]) – A computing device for bert.
sentence_encoder – A sentence encoder to calculate the semantic similarity of two sentences. Default:
UniversalSentenceEncoderfilter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
- Data Requirements
- Classifier Capacity
get_pred
get_prob
- Language
english
-
BERTAttacker¶
-
class
BERTAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(mlm_path='bert-base-uncased', k=36, use_bpe=True, sim_mat=None, threshold_pred_score=0.3, max_length=512, device=None, filter_words=None)[source]¶ BERT-ATTACK: Adversarial Attack Against BERT Using BERT, Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, Xipeng Qiu, EMNLP2020 [pdf] [code]
- Parameters
mlm_path (str) – The path to the masked language model. Default: ‘bert-base-uncased’
k (int) – The k most important words / sub-words to substitute for. Default: 36
use_bpe (bool) – Whether use bpe. Default: True
sim_mat (Union[None, bool, OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – Whether use cosine_similarity to filter out atonyms. Keep None for not using a sim_mat.
threshold_pred_score (float) – Threshold used in substitute module. Default: 0.3
max_length (int) – The maximum length of an input sentence for bert. Default: 512
device (Optional[torch.device]) – A computing device for bert.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
- Classifier Capacity
get_pred
get_prob
-
DeepWordBugAttacker¶
-
class
DeepWordBugAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(token_unk='<UNK>', scoring='replaceone', transform='homoglyph', power=5, tokenizer=None)[source]¶ Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi. IEEE SPW 2018. [pdf] [code]
- Parameters
token_unk – The token id or the token name for out-of-vocabulary words in victim model. Default:
"<UNK>"scoring – Scoring function used to compute word importance, must be one of the following:
["replaceone", "temporal", "tail", "combined"]. Default: replaceonetransform – Transform function to modify a word, must be one of the following:
["homoglyph", "swap"]. Default: homoglyphpower – Max words to replace. Default: 5
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizer
- Classifier Capacity
get_pred
get_prob
-
FDAttacker¶
-
class
FDAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(substitute=None, tokenizer=None, token_unk='<UNK>', max_iter=100, lang=None, filter_words=None)[source]¶ Crafting Adversarial Input Sequences For Recurrent Neural Networks. Nicolas Papernot, Patrick McDaniel, Ananthram Swami, Richard Harang. MILCOM 2016. [pdf]
- Parameters
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutetokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizertoken_unk (str) – The token id or the token name for out-of-vocabulary words in victim model. Default:
"<UNK>"max_iter (int) – Maximum number of iterations in attack procedure.
lang (Optional[str]) – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
- Classifier Capacity
get_pred
get_grad
get_embedding
-
GANAttacker¶
-
class
GANAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(gan_dataset='sst')[source]¶ Generating Natural Adversarial Examples. Zhengli Zhao, Dheeru Dua, Sameer Singh. ICLR 2018. [pdf] [code]
- Parameters
gan_dataset (str) – The name of dataset which GAN model is trained on. Must be one of the following:
["sst", "snli"]. Default: sst- Language
english
- Classifier Capacity
get_pred
-
GEOAttacker¶
GeneticAttacker¶
-
class
GeneticAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(pop_size=20, max_iters=20, tokenizer=None, substitute=None, lang=None, filter_words=None)[source]¶ Generating Natural Language Adversarial Examples. Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, Kai-Wei Chang. EMNLP 2018. [pdf] [code]
- Parameters
pop_size (int) – Genetic algorithm popluation size. Default: 20
max_iter – Maximum generations of genetic algorithm. Default: 20
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizersubstitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutelang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
max_iters (int) –
- Classifier Capacity
get_pred
get_prob
-
HotFlipAttacker¶
-
class
HotFlipAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(substitute=None, tokenizer=None, filter_words=None, lang=None)[source]¶ HotFlip: White-Box Adversarial Examples for Text Classification. Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou. ACL 2018. [pdf] [code]
- Parameters
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizersubstitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutefilter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
- Classifier Capacity
get_pred
get_prob
-
PSOAttacker¶
-
class
PSOAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(pop_size=20, max_iters=20, tokenizer=None, substitute=None, filter_words=None, lang=None)[source]¶ Word-level Textual Adversarial Attacking as Combinatorial Optimization. Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu and Maosong Sun. ACL 2020. [pdf] [code]
- Parameters
pop_size (int) – Genetic algorithm popluation size. Default: 20
max_iter – Maximum generations of pso algorithm. Default: 20
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizersubstitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutelang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
max_iters (int) –
- Classifier Capacity
get_pred
get_prob
-
PWWSAttacker¶
-
class
PWWSAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(tokenizer=None, substitute=None, token_unk='<UNK>', filter_words=None, lang=None)[source]¶ Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. Shuhuai Ren, Yihe Deng, Kun He, Wanxiang Che. ACL 2019. [pdf] [code]
- Parameters
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizersubstitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutetoken_unk (str) – The token id or the token name for out-of-vocabulary words in victim model. Default:
"<UNK>"lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
- Classifier Capacity
get_pred
get_prob
-
SCPNAttacker¶
-
class
SCPNAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(templates=['( ROOT ( S ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( VP ) ( . ) ) ) EOP', '( ROOT ( NP ( NP ) ( . ) ) ) EOP', '( ROOT ( FRAG ( SBAR ) ( . ) ) ) EOP', '( ROOT ( S ( S ) ( , ) ( CC ) ( S ) ( . ) ) ) EOP', '( ROOT ( S ( LST ) ( VP ) ( . ) ) ) EOP', '( ROOT ( SBARQ ( WHADVP ) ( SQ ) ( . ) ) ) EOP', '( ROOT ( S ( PP ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( ADVP ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( SBAR ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP'], device=None, tokenizer=None, parser=None)[source]¶ Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer. NAACL-HLT 2018. [pdf] [code]
- Parameters
templates (List[str]) – A list of templates used in SCPNAttacker. Default: ten manually selected templates.
device (Optional[torch.device]) – The device to load SCPN models (pytorch). Default: Use “cpu” if cuda is not available else “cuda”.
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizerparser (Optional[OpenAttack.text_process.constituency_parser.base.ConstituencyParser]) – A constituency parser.
- Language
english
- Classifier Capacity
get_pred
The default templates are:
DEFAULT_TEMPLATES = [ '( ROOT ( S ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( VP ) ( . ) ) ) EOP', '( ROOT ( NP ( NP ) ( . ) ) ) EOP', '( ROOT ( FRAG ( SBAR ) ( . ) ) ) EOP', '( ROOT ( S ( S ) ( , ) ( CC ) ( S ) ( . ) ) ) EOP', '( ROOT ( S ( LST ) ( VP ) ( . ) ) ) EOP', '( ROOT ( SBARQ ( WHADVP ) ( SQ ) ( . ) ) ) EOP', '( ROOT ( S ( PP ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( ADVP ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( SBAR ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP' ]
-
TextBuggerAttacker¶
-
class
TextBuggerAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(blackbox=True, tokenizer=None, substitute=None, filter_words=None, lang=None)[source]¶ TEXTBUGGER: Generating Adversarial Text Against Real-world Applications. Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, Ting Wang. NDSS 2019. [pdf]
- Parameters
blackbox – If is true, the attacker will perform a black-box attack.
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizersubstitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutelang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
- Classifier Capacity
get_pred
get_prob if blackbox = True
get_grad if blackbox = False
-
TextFoolerAttacker¶
-
class
TextFoolerAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(import_score_threshold=- 1, sim_score_threshold=0.5, sim_score_window=15, tokenizer=None, substitute=None, filter_words=None, token_unk='<UNK>', lang=None)[source]¶ Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits. AAAI 2020. [pdf] [code]
- Parameters
import_score_threshold (float) – Threshold used to choose important word. Default: -1.
sim_score_threshold (float) – Threshold used to choose sentences of high semantic similarity. Default: 0.5
im_score_window – length used in score module. Default: 15
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizersubstitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of
WordSubstitutelang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
token_unk – The token id or the token name for out-of-vocabulary words in victim model. Default:
"<UNK>"filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
sim_score_window (int) –
- Classifier Capacity
get_pred
get_prob
-
UATAttacker¶
-
class
UATAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(triggers=['the', 'the', 'the'], tokenizer=None, lang=None)[source]¶ Universal Adversarial Triggers for Attacking and Analyzing NLP. Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh. EMNLP-IJCNLP 2019. [pdf] [code]
- Parameters
triggers (List[str]) – A list of trigger words.
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizerlang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
- Classifier Capacity
get_pred
-
classmethod
get_triggers(victim, dataset, tokenizer, epoch=5, batch_size=5, trigger_len=3, beam_size=5, lang=None)[source]¶ This method is used to get trigger words of vicim model on dataset.
- Parameters
victim (OpenAttack.victim.classifiers.base.Classifier) – The classifier that you want to attack.
dataset (datasets.Dataset) – A datsets.Dataset.
tokenizer (OpenAttack.text_process.tokenizer.base.Tokenizer) – A tokenizer that will be used during the attack procedure. Must be an instance of
Tokenizerepoch (int) – Maximum epochs to get the universal adversarial triggers.
barch_size – Batch size.
trigger_len (int) – The number of triggers.
beam_size (int) – Beam search size used in this attacker.
batch_size (int) –
- Returns
A list of trigger words.
- Return type
List[str]
-
VIPERAttacker¶
-
class
VIPERAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶ -
__init__(prob=0.3, topn=12, generations=120, method='eces')[source]¶ Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems. Steffen Eger, Gözde Gül ¸Sahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych. NAACL-HLT 2019. [pdf] [code]
- Parameters
prob (float) – The probability of changing a char in a sentence. Default: 0.3
topn (int) – Number of substitutes while using DCES substitute. Default: 12
generations (int) – Maximum number of sentences generated per attack. Default: 120
method (str) – The method of this attack. Must be one of the following:
["eces", "dces"]. Default: eces
- Classifier Capacity
get_pred
-