Attackers API¶

Attacker¶

class Attacker[source]¶: The base class of all attackers.

ClassificationAttacker¶

class ClassificationAttacker[source]¶: The base class of all classification attackers.

BAEAttacker¶

class BAEAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(mlm_path='bert-base-uncased', k=50, threshold_pred_score=0.3, max_length=512, batch_size=32, replace_rate=1.0, insert_rate=0.0, device=None, sentence_encoder=None, filter_words=None)[source]¶

BAE: BERT-based Adversarial Examples for Text Classification. Siddhant Garg, Goutham Ramakrishnan. EMNLP 2020. [pdf] [code]

This script is adapted from <https://github.com/LinyangLee/BERT-Attack> given the high similarity between the two attack methods.

This attacker supports the 4 attack methods (BAE-R, BAE-I, BAE-R/I, BAE-R+I) in the paper.

Parameters

mlm_path (str) – The path to the masked language model. Default: ‘bert-base-uncased’
k (int) – The k most important words / sub-words to substitute for. Default: 50
threshold_pred_score (float) – Threshold used in substitute module. Default: 0.3
max_length (int) – The maximum length of an input sentence for bert. Default: 512
batch_size (int) – The size of a batch of input sentences for bert. Default: 32
replace_rate (float) – Replace rate.
insert_rate (float) – Insert rate.
device (Optional[torch.device]) – A computing device for bert.
sentence_encoder – A sentence encoder to calculate the semantic similarity of two sentences. Default: UniversalSentenceEncoder
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Data Requirements

TProcess.NLTKPerceptronPosTagger

Classifier Capacity

get_pred
get_prob

Language

english

BERTAttacker¶

class BERTAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(mlm_path='bert-base-uncased', k=36, use_bpe=True, sim_mat=None, threshold_pred_score=0.3, max_length=512, device=None, filter_words=None)[source]¶

BERT-ATTACK: Adversarial Attack Against BERT Using BERT, Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, Xipeng Qiu, EMNLP2020 [pdf] [code]

Parameters

mlm_path (str) – The path to the masked language model. Default: ‘bert-base-uncased’
k (int) – The k most important words / sub-words to substitute for. Default: 36
use_bpe (bool) – Whether use bpe. Default: True
sim_mat (Union[None, bool, OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – Whether use cosine_similarity to filter out atonyms. Keep None for not using a sim_mat.
threshold_pred_score (float) – Threshold used in substitute module. Default: 0.3
max_length (int) – The maximum length of an input sentence for bert. Default: 512
device (Optional[torch.device]) – A computing device for bert.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity

get_pred
get_prob

DeepWordBugAttacker¶

class DeepWordBugAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(token_unk='<UNK>', scoring='replaceone', transform='homoglyph', power=5, tokenizer=None)[source]¶

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi. IEEE SPW 2018. [pdf] [code]

Parameters

token_unk – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"
scoring – Scoring function used to compute word importance, must be one of the following: ["replaceone", "temporal", "tail", "combined"]. Default: replaceone
transform – Transform function to modify a word, must be one of the following: ["homoglyph", "swap"]. Default: homoglyph
power – Max words to replace. Default: 5
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

Classifier Capacity

get_pred
get_prob

FDAttacker¶

class FDAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(substitute=None, tokenizer=None, token_unk='<UNK>', max_iter=100, lang=None, filter_words=None)[source]¶

Crafting Adversarial Input Sequences For Recurrent Neural Networks. Nicolas Papernot, Patrick McDaniel, Ananthram Swami, Richard Harang. MILCOM 2016. [pdf]

Parameters

substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
token_unk (str) – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"
max_iter (int) – Maximum number of iterations in attack procedure.
lang (Optional[str]) – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity

get_pred
get_grad
get_embedding

GANAttacker¶

class GANAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(gan_dataset='sst')[source]¶

Generating Natural Adversarial Examples. Zhengli Zhao, Dheeru Dua, Sameer Singh. ICLR 2018. [pdf] [code]

Parameters

gan_dataset (str) – The name of dataset which GAN model is trained on. Must be one of the following: ["sst", "snli"]. Default: sst

Language

english

Classifier Capacity

get_pred

GEOAttacker¶

GeneticAttacker¶

class GeneticAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(pop_size=20, max_iters=20, tokenizer=None, substitute=None, lang=None, filter_words=None)[source]¶

Generating Natural Language Adversarial Examples. Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, Kai-Wei Chang. EMNLP 2018. [pdf] [code]

Parameters

pop_size (int) – Genetic algorithm popluation size. Default: 20
max_iter – Maximum generations of genetic algorithm. Default: 20
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
max_iters (int) –

Classifier Capacity

get_pred
get_prob

HotFlipAttacker¶

class HotFlipAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(substitute=None, tokenizer=None, filter_words=None, lang=None)[source]¶

HotFlip: White-Box Adversarial Examples for Text Classification. Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou. ACL 2018. [pdf] [code]

Parameters

tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

Classifier Capacity

get_pred
get_prob

PSOAttacker¶

class PSOAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(pop_size=20, max_iters=20, tokenizer=None, substitute=None, filter_words=None, lang=None)[source]¶

Word-level Textual Adversarial Attacking as Combinatorial Optimization. Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu and Maosong Sun. ACL 2020. [pdf] [code]

Parameters

pop_size (int) – Genetic algorithm popluation size. Default: 20
max_iter – Maximum generations of pso algorithm. Default: 20
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
max_iters (int) –

Classifier Capacity

get_pred
get_prob

PWWSAttacker¶

class PWWSAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(tokenizer=None, substitute=None, token_unk='<UNK>', filter_words=None, lang=None)[source]¶

Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. Shuhuai Ren, Yihe Deng, Kun He, Wanxiang Che. ACL 2019. [pdf] [code]

Parameters

tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
token_unk (str) – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity

get_pred
get_prob

SCPNAttacker¶

class SCPNAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(templates=['( ROOT ( S ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( VP ) ( . ) ) ) EOP', '( ROOT ( NP ( NP ) ( . ) ) ) EOP', '( ROOT ( FRAG ( SBAR ) ( . ) ) ) EOP', '( ROOT ( S ( S ) ( , ) ( CC ) ( S ) ( . ) ) ) EOP', '( ROOT ( S ( LST ) ( VP ) ( . ) ) ) EOP', '( ROOT ( SBARQ ( WHADVP ) ( SQ ) ( . ) ) ) EOP', '( ROOT ( S ( PP ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( ADVP ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( SBAR ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP'], device=None, tokenizer=None, parser=None)[source]¶

Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer. NAACL-HLT 2018. [pdf] [code]

Parameters

templates (List[str]) – A list of templates used in SCPNAttacker. Default: ten manually selected templates.
device (Optional[torch.device]) – The device to load SCPN models (pytorch). Default: Use “cpu” if cuda is not available else “cuda”.
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
parser (Optional[OpenAttack.text_process.constituency_parser.base.ConstituencyParser]) – A constituency parser.

Language

english

Classifier Capacity

get_pred

The default templates are:

DEFAULT_TEMPLATES = [
    '( ROOT ( S ( NP ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( S ( VP ) ( . ) ) ) EOP',
    '( ROOT ( NP ( NP ) ( . ) ) ) EOP',
    '( ROOT ( FRAG ( SBAR ) ( . ) ) ) EOP',
    '( ROOT ( S ( S ) ( , ) ( CC ) ( S ) ( . ) ) ) EOP',
    '( ROOT ( S ( LST ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( SBARQ ( WHADVP ) ( SQ ) ( . ) ) ) EOP',
    '( ROOT ( S ( PP ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( S ( ADVP ) ( NP ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( S ( SBAR ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP'
]

TextBuggerAttacker¶

class TextBuggerAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(blackbox=True, tokenizer=None, substitute=None, filter_words=None, lang=None)[source]¶

TEXTBUGGER: Generating Adversarial Text Against Real-world Applications. Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, Ting Wang. NDSS 2019. [pdf]

Parameters

blackbox – If is true, the attacker will perform a black-box attack.
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity

get_pred
get_prob if blackbox = True
get_grad if blackbox = False

TextFoolerAttacker¶

class TextFoolerAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(import_score_threshold=- 1, sim_score_threshold=0.5, sim_score_window=15, tokenizer=None, substitute=None, filter_words=None, token_unk='<UNK>', lang=None)[source]¶

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits. AAAI 2020. [pdf] [code]

Parameters

import_score_threshold (float) – Threshold used to choose important word. Default: -1.
sim_score_threshold (float) – Threshold used to choose sentences of high semantic similarity. Default: 0.5
im_score_window – length used in score module. Default: 15
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.
token_unk – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"
filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.
sim_score_window (int) –

Classifier Capacity

get_pred
get_prob

UATAttacker¶

class UATAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(triggers=['the', 'the', 'the'], tokenizer=None, lang=None)[source]¶

Universal Adversarial Triggers for Attacking and Analyzing NLP. Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh. EMNLP-IJCNLP 2019. [pdf] [code]

Parameters

triggers (List[str]) – A list of trigger words.
tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

Classifier Capacity

get_pred

classmethod get_triggers(victim, dataset, tokenizer, epoch=5, batch_size=5, trigger_len=3, beam_size=5, lang=None)[source]¶

This method is used to get trigger words of vicim model on dataset.

Parameters

victim (OpenAttack.victim.classifiers.base.Classifier) – The classifier that you want to attack.
dataset (datasets.Dataset) – A datsets.Dataset.
tokenizer (OpenAttack.text_process.tokenizer.base.Tokenizer) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer
epoch (int) – Maximum epochs to get the universal adversarial triggers.
barch_size – Batch size.
trigger_len (int) – The number of triggers.
beam_size (int) – Beam search size used in this attacker.
batch_size (int) –

Returns

A list of trigger words.

Return type

List[str]

VIPERAttacker¶

class VIPERAttacker(OpenAttack.attackers.ClassificationAttacker)[source]¶

__init__(prob=0.3, topn=12, generations=120, method='eces')[source]¶

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems. Steffen Eger, Gözde Gül ¸Sahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych. NAACL-HLT 2019. [pdf] [code]

Parameters

prob (float) – The probability of changing a char in a sentence. Default: 0.3
topn (int) – Number of substitutes while using DCES substitute. Default: 12
generations (int) – Maximum number of sentences generated per attack. Default: 120
method (str) – The method of this attack. Must be one of the following: ["eces", "dces"]. Default: eces

Classifier Capacity

get_pred