Attackers API

Attacker

class Attacker[source]

The base class of all attackers.


ClassificationAttacker

class ClassificationAttacker[source]

The base class of all classification attackers.


BAEAttacker

class BAEAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(mlm_path='bert-base-uncased', k=50, threshold_pred_score=0.3, max_length=512, batch_size=32, replace_rate=1.0, insert_rate=0.0, device=None, sentence_encoder=None, filter_words=None)[source]

BAE: BERT-based Adversarial Examples for Text Classification. Siddhant Garg, Goutham Ramakrishnan. EMNLP 2020. [pdf] [code]

This script is adapted from <https://github.com/LinyangLee/BERT-Attack> given the high similarity between the two attack methods.

This attacker supports the 4 attack methods (BAE-R, BAE-I, BAE-R/I, BAE-R+I) in the paper.

Parameters
  • mlm_path (str) – The path to the masked language model. Default: ‘bert-base-uncased’

  • k (int) – The k most important words / sub-words to substitute for. Default: 50

  • threshold_pred_score (float) – Threshold used in substitute module. Default: 0.3

  • max_length (int) – The maximum length of an input sentence for bert. Default: 512

  • batch_size (int) – The size of a batch of input sentences for bert. Default: 32

  • replace_rate (float) – Replace rate.

  • insert_rate (float) – Insert rate.

  • device (Optional[torch.device]) – A computing device for bert.

  • sentence_encoder – A sentence encoder to calculate the semantic similarity of two sentences. Default: UniversalSentenceEncoder

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Data Requirements

TProcess.NLTKPerceptronPosTagger

Classifier Capacity
  • get_pred

  • get_prob

Language

english

BERTAttacker

class BERTAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(mlm_path='bert-base-uncased', k=36, use_bpe=True, sim_mat=None, threshold_pred_score=0.3, max_length=512, device=None, filter_words=None)[source]

BERT-ATTACK: Adversarial Attack Against BERT Using BERT, Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, Xipeng Qiu, EMNLP2020 [pdf] [code]

Parameters
  • mlm_path (str) – The path to the masked language model. Default: ‘bert-base-uncased’

  • k (int) – The k most important words / sub-words to substitute for. Default: 36

  • use_bpe (bool) – Whether use bpe. Default: True

  • sim_mat (Union[None, bool, OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – Whether use cosine_similarity to filter out atonyms. Keep None for not using a sim_mat.

  • threshold_pred_score (float) – Threshold used in substitute module. Default: 0.3

  • max_length (int) – The maximum length of an input sentence for bert. Default: 512

  • device (Optional[torch.device]) – A computing device for bert.

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity
  • get_pred

  • get_prob

DeepWordBugAttacker

class DeepWordBugAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(token_unk='<UNK>', scoring='replaceone', transform='homoglyph', power=5, tokenizer=None)[source]

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi. IEEE SPW 2018. [pdf] [code]

Parameters
  • token_unk – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"

  • scoring – Scoring function used to compute word importance, must be one of the following: ["replaceone", "temporal", "tail", "combined"]. Default: replaceone

  • transform – Transform function to modify a word, must be one of the following: ["homoglyph", "swap"]. Default: homoglyph

  • power – Max words to replace. Default: 5

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

Classifier Capacity
  • get_pred

  • get_prob

FDAttacker

class FDAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(substitute=None, tokenizer=None, token_unk='<UNK>', max_iter=100, lang=None, filter_words=None)[source]

Crafting Adversarial Input Sequences For Recurrent Neural Networks. Nicolas Papernot, Patrick McDaniel, Ananthram Swami, Richard Harang. MILCOM 2016. [pdf]

Parameters
  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • token_unk (str) – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"

  • max_iter (int) – Maximum number of iterations in attack procedure.

  • lang (Optional[str]) – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity
  • get_pred

  • get_grad

  • get_embedding

GANAttacker

class GANAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(gan_dataset='sst')[source]

Generating Natural Adversarial Examples. Zhengli Zhao, Dheeru Dua, Sameer Singh. ICLR 2018. [pdf] [code]

Parameters

gan_dataset (str) – The name of dataset which GAN model is trained on. Must be one of the following: ["sst", "snli"]. Default: sst

Language

english

Classifier Capacity
  • get_pred

GEOAttacker

GeneticAttacker

class GeneticAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(pop_size=20, max_iters=20, tokenizer=None, substitute=None, lang=None, filter_words=None)[source]

Generating Natural Language Adversarial Examples. Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, Kai-Wei Chang. EMNLP 2018. [pdf] [code]

Parameters
  • pop_size (int) – Genetic algorithm popluation size. Default: 20

  • max_iter – Maximum generations of genetic algorithm. Default: 20

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

  • max_iters (int) –

Classifier Capacity
  • get_pred

  • get_prob

HotFlipAttacker

class HotFlipAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(substitute=None, tokenizer=None, filter_words=None, lang=None)[source]

HotFlip: White-Box Adversarial Examples for Text Classification. Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou. ACL 2018. [pdf] [code]

Parameters
  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

Classifier Capacity
  • get_pred

  • get_prob

PSOAttacker

class PSOAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(pop_size=20, max_iters=20, tokenizer=None, substitute=None, filter_words=None, lang=None)[source]

Word-level Textual Adversarial Attacking as Combinatorial Optimization. Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu and Maosong Sun. ACL 2020. [pdf] [code]

Parameters
  • pop_size (int) – Genetic algorithm popluation size. Default: 20

  • max_iter – Maximum generations of pso algorithm. Default: 20

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

  • max_iters (int) –

Classifier Capacity
  • get_pred

  • get_prob

PWWSAttacker

class PWWSAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(tokenizer=None, substitute=None, token_unk='<UNK>', filter_words=None, lang=None)[source]

Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. Shuhuai Ren, Yihe Deng, Kun He, Wanxiang Che. ACL 2019. [pdf] [code]

Parameters
  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • token_unk (str) – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity
  • get_pred

  • get_prob

SCPNAttacker

class SCPNAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(templates=['( ROOT ( S ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( VP ) ( . ) ) ) EOP', '( ROOT ( NP ( NP ) ( . ) ) ) EOP', '( ROOT ( FRAG ( SBAR ) ( . ) ) ) EOP', '( ROOT ( S ( S ) ( , ) ( CC ) ( S ) ( . ) ) ) EOP', '( ROOT ( S ( LST ) ( VP ) ( . ) ) ) EOP', '( ROOT ( SBARQ ( WHADVP ) ( SQ ) ( . ) ) ) EOP', '( ROOT ( S ( PP ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( ADVP ) ( NP ) ( VP ) ( . ) ) ) EOP', '( ROOT ( S ( SBAR ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP'], device=None, tokenizer=None, parser=None)[source]

Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer. NAACL-HLT 2018. [pdf] [code]

Parameters
  • templates (List[str]) – A list of templates used in SCPNAttacker. Default: ten manually selected templates.

  • device (Optional[torch.device]) – The device to load SCPN models (pytorch). Default: Use “cpu” if cuda is not available else “cuda”.

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • parser (Optional[OpenAttack.text_process.constituency_parser.base.ConstituencyParser]) – A constituency parser.

Language

english

Classifier Capacity

get_pred

The default templates are:

DEFAULT_TEMPLATES = [
    '( ROOT ( S ( NP ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( S ( VP ) ( . ) ) ) EOP',
    '( ROOT ( NP ( NP ) ( . ) ) ) EOP',
    '( ROOT ( FRAG ( SBAR ) ( . ) ) ) EOP',
    '( ROOT ( S ( S ) ( , ) ( CC ) ( S ) ( . ) ) ) EOP',
    '( ROOT ( S ( LST ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( SBARQ ( WHADVP ) ( SQ ) ( . ) ) ) EOP',
    '( ROOT ( S ( PP ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( S ( ADVP ) ( NP ) ( VP ) ( . ) ) ) EOP',
    '( ROOT ( S ( SBAR ) ( , ) ( NP ) ( VP ) ( . ) ) ) EOP'
]

TextBuggerAttacker

class TextBuggerAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(blackbox=True, tokenizer=None, substitute=None, filter_words=None, lang=None)[source]

TEXTBUGGER: Generating Adversarial Text Against Real-world Applications. Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, Ting Wang. NDSS 2019. [pdf]

Parameters
  • blackbox – If is true, the attacker will perform a black-box attack.

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

Classifier Capacity
  • get_pred

  • get_prob if blackbox = True

  • get_grad if blackbox = False

TextFoolerAttacker

class TextFoolerAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(import_score_threshold=- 1, sim_score_threshold=0.5, sim_score_window=15, tokenizer=None, substitute=None, filter_words=None, token_unk='<UNK>', lang=None)[source]

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits. AAAI 2020. [pdf] [code]

Parameters
  • import_score_threshold (float) – Threshold used to choose important word. Default: -1.

  • sim_score_threshold (float) – Threshold used to choose sentences of high semantic similarity. Default: 0.5

  • im_score_window – length used in score module. Default: 15

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • substitute (Optional[OpenAttack.attack_assist.substitute.word.base.WordSubstitute]) – A substitute that will be used during the attack procedure. Must be an instance of WordSubstitute

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

  • token_unk – The token id or the token name for out-of-vocabulary words in victim model. Default: "<UNK>"

  • filter_words (Optional[List[str]]) – A list of words that will be preserved in the attack procesudre.

  • sim_score_window (int) –

Classifier Capacity
  • get_pred

  • get_prob

UATAttacker

class UATAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(triggers=['the', 'the', 'the'], tokenizer=None, lang=None)[source]

Universal Adversarial Triggers for Attacking and Analyzing NLP. Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh. EMNLP-IJCNLP 2019. [pdf] [code]

Parameters
  • triggers (List[str]) – A list of trigger words.

  • tokenizer (Optional[OpenAttack.text_process.tokenizer.base.Tokenizer]) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • lang – The language used in attacker. If is None then attacker will intelligently select the language based on other parameters.

Classifier Capacity
  • get_pred

classmethod get_triggers(victim, dataset, tokenizer, epoch=5, batch_size=5, trigger_len=3, beam_size=5, lang=None)[source]

This method is used to get trigger words of vicim model on dataset.

Parameters
  • victim (OpenAttack.victim.classifiers.base.Classifier) – The classifier that you want to attack.

  • dataset (datasets.Dataset) – A datsets.Dataset.

  • tokenizer (OpenAttack.text_process.tokenizer.base.Tokenizer) – A tokenizer that will be used during the attack procedure. Must be an instance of Tokenizer

  • epoch (int) – Maximum epochs to get the universal adversarial triggers.

  • barch_size – Batch size.

  • trigger_len (int) – The number of triggers.

  • beam_size (int) – Beam search size used in this attacker.

  • batch_size (int) –

Returns

A list of trigger words.

Return type

List[str]

VIPERAttacker

class VIPERAttacker(OpenAttack.attackers.ClassificationAttacker)[source]
__init__(prob=0.3, topn=12, generations=120, method='eces')[source]

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems. Steffen Eger, Gözde Gül ¸Sahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych. NAACL-HLT 2019. [pdf] [code]

Parameters
  • prob (float) – The probability of changing a char in a sentence. Default: 0.3

  • topn (int) – Number of substitutes while using DCES substitute. Default: 12

  • generations (int) – Maximum number of sentences generated per attack. Default: 120

  • method (str) – The method of this attack. Must be one of the following: ["eces", "dces"]. Default: eces

Classifier Capacity
  • get_pred