Substitutes API

Abstract Classes

class WordSubstitute[source]
__call__(word, pos=None)[source]

In WordSubstitute, we return a list of words that are semantically similar to the input word.

Parameters
  • word (str) – A single word.

  • pos (Optional[str]) – POS tag of input word. Must be one of the following: ["adv", "adj", "noun", "verb", "other", None]

Returns

A list of words and their distance to original word (distance is a number between 0 and 1, with smaller indicating more similarity)

Raises
  • WordNotInDictionaryException – input word not in the dictionary of substitute algorithm

  • UnknownPOSException – invalid pos tagging

Return type

List[Tuple[str, float]]

class CharSubstitute[source]
__call__(char)[source]

Char-level substitute algorithm.

In CharSubstitute, we return a list of chars that are visually similar to the original word.

Parameters

char (str) – A signle char

Returns

A list of chars and distance to original char (distance is a number between 0 and 1, with smaller indicating more similarity).

Return type

List[Tuple[str, float]]


EmbedBasedSubstitute

class EmbedBasedSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(word2id, embedding, cosine=False, k=50, threshold=0.5, device=None)[source]

Embedding based word substitute.

Parameters
  • word2id (Dict[str, int]) – A dict maps words to indexes.

  • embedding (torch.Tensor) – A word embedding matrix.

  • cosine – If true then the cosine distance is used, otherwise the Euclidian distance is used.

  • threshold – Distance threshold. Default: 0.5

  • k – Top-k results to return. If k is None, all results will be returned. Default: 50

  • device – A pytocrh device for computing distances. Default: “cpu”

ChineseHowNetSubstitute

class ChineseHowNetSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(k=None)[source]

Chinese Sememe-based word substitute based on OpenHowNet. [pdf]

Parameters

k (Optional[int]) – Top-k results to return. If k is None, all results will be returned.

Package Requirements

OpenHowNet

Data Requirements

AttackAssist.HowNet

Language

chinese

ChineseWord2VecSubstitute

class ChineseWord2VecSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(cosine=False, threshold=0.5, k=50, device=None)[source]

Chinese word substitute based on word2vec.

Parameters
  • cosine (bool) – If true then the cosine distance is used, otherwise the Euclidian distance is used.

  • threshold (float) – Distance threshold. Default: 0.5

  • k (int) – Top-k results to return. If k is None, all results will be returned. Default: 50

  • device (Optional[Union[str, torch.device]]) – A pytocrh device for computing distances. Default: “cpu”

Data Requirements

AttackAssist.ChineseWord2Vec

Language

chinese

ChineseWordNetSubstitute

class ChineseWordNetSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(k=None)[source]

Chinese word substitute based on wordnet.

Parameters

k – Top-k results to return. If k is None, all results will be returned. Default: 50

Language

chinese

CounterFittedSubstitute

class CounterFittedSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(cosine=False, k=50, threshold=0.5, device=None)[source]

English word substitute based on Counter-fitting word vectors. [pdf]

Parameters
  • cosine (bool) – If true then the cosine distance is used, otherwise the Euclidian distance is used.

  • threshold (float) – Distance threshold. Default: 0.5

  • k (int) – Top-k results to return. If k is None, all results will be returned. Default: 50

  • device – A pytocrh device for computing distances. Default: “cpu”

Data Requirements

AttackAssist.CounterFit

Language

english

ChineseCiLinSubstitute

class ChineseCiLinSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(k=None)[source]

Chinese Sememe-based word substitute based CiLin.

Parameters

k (Optional[int]) – Top-k results to return. If k is None, all results will be returned.

Data Requirements

AttackAssist.CiLin

Language

chinese

GloveSubstitute

class GloveSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(cosine=False, k=50, threshold=0.5, device=None)[source]

English word substitute based on GloVe word vectors. [pdf]

Parameters
  • cosine – If true then the cosine distance is used, otherwise the Euclidian distance is used.

  • threshold – Distance threshold. Default: 0.5

  • k – Top-k results to return. If k is None, all results will be returned. Default: 50

  • device – A pytocrh device for computing distances. Default: “cpu”

Data Requirements

AttackAssist.GloVe

Language

english

HowNetSubstitute

class HowNetSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(k=None)[source]

English Sememe-based word substitute based on OpenHowNet. [pdf]

Parameters

k – Top-k results to return. If k is None, all results will be returned.

Data Requirements

AttackAssist.HownetSubstituteDict

Language

english

Word2VecSubstitute

class Word2VecSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(cosine=False, k=50, threshold=0.5, device=None)[source]

English word substitute based on word2vec.

Parameters
  • cosine – If true then the cosine distance is used, otherwise the Euclidian distance is used.

  • threshold – Distance threshold. Default: 0.5

  • k – Top-k results to return. If k is None, all results will be returned. Default: 50

  • device – A pytocrh device for computing distances. Default: “cpu”

Data Requirements

AttackAssist.GloVe

Language

english

WordNetSubstitute

class WordNetSubstitute(OpenAttack.attack_assist.substitute.word.WordSubstitute)[source]
__init__(k=None)[source]

English word substitute based on wordnet.

Parameters

k – Top-k results to return. If k is None, all results will be returned. Default: 50

Data Requirements

TProcess.NLTKWordNet

Language

english

ChineseFYHCharSubstitute

class ChineseFYHCharSubstitute(OpenAttack.attack_assist.substitute.char.CharSubstitute)[source]
__init__(k=None)[source]

Returns traditional, variant and Martian characters of the input character.

Parameters

k (Optional[int]) – Top-k results to return. If k is None, all results will be returned.

Data Requirements

AttackAssist.FYH

Language

chinese

ChineseSimCharSubstitute

class ChineseSimCharSubstitute(OpenAttack.attack_assist.substitute.char.CharSubstitute)[source]
__init__(k=None)[source]

Returns the chars that is visually similar to the input.

Parameters

k (Optional[int]) – Top-k results to return. If k is None, all results will be returned.

Data Requirements

AttackAssist.SIM

Language

chinese

DCESSubstitute

class DCESSubstitute(OpenAttack.attack_assist.substitute.char.CharSubstitute)[source]
__init__(k=12)[source]

Returns the chars that is visually similar to the input.

DCES substitute used in VIPERAttacker.

Parameters

k (int) – Top-k results to return. Default: k = 12

Data Requirements

AttackAssist.SIM

Language

english

Package Requirements
  • sklearn

ECESSubstitute

class ECESSubstitute(OpenAttack.attack_assist.substitute.char.CharSubstitute)[source]
__init__()[source]

Returns the chars that is visually similar to the input.

DCES substitute used in VIPERAttacker.

Data Requirements

AttackAssist.SIM

Language

english