-
Notifications
You must be signed in to change notification settings - Fork 434
New metric module to improve flexibility and intuitiveness - moved from #475 #514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 18 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
c0e2993
[CODE] Transfer from #475 + new structure
da4ca68
[CODE] Changing metric to resemble constraints structure
46781b1
[CODE] Add code for perplexity
5a88253
[CODE] Fixing init file
4a94be9
[CODE] Add command-line option for quality metrics
5e929f2
[FIX] Import order+metrics import
sanchit97 a1b2c5b
[CODE] New USE metric WIP
sanchit97 0baa502
[FIX] Working USE
sanchit97 10ee24b
[FIX] Change structure of Metric mdl
sanchit97 32c3e43
[CODE] Fix metrics, add tests
sanchit97 559057e
[CODE] Fix black on use
sanchit97 aab7eec
[CODE] Fix isort on use
sanchit97 b5a1209
[CODE] Add new help msg
sanchit97 aa1ad15
[CODE] Add new docs
sanchit97 d44a54c
[CODE] Fix print
sanchit97 eab1cd0
[CODE] Fix black
sanchit97 3e2b16f
[FIX] Fix perplexity precision
sanchit97 f1ef471
[FIX] Fix perplexity escape seq
sanchit97 fa9817a
fix docstring issues..
qiyanjun File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
74 changes: 74 additions & 0 deletions
74
tests/sample_outputs/run_attack_hotflip_lstm_mr_4_adv_metrics.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| /.*/Attack( | ||
| (search_method): BeamSearch( | ||
| (beam_width): 10 | ||
| ) | ||
| (goal_function): UntargetedClassification | ||
| (transformation): WordSwapGradientBased( | ||
| (top_n): 1 | ||
| ) | ||
| (constraints): | ||
| (0): MaxWordsPerturbed( | ||
| (max_num_words): 2 | ||
| (compare_against_original): True | ||
| ) | ||
| (1): WordEmbeddingDistance( | ||
| (embedding): WordEmbedding | ||
| (min_cos_sim): 0.8 | ||
| (cased): False | ||
| (include_unknown_words): True | ||
| (compare_against_original): True | ||
| ) | ||
| (2): PartOfSpeech( | ||
| (tagger_type): nltk | ||
| (tagset): universal | ||
| (allow_verb_noun_swap): True | ||
| (compare_against_original): True | ||
| ) | ||
| (3): RepeatModification | ||
| (4): StopwordModification | ||
| (is_black_box): False | ||
| ) | ||
|
|
||
| --------------------------------------------- Result 1 --------------------------------------------- | ||
| [[Positive (96%)]] --> [[Negative (77%)]] | ||
|
|
||
| the story gives ample opportunity for large-scale action and suspense , which director shekhar kapur [[supplies]] with tremendous skill . | ||
|
|
||
| the story gives ample opportunity for large-scale action and suspense , which director shekhar kapur [[stagnated]] with tremendous skill . | ||
|
|
||
|
|
||
| --------------------------------------------- Result 2 --------------------------------------------- | ||
| [[Negative (57%)]] --> [[[SKIPPED]]] | ||
|
|
||
| red dragon " never cuts corners . | ||
|
|
||
|
|
||
| --------------------------------------------- Result 3 --------------------------------------------- | ||
| [[Positive (51%)]] --> [[[FAILED]]] | ||
|
|
||
| fresnadillo has something serious to say about the ways in which extravagant chance can distort our perspective and throw us off the path of good sense . | ||
|
|
||
|
|
||
| --------------------------------------------- Result 4 --------------------------------------------- | ||
| [[Positive (89%)]] --> [[[FAILED]]] | ||
|
|
||
| throws in enough clever and unexpected twists to make the formula feel fresh . | ||
|
|
||
|
|
||
|
|
||
| +-------------------------------+--------+ | ||
| | Attack Results | | | ||
| +-------------------------------+--------+ | ||
| | Number of successful attacks: | 1 | | ||
| | Number of failed attacks: | 2 | | ||
| | Number of skipped attacks: | 1 | | ||
| | Original accuracy: | 75.0% | | ||
| | Accuracy under attack: | 50.0% | | ||
| | Attack success rate: | 33.33% | | ||
| | Average perturbed word %: | 5.56% | | ||
| | Average num. words per input: | 15.5 | | ||
| | Avg num queries: | 1.33 | | ||
| | Average Original Perplexity: | 291.47 | | ||
| | Average Attack Perplexity: | 320.33 | | ||
| | Average Attack USE Score: | 0.91 | | ||
| +-------------------------------+--------+ | ||
68 changes: 68 additions & 0 deletions
68
tests/sample_outputs/run_attack_transformers_datasets_adv_metrics.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| /.*/Attack( | ||
| (search_method): GreedyWordSwapWIR( | ||
| (wir_method): unk | ||
| ) | ||
| (goal_function): UntargetedClassification | ||
| (transformation): CompositeTransformation( | ||
| (0): WordSwapNeighboringCharacterSwap( | ||
| (random_one): True | ||
| ) | ||
| (1): WordSwapRandomCharacterSubstitution( | ||
| (random_one): True | ||
| ) | ||
| (2): WordSwapRandomCharacterDeletion( | ||
| (random_one): True | ||
| ) | ||
| (3): WordSwapRandomCharacterInsertion( | ||
| (random_one): True | ||
| ) | ||
| ) | ||
| (constraints): | ||
| (0): LevenshteinEditDistance( | ||
| (max_edit_distance): 30 | ||
| (compare_against_original): True | ||
| ) | ||
| (1): RepeatModification | ||
| (2): StopwordModification | ||
| (is_black_box): True | ||
| ) | ||
|
|
||
| --------------------------------------------- Result 1 --------------------------------------------- | ||
| [[Negative (100%)]] --> [[Positive (71%)]] | ||
|
|
||
| [[hide]] [[new]] secretions from the parental units | ||
|
|
||
| [[Ehide]] [[enw]] secretions from the parental units | ||
|
|
||
|
|
||
| --------------------------------------------- Result 2 --------------------------------------------- | ||
| [[Negative (100%)]] --> [[[FAILED]]] | ||
|
|
||
| contains no wit , only labored gags | ||
|
|
||
|
|
||
| --------------------------------------------- Result 3 --------------------------------------------- | ||
| [[Positive (100%)]] --> [[Negative (96%)]] | ||
|
|
||
| that [[loves]] its characters and communicates [[something]] [[rather]] [[beautiful]] about human nature | ||
|
|
||
| that [[lodes]] its characters and communicates [[somethNng]] [[rathrer]] [[beautifdul]] about human nature | ||
|
|
||
|
|
||
|
|
||
| +-------------------------------+---------+ | ||
| | Attack Results | | | ||
| +-------------------------------+---------+ | ||
| | Number of successful attacks: | 2 | | ||
| | Number of failed attacks: | 1 | | ||
| | Number of skipped attacks: | 0 | | ||
| | Original accuracy: | 100.0% | | ||
| | Accuracy under attack: | 33.33% | | ||
| | Attack success rate: | 66.67% | | ||
| | Average perturbed word %: | 30.95% | | ||
| | Average num. words per input: | 8.33 | | ||
| | Avg num queries: | 22.67 | | ||
| | Average Original Perplexity: | 1126.57 | | ||
| | Average Attack Perplexity: | 2823/.*/| | ||
| | Average Attack USE Score: | 0.76 | | ||
| +-------------------------------+---------+ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The perplexity numbers here seem abnormally high (should be less than 100).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have doubts about this too.