Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
f8287f0
custom data loader
ArshdeepSekhon Oct 29, 2020
bb1e021
custom textattack dataset from local files or in memory using hugging…
ArshdeepSekhon Oct 30, 2020
9195e1e
load user dataset from local files and convert to TextAttack dataset …
ArshdeepSekhon Nov 4, 2020
c1bd607
load user dataset from local files and convert to TextAttack dataset …
ArshdeepSekhon Nov 4, 2020
157bd21
load user dataset from local files and convert to textattack dataset …
ArshdeepSekhon Nov 4, 2020
3edd74b
load user dataset from local files and convert to textattack dataset …
ArshdeepSekhon Nov 4, 2020
29b0d9a
custom dataset: add attribute error
ArshdeepSekhon Nov 4, 2020
ea15f9a
custom dataset: remove stray prints
ArshdeepSekhon Nov 4, 2020
34b02ec
fix output column for custom dataset
ArshdeepSekhon Nov 4, 2020
af379af
custom dataset: add support for dict
ArshdeepSekhon Nov 4, 2020
6e07bd5
custom dataset: checks
ArshdeepSekhon Nov 4, 2020
2105de2
option to test on entire dataset
ArshdeepSekhon Oct 22, 2020
5f9a4c2
eval on entire dataset, checks
ArshdeepSekhon Oct 22, 2020
f238449
fix failed checks
ArshdeepSekhon Oct 22, 2020
2f00e33
custom data loader
ArshdeepSekhon Oct 29, 2020
793dbe0
custom textattack dataset from local files or in memory using hugging…
ArshdeepSekhon Oct 30, 2020
ae1c1f0
load user dataset from local files and convert to TextAttack dataset …
ArshdeepSekhon Nov 4, 2020
799f29e
load user dataset from local files and convert to TextAttack dataset …
ArshdeepSekhon Nov 4, 2020
97ea615
load user dataset from local files and convert to textattack dataset …
ArshdeepSekhon Nov 4, 2020
6172e24
load user dataset from local files and convert to textattack dataset …
ArshdeepSekhon Nov 4, 2020
d3e4269
custom dataset: add attribute error
ArshdeepSekhon Nov 4, 2020
92a54a5
custom dataset: remove stray prints
ArshdeepSekhon Nov 4, 2020
7b167ca
fix output column for custom dataset
ArshdeepSekhon Nov 4, 2020
601371d
custom dataset: add support for dict
ArshdeepSekhon Nov 4, 2020
9d0ed54
custom dataset: checks
ArshdeepSekhon Nov 4, 2020
12aab83
skeleton code for custom dataset
ArshdeepSekhon Nov 24, 2020
474bfa7
Merge branch 'custom_dataset' of https://github.com/ArshdeepSekhon/Te…
ArshdeepSekhon Nov 24, 2020
7f746d1
add utils for reading from files
ArshdeepSekhon Nov 25, 2020
7d91be2
add support for reading from csv, df, txt
ArshdeepSekhon Nov 25, 2020
7d2f976
fix format errors
ArshdeepSekhon Dec 4, 2020
9222066
update the confusing word"Successes" to "True Positive/Positive"
qiyanjun Dec 4, 2020
5c172b2
update the confusing uses of "Successes" to "True Positive/Positive"
qiyanjun Dec 4, 2020
11d2930
Merge branch 'master' into custom_dataset
ArshdeepSekhon Dec 4, 2020
36c83b3
black,isort formatting
ArshdeepSekhon Dec 4, 2020
f6fb8c5
Update dataset.py
qiyanjun Dec 5, 2020
41c5ef5
fix a wrong typo
qiyanjun Dec 5, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 67 additions & 67 deletions docs/3recipes/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,26 +50,26 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- AG News (`lstm-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 914/1000
- True Positive/Positive: 914/1000
- Accuracy: 91.4%
- IMDB (`lstm-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 883/1000
- True Positive/Positive: 883/1000
- Accuracy: 88.30%
- Movie Reviews [Rotten Tomatoes] (`lstm-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 807/1000
- True Positive/Positive: 807/1000
- Accuracy: 80.70%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 781/1000
- True Positive/Positive: 781/1000
- Accuracy: 78.10%
- SST-2 (`lstm-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 737/872
- True Positive/Positive: 737/872
- Accuracy: 84.52%
- Yelp Polarity (`lstm-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 922/1000
- True Positive/Positive: 922/1000
- Accuracy: 92.20%

</section>
Expand All @@ -81,26 +81,26 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- AG News (`cnn-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 910/1000
- True Positive/Positive: 910/1000
- Accuracy: 91.00%
- IMDB (`cnn-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 863/1000
- True Positive/Positive: 863/1000
- Accuracy: 86.30%
- Movie Reviews [Rotten Tomatoes] (`cnn-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 794/1000
- True Positive/Positive: 794/1000
- Accuracy: 79.40%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 768/1000
- True Positive/Positive: 768/1000
- Accuracy: 76.80%
- SST-2 (`cnn-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 721/872
- True Positive/Positive: 721/872
- Accuracy: 82.68%
- Yelp Polarity (`cnn-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 913/1000
- True Positive/Positive: 913/1000
- Accuracy: 91.30%

</section>
Expand All @@ -112,50 +112,50 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- AG News (`albert-base-v2-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 943/1000
- True Positive/Positive: 943/1000
- Accuracy: 94.30%
- CoLA (`albert-base-v2-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 829/1000
- True Positive/Positive: 829/1000
- Accuracy: 82.90%
- IMDB (`albert-base-v2-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 913/1000
- True Positive/Positive: 913/1000
- Accuracy: 91.30%
- Movie Reviews [Rotten Tomatoes] (`albert-base-v2-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 882/1000
- True Positive/Positive: 882/1000
- Accuracy: 88.20%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 851/1000
- True Positive/Positive: 851/1000
- Accuracy: 85.10%
- Quora Question Pairs (`albert-base-v2-qqp`)
- `datasets` dataset `glue`, subset `qqp`, split `validation`
- Successes: 914/1000
- True Positive/Positive: 914/1000
- Accuracy: 91.40%
- Recognizing Textual Entailment (`albert-base-v2-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 211/277
- True Positive/Positive: 211/277
- Accuracy: 76.17%
- SNLI (`albert-base-v2-snli`)
- `datasets` dataset `snli`, split `test`
- Successes: 883/1000
- True Positive/Positive: 883/1000
- Accuracy: 88.30%
- SST-2 (`albert-base-v2-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 807/872
- True Positive/Positive: 807/872
- Accuracy: 92.55%)
- STS-b (`albert-base-v2-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.9041359738552746
- Spearman correlation: 0.8995912861209745
- WNLI (`albert-base-v2-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 42/71
- True Positive/Positive: 42/71
- Accuracy: 59.15%
- Yelp Polarity (`albert-base-v2-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 963/1000
- True Positive/Positive: 963/1000
- Accuracy: 96.30%

</section>
Expand All @@ -166,62 +166,62 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- AG News (`bert-base-uncased-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 942/1000
- True Positive/Positive: 942/1000
- Accuracy: 94.20%
- CoLA (`bert-base-uncased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 812/1000
- True Positive/Positive: 812/1000
- Accuracy: 81.20%
- IMDB (`bert-base-uncased-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 919/1000
- True Positive/Positive: 919/1000
- Accuracy: 91.90%
- MNLI matched (`bert-base-uncased-mnli`)
- `datasets` dataset `glue`, subset `mnli`, split `validation_matched`
- Successes: 840/1000
- True Positive/Positive: 840/1000
- Accuracy: 84.00%
- Movie Reviews [Rotten Tomatoes] (`bert-base-uncased-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 876/1000
- True Positive/Positive: 876/1000
- Accuracy: 87.60%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 838/1000
- True Positive/Positive: 838/1000
- Accuracy: 83.80%
- MRPC (`bert-base-uncased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 358/408
- True Positive/Positive: 358/408
- Accuracy: 87.75%
- QNLI (`bert-base-uncased-qnli`)
- `datasets` dataset `glue`, subset `qnli`, split `validation`
- Successes: 904/1000
- True Positive/Positive: 904/1000
- Accuracy: 90.40%
- Quora Question Pairs (`bert-base-uncased-qqp`)
- `datasets` dataset `glue`, subset `qqp`, split `validation`
- Successes: 924/1000
- True Positive/Positive: 924/1000
- Accuracy: 92.40%
- Recognizing Textual Entailment (`bert-base-uncased-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 201/277
- True Positive/Positive: 201/277
- Accuracy: 72.56%
- SNLI (`bert-base-uncased-snli`)
- `datasets` dataset `snli`, split `test`
- Successes: 894/1000
- True Positive/Positive: 894/1000
- Accuracy: 89.40%
- SST-2 (`bert-base-uncased-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 806/872
- True Positive/Positive: 806/872
- Accuracy: 92.43%)
- STS-b (`bert-base-uncased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.8775458937815515
- Spearman correlation: 0.8773251339980935
- WNLI (`bert-base-uncased-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 40/71
- True Positive/Positive: 40/71
- Accuracy: 56.34%
- Yelp Polarity (`bert-base-uncased-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 963/1000
- True Positive/Positive: 963/1000
- Accuracy: 96.30%

</section>
Expand All @@ -233,23 +233,23 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- CoLA (`distilbert-base-cased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 786/1000
- True Positive/Positive: 786/1000
- Accuracy: 78.60%
- MRPC (`distilbert-base-cased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 320/408
- True Positive/Positive: 320/408
- Accuracy: 78.43%
- Quora Question Pairs (`distilbert-base-cased-qqp`)
- `datasets` dataset `glue`, subset `qqp`, split `validation`
- Successes: 908/1000
- True Positive/Positive: 908/1000
- Accuracy: 90.80%
- SNLI (`distilbert-base-cased-snli`)
- `datasets` dataset `snli`, split `test`
- Successes: 861/1000
- True Positive/Positive: 861/1000
- Accuracy: 86.10%
- SST-2 (`distilbert-base-cased-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 785/872
- True Positive/Positive: 785/872
- Accuracy: 90.02%)
- STS-b (`distilbert-base-cased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
Expand All @@ -264,39 +264,39 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- AG News (`distilbert-base-uncased-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 944/1000
- True Positive/Positive: 944/1000
- Accuracy: 94.40%
- CoLA (`distilbert-base-uncased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 786/1000
- True Positive/Positive: 786/1000
- Accuracy: 78.60%
- IMDB (`distilbert-base-uncased-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 903/1000
- True Positive/Positive: 903/1000
- Accuracy: 90.30%
- MNLI matched (`distilbert-base-uncased-mnli`)
- `datasets` dataset `glue`, subset `mnli`, split `validation_matched`
- Successes: 817/1000
- True Positive/Positive: 817/1000
- Accuracy: 81.70%
- MRPC (`distilbert-base-uncased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 350/408
- True Positive/Positive: 350/408
- Accuracy: 85.78%
- QNLI (`distilbert-base-uncased-qnli`)
- `datasets` dataset `glue`, subset `qnli`, split `validation`
- Successes: 860/1000
- True Positive/Positive: 860/1000
- Accuracy: 86.00%
- Recognizing Textual Entailment (`distilbert-base-uncased-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 180/277
- True Positive/Positive: 180/277
- Accuracy: 64.98%
- STS-b (`distilbert-base-uncased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.8421540899520146
- Spearman correlation: 0.8407155030382939
- WNLI (`distilbert-base-uncased-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 40/71
- True Positive/Positive: 40/71
- Accuracy: 56.34%

</section>
Expand All @@ -307,46 +307,46 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- AG News (`roberta-base-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 947/1000
- True Positive/Positive: 947/1000
- Accuracy: 94.70%
- CoLA (`roberta-base-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 857/1000
- True Positive/Positive: 857/1000
- Accuracy: 85.70%
- IMDB (`roberta-base-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 941/1000
- True Positive/Positive: 941/1000
- Accuracy: 94.10%
- Movie Reviews [Rotten Tomatoes] (`roberta-base-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 899/1000
- True Positive/Positive: 899/1000
- Accuracy: 89.90%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 883/1000
- True Positive/Positive: 883/1000
- Accuracy: 88.30%
- MRPC (`roberta-base-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 371/408
- True Positive/Positive: 371/408
- Accuracy: 91.18%
- QNLI (`roberta-base-qnli`)
- `datasets` dataset `glue`, subset `qnli`, split `validation`
- Successes: 917/1000
- True Positive/Positive: 917/1000
- Accuracy: 91.70%
- Recognizing Textual Entailment (`roberta-base-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 217/277
- True Positive/Positive: 217/277
- Accuracy: 78.34%
- SST-2 (`roberta-base-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 820/872
- True Positive/Positive: 820/872
- Accuracy: 94.04%)
- STS-b (`roberta-base-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.906067852162708
- Spearman correlation: 0.9025045272903051
- WNLI (`roberta-base-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 40/71
- True Positive/Positive: 40/71
- Accuracy: 56.34%

</section>
Expand All @@ -357,34 +357,34 @@ All evaluations shown are on the full validation or test set up to 1000 examples

- CoLA (`xlnet-base-cased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 800/1000
- True Positive/Positive: 800/1000
- Accuracy: 80.00%
- IMDB (`xlnet-base-cased-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 957/1000
- True Positive/Positive: 957/1000
- Accuracy: 95.70%
- Movie Reviews [Rotten Tomatoes] (`xlnet-base-cased-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 908/1000
- True Positive/Positive: 908/1000
- Accuracy: 90.80%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 876/1000
- True Positive/Positive: 876/1000
- Accuracy: 87.60%
- MRPC (`xlnet-base-cased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 363/408
- True Positive/Positive: 363/408
- Accuracy: 88.97%
- Recognizing Textual Entailment (`xlnet-base-cased-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 196/277
- True Positive/Positive: 196/277
- Accuracy: 70.76%
- STS-b (`xlnet-base-cased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.883111673280641
- Spearman correlation: 0.8773439961182335
- WNLI (`xlnet-base-cased-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 41/71
- True Positive/Positive: 41/71
- Accuracy: 57.75%

</section>
Expand Down
3 changes: 3 additions & 0 deletions sample.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
0 Hi there
1 Nope
2 Whatever
Loading