Skip to content

Conversation

@Hanyu-Liu-123
Copy link
Collaborator

@Hanyu-Liu-123 Hanyu-Liu-123 commented Dec 3, 2021

What does this PR do?

Summary

This PR is created to fix #589, in which the sentence encoder raises StopIteration when the newly_modified_indices of the attacked text is an empty set. This happens when the new words swapped/inserted/merged by the CLARE transformations do not contain any letters. Our current implementation keeps tracks of newly_modified_indices only when the new words at the modified indices are actually words, not symbols. An example is shown below:

Suppose we have the sentence

lovingly photographed in the manner of a golden book sprung to life , stuart little 2 manages sweetness largely without stickiness.

and apply WordMergeMaskedLM to the sentence, one of the transformations it returns is

lovingly photographed in the manner of a golden book sprung to life , stuart little 2), largely without stickiness .

in which the phrase manages sweetness is changed to the symbols ),

This change will not be present in the newly_modified_indices of the transformed sentence, since we omit symbols. But when running the sentence encoder, newly_modified_indices of the transformed sentence needs to have at least one element in it. The StopIteration bug consequently occurs.

To fix the issue, one thing we could do is to change how newly_modified_indices is recorded. This is linked to the deletion index issue as well #558. This will likely be a big change to make considering our current implementation relies on not counting symbols as modifications. The other thing we could do as a temporary fix is for the CLARE transformations is to only include transformations whose substituted/added words contain at least 1 letter. In this way, the modified indices will be recorded as normal.

Additions

Check replacement words before making the transformation. If the replacement words do not contain any letter, skip that transformation.

@Hanyu-Liu-123 Hanyu-Liu-123 changed the title add re.search Fix CLARE StopIteration Bug Dec 3, 2021
@Hanyu-Liu-123 Hanyu-Liu-123 self-assigned this Dec 3, 2021
@qiyanjun qiyanjun merged commit 5f825f6 into master Dec 3, 2021
@qiyanjun qiyanjun deleted the fix-StopIteration-bug branch February 21, 2022 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CLARE implementation not working

3 participants