Proper handling of output values in Masking layer #449
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The following - rather artificial - example shows a potential flaw in the current masking approach:
Produces the following output:
Explanation: In all recurrent layers, the input value is multiplied with some matrix (usually
W) before the time dimension is evaluated (Theanoscan). The mask is currently only applied within the Theanoscanloop, whereas the first multiplication is unmasked. This works fine as long as0is chosen as amask_value. For other values this however results in unwanted behavior, as shown here for the extreme case ofmask_value=1e+30.Fix: By replacing all the values which are to be masked with
0in the output of theMaskinglayer, everything works again as expected.