Skip to content

Conversation

@tkipf
Copy link
Contributor

@tkipf tkipf commented Jul 27, 2015

The following - rather artificial - example shows a potential flaw in the current masking approach:

X = np.asarray([[[1e+30]]])
y = np.ones((1, 1, 1))

model = Sequential()
model.add(Masking(mask_value=1e+30))
model.add(SimpleRNN(1, 1, init='one', activation='relu', return_sequences=True))
model.compile(loss='mse', optimizer='sgd')

logs = model.fit(X, y, nb_epoch=3)

Produces the following output:

Epoch 0
1/1 [==============================] - 0s - loss: inf
Epoch 1
1/1 [==============================] - 0s - loss: nan
Epoch 2
1/1 [==============================] - 0s - loss: nan

Explanation: In all recurrent layers, the input value is multiplied with some matrix (usually W) before the time dimension is evaluated (Theano scan). The mask is currently only applied within the Theano scan loop, whereas the first multiplication is unmasked. This works fine as long as 0 is chosen as a mask_value. For other values this however results in unwanted behavior, as shown here for the extreme case of mask_value=1e+30.

Fix: By replacing all the values which are to be masked with 0 in the output of the Masking layer, everything works again as expected.

fchollet added a commit that referenced this pull request Jul 31, 2015
Proper handling of output values in Masking layer
@fchollet fchollet merged commit 3bf5340 into keras-team:master Jul 31, 2015
hubingallin pushed a commit to hubingallin/keras that referenced this pull request Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants