Skip to content

Commit 85f8071

Browse files
dbonadimanfchollet
authored andcommitted
Max Over Time in imdb_cnn.py (keras-team#2320)
* Max Over Time in imdb_cnn.py Following this issue keras-team#2296 i propose this PR. The mayor optimisation a part of the Max over time are: - Dropout in the Embedding layer. - Longer input sequences (400 instead of 100), made possible from the speedup of the Max Over Time. - Adam optimizer. Overall it takes 90 to 100 sec per epoch on my laptop CPU and in two epochs it reaches 0.885 accuracy that is a 5 points improvement over the previous implementation. Moreover it requires less memory (300k parameters vs 3M+) since the number of parameters do not depend by the length of the input sequence anymore. * Update imdb_cnn.py
1 parent 2cc9ebf commit 85f8071

File tree

1 file changed

+21
-14
lines changed

1 file changed

+21
-14
lines changed

examples/imdb_cnn.py

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
'''This example demonstrates the use of Convolution1D for text classification.
22
3-
Gets to 0.835 test accuracy after 2 epochs. 100s/epoch on K520 GPU.
3+
Gets to 0.88 test accuracy after 2 epochs.
4+
90s/epoch on Intel i5 2.4Ghz CPU.
5+
10s/epoch on Tesla K40 GPU.
6+
47
'''
58

69
from __future__ import print_function
@@ -9,17 +12,18 @@
912

1013
from keras.preprocessing import sequence
1114
from keras.models import Sequential
12-
from keras.layers.core import Dense, Dropout, Activation, Flatten
15+
from keras.layers.core import Dense, Dropout, Activation, Lambda
1316
from keras.layers.embeddings import Embedding
14-
from keras.layers.convolutional import Convolution1D, MaxPooling1D
17+
from keras.layers.convolutional import Convolution1D
1518
from keras.datasets import imdb
19+
from keras import backend as K
1620

1721

1822
# set parameters:
1923
max_features = 5000
20-
maxlen = 100
24+
maxlen = 400
2125
batch_size = 32
22-
embedding_dims = 100
26+
embedding_dims = 50
2327
nb_filter = 250
2428
filter_length = 3
2529
hidden_dims = 250
@@ -42,8 +46,10 @@
4246

4347
# we start off with an efficient embedding layer which maps
4448
# our vocab indices into embedding_dims dimensions
45-
model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
46-
model.add(Dropout(0.25))
49+
model.add(Embedding(max_features,
50+
embedding_dims,
51+
input_length=maxlen,
52+
dropout=0.2))
4753

4854
# we add a Convolution1D, which will learn nb_filter
4955
# word group filters of size filter_length:
@@ -52,24 +58,25 @@
5258
border_mode='valid',
5359
activation='relu',
5460
subsample_length=1))
55-
# we use standard max pooling (halving the output of the previous layer):
56-
model.add(MaxPooling1D(pool_length=2))
5761

58-
# We flatten the output of the conv layer,
59-
# so that we can add a vanilla dense layer:
60-
model.add(Flatten())
62+
# we use max over time pooling by defining a python function to use
63+
# in a Lambda layer
64+
def max_1d(X):
65+
return K.max(X, axis=1)
66+
67+
model.add(Lambda(max_1d, output_shape=(nb_filter,)))
6168

6269
# We add a vanilla hidden layer:
6370
model.add(Dense(hidden_dims))
64-
model.add(Dropout(0.25))
71+
model.add(Dropout(0.2))
6572
model.add(Activation('relu'))
6673

6774
# We project onto a single unit output layer, and squash it with a sigmoid:
6875
model.add(Dense(1))
6976
model.add(Activation('sigmoid'))
7077

7178
model.compile(loss='binary_crossentropy',
72-
optimizer='rmsprop',
79+
optimizer='adam',
7380
metrics=['accuracy'])
7481
model.fit(X_train, y_train,
7582
batch_size=batch_size,

0 commit comments

Comments
 (0)