-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Convolutional layers for 3D #1623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Travis test to pass, you could add:
@pytest.mark.skipif(K._BACKEND != 'theano', reason="Requires Theano backend")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks.
|
By running the test code, I notice that, the loss value from each epoch is different when I run on GPU and CPU, I think it could possibly because the precision difference of In my test, I tried to fix the random seed by
Any idea about this difference? |
|
I think 3D operators will be interesting to have in Keras. I will do a detailed code review in the near future. For now, please:
|
|
Thanks for the response. In real world, and in the context of CNN, 3D data usually means 3D volume or 2D movie. For 3D volume, it's spatial signal. And for movie, it's spatio-temporal signal. Names such as "first_spatial_dim", "second_spatial_dim" etc. would not apply in the case of a movie, because, time dimension for a movie is not the "first spatial dim", it's a temporal dim. Maybe other people have better ideas about how to define it, but before we find a good one, perhaps, it would still be reasonable to use "time":
So, I would suggest we use a more specific word to make it easier to understand. If other people have better idea on this, please discuss here. |
|
It's precise because "space" does not mean physical 3D space, it means On 2 February 2016 at 15:15, Will Ouyang [email protected] wrote:
|
|
+1 for @fchollet's notation. I use 3D CNNs on 3D renderings where the 3rd dimension isn't time. |
|
+1 for the "vector space" notation. This notation would be compatible with hypothetical N-d convolutions. |
|
OK, let's use the vector space notation then. Thanks for everyone. |
|
Don't you guys think "first_spatial_dim" is too long? In the arguments we have "nb_row", "nb_filter", and now we need to add "nb_first_spatial_dim" or "len_first_spatial_dim"? Alternatives are to use:
How about keep the "rows" and "cols", and add another notation for the extra dimension? That would be more consistent with 1D and 2D case. Examples are "z_dim" if we think "rows" and "cols" are from XY plane, then the extra dimension is Z. |
There is a reddit thread on exactly this topic: If a 2d matrix has rows and columns, what is the third dimension in a 3d matrix?. I like |
|
+1 for |
|
+1 for |
|
What distinguishes the row, column, and new dimension from the batch and filter dimensions is that the convolution filter slides across these dimensions, so wouldn't it make sense to call these dimensions |
+1 |
|
I have changed it to "conv_dim1" and "len_conv_dim1", if you guys have better idea, or more people vote for "slices, rows, cols", we could make another change to it. @fchollet I think I am done, could you please do a code review if you have time, and merge it? Otherwise, I would need to follow your master repository all the time. Thanks. |
|
I'm using this PR as a part of a project and it doesn't seem to be working with a fairly straightforward model design. Are you sure it's working as intended? specifically, it seems when I added a stride size for the 3D CNN layer it caused dimensions to be miscalculated. Removing subsamples=(2,2,2) seems to fix the problem. |
|
Thanks for reporting the problem. I just checked, stride is not implemented in conv3d2d.conv3d currently. So, for now the only supported stride is (1,1,1), and you can remove those result yourself(maybe with a Lambda layer). You can see the document here: http://deeplearning.net/software/theano/library/tensor/nnet/conv.html#theano.tensor.nnet.conv3d2d.conv3d I will put this in the docstring then, sorry for the misleading. |
|
Right that makes sense. You should probably note that this parameter is ignored and add in a check so that this doesn't happen to others. |
|
@mdering I just added the support for strides by slicing the output of conv3d, it won't change the real computation, but you should get the result you want. Tell me if you have any problem, thank you. |
|
Will that new statement be optimized away when strides are (1, 1, 1)? Otherwise it might be good to add an if statement for that case. conv3d2d uses 2d convs to create a 3d conv, so I wonder if it would be possible to implement strides efficiently inside conv3d2d by using strides in the conv2d calls. That would be a Theano PR though |
|
@oeway I've tried it and it seems to be working correctly. Thanks. It would be nice to get a better fix for this upstream, since I'm running on some not so fantastic hardware, but in the meantime this is working great. Thanks again. |
|
Hi, me again. These layers are not currently serializable to either json or yaml, because the relevant part of the get_config method is missing. Specifically you'll need to add base_config = super(Convolution3D, self).get_config() at the end of the method for this to work properly. EDIT: theres actually an additional error with the deserializing process, that I wasn't really able to determine the cause of. When using a deserialized model arch, i got an error: Removing the border_mode from the parameters passed to the Theano backend seems to have done the trick. I don't anticipate this being a problem since this parameter is always 'valid' anyway, and that is the default value. I should also add even after all this serialization and deserialization is still buggy. Maybe someone else can help more |
|
@mdering Thanks for reporting the problem, I just fixed it according to your suggestion. I think I found the problem, after deserialising, strings from the json code are becomes unicode string: json_string = model.to_json()
model2 = model_from_json(json_string)
print model2.layers[0].border_mode, model.layers[0].border_modeoutput: isinstance(u'valid', str), isinstance('valid', str)output: However, I didn't see any problem with model_from_yaml(), do you have any problem yaml? I have fixed the problem by passing a tuple as border_mode: Maybe we should fix model_from_json() function in Keras to prevent further problem, for now, using yaml can avoid that. In case you want to track this issue, follow #1702 . Tell me if it works for you, thank you. |
|
For verification reasons, I have run the exact same experiment without the use of the regularizer. As you can see, the weights of the filter seem a bit random and are not identical slice-wise. |
|
I did a quick check, I didn't find any reason for this, I used exactly the same implementation as Convolution2D in terms of regularization. |
|
@MathiasPolfliet make sure that your input_shape is 5-dimensional |
|
@jruales You mean here? model_p1.add(Convolution3D(nb_filter=c_Filter[0], len_conv_dim1=c_Kernel[0], len_conv_dim2=c_Kernel[0], len_conv_dim3=c_Kernel[0], init='normal', W_regularizer=l2(0.4), border_mode='valid', input_shape=(1,c_Patch[0],c_Patch[0],c_Patch[0]))) According to the comments in the code it should only be 4-D: |
…ng3D, UpSampling3D and ZeroPadding3D, working with theano backend) --------------------------------------- Squashed from the following commits add Convolution3D and MaxPooling3D layers fix 5D tensor in theano, add examples update conv3d, pool3d, add resize_volumes and spatial_3d_padding update Convolution3D, MaxPooling3D and AveragePooling3D, add UpSampling3D and ZeroPadding3D add test functions for Convolution3D, MaxPooling3D, AveragePooling3D, ZeroPadding3D and UpSampling3D small fix by changing pad_z to pad_t update comment skip some tests for tenforflow, @pytest.mark.skipif(K._BACKEND != theano, reason="Requires Theano backend") use autopep8 to fix the code to match pep8 coding style small fix (caused by autopep8) small fix (caused by autopep8) small fix (caused by autopep8) fixed the document string for all newly added layers remove the example and the dataset for 3d add error messge for tensorflow backend support stride in pool3d Rename "params" to "trainable_weights" change notations and docstrings for 3D layers fix pep8 error change variable name in test code small fix for pep8 add error message and docstring for strides in conv3d fix test error caused by wrong strides in conv3d support strides in conv3d by slicing the output add if statement for stride (1,1,1) fix get_config according to mdering, and other small fix fix model_from_json issue by passing a 3d border_mode fix according to jruales' review change docstring in Convolution3D delete docstring about TensorFlow change docstring in Convolution3D and theano_backend --------------------------------------- Author: Wei OUYANG <[email protected]>
|
Sorry, my bad, I was thinking of the actual shape of the input tensor to the network and forgot the sample dimension is excluded from the constructor argument |
|
As a sanity check, could you please check the following:
Also please make sure you are using the latest version of the pull request's code |
|
@fchollet I think now it's ready to merge. The ongoing discussion shouldn't be a problem of Convolution3D. |
|
Agreed. LGTM. |
Is someone able to reproduce these findings? If not, it'll be just a bug in my internal code... |
|
Okay, when using two layers of convolution this problem arises in the second layer and not in the first layer. Can it have something to do with Flatten()? |
|
I have determined what causes the behaviour. When convolving a nxnxn patch with an nxnxn filter the output is a 1x1x1, which leads to the strange behaviour. When using larger patches or smaller kernels 'fixes' it. |
|
Convolving an nxnxn patch with an nxnxn kernel is equivalent to using a Dense layer with output size I can't think of why the strange behavior you see would only appear when using regularization. I'm wondering if it has to do with the fact that you regularize your convolutional weights but not your dense weights? Or maybe somehow your data is repeated along the third convolutional dimension? I haven't had the chance to experiment with this in code, so I'm just throwing out ideas. |
|
I try to insert a 3D Conv layer to a Graph model but meet some errors. Is it because something not compatible with Graph? the error occurs when compile the graph |
|
Hi @DeeperCS, you are using the wrong argument name, you should use: Convolution3D(nb_filter=32, kernel_dim1 =3, kernel_dim2 =3, for simplicity, I would just write as: Convolution3D(32, 3, 3, 3, border_mode='same') On Wed, Mar 9, 2016 at 2:29 PM, DeeperCS [email protected] wrote:
|
|
@oeway For the code below: I got the response below: For the other code below: I got the response below(seems the same error as what I met few minutes ago): Is there some other things that I did wrong? |
|
Not sure what's the problem, but I just tried your code, for me it can get Sorry I am quite busy at the moment. Good luck. On Wed, Mar 9, 2016 at 3:09 PM, DeeperCS [email protected] wrote:
|
|
@oeway Thank you so much. I used the #MinhazPalasara/keras at the beginning, and I just checked that 3D convolution stuffs have merged to the master version, so I installed the latest keras 0.3.2 and tried again, unfortunately, these errors still around me - -! |
|
Hi, I am using Ubuntu 14.04LTS, here some other information: import sys 2.7.6 (default, Jun 22 2015, 17:58:13) On Thu, Mar 10, 2016 at 6:07 AM, DeeperCS [email protected] wrote:
|
|
@oeway Thanks a million~! |
|
Convolutions are arriving in TF with tensorflow/tensorflow@784778e |
|
3d convolution and pooling are available now in Tensorflow with tensorflow/tensorflow@6a187cc. Do you want update this? |
|
Thanks for the reminder, I would look into that later. On Thu, May 5, 2016 at 8:05 AM, bhack [email protected] wrote:
|
This PR includes the following convolution layers:
And also the corresponding test functions for these layers.
You can use it like this:
Convolution3D(nb_filters, nb_time, nb_row, nb_col)
See the convolution 3D example in examples/shapes_3d_cnn.py by @MinhazPalasara.
The shape of the underlaying 5D tensor:
(samples, channels, time, rows, cols)if dim_ordering='th'(samples, time, rows, cols, channels)if dim_ordering='tf'.Here I used "time" for the case of movies, of course it can be depth of a 3D volume. I didn't use the notion "depth" because some people use "depth" to stand for filter or channel, the word "depth" sometimes appears in the docstring of Keras when it means channel.
The following functions are added into theano_backend.py:
But notice that, currently, these layers won't work in tensorflow backend, due the fact that there is no convolution3d ops implemented in tensorflow. It would be great if anyone can implement it.
Details about convolutional 3D in theano:
There are two implementations available in theano, I used
conv3d2d.conv3dfor both GPU and CPU, because it's faster:Notice that there are precision differences with
conv3d2d.conv3don GPU and CPU. The testing code can be found in this discussion: https://groups.google.com/d/msg/theano-users/1S9_bZgHxVw/0cQR9a4riFUJ.Another PR(#718) was created by @MinhazPalasara but it's out dated. Many functions, the test example and test dataset in this PR are taken from #718, thanks @MinhazPalasara, @rbharath, @fchollet and others for improving that codes and giving suggestions.
Be careful that, these codes are not tested restrictedly, please report bugs if you encountered any.