-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fixes for float16 problems in the DLT #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
93c9a36
Use MRG_RandomStreams instead for shared_randomstreams for GPU compat.
abergeron d403591
Compute mean in higher precision to avoid overflow.
abergeron 5a13d98
Fix import of sandbox.
abergeron 93837e0
Fix printout in lstm.py.
abergeron 780cecc
Adjust mean dtypes for scores in SdA too.
abergeron File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,7 @@ | |
| from theano import function, shared | ||
| from theano import tensor as TT | ||
| import theano | ||
| import theano.sandbox.rng_mrg | ||
|
|
||
| sharedX = (lambda X, name: | ||
| shared(numpy.asarray(X, dtype=theano.config.floatX), name=name)) | ||
|
|
@@ -275,14 +276,14 @@ def hmc_updates(positions, stepsize, avg_acceptance_rate, final_pos, accept, | |
|
|
||
| """ | ||
|
|
||
| ## POSITION UPDATES ## | ||
| # POSITION UPDATES # | ||
| # broadcast `accept` scalar to tensor with the same dimensions as | ||
| # final_pos. | ||
| accept_matrix = accept.dimshuffle(0, *(('x',) * (final_pos.ndim - 1))) | ||
| # if accept is True, update to `final_pos` else stay put | ||
| new_positions = TT.switch(accept_matrix, final_pos, positions) | ||
| # end-snippet-5 start-snippet-7 | ||
| ## STEPSIZE UPDATES ## | ||
| # STEPSIZE UPDATES # | ||
| # if acceptance rate is too low, our sampler is too "noisy" and we reduce | ||
| # the stepsize. If it is too high, our sampler is too conservative, we can | ||
| # get away with a larger stepsize (resulting in better mixing). | ||
|
|
@@ -292,7 +293,7 @@ def hmc_updates(positions, stepsize, avg_acceptance_rate, final_pos, accept, | |
| new_stepsize = TT.clip(_new_stepsize, stepsize_min, stepsize_max) | ||
|
|
||
| # end-snippet-7 start-snippet-6 | ||
| ## ACCEPT RATE UPDATES ## | ||
| # ACCEPT RATE UPDATES # | ||
| # perform exponential moving average | ||
| mean_dtype = theano.scalar.upcast(accept.dtype, avg_acceptance_rate.dtype) | ||
| new_acceptance_rate = TT.add( | ||
|
|
@@ -358,7 +359,7 @@ def new_from_shared_positions( | |
| stepsize = sharedX(initial_stepsize, 'hmc_stepsize') | ||
| avg_acceptance_rate = sharedX(target_acceptance_rate, | ||
| 'avg_acceptance_rate') | ||
| s_rng = TT.shared_randomstreams.RandomStreams(seed) | ||
| s_rng = theano.sandbox.rng_mrg.MRG_RandomStreams(seed) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| # define graph for an `n_steps` HMC simulation | ||
| accept, final_pos = hmc_move( | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what is going on about numpy.mean? if c are float16 output I checked that numpy output float16. In my tests, it seem the accumulator is in float16 or something like that. Do you know?
We would need to document that in Theano about float16. At least in this issue: Theano/Theano#2908. I let you modify it, in case you can add more information.
Should we special case float16 and make Theano always return at least float32 to help prevent that type of problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the accumulator is float16 internally and overflows.
Do we have a page about float16 gotchas in Theano. This is the only place where I would see this sort of information.
I would oppose special-casing outputs in Theano, because the problem is easily resolved by the user and very visible.