-
Notifications
You must be signed in to change notification settings - Fork 696
NonlinearMinimizer using Projection and Proximal Operators #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
fb74645
Initial commit and skeleton for NonlinearMinimizer
a7ee059
Merge branch 'qp' of https://github.com/debasish83/breeze into nlqp
679bb5f
Skeleton for approximate eigen value calculation
3d80b31
Copyright message for NonlinearMinimizer
3781e37
merge with qp branch; NOTICE file updated
536886d
Initial checkin for PowerMethod and PowerMethodTest;Eigen value extra…
f98bd80
Compilation fixes to LBFGS eigenMin and eigenMax
ae795b6
Power Method merged; NonlinearMinimizer now supports preserving histo…
51b4224
Generating PQN.CompactHessian from BFGS.ApproximateInverseHessian not…
ee697bf
Linear Regression formulation added for comparisons
ce8638f
Fixed LBFGS.maxEigen using power law on CompactHessian
bbc3edd
Merge branch 'qp' of https://github.com/debasish83/breeze into nlqp
f85ff86
Merge branch 'qp' of https://github.com/debasish83/breeze into nlqp
e3a61a9
Added a proximal interface to ProjectQuasiNewton solver; Added projec…
928de32
probability simplex benchmark
91f2e17
After experimentation NonlinearMinimizer now users PQN/OWLQN and supp…
33d28ff
Add testcases for Least square variants
6cba897
merge with upstream
9bef354
I dunno.
dlwh 18c7789
PQN fixes from David's fix_pqn branch; added strong wolfe line search…
43794c0
Unused import from FirstOrderMinimizer; PQN migrated to Strong Wolfe …
e2c1db8
Used BacktrackingLineSearch in SPG and PQN; Updated NonlinearMinimize…
defaff5
NonlinearMinimizer println changed to nl from pqn
610027f
Updated with cforRange in proximal operations
8c6a6c8
BacktrackingLineSearch takes an initfval;OWLQN, PQN and SPG updated t…
b4d86e8
Merge branch 'master' of https://github.com/scalanlp/breeze into nlqp
3a6fc97
infiniteIteration API in FirstOrderMinimizer takes initialState;PQN b…
8533ada
migrate LBFGS Eigen calculation to https://github.com/debasish83/bree…
a0bbd33
cleaned up minEigen call from QuadraticMinimizer
40a45a8
NonlinearMinimizer inner iterations through BFGS cleaned
7308c7a
Updated contributions in README.md
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
merge with qp branch; NOTICE file updated
- Loading branch information
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
You are viewing a condensed version of this merge commit. You can view the full changes here.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, all rights reserved has no legal meaning in modern copyright law.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PQN can do simplex projection but it does not separate f(x) and g(z)....what we really want to do is to solve f (x) through a blocked cyclic coordinate descent using bfgs and satisfy g (z) through proximal operator...that's our version of parameter server (distributed solver)...I am still thinking if we can replace coordinate descent with distributed bfgs by using some tricks...if I use owlqn or pqn I have to change code in all of them...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I looked at the paper...that's the same algorithm I am implementing....it is a proximal algorithm....may be we can plugin to pqn as well....i dont know pqn that well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the PQN paper and it is possible to stick all the proximal operators inside PQN...I have tested bounds with PQN already and it works well compared to default ADMM.
If it holds for ProximalL1() compared to OWLQN then I am good to use PQN as the core of NonlinearMinimizer...PQN right now accepts closure of form DenseVector[Double] => DenseVector[Double]. Would it be fine if I change the signature to (x: DenseVector[Double], rho: Double) => DenseVector[Double] ?
This is in-tune to generic proximal algorithms:
minimize g(x) + rho||x - v||_{2}^{2}
g(x) can be constraints here: x \in C
g(x) can be L1 here as well through soft-thresholding I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm totally open to that, but I'm not quite sure how it will be used. PQN
won't manipulate it, right? Why not have your driver specify it in a new
instance of PQN?
-- David
On Tue, Feb 17, 2015 at 12:01 PM, Debasish Das [email protected]
wrote:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not an interface like:
trait SeparableDiffFunction[T] {
def apply(x: T): IndexedSeq[(Double, T)]
}
?
maybe with a method to turn it into a normal DiffFunction given an OpAdd
for T?
On Sat, Feb 21, 2015 at 7:53 AM, Debasish Das [email protected]
wrote:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add the experimental version with RDD in it and you can help defining the clean interface...I feel we will need different interfaces for separable and non-separable functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the PQN driver and the projection operators for L1(x) <= c and ProbabilitySimplex...Somehow in my experiments I am unable to reproduce the results from Figure 1 of PQN paper...OWLQN is first run to extract the parameter lambda_L1(x_) to generate the constraint L1(x) <= c for PQN
runMain breeze.optimize.proximal.NonlinearMinimizer 500 0.4 0.99
Elastic Net with L1 and L2 regularization.
Linear Regression:
Issues:
owlqn 678.072 ms iters 173
pqnSparseTime 30600.365 ms iters 500
owlqnObj -145.38669700980395 pqnSparseObj -135.15057488775597
Logistic Regression:
Cons:
owlqn 187.894 ms iters 74 pqn 758.073 ms iters 28
objective owlqnLogistic 52.32713379333781 pqnLogistic 81.37098398138012
I am debugging the code further but any pointers will be great. I don't think this is the expected behavior from PQN on L1/ProbabilitySimplex constraint as per paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random question: does it seem to be line searching a lot (relative to
OWLQN, or runs with a box constraint?)
On Mon, Feb 23, 2015 at 2:09 PM, Debasish Das [email protected]
wrote:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my L1 projection has bugs in it...L1(x) is built using ProbabilitySimplex and PQN is beating naive ADMM and at par with QuadraticMinimizer for ProbabilitySimplex.
I will take a closer look into L1 projection for bugs but for ADMM based multicore/distributed consensus I will most likely choose OWLQN for elastic net and PQN for other constraints.
Here are the results with ProbabilitySimplex:
minimize f(x) s.t 1'x = c, x >= 0, c = 1
Linear Regression with ProbabilitySimplex, f(x) = ||Ax - b||^2
Objective pqn 85.34613832959954 nl 84.95320179604967 qp 85.33114863196366
Constraint pqn 0.9999999999999997 nl 1.001707509961072 qp 1.000000000000004
time pqn 150.552 ms nl 15847.125 ms qp 96.105 ms
Logistic Regression with ProbabilitySimplex, f(x) logistic loss from mllib
Objective pqn 257.6058563777358 nl 257.4025971846134
Constraint pqn 0.9999999999999998 nl 1.0007230450802203
time pqn 94.19 ms nl 25160.317 ms