Skip to content

Conversation

@jkbradley
Copy link
Member

What changes were proposed in this pull request?

pyspark.ml API for LDA

  • LDA, LDAModel, LocalLDAModel, DistributedLDAModel
  • includes persistence

This replaces [https://github.com//pull/10242]

How was this patch tested?

  • doc test for LDA, including Param setters
  • unit test for persistence

@jkbradley
Copy link
Member Author

CC: @yanboliang Would you have time to take a look at this PR? Thanks!

@SparkQA
Copy link

SparkQA commented Apr 27, 2016

Test build #57081 has finished for PR 12723 at commit 4f807e8.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DistributedLDAModel(LDAModel, JavaMLReadable, JavaMLWritable):
    • class LocalLDAModel(LDAModel, JavaMLReadable, JavaMLWritable):
    • class LDA(JavaEstimator, HasFeaturesCol, HasMaxIter, HasSeed, HasCheckpointInterval,

pass


class LDA(JavaEstimator, HasFeaturesCol, HasMaxIter, HasSeed, HasCheckpointInterval,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inherit_doc

@yanboliang
Copy link
Contributor

Made one pass, thanks.

@jkbradley
Copy link
Member Author

@yanboliang Thanks! I think I addressed everything so far.

@SparkQA
Copy link

SparkQA commented Apr 27, 2016

Test build #57163 has finished for PR 12723 at commit 46979e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor

I saw you put save/load tests at tests.py rather than in doctest. I think the doctest is not only unit tests but also used for illustrating how to use this Estimator/Model. It's better to have simple test for save/load in doc test. LGTM otherwise. Thanks!

@jkbradley
Copy link
Member Author

I've wondered about the save/load in doc tests. On the one hand, it's nice to have that example, but on the other hand, it's going to be an extra save/load test for every run of the Jenkins tests (once we beef up the actual unit tests). I'll add save/load for now but we should re-evaluate in the future.

@jkbradley
Copy link
Member Author

Updated!

@SparkQA
Copy link

SparkQA commented Apr 28, 2016

Test build #57268 has finished for PR 12723 at commit f37c1c1.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member Author

test this please

@SparkQA
Copy link

SparkQA commented Apr 28, 2016

Test build #57281 has finished for PR 12723 at commit f37c1c1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member Author

Ready now?

@yanboliang
Copy link
Contributor

LGTM, thanks!

@jkbradley
Copy link
Member Author

Thanks @yanboliang and @zjffdu !
Merging with master

@asfgit asfgit closed this in 775772d Apr 29, 2016
@jkbradley jkbradley deleted the zjffdu-SPARK-11940 branch April 29, 2016 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants