Skip to content

Conversation

@cloud-fan
Copy link
Contributor

create java version of constructorFor and extractorFor in JavaTypeInference

@SparkQA
Copy link

SparkQA commented Nov 24, 2015

Test build #46608 has finished for PR 9937 at commit d7a4bf9.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class InitializeJavaBean(n: NewInstance, fields: Seq[Expression], setters: Seq[String])\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this take Map[String, Expression]? zipping always seems awkward when we can avoid it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would not make it take a NewInstance as it would be nice to use a Literal sometimes for object reuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we can not use Map here, as the fields: Seq[Expression] needs resolution and transformation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably just extend this logic to recurse into maps similar to how we handle traversables.

@rxin
Copy link
Contributor

rxin commented Nov 24, 2015

I think Java beans is different from pojo. Beans only care about fields that have getters and setters, while pojo should care about all the fiedls.

@cloud-fan
Copy link
Contributor Author

A POJO is not rigorously defined, and I'm not sure how to detect it and re-construct it...

@rxin
Copy link
Contributor

rxin commented Nov 25, 2015

Use reflection to find all the public and private fields?

@marmbrus
Copy link
Contributor

Perhaps we should start with Java Beans. They are well defined, we have existing reflection code, and I've never heard a java user complain that was not sufficient (this is what we support for creating java data frames).

@rxin
Copy link
Contributor

rxin commented Nov 25, 2015

Sounds good (supporting only beans for now).

@cloud-fan cloud-fan changed the title [SPARK-11954][SQL][WIP] Encoder for JavaBeans / POJOs [SPARK-11954][SQL] Encoder for JavaBeans Nov 25, 2015
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with this by intuition and verified it by experiment, cannot find any doc for it, very hacky...

@SparkQA
Copy link

SparkQA commented Nov 25, 2015

Test build #46691 has finished for PR 9937 at commit a31ea94.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait ScalaReflection\n * case class InitializeJavaBean(n: NewInstance, fields: Seq[Expression], setters: Seq[String])\n

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not right, we may have attribute loopVar in MapObjects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I depend on name to skip the attribute loopVar in MapObjects, need a better way...
cc @marmbrus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this does feel kind of hacky, but I'm not sure if its really a problem in practice. This assertion is only to catch our own bugs, not user mistakes.

@SparkQA
Copy link

SparkQA commented Nov 25, 2015

Test build #46693 has finished for PR 9937 at commit 58d1cc1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait ScalaReflection\n * case class InitializeJavaBean(n: NewInstance, fields: Seq[Expression], setters: Seq[String])\n

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the schemaFor stuff to object ScalaReflection, because of the need of mirror.runtimeClass in getJavaBeanFields

@SparkQA
Copy link

SparkQA commented Nov 26, 2015

Test build #46763 has finished for PR 9937 at commit 40433ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * trait ScalaReflection\n * case class InitializeJavaBean(n: NewInstance, fields: Seq[Expression], setters: Seq[String])\n

@marmbrus
Copy link
Contributor

Thanks for working on this. I'm not sure that this is the right tactic though. I don't think that we want to make java bean the default fall back in our scala reflection code path. Instead I think that we only want to create a java bean encoder when the user explicitly asks for it and thus this should be a separate code path. The only common code should be ExpressionEncoder probably.

Also, instead of hacking this into scala's reflection, I think we should use the builtin java support for introspecting on beans. I would probably start with the code in JavaTypeInference and use that to construct expressions.

@cloud-fan
Copy link
Contributor Author

@marmbrus do you mean we should write java version of extractorsFor and constructorFor in JavaTypeInference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that, for java map, if we get the keys and values by keySet and values, we can not guarantee they have same iteration order(which is different from scala map). A possible solution is creating a new MapObjects that can iterate a map directly.

cc @marmbrus

@SparkQA
Copy link

SparkQA commented Nov 30, 2015

Test build #46871 has finished for PR 9937 at commit c634aa4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class InitializeJavaBean(n: NewInstance, setters: Map[String, Expression])\n

@SparkQA
Copy link

SparkQA commented Nov 30, 2015

Test build #46880 has finished for PR 9937 at commit e852710.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class InitializeJavaBean(n: NewInstance, setters: Map[String, Expression])\n

@SparkQA
Copy link

SparkQA commented Nov 30, 2015

Test build #46882 has finished for PR 9937 at commit 7940ec7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class InitializeJavaBean(n: NewInstance, setters: Map[String, Expression])\n

@SparkQA
Copy link

SparkQA commented Nov 30, 2015

Test build #46883 has finished for PR 9937 at commit fb9a898.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class InitializeJavaBean(n: NewInstance, setters: Map[String, Expression])\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.view.force seems a little odd. toList?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to return a Map here, maybe toList.toMap?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... hmmm, what happens if we don't have this? is it lazy? I guess view.force is fine. It was just a little confusing when reading.

Also, can you add a test to TreeNodeSuite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, mapValues is lazy, I'll add a comment to explain it, also add test in TreeNodeSuite

@marmbrus
Copy link
Contributor

marmbrus commented Dec 1, 2015

Some minor comments, but this is looking really good!

@SparkQA
Copy link

SparkQA commented Dec 1, 2015

Test build #46939 has finished for PR 9937 at commit 0d95daf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class InitializeJavaBean(beanInstance: Expression, setters: Map[String, Expression])\n

@marmbrus
Copy link
Contributor

marmbrus commented Dec 1, 2015

Thanks! I'm going to try and fix conflicts while merging to master and 1.6.

asfgit pushed a commit that referenced this pull request Dec 1, 2015
create java version of `constructorFor` and `extractorFor` in `JavaTypeInference`

Author: Wenchen Fan <[email protected]>

This patch had conflicts when merged, resolved by
Committer: Michael Armbrust <[email protected]>

Closes #9937 from cloud-fan/pojo.

(cherry picked from commit fd95eea)
Signed-off-by: Michael Armbrust <[email protected]>
@asfgit asfgit closed this in fd95eea Dec 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants