Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented May 31, 2015

No description provided.

@davies
Copy link
Contributor Author

davies commented May 31, 2015

cc @JoshRosen @rxin

@rxin
Copy link
Contributor

rxin commented May 31, 2015

Does this need to block 1.4?

@davies
Copy link
Contributor Author

davies commented May 31, 2015

@rxin I think so, it's a regression.

@SparkQA
Copy link

SparkQA commented May 31, 2015

Test build #33836 has finished for PR 6532 at commit 1425359.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant SPARK-7978

@JoshRosen
Copy link
Contributor

I guess the root cause of this bug was that the DecimalType class's singleton-ness was inherited due to our use of metaclasses. In Scala, we didn't have this problem because the singleton-ness was implemented at the leaves of the class hierarchy by using object.

To prevent this sort of issue in the future, I wonder whether it would be clearer to apply a decorator on the leaf class declarations themselves (as in http://elbenshira.com/blog/singleton-pattern-in-python/) rather than using metaclasses. It would be good to understand whether the difference between these approaches has any performance / correctness impacts.

In the meantime, the fix here makes sense to me: we're just pushing the singleton-ness a bit deeper into the class hierarchy so that it isn't inappropriately inherited by DecimalType.

From a test coverage perspective, what tests would have caught this? It seems that something as simple as just manually constructing an instance of each type with all of its parameters specified would have caught this. Even though it seems like an obvious / dumb test, maybe we should add a test that just checks that we're able to at least instantiate each of the types that take parameters.

@JoshRosen
Copy link
Contributor

Also, I wonder whether cov.py would have shown this as missing in its coverage report, since it seems like some sorts of branch coverage tests would notice that this path wasn't exercised.

@davies
Copy link
Contributor Author

davies commented May 31, 2015

@JoshRosen I think the new added test could coverage that DecimalType is not singleton, by

  self.assertTrue(t2 is not t1)

It may make sense to switch other singleton patterns, but I'd like to fix this first without introducing too much other things.

@SparkQA
Copy link

SparkQA commented May 31, 2015

Test build #33848 has finished for PR 6532 at commit c7fcbce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@airhorns
Copy link

Maybe another systematic fix is to not allow the singleton-ing of any Type that takes constructor args? At subclass time can you check that?

@JoshRosen
Copy link
Contributor

@davies, I agree that the test you added here acts as a proper regression test. My comment was more to suggest that we could have prevented this regression in the first place with a relatively simple test that just tries to instantiate each data type with all of its constructor arguments. The fact that this bug evaded unit tests implies that our existing unit tests didn't create DecimalTypes with any constructor arguments, implying that our test coverage of decimal-related code might be insufficient. I think that this patch is fine, but for 1.5 we should make a dedicated effort to improve Python's test coverage.

@airhorns, do you mean that the single metaclass would act as a no-op when applied to Types that take constructor arguments or that it would throw an exception if applied to those types? This is purely academic at this point, but I can imagine some contrived scenarios where the no-op behavior might be confusing: what if I had a class which accepted constructor parameters, then created a subclass which called its superclass constructor with constant values for those parameters? In this case, the subclass can be a singleton but the superclass can't. To avoid having to reason about these corner-cases, maybe it's better to just accept a bit of verbosity and use decorators instead. We shouldn't do that for this patch, though; we can leave it as a followup for 1.5.

@airhorns
Copy link

@JoshRosen sounds good to me!

@JoshRosen
Copy link
Contributor

As an experiment, I put together some code to run the PySpark test suite through coverage.py (https://gist.github.com/JoshRosen/60d590b1cdc271d332e5) and it turns out that line + branch coverage for the DecimalType class itself doesn't uncover this bug since it won't catch the fact that the constructor is only ever called with defaults for both of its arguments. The closest thing to a red flag in the coverage report was the fact that the only call to DecimalType with arguments, inside of _parse_datatype_json_value, wasn't hit by the tests.

@asfgit asfgit closed this in 91777a1 Jun 1, 2015
asfgit pushed a commit that referenced this pull request Jun 1, 2015
Author: Davies Liu <[email protected]>

Closes #6532 from davies/decimal and squashes the following commits:

c7fcbce [Davies Liu] Update tests.py
1425359 [Davies Liu] DecimalType should not be singleton

(cherry picked from commit 91777a1)
Signed-off-by: Reynold Xin <[email protected]>
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
Author: Davies Liu <[email protected]>

Closes apache#6532 from davies/decimal and squashes the following commits:

c7fcbce [Davies Liu] Update tests.py
1425359 [Davies Liu] DecimalType should not be singleton
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
Author: Davies Liu <[email protected]>

Closes apache#6532 from davies/decimal and squashes the following commits:

c7fcbce [Davies Liu] Update tests.py
1425359 [Davies Liu] DecimalType should not be singleton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants