Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update the None equal comparison
  • Loading branch information
DjvuLee committed Jan 17, 2017
commit 43602b56d6099213a103a0c0389ac37ebb2c326b
4 changes: 2 additions & 2 deletions python/pyspark/sql/readwriter.py
Original file line number Diff line number Diff line change
Expand Up @@ -432,8 +432,8 @@ def jdbc(self, url, table, column=None, lowerBound=None, upperBound=None, numPar
if column is not None:
if numPartitions is None:
numPartitions = self._spark._sc.defaultParallelism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is contradicting with the scala version. Could you also change it to the following code

assert numPartitions is not None, "numPartitions can not be None when ``column`` is specified"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a little worry whether this change will break the API. If some users just specify the column, lowerBound, upperBound in some Spark version, its program will fail after update, even very few people just use the default parallelism.

In my personal opinion, I prefer to make a change and keep API consistent.

If your opinion is to add the assert on numPartitions, I will update the PR soon.

Copy link
Member

@gatorsmile gatorsmile Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make the Scala API and Python API consistent. The existing Python API is not following the document.

These options must all be specified if any of them is specified. They describe how to partition the table when reading in parallel from multiple workers. partitionColumn must be a numeric column from the table in question. Notice that lowerBound and upperBound are just used to decide the partition stride, not for filtering the rows in table. So all rows in the table will be partitioned and returned. This option applies only to reading.

assert lowerBound != None, "lowerBound can not be None when ``column`` is specified"
assert upperBound != None, "upperBound can not be None when ``column`` is specified"
assert lowerBound is not None, "lowerBound can not be None when ``column`` is specified"
assert upperBound is not None, "upperBound can not be None when ``column`` is specified"
return self._df(self._jreader.jdbc(url, table, column, int(lowerBound), int(upperBound),
int(numPartitions), jprop))
if predicates is not None:
Expand Down