-
Notifications
You must be signed in to change notification settings - Fork 11
Added Sorting Functions #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #233 +/- ##
==========================================
+ Coverage 96.61% 96.64% +0.03%
==========================================
Files 53 54 +1
Lines 943 952 +9
Branches 18 12 -6
==========================================
+ Hits 911 920 +9
Misses 32 32
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
Sorry in the phone show me that the changes from the last pr were reverted 😵💫 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Just rebase your branch because codecov is acting weird, saying the coverage will drop because of your PR, and that's not true
I updated the issue #70 as well because we were only talking about column functions (we forgot about DataFrame order)
0cc01c6 to
1cfe321
Compare
|
Apologies for the delay on this! The past few days have been a little bit busy for me. Looking at the updated #70, it seems all thats left on this issue is to add |
|
No problem at all! We really appreciate what you are doing. About #70, yeah, that's the only función left, I updated the issue to track more related functions, but it doesn't mean that you have to implement it (you can though, but if you don't we will do it as soon as we can). Anyway, whenever you rebase your branch and mark your PR as "ready for review" we will merge it. Thank you again for your help and effort! |
|
The Spark 2.4 tests are failing due the Spark minus months returning a different result to the LocalDate equivalent... It would appear this is some flakiness that will occur on certain days of the year. One option to get around this and prevent it from happening again would be to hardcode the date to a certain value instead of using |
|
As a follow up to the above comment about the LocalDate flakiness, I found the PR that updated the logic to use the LocalDate api: apache/spark#25153. I'll move the existing logic to the spark 3.x tests and try and come up with a solution for the 2.4 tests |
|
Hi! I'm back, sorry for this absence. I see you got it fixed but to test it for all versions, we can set the expected value according to the spark version. Something like: val expectedResult = if (spark.version.take(1).toInt > 2)
List(Date.valueOf(LocalDate.now.minusMonths(3)), null).map(Option(_))
else
\\ spark 2 result
df.testColumns2("dateCol", -3)(
(d, m) => colDate(d).addMonths(m.lit),
(d, m) => f.add_months(f.col(d), m),
expectedResult
)Thanks for all the work 😄 |
|
Welcome back! No worries at all!! Sounds good! Just for understanding, when should tests be placed into the Spark 3 folders like I did in b2b4c38 compared to when logic like this should be implemented? |
|
The folders contain in the name the versions that apply to them, if it has We use the folders if it's more a compile problem (different interfaces, methods, or classes that don't exist in a spark version, etc) but in your case it exists, it's compatible, but the only problem is that depending on the version, it returns a value or other. |
|
Codecov error, I'm going to launch it again. 😓 |
Description
This PR is implements the
ascanddescfunctions alongside theorderByandsortfunctions currently present in Spark.Related Issue
Resolves #70
Motivation and Context
Currently when performing
orderByor sort operations in a DataFrame, you have to use Spark columns and functions. This PR aims to allow the use of doric columns when sorting.How Has This Been Tested?