-
Notifications
You must be signed in to change notification settings - Fork 976
[KYUUBI #7180][LINEAGE] Subquery in the project should always drill down to get the lineage relationships #7181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7181 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 698 698
Lines 43646 43648 +2
Branches 5894 5894
======================================
- Misses 43646 43648 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for the PR! This PR is being closed due to inactivity. This isn't a judgement on the merit of the PR in any way. If this is still an issue with the latest version of Kyuubi, please reopen it and ask a committer to remove the Stale tag! Thank you for using Kyuubi! |
|
@pan3793 Hi~ could you help review this when you have time? We had tested much other sql and didn't find more problems. |
|
@yabola can you rebase it since master moves forward a lot. |
...eage/src/main/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParseHelper.scala
Outdated
Show resolved
Hide resolved
...eage/src/main/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParseHelper.scala
Outdated
Show resolved
Hide resolved
…wn to get the column lineage relationships
4afb6a6 to
e9ded00
Compare
...eage/src/main/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParseHelper.scala
Outdated
Show resolved
Hide resolved
Thank you for the review. I rebased master. |
wForget
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yabola , LGTM
Why are the changes needed?
The following SQL statement will get the wrong column lineage result:
create table table0(a int, b string, c string)
create table table1(a int, b string, c string)
select (select sum(a) from table0 where table1.b = table0.b) as aa, b from table1
The root cause:
From apache/spark#32687 , we can know the references for a subquery expression are defined as outer attribute references. So we should always drill down to get the corresponding column lineage relationship for the subquery plan.
How was this patch tested?
add new ut
Was this patch authored or co-authored using generative AI tooling?
no