-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-45265][SQL] Supporting Hive 4.0 Metastore #45801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi, @attilapiros . I revised your closed PR, #43064, with your authorship. SPARK-45265 has been assigned to you, too. If you want, you can take over from here also. |
|
Hi @dongjoon-hyun, Thanks! I am fine either way. By the way should not we need to extend the condition with The rest looks good to me but we should wait for the tests to finish. |
f3f1ee3 to
d358bb4
Compare
|
Thank you, @attilapiros . I addressed your comment and fixed the patch according to HIVE-21078 and HIVE-21164. |
|
Sorry guys, @HyukjinKwon and @attilapiros . |
|
Hi, @attilapiros , just a question. When you did last try, the PR description has partitioned tables, I'm wondering how did you test partitioned tables with your previous PR? HIVE-21703 seems to be in |
|
For the testing I run hive in a docker image like: Checking my command history it was probably 4.0.0-beta-1. |
|
Got it. Thank you for the info. |
### What changes were proposed in this pull request? This PR continues the work from #43064 and #45801 to support Hive Metastore Server 4.0. CHAR/VARCHAR type partition filter pushdown is not included in this PR, as it requires further investment. ### Why are the changes needed? Enhance the multiple hive metastore server support feature ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Passing HiveClient*Suites w/ 4.0 ### Was this patch authored or co-authored using generative AI tooling? no Closes #48823 from yaooqinn/SPARK-45265. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR aims to support Apache Hive 4.0 Metastore where partition filters even for CHAR and a VARCHAR types can be pushed down.
Why are the changes needed?
This is blocked by SPARK-47679 due to the incompatible change of HIVE-27925 .
HiveConf.getConfVarsor Hive conf names directly #45804Supporting more Hive versions (with extra performance improvement) is good for our users.
Does this PR introduce any user-facing change?
Yes. Regarding supporting Hive 4.0 metastore the documentation is updated accordingly.
How was this patch tested?
Manually
I used the docker image of apache/hive:4.0.0-beta-1 for starting a metastore and a hiveserver2 (along with a hadoop3 docker image).
Created a table:
Inserted some values in beeline:
Started my spark in the hiveserver2 container as:
Run the query as:
And check the HMS calls in the metastore container in the file
/tmp/hive/hive.log:Which contains the expected
get_partitions_by_filter.Was this patch authored or co-authored using generative AI tooling?
No.