Skip to content

Conversation

@Yohahaha
Copy link
Contributor

@Yohahaha Yohahaha commented Dec 7, 2023

What changes were proposed in this pull request?

Respect parsed attribute name from Spark, sum(a) is a valid attr name in Spark.

Remove column name validate logic safely since SparkTokenizer already added.

(Fixes: #3962)

How was this patch tested?

added UT.

@github-actions
Copy link

github-actions bot commented Dec 7, 2023

#3962

@github-actions
Copy link

github-actions bot commented Dec 7, 2023

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

Yohahaha commented Dec 7, 2023

Spark33 allow read parquet column with special characters but Spark32 does not.... details see apache/spark#35229

@github-actions
Copy link

github-actions bot commented Dec 7, 2023

Run Gluten Clickhouse CI

@github-actions
Copy link

github-actions bot commented Dec 7, 2023

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

Yohahaha commented Dec 8, 2023

@PHILO-HE @rui-mo @ulysses-you please help review, thank you!

Copy link
Member

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work! Just few comments.

@github-actions
Copy link

github-actions bot commented Dec 8, 2023

Run Gluten Clickhouse CI

1 similar comment
@github-actions
Copy link

github-actions bot commented Dec 8, 2023

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

cc @waitinfuture please help rerun failed ci job, thank you!

@github-actions
Copy link

Run Gluten Clickhouse CI

fix

fix format

fix ut

fix
@github-actions
Copy link

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

@PHILO-HE @ulysses-you @rui-mo any more comments?

Copy link
Member

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks!

Copy link
Contributor

@waitinfuture waitinfuture left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@waitinfuture waitinfuture merged commit 8edcb72 into apache:main Dec 13, 2023
@Yohahaha Yohahaha deleted the attr-name branch December 13, 2023 05:45
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3963_time.csv log/native_master_12_12_2023_8c6f1643e_time.csv difference percentage
q1 33.89 33.75 -0.140 99.59%
q2 27.48 24.88 -2.595 90.56%
q3 37.58 37.71 0.124 100.33%
q4 39.51 37.86 -1.656 95.81%
q5 72.23 72.83 0.599 100.83%
q6 7.02 7.05 0.029 100.41%
q7 83.96 86.50 2.538 103.02%
q8 87.97 86.24 -1.726 98.04%
q9 126.84 125.50 -1.339 98.94%
q10 45.71 44.68 -1.026 97.76%
q11 20.94 20.21 -0.726 96.53%
q12 27.39 25.06 -2.326 91.51%
q13 46.81 46.58 -0.234 99.50%
q14 15.61 19.99 4.382 128.08%
q15 29.11 27.55 -1.566 94.62%
q16 15.49 15.86 0.367 102.37%
q17 103.03 103.27 0.238 100.23%
q18 150.83 151.33 0.502 100.33%
q19 12.83 13.89 1.060 108.26%
q20 26.96 28.68 1.729 106.41%
q21 226.18 229.12 2.941 101.30%
q22 14.05 13.95 -0.093 99.34%
total 1251.41 1252.49 1.083 100.09%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CORE] Do not modify attribute name in transform scan operator

7 participants