Skip to content

Conversation

@ulysses-you
Copy link
Contributor

This PR backport #34365 to branch-3.0

What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with parquet.compression.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.

CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);

So we should invalidate the table cache after alter table properties.

Does this PR introduce any user-facing change?

yes, bug fix

How was this patch tested?

Add test

@SparkQA
Copy link

SparkQA commented Oct 25, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49053/

@SparkQA
Copy link

SparkQA commented Oct 25, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49053/

@SparkQA
Copy link

SparkQA commented Oct 25, 2021

Test build #144582 has finished for PR 34379 at commit c195021.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 25, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49058/

@SparkQA
Copy link

SparkQA commented Oct 25, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49058/

@SparkQA
Copy link

SparkQA commented Oct 25, 2021

Test build #144587 has finished for PR 34379 at commit 6c3781c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Oct 26, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49071/

@SparkQA
Copy link

SparkQA commented Oct 26, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49071/

@SparkQA
Copy link

SparkQA commented Oct 26, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49075/

@SparkQA
Copy link

SparkQA commented Oct 26, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49075/

@SparkQA
Copy link

SparkQA commented Oct 26, 2021

Test build #144601 has finished for PR 34379 at commit 6c3781c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 26, 2021

Test build #144604 has finished for PR 34379 at commit 2ae0a54.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun
Copy link
Member

Merged to branch-3.0.

dongjoon-hyun pushed a commit that referenced this pull request Oct 26, 2021
This PR backport #34365 to branch-3.0

### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes #34379 from ulysses-you/SPARK-37098-3.0.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@ulysses-you
Copy link
Contributor Author

thank you @dongjoon-hyun

@ulysses-you ulysses-you deleted the SPARK-37098-3.0 branch October 27, 2021 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants