Skip to content

Conversation

@ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with parquet.compression.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.

CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);

So we should invalidate the table cache after alter table properties.

Does this PR introduce any user-facing change?

yes, bug fix

How was this patch tested?

Add test

@github-actions github-actions bot added the SQL label Oct 22, 2021
Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ulysses-you
Copy link
Contributor Author

cc @yaooqinn @cloud-fan @viirya

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49006/

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49006/

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Test build #144535 has finished for PR 34365 at commit 587a285.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

properties = table.properties ++ properties,
comment = properties.get(TableCatalog.PROP_COMMENT).orElse(table.comment))
catalog.alterTable(newTable)
catalog.invalidateCachedTable(tableName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be better to do this inside SessionCatalog.alterTable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @sunchao . I think it's a code improvement, and if we decide to do that, it should consider more methods include alterTableDataSchema, alterPartition, etc. So for now in this PR, I prefer do the simple fix which is straightforward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks good to me. I'm just curious whether other alter table variants have the same issue.

@yaooqinn yaooqinn closed this in 02d3b3b Oct 25, 2021
yaooqinn pushed a commit that referenced this pull request Oct 25, 2021
### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes #34365 from ulysses-you/SPARK-37098.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit 02d3b3b)
Signed-off-by: Kent Yao <[email protected]>
yaooqinn pushed a commit that referenced this pull request Oct 25, 2021
### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes #34365 from ulysses-you/SPARK-37098.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit 02d3b3b)
Signed-off-by: Kent Yao <[email protected]>
@yaooqinn
Copy link
Member

thanks, merged to master/3.2/3.1.

@yaooqinn
Copy link
Member

It conflicts 3.0, would you please backport it @ulysses-you ?

@ulysses-you
Copy link
Contributor Author

thank you all, created #34379

@ulysses-you ulysses-you deleted the SPARK-37098 branch October 25, 2021 08:39
@dongjoon-hyun
Copy link
Member

+1, LGTM. Thank you, @ulysses-you and all.

@zzcclp
Copy link
Contributor

zzcclp commented Oct 26, 2021

It conflicts 3.0, would you please backport it @ulysses-you ?

This pr can't be merged into branch-3.1 directly too, there is an error below:

type mismatch;
 found   : org.apache.spark.sql.catalyst.TableIdentifier
 required: org.apache.spark.sql.catalyst.QualifiedTableName
    catalog.invalidateCachedTable(tableName)

in AlterTableUnsetPropertiesCommand, there isn't def invalidateCachedTable(name: TableIdentifier): Unit method in SessionCatalog.

@yaooqinn
Copy link
Member

@zzcclp Thanks for reporting this. I created 05aea97 to revert it from branch-3.1.

@ulysses-you would you please also send a PR target 3.1 again?

@ulysses-you
Copy link
Contributor Author

created #34390 for banchr-3.1

dongjoon-hyun pushed a commit that referenced this pull request Oct 26, 2021
This PR backport #34365 to branch-3.1

### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes #34390 from ulysses-you/SOARK-37098-3.1.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Oct 26, 2021
This PR backport #34365 to branch-3.0

### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes #34379 from ulysses-you/SPARK-37098-3.0.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
sunchao pushed a commit to sunchao/spark that referenced this pull request Dec 8, 2021
### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes apache#34365 from ulysses-you/SPARK-37098.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit 02d3b3b)
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit b7948ee)
fishcus pushed a commit to fishcus/spark that referenced this pull request Jan 12, 2022
This PR backport apache#34365 to branch-3.1

### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes apache#34390 from ulysses-you/SOARK-37098-3.1.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
catalinii pushed a commit to lyft/spark that referenced this pull request Feb 22, 2022
### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes apache#34365 from ulysses-you/SPARK-37098.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit 02d3b3b)
Signed-off-by: Kent Yao <[email protected]>
catalinii pushed a commit to lyft/spark that referenced this pull request Mar 4, 2022
### What changes were proposed in this pull request?

Invalidate the table cache after alter table properties (set and unset).

### Why are the changes needed?

The table properties can change the behavior of wriing. e.g. the parquet table with `parquet.compression`.

If you execute the following SQL, we will get the file with snappy compression rather than zstd.
```
CREATE TABLE t (c int) STORED AS PARQUET;
// cache table metadata
SELECT * FROM t;
ALTER TABLE t SET TBLPROPERTIES('parquet.compression'='zstd');
INSERT INTO TABLE t values(1);
```
So we should invalidate the table cache after alter table properties.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Add test

Closes apache#34365 from ulysses-you/SPARK-37098.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit 02d3b3b)
Signed-off-by: Kent Yao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants