Skip to content

Conversation

@ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Nov 15, 2022

What changes were proposed in this pull request?

Skip UnresolvedHint in rule AddMetadataColumns to avoid call exprId on UnresolvedAttribute.

Why are the changes needed?

CREATE TABLE t1(c1 bigint) USING PARQUET;
CREATE TABLE t2(c2 bigint) USING PARQUET;
SELECT /*+ hash(t2) */ * FROM t1 join t2 on c1 = c2;

failed with msg:

org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to exprId on unresolved object
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.exprId(unresolved.scala:147)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$4(Analyzer.scala:1005)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$4$adapted(Analyzer.scala:1005)
  at scala.collection.Iterator.exists(Iterator.scala:969)
  at scala.collection.Iterator.exists$(Iterator.scala:967)
  at scala.collection.AbstractIterator.exists(Iterator.scala:1431)
  at scala.collection.IterableLike.exists(IterableLike.scala:79)
  at scala.collection.IterableLike.exists$(IterableLike.scala:78)
  at scala.collection.AbstractIterable.exists(Iterable.scala:56)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$3(Analyzer.scala:1005)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$3$adapted(Analyzer.scala:1005) 

But before just a warning: WARN HintErrorLogger: Unrecognized hint: hash(t2)

Does this PR introduce any user-facing change?

yes, fix regression from 3.3.1.

Note, the root reason is we mark UnresolvedHint is resolved if child is resolved since #32841, then #37758 trigger this bug.

How was this patch tested?

add test

@github-actions github-actions bot added the SQL label Nov 15, 2022
@ulysses-you
Copy link
Contributor Author

cc @cloud-fan @cfmcgrady @viirya

@cloud-fan
Copy link
Contributor

thanks, merging to master/3.3!

@cloud-fan cloud-fan closed this in a9bf5d2 Nov 15, 2022
cloud-fan pushed a commit that referenced this pull request Nov 15, 2022
Skip `UnresolvedHint` in rule `AddMetadataColumns` to avoid call exprId on `UnresolvedAttribute`.

```
CREATE TABLE t1(c1 bigint) USING PARQUET;
CREATE TABLE t2(c2 bigint) USING PARQUET;
SELECT /*+ hash(t2) */ * FROM t1 join t2 on c1 = c2;
```

failed with msg:
```
org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to exprId on unresolved object
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.exprId(unresolved.scala:147)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$4(Analyzer.scala:1005)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$4$adapted(Analyzer.scala:1005)
  at scala.collection.Iterator.exists(Iterator.scala:969)
  at scala.collection.Iterator.exists$(Iterator.scala:967)
  at scala.collection.AbstractIterator.exists(Iterator.scala:1431)
  at scala.collection.IterableLike.exists(IterableLike.scala:79)
  at scala.collection.IterableLike.exists$(IterableLike.scala:78)
  at scala.collection.AbstractIterable.exists(Iterable.scala:56)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$3(Analyzer.scala:1005)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$3$adapted(Analyzer.scala:1005)
```

But before just a warning: `WARN HintErrorLogger: Unrecognized hint: hash(t2)`

yes, fix regression from 3.3.1.

Note, the root reason is we mark `UnresolvedHint` is resolved if child is resolved since #32841, then #37758 trigger this bug.

add test

Closes #38662 from ulysses-you/hint.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit a9bf5d2)
Signed-off-by: Wenchen Fan <[email protected]>
@ulysses-you ulysses-you deleted the hint branch November 15, 2022 09:14
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?

Skip `UnresolvedHint` in rule `AddMetadataColumns` to avoid call exprId on `UnresolvedAttribute`.

### Why are the changes needed?

```
CREATE TABLE t1(c1 bigint) USING PARQUET;
CREATE TABLE t2(c2 bigint) USING PARQUET;
SELECT /*+ hash(t2) */ * FROM t1 join t2 on c1 = c2;
```

failed with msg:
```
org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to exprId on unresolved object
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.exprId(unresolved.scala:147)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$4(Analyzer.scala:1005)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$4$adapted(Analyzer.scala:1005)
  at scala.collection.Iterator.exists(Iterator.scala:969)
  at scala.collection.Iterator.exists$(Iterator.scala:967)
  at scala.collection.AbstractIterator.exists(Iterator.scala:1431)
  at scala.collection.IterableLike.exists(IterableLike.scala:79)
  at scala.collection.IterableLike.exists$(IterableLike.scala:78)
  at scala.collection.AbstractIterable.exists(Iterable.scala:56)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$3(Analyzer.scala:1005)
  at org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns$.$anonfun$hasMetadataCol$3$adapted(Analyzer.scala:1005)
```

But before just a warning: `WARN HintErrorLogger: Unrecognized hint: hash(t2)`

### Does this PR introduce _any_ user-facing change?

yes, fix regression from 3.3.1.

Note, the root reason is we mark `UnresolvedHint` is resolved if child is resolved since apache#32841, then apache#37758 trigger this bug.

### How was this patch tested?

add test

Closes apache#38662 from ulysses-you/hint.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants