Commit 6dd4001
[SPARK-53490][CONNECT][SQL] Fix Protobuf conversion in observed metrics
### What changes were proposed in this pull request?
This PR fixes a critical issue in the protobuf conversion of observed metrics in Spark Connect, specifically when dealing with complex data types like structs, arrays, and maps. The main changes include:
1. **Modified Observation class to store Row objects instead of Map[String, Any]**: Changed the internal promise type from `Promise[Map[String, Any]]` to `Promise[Row]` to preserve type information during protobuf serialization/deserialization.
2. **Enhanced protobuf conversion for complex types**:
- Added proper handling for struct types by creating `GenericRowWithSchema` objects instead of tuples
- Added support for map type conversion in `LiteralValueProtoConverter`
- Improved data type inference with a new `getDataType` method that properly handles all literal types
3. **Fixed observed metrics**: Modified the observed metrics processing to include data type information in the protobuf conversion, ensuring that complex types are properly serialized and deserialized.
### Why are the changes needed?
The previous implementation had several issues:
1. **Data type loss**: Observed metrics were losing their original data types during Protobuf conversion, causing errors
2. **Struct handling problems**: The conversion logic didn't properly handle Row objects and struct types
### Does this PR introduce _any_ user-facing change?
Yes, this PR fixes a bug that was preventing users from successfully using observed metrics with complex data types (structs, arrays, maps) in Spark Connect. Users can now:
- Use `struct()` expressions in observed metrics and receive properly typed `GenericRowWithSchema` objects
- Use `array()` expressions in observed metrics and receive properly typed arrays
- Use `map()` expressions in observed metrics and receive properly typed maps
Previously, the code below would fail.
```scala
val observation = Observation("struct")
spark
.range(10)
.observe(observation, struct(count(lit(1)).as("rows"), max("id").as("maxid")).as("struct"))
.collect()
observation.get
// Below is the error message:
"""
org.apache.spark.SparkUnsupportedOperationException: literal [10,9] not supported (yet).
org.apache.spark.sql.connect.common.LiteralValueProtoConverter$.toLiteralProtoBuilder(LiteralValueProtoConverter.scala:104)
org.apache.spark.sql.connect.common.LiteralValueProtoConverter$.toLiteralProto(LiteralValueProtoConverter.scala:203)
org.apache.spark.sql.connect.execution.SparkConnectPlanExecution$.$anonfun$createObservedMetricsResponse$2(SparkConnectPlanExecution.scala:571)
org.apache.spark.sql.connect.execution.SparkConnectPlanExecution$.$anonfun$createObservedMetricsResponse$2$adapted(SparkConnectPlanExecution.scala:570)
"""
```
### How was this patch tested?
`build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite -- -z SPARK-53490"`
`build/sbt "connect/testOnly *LiteralExpressionProtoConverterSuite"`
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor 1.5.9
Closes #52236 from heyihong/SPARK-53490.
Authored-by: Yihong He <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>1 parent feaf659 commit 6dd4001
File tree
9 files changed
+245
-137
lines changed- sql
- api/src/main/scala/org/apache/spark/sql
- connect
- client/jvm/src/test/scala/org/apache/spark/sql/connect
- common/src/main/scala/org/apache/spark/sql/connect
- client
- common
- server/src
- main/scala/org/apache/spark/sql/connect
- execution
- planner
- test/scala/org/apache/spark/sql/connect/planner
9 files changed
+245
-137
lines changedLines changed: 19 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
| 66 | + | |
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
79 | | - | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
80 | 83 | | |
81 | 84 | | |
82 | 85 | | |
| |||
99 | 102 | | |
100 | 103 | | |
101 | 104 | | |
102 | | - | |
| 105 | + | |
| 106 | + | |
103 | 107 | | |
104 | 108 | | |
105 | 109 | | |
| |||
118 | 122 | | |
119 | 123 | | |
120 | 124 | | |
121 | | - | |
122 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
123 | 136 | | |
124 | 137 | | |
125 | 138 | | |
| |||
Lines changed: 36 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1749 | 1749 | | |
1750 | 1750 | | |
1751 | 1751 | | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| 1757 | + | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
| 1762 | + | |
| 1763 | + | |
| 1764 | + | |
| 1765 | + | |
| 1766 | + | |
| 1767 | + | |
| 1768 | + | |
| 1769 | + | |
| 1770 | + | |
| 1771 | + | |
| 1772 | + | |
| 1773 | + | |
| 1774 | + | |
| 1775 | + | |
| 1776 | + | |
| 1777 | + | |
| 1778 | + | |
| 1779 | + | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
1752 | 1788 | | |
1753 | 1789 | | |
1754 | 1790 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
383 | 383 | | |
384 | 384 | | |
385 | 385 | | |
386 | | - | |
| 386 | + | |
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
| |||
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/common/DataTypeProtoConverter.scala
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| |||
Lines changed: 120 additions & 58 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
33 | 34 | | |
| 35 | + | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
38 | | - | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
223 | 224 | | |
224 | 225 | | |
225 | 226 | | |
226 | | - | |
| 227 | + | |
227 | 228 | | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
266 | 236 | | |
267 | | - | |
268 | | - | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
269 | 262 | | |
270 | | - | |
271 | | - | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
272 | 272 | | |
273 | 273 | | |
274 | 274 | | |
| |||
721 | 721 | | |
722 | 722 | | |
723 | 723 | | |
724 | | - | |
725 | | - | |
726 | | - | |
727 | | - | |
728 | | - | |
729 | | - | |
730 | | - | |
731 | | - | |
732 | | - | |
733 | | - | |
734 | | - | |
735 | | - | |
| 724 | + | |
736 | 725 | | |
737 | 726 | | |
738 | | - | |
| 727 | + | |
739 | 728 | | |
740 | | - | |
| 729 | + | |
741 | 730 | | |
742 | 731 | | |
743 | 732 | | |
| |||
759 | 748 | | |
760 | 749 | | |
761 | 750 | | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
762 | 824 | | |
sql/connect/server/src/main/scala/org/apache/spark/sql/connect/execution/ExecuteThreadRunner.scala
Lines changed: 8 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
227 | 228 | | |
228 | 229 | | |
229 | 230 | | |
230 | | - | |
| 231 | + | |
231 | 232 | | |
232 | | - | |
233 | | - | |
234 | | - | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
235 | 237 | | |
236 | 238 | | |
237 | 239 | | |
238 | | - | |
| 240 | + | |
239 | 241 | | |
240 | 242 | | |
241 | 243 | | |
242 | 244 | | |
243 | 245 | | |
244 | | - | |
| 246 | + | |
245 | 247 | | |
246 | 248 | | |
247 | 249 | | |
| |||
0 commit comments