-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20254][SQL] Remove unnecessary data conversion for Dataset with primitive array #17568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
1ece803
c042ff2
264a18b
82a1f2b
cc3c7bd
7d43d27
3bcee3f
1b24be0
5174b5e
e608c45
2ee4030
1d6ab36
0fd8c25
791aad9
f695e50
ce6927d
a6bedee
c0dca2b
8de6915
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -110,11 +110,18 @@ object EliminateMapObjects extends Rule[LogicalPlan] { | |
| def apply(plan: LogicalPlan): LogicalPlan = plan transform { | ||
|
||
| case _ @ DeserializeToObject(Invoke( | ||
| MapObjects(_, _, _, Cast(LambdaVariable(_, _, dataType, _), castDataType, _), | ||
|
||
| inputData, None, _), | ||
| inputData, None), | ||
| funcName, returnType: ObjectType, arguments, propagateNull, returnNullable), | ||
| outputObjAttr, child) if dataType == castDataType => | ||
|
||
| DeserializeToObject(Invoke( | ||
| inputData, funcName, returnType, arguments, propagateNull, returnNullable), | ||
| outputObjAttr, child) | ||
| case _ @ DeserializeToObject(Invoke( | ||
|
||
| MapObjects(_, _, _, LambdaVariable(_, _, dataType, _), inputData, None), | ||
|
||
| funcName, returnType: ObjectType, arguments, propagateNull, returnNullable), | ||
| outputObjAttr, child) => | ||
| DeserializeToObject(Invoke( | ||
| inputData, funcName, returnType, arguments, propagateNull, returnNullable), | ||
| outputObjAttr, child) | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this safe to do? According to the description of
AssertNotNull, evencis non-nullable, we still need to add this assertion for some cases.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is what @cloud-fan suggested in his comment.
Is there better alternative implementation to remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if @cloud-fan's no-op
AssertNotNullis as the same as the case inAssertNotNull's description.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah good catch! sorry it was my mistake, but then seems we can not remove
MapObjects, as the null check have to be done.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I checked all the usage of
AssertNotNull, we never useAssertNotNullto check a not nullable column/field, seems the document ofAssertNotNullis wrong. Can you double check?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @cloud-fan. I have also checked the usages of
AssertNotNull. IIUC, all of them are used for throwing a runtime exception.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the purpose of
AssertNotNullis used to give proper exception in runtime when an expression (note: it can be nullable or non-nullable expression) evaluates to null value.Maybe for
MapObjects, we can safely remove it. But I am not sure other cases it is okay too.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as we trust the
nullableproperty, I think it's safe to remove it. We don't useAssertNullto validate the input data.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Sounds reasonable to me.