-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-23873][SQL] Use accessors in interpreted LambdaVariable #20981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @hvanhovell |
|
Test build #88926 has finished for PR 20981 at commit
|
|
retest this please. |
| dataType: DataType, | ||
| nullable: Boolean = true) extends LeafExpression with NonSQLExpression { | ||
|
|
||
| private lazy val accessor: InternalRow => Any = dataType match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make this a bit more generic and also use this for BoundReference.
| } | ||
|
|
||
| val acceptedTypes = elementTypes ++ arrayTypes ++ mapTypes ++ structTypes | ||
| val random = new Random() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please set a seed or use withClue() otherwise it will be very annoying to debug these when something goes south.
|
A few minor comments. Looks good otherwise. |
|
Test build #88930 has finished for PR 20981 at commit
|
|
@hvanhovell Addressed your comment. Thanks. |
|
Test build #89042 has finished for PR 20981 at commit
|
|
Test build #89040 has finished for PR 20981 at commit
|
|
retest this please. |
|
|
||
| override def toString: String = s"input[$ordinal, ${dataType.simpleString}, $nullable]" | ||
|
|
||
| private lazy val accessor: InternalRow => Any = InternalRow.getAccessor(dataType, ordinal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to be lazy?
| case t: StructType => (input) => input.getStruct(ordinal, t.size) | ||
| case _: ArrayType => (input) => input.getArray(ordinal) | ||
| case _: MapType => (input) => input.getMap(ordinal) | ||
| case _ => (input) => input.get(ordinal, dataType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle UDT?
|
Test build #89048 has finished for PR 20981 at commit
|
| /** | ||
| * Returns an accessor for an InternalRow with given data type and ordinal. | ||
| */ | ||
| def getAccessor(dataType: DataType, ordinal: Int): (InternalRow) => Any = dataType match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make the accessor accept SpecializedGetters, so this can be reused in #20984 too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no. The accessor in #20984 takes ordinal as input parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well you could generalize to make it take an ordinal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
|
Test build #89101 has finished for PR 20981 at commit
|
| } | ||
|
|
||
|
|
||
| private def getActualDataType(dt: DataType): DataType = dt match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a similar method to the UserDefinedType companion. Which one shall we add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can use the method you added.
| * actually takes a `SpecializedGetters` input because it can be generalized to other classes | ||
| * that implements `SpecializedGetters` (e.g., `ArrayData`) too. | ||
| */ | ||
| def getAccessor(dataType: DataType): (SpecializedGetters, Int) => Any = dataType match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should move this to the companion object of SpecializedGetters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As SpecializedGetters is a java interface, seems we can't create a companion object for it? Or I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehe - good point. Let's leave it here for now.
|
Looks good. Two more comments. |
|
Test build #89123 has finished for PR 20981 at commit
|
|
LGTM |
|
Test build #89186 has finished for PR 20981 at commit
|
|
retest this please. |
|
Test build #89199 has finished for PR 20981 at commit
|
|
retest this please. |
|
Test build #89262 has finished for PR 20981 at commit
|
|
ping @hvanhovell |
|
Merging to master. Sorry for the long wait. |
What changes were proposed in this pull request?
Currently, interpreted execution of
LambdaVariablejust usesInternalRow.getto access element. We should use specified accessors if possible.How was this patch tested?
Added test.