-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-18016][SQL][CATALYST] Code Generation: Constant Pool Limit - Class Splitting #18075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
76e291b
class_splitting_only adding class splitting
b6bf6db
class_splitting_only addressing code review comments: typo, case stat…
28fc548
class_splitting_only refactoring classFunctions to use a simpler list
d30d097
class_splitting_only fixing error in pom scalatest-maven-plugin
442332b
class_splitting_only more consistent use of assoc notation
a1c93fb
class_splitting_only fixing classFunctions buffer append
1086bb3
class_splitting_only removing instances of this, adding nested class …
7fe5e4a
class_splitting_only using a map for functions since some functions m…
678b4ad
class_splitting_only addressing review comments
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
class_splitting_only fixing classFunctions buffer append
- Loading branch information
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,7 +28,6 @@ import scala.util.control.NonFatal | |
|
|
||
| import com.google.common.cache.{CacheBuilder, CacheLoader} | ||
| import com.google.common.util.concurrent.{ExecutionError, UncheckedExecutionException} | ||
| import org.apache.commons.lang3.exception.ExceptionUtils | ||
| import org.codehaus.commons.compiler.CompileException | ||
| import org.codehaus.janino.{ByteArrayClassLoader, ClassBodyEvaluator, JaninoRuntimeException, SimpleCompiler} | ||
| import org.codehaus.janino.util.ClassFile | ||
|
|
@@ -261,6 +260,10 @@ class CodegenContext { | |
| * | ||
| * @param funcName the class-unqualified name of the function | ||
| * @param funcCode the body of the function | ||
| * @param inlineToOuterClass whether the given code must be inlined to the `OuterClass`. This | ||
| * can be necessary when a function is declared outside of the context | ||
| * it is eventually referenced and a returned qualified function name | ||
| * cannot otherwise be accessed. | ||
| * @return the name of the function, qualified by class if it will be inlined to a private, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ditto. |
||
| * nested sub-class | ||
| */ | ||
|
|
@@ -287,7 +290,7 @@ class CodegenContext { | |
| val name = classInfo._1 | ||
|
|
||
| classSize.update(name, classSize(name) + funcCode.length) | ||
| classFunctions.update(name, classFunctions(name) += funcCode) | ||
| classFunctions(name).append(funcCode) | ||
|
|
||
| if (name.equals("OuterClass")) { | ||
| funcName | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give an example? I'm not very clear when we need this
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see the portion of
doConsumein the Limit class where thestopEarlyfunction is registered, https://github.com/apache/spark/pull/18075/files#diff-379cccace8699ca00b76ff5631222adeR73In this section of code, the registration of the function is separate from the caller code, so unlike other changes in this patch, we have no way of informing the caller code what the potentially class-qualified name of the function would be if it were inlined to a nested class. Instead, the caller code for the function (in WholeStageCodegenExec), makes a hard assumption that
stopEarlywill be visible globally, that is, in the outer class. The caller is divorced from the function producer across classes, so it's not clear how to make a generated function name visible, but the hint to inline to just inline the function to the outer class fixes that issue.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me, as the
stopEarlyinLimitis going to override thestopEarlyinBufferedRowIterator, we can only put it in outer class.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, whole stage codegen is really tricky...