Skip to content
Prev Previous commit
Next Next commit
class_splitting_only fixing classFunctions buffer append
  • Loading branch information
ALeksander Eskilson committed May 31, 2017
commit a1c93fbc52a873fd8acdb7607388d56bc183a9bc
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ import scala.util.control.NonFatal

import com.google.common.cache.{CacheBuilder, CacheLoader}
import com.google.common.util.concurrent.{ExecutionError, UncheckedExecutionException}
import org.apache.commons.lang3.exception.ExceptionUtils
import org.codehaus.commons.compiler.CompileException
import org.codehaus.janino.{ByteArrayClassLoader, ClassBodyEvaluator, JaninoRuntimeException, SimpleCompiler}
import org.codehaus.janino.util.ClassFile
Expand Down Expand Up @@ -261,6 +260,10 @@ class CodegenContext {
*
* @param funcName the class-unqualified name of the function
* @param funcCode the body of the function
* @param inlineToOuterClass whether the given code must be inlined to the `OuterClass`. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give an example? I'm not very clear when we need this

Copy link
Author

@bdrillard bdrillard Jun 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see the portion of doConsume in the Limit class where the stopEarly function is registered, https://github.com/apache/spark/pull/18075/files#diff-379cccace8699ca00b76ff5631222adeR73

In this section of code, the registration of the function is separate from the caller code, so unlike other changes in this patch, we have no way of informing the caller code what the potentially class-qualified name of the function would be if it were inlined to a nested class. Instead, the caller code for the function (in WholeStageCodegenExec), makes a hard assumption that stopEarly will be visible globally, that is, in the outer class. The caller is divorced from the function producer across classes, so it's not clear how to make a generated function name visible, but the hint to inline to just inline the function to the outer class fixes that issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me, as the stopEarly in Limit is going to override the stopEarly in BufferedRowIterator, we can only put it in outer class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, whole stage codegen is really tricky...

* can be necessary when a function is declared outside of the context
* it is eventually referenced and a returned qualified function name
* cannot otherwise be accessed.
* @return the name of the function, qualified by class if it will be inlined to a private,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

* nested sub-class
*/
Expand All @@ -287,7 +290,7 @@ class CodegenContext {
val name = classInfo._1

classSize.update(name, classSize(name) + funcCode.length)
classFunctions.update(name, classFunctions(name) += funcCode)
classFunctions(name).append(funcCode)

if (name.equals("OuterClass")) {
funcName
Expand Down