Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Don't access SQLConf on task execution.
  • Loading branch information
viirya committed May 7, 2018
commit 1f5cc17741ef0feed5894fbafd21c36a44bc5373
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ package org.apache.spark.sql.catalyst.expressions
import org.codehaus.commons.compiler.CompileException
import org.codehaus.janino.InternalCompilerException

import org.apache.spark.TaskContext
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.util.Utils

Expand All @@ -34,23 +35,36 @@ object CodegenError {
}
}

/**
* Defines values for `SQLConf` config of fallback mode. Use for test only.
*/
object CodegenObjectFactoryMode extends Enumeration {
val AUTO, CODEGEN_ONLY, NO_CODEGEN = Value

def currentMode: CodegenObjectFactoryMode.Value = {
// If we weren't on task execution, accesses that config.
if (TaskContext.get == null) {
val config = SQLConf.get.getConf(SQLConf.CODEGEN_OBJECT_FALLBACK)
CodegenObjectFactoryMode.withName(config)
Copy link
Member Author

@viirya viirya May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This SQL config can only be useful for unit test against codegen and interpreted unsafe projection.

} else {
CodegenObjectFactoryMode.AUTO
}
}
}

/**
* A factory which can be used to create objects that have both codegen and interpreted
* implementations. This tries to create codegen object first, if any compile error happens,
* it fallbacks to interpreted version.
*/
abstract class CodegenObjectFactory[IN, OUT] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any better name? How about CodeGeneratorWithInterpretedFallback and make it extends CodeGenerator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it needs CodeGenerator? I think this CodeGeneratorWithInterpretedFallback just delegate to various CodeGenerator (e.g., GenerateUnsafeProjection) to produce codegen object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateUnsafeProjection also extends CodeGenerator.

Is there any class we want it to extend CodegenObjectFactory but not CodeGenerator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no doubt GenerateUnsafeProjection extends CodeGenerator. It's the class doing generators of byte codes.

Previously UnsafeProjection has its own interface UnsafeProjectionCreator, does not extends CodeGenerator. So currently I let it follow previous UnsafeProjection's API.

We can change it to implement CodeGenerator and also change the places using UnsafeProjection, if you think it is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok let's keep the previous way to minimize code changes. How about just rename it to CodeGeneratorWithInterpretedFallback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Reamed.


// Creates wanted object. First trying codegen implementation. If any compile error happens,
// fallbacks to interpreted version.
def createObject(in: IN): OUT = {
val fallbackMode = SQLConf.get.getConf(SQLConf.CODEGEN_OBJECT_FALLBACK)
// Only in tests, we can use `SQLConf.CODEGEN_OBJECT_FALLBACK` to choose codegen/interpreted
// only path.
if (Utils.isTesting && fallbackMode != "fallback") {
fallbackMode match {
case "codegen-only" => createCodeGeneratedObject(in)
case "interpreted-only" => createInterpretedObject(in)
// We are allowed to choose codegen-only or no-codegen modes if under tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rethinking about it. Do you think this conf is useful if there is a perf problem in codegen and users want to use interpreted version?

Copy link
Member Author

@viirya viirya May 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, possible. If users are aware of codegen objects like unsafe projection can be controlled too. Currently when we fallback from wholestage codegen to interpreted mode, we still use codegen unsafe projection if it doesn't fail compile. If it has perf problem, users can't do anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we want to have this conf for not only test but also production?

if (Utils.isTesting && CodegenObjectFactoryMode.currentMode != CodegenObjectFactoryMode.AUTO) {
CodegenObjectFactoryMode.currentMode match {
case CodegenObjectFactoryMode.CODEGEN_ONLY => createCodeGeneratedObject(in)
case CodegenObjectFactoryMode.NO_CODEGEN => createInterpretedObject(in)
}
} else {
try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import org.apache.spark.internal.Logging
import org.apache.spark.internal.config._
import org.apache.spark.network.util.ByteUnit
import org.apache.spark.sql.catalyst.analysis.Resolver
import org.apache.spark.sql.catalyst.expressions.CodegenObjectFactoryMode
import org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator
import org.apache.spark.util.Utils

Expand Down Expand Up @@ -688,14 +689,14 @@ object SQLConf {

val CODEGEN_OBJECT_FALLBACK = buildConf("spark.sql.test.codegenObject.fallback")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODEGEN_FACTORY_MODE and spark.sql.codegen.factoryMode?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, shall we use CODEGEN_FACTORY_MODE or CODEGEN_FALLBACK_MODE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CODEGEN_FACTORY_MODE seems more accurate to me. This doesn't control all codegen fallback.

.doc("Determines the behavior of any factories extending `CodegenObjectFactory`" +
" during tests. `fallback` means trying codegen first and then fallbacking to" +
"interpreted if any compile error happens. Disabling fallback if `codegen-only`." +
"`interpreted-only` skips codegen and goes interpreted path always. Note that" +
"this config works only for tests. In production it always runs with `fallback` mode")
" during tests. `AUTO` means trying codegen first and then fallbacking to" +
"interpreted if any compile error happens. Disabling fallback if `CODEGEN_ONLY`." +
"`NO_CODEGEN` skips codegen and goes interpreted path always. Note that" +
"this config works only for tests.")
.internal()
.stringConf
.checkValues(Set("fallback", "codegen-only", "interpreted-only"))
.createWithDefault("fallback")
.checkValues(CodegenObjectFactoryMode.values.map(_.toString))
.createWithDefault(CodegenObjectFactoryMode.AUTO.toString)

val CODEGEN_FALLBACK = buildConf("spark.sql.codegen.fallback")
.internal()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ class CodegenObjectFactorySuite extends SparkFunSuite with PlanTestBase {
private def testCodegenFactory[IN, OUT](factory: CodegenObjectFactory[IN, OUT],
input: IN, checkerForCodegen: OUT => Unit, checkerForInterpreted: OUT => Unit) = {

val modes = Seq("codegen-only", "interpreted-only")
val modes = Seq("CODEGEN_ONLY", "NO_CODEGEN")
.zip(Seq(checkerForCodegen, checkerForInterpreted))

for ((fallbackMode, checker) <- modes) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ trait ExpressionEvalHelper extends GeneratorDrivenPropertyChecks with PlanTestBa
expression: Expression,
expected: Any,
inputRow: InternalRow = EmptyRow): Unit = {
for (fallbackMode <- Seq("codegen-only", "interpreted-only")) {
for (fallbackMode <- Seq("CODEGEN_ONLY", "NO_CODEGEN")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not hardcode it, call CodegenObjectFactoryMode.xxx instead

withSQLConf(SQLConf.CODEGEN_OBJECT_FALLBACK.key -> fallbackMode) {
val factory = UnsafeProjection
val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow, factory)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ class UnsafeRowConverterSuite extends SparkFunSuite with Matchers with PlanTestB
f: UnsafeProjection.type => Unit): Unit = {
val factory = UnsafeProjection
test(name) {
for (fallbackMode <- Seq("codegen-only", "interpreted-only")) {
for (fallbackMode <- Seq("CODEGEN_ONLY", "NO_CODEGEN")) {
withSQLConf(SQLConf.CODEGEN_OBJECT_FALLBACK.key -> fallbackMode) {
f(factory)
}
Expand Down