Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
2a2df72
Palantir Build infrastructure, Palantir Hadoop, Palantir Parquet
rshkv Feb 26, 2021
5075e0a
[SPARK-33770][SQL][TESTS][3.1][3.0] Fix the `ALTER TABLE .. DROP PART…
MaxGekk Dec 14, 2020
397184c
[SPARK-25200][YARN] Allow specifying HADOOP_CONF_DIR as spark property
rshkv Feb 23, 2021
bbd8d9b
[SPARK-26626][SQL] Maximum size for repeatedly substituted aliases in…
rshkv Feb 23, 2021
4ba513b
[SPARK-20952] ParquetFileFormat should forward TaskContext to its for…
rshkv Feb 23, 2021
ce81128
[SPARK-18079] [SQL] CollectLimitExec.executeToIterator should perform…
rshkv Feb 23, 2021
edd1779
[SPARK-33089][CORE] Enhance ExecutorPlugin API to include callbacks o…
fsamuel-bs Oct 16, 2020
2996136
Allow custom external catalogs (#127)
rshkv Feb 26, 2021
89c98b3
Custom CatalogFileIndex (#364)
rshkv Feb 26, 2021
0c99eb0
Implement a Docker image generator gradle plugin for Spark applications
jdcasale Feb 16, 2021
f830c35
Palantir SafeLogging
jdcasale Feb 16, 2021
36d6e37
K8s local file mounting
rshkv Feb 23, 2021
1209fb7
K8s local deploy mode
rshkv Feb 23, 2021
bfa8d60
Infer Pandas string columns in Arrow conversion on Python 2
rshkv Feb 25, 2021
16cb2a4
Palantir Conda Runner & R-Test infrastructure
rshkv Feb 25, 2021
1879255
[SPARK-33984][PYTHON] Upgrade to Py4J 0.10.9.1
HyukjinKwon Jan 4, 2021
02e67e0
[SPARK-21195][CORE] MetricSystem should pick up dynamically registere…
rshkv Feb 19, 2021
efe361f
Update contributing-to-spark.md
jdcasale Mar 4, 2021
f395b15
Default spark.sql.parquet.outputTimestampType to TIMESTAMP_MICROS
rshkv Mar 8, 2021
6d71135
[SPARK-33504][CORE][3.0] The application log in the Spark history ser…
viirya Feb 24, 2021
e46dbe7
[SPARK-34232][CORE][3.0] Redact SparkListenerEnvironmentUpdate event …
viirya Feb 24, 2021
a2bd672
[SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch…
Mar 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Allow custom external catalogs (#127)
This adds config that allows us to to inject a custom session builder.
Internally we use it to build SparkSessions that highly configured
beyond what Spark's built-in configs allow. Most importantly that
includes building and registering our own session catalog (v1)
implementation with the SparkSession.

You can find how we use this config here [1] and our own
SessionStateBuilder here [2].

[1]: https://pl.ntr/1UU
[2]: https://pl.ntr/1UT

Co-authored-by: Robert Kruszewski <robertk@palantir.com>
Co-authored-by: Josh Casale <jcasale@palantir.com>
Co-authored-by: Will Raschkowski <wraschkowski@palantir.com>
  • Loading branch information
3 people committed Mar 4, 2021
commit 2996136e30cbd7b421eb6112ef11b05c321f0a51
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,13 @@ object StaticSQLConf {
.internal()
.version("2.0.0")
.stringConf
.checkValues(Set("hive", "in-memory"))
.createWithDefault("in-memory")

val SESSION_STATE_IMPLEMENTATION = buildStaticConf("spark.sql.sessionStateImplementation")
.internal()
.stringConf
.createWithDefault(CATALOG_IMPLEMENTATION.defaultValueString)

val GLOBAL_TEMP_DATABASE = buildStaticConf("spark.sql.globalTempDatabase")
.internal()
.version("2.1.0")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ import org.apache.spark.sql.execution._
import org.apache.spark.sql.execution.command.ExternalCommandExecutor
import org.apache.spark.sql.execution.datasources.{DataSource, LogicalRelation}
import org.apache.spark.sql.internal._
import org.apache.spark.sql.internal.StaticSQLConf.CATALOG_IMPLEMENTATION
import org.apache.spark.sql.internal.StaticSQLConf.{CATALOG_IMPLEMENTATION, SESSION_STATE_IMPLEMENTATION}
import org.apache.spark.sql.sources.BaseRelation
import org.apache.spark.sql.streaming._
import org.apache.spark.sql.types.{DataType, StructType}
Expand Down Expand Up @@ -866,6 +866,7 @@ object SparkSession extends Logging {
*/
def enableHiveSupport(): Builder = synchronized {
if (hiveClassesArePresent) {
config(SESSION_STATE_IMPLEMENTATION.key, "hive")
config(CATALOG_IMPLEMENTATION.key, "hive")
} else {
throw new IllegalArgumentException(
Expand Down Expand Up @@ -1083,9 +1084,10 @@ object SparkSession extends Logging {
"org.apache.spark.sql.hive.HiveSessionStateBuilder"

private def sessionStateClassName(conf: SparkConf): String = {
conf.get(CATALOG_IMPLEMENTATION) match {
conf.get(SESSION_STATE_IMPLEMENTATION) match {
case "hive" => HIVE_SESSION_STATE_BUILDER_CLASS_NAME
case "in-memory" => classOf[SessionStateBuilder].getCanonicalName
case builder => builder
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,7 @@ object SharedState extends Logging {
conf.get(CATALOG_IMPLEMENTATION) match {
case "hive" => HIVE_EXTERNAL_CATALOG_CLASS_NAME
case "in-memory" => classOf[InMemoryCatalog].getCanonicalName
case name => name
}
}

Expand Down