Skip to content

Commit 8700297

Browse files
author
vatsal mevada
authored
[SNAP-3165] Instantiating snappy session only when catalogImplementation (#191)
is in-memory which running pyspark shell. ## What changes were proposed in this pull request? We are initializing `SparkSession` as well as `SnappySession` while starting pyspark shell. `SparkSession` and `SparkContext`were always initialized with hive support enable irrespective of value of `spark.sql.catalogImplementation` config. With these changes, we are checking the value of `spark.sql.catalogImplementation` and hive support is not enabled when the value of above-mentioned property is set to `in-memory` explicitly. SnappySession will be only initialized when catalog implementation is set to `in-memory` to avoid failure reported in SNAP-3165. Later we can provide support for hive catalog implementation for python with SnappySession.
1 parent 840a4b3 commit 8700297

File tree

1 file changed

+23
-9
lines changed

1 file changed

+23
-9
lines changed

python/pyspark/shell.py

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@
4747
import py4j
4848

4949
import pyspark
50+
51+
from pyspark import SparkConf
5052
from pyspark.context import SparkContext
5153
from pyspark.sql import SparkSession, SQLContext
5254
from pyspark.sql.snappy import SnappySession
@@ -57,25 +59,36 @@
5759

5860
SparkContext._ensure_initialized()
5961

62+
conf = SparkConf()
63+
catalogImplementation = conf.get('spark.sql.catalogImplementation', 'hive').lower()
6064
try:
61-
# Try to access HiveConf, it will raise exception if Hive is not added
62-
SparkContext._jvm.org.apache.hadoop.hive.conf.HiveConf()
63-
spark = SparkSession.builder\
64-
.enableHiveSupport()\
65-
.getOrCreate()
65+
if catalogImplementation == 'hive':
66+
# Try to access HiveConf, it will raise exception if Hive is not added
67+
SparkContext._jvm.org.apache.hadoop.hive.conf.HiveConf()
68+
spark = SparkSession.builder\
69+
.enableHiveSupport()\
70+
.getOrCreate()
71+
else:
72+
spark = SparkSession.builder.getOrCreate()
6673
except py4j.protocol.Py4JError:
6774
spark = SparkSession.builder.getOrCreate()
6875
except TypeError:
6976
spark = SparkSession.builder.getOrCreate()
7077

7178

7279
sc = spark.sparkContext
73-
snappy = SnappySession(sc)
74-
sql = snappy.sql
80+
if catalogImplementation == 'in-memory':
81+
snappy = SnappySession(sc)
82+
sql = snappy.sql
83+
else:
84+
sql = spark.sql
7585
atexit.register(lambda: sc.stop())
7686

7787
# for compatibility
78-
sqlContext = snappy._wrapped
88+
if catalogImplementation == 'in-memory':
89+
sqlContext = snappy._wrapped
90+
else:
91+
sqlContext = spark._wrapped
7992
sqlCtx = sqlContext
8093

8194
print("""Welcome to
@@ -90,7 +103,8 @@
90103
platform.python_build()[0],
91104
platform.python_build()[1]))
92105
print("SparkSession available as 'spark'.")
93-
print("SnappySession available as 'snappy'.")
106+
if catalogImplementation == 'in-memory':
107+
print("SnappySession available as 'snappy'.")
94108

95109
# The ./bin/pyspark script stores the old PYTHONSTARTUP value in OLD_PYTHONSTARTUP,
96110
# which allows us to execute the user's PYTHONSTARTUP file:

0 commit comments

Comments
 (0)