Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-20360][PYTHON] Cleaner SparkContext HTML
  • Loading branch information
rgbkrk committed Apr 18, 2017
commit a2acd97a608c688c3f55606ac1afa8c1c89a2886
20 changes: 13 additions & 7 deletions python/pyspark/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,15 +249,21 @@ def __repr__(self):
def _repr_html_(self):
return """
<div>
<p><b>Spark Context</b></p>
<ul>
<li>Spark <code>v{spark_version}</code></li>
<li><a href="{spark_ui_url}">Spark UI</a></li>
</ul>
<p><b>SparkContext</b></p>

<p><a href="{sc.uiWebUrl}">Spark UI</a></p>

<dl>
<dt>Version</dt>
<dd><code>v{sc.version}</code></dd>
<dt>Master</dt>
<dd><code>{sc.master}</code></dd>
<dt>AppName</dt>
<dd><code>{sc.appName}</code></dd>
</dl>
</div>
""".format(
spark_version=self.version,
spark_ui_url=self.uiWebUrl,
sc=self
)

def _initialize_context(self, jconf):
Expand Down
3 changes: 3 additions & 0 deletions python/pyspark/sql/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,9 @@ def __init__(self, sparkContext, jsparkSession=None):
or SparkSession._instantiatedSession._sc._jsc is None:
SparkSession._instantiatedSession = self

def _repr_html_(self):
return self.sparkContext._repr_html_()
Copy link
Contributor

@holdenk holdenk Apr 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @felixcheung suggested I think it might make sense to include some extra Spark SQL specific things.

I think the catalog implementation type here (can be very useful for understanding why hive UDFs are not working) (which you can get from the session config with the spark.sql.catalogImplementation key) and the current database (which you can get from the catalog object from the session). The catalog implementation is especially useful since we currently do a "fallback" from hive supported to non-hive supported and the user might not have noticed if they launched in Jupyter where the log messages are a bit more obscure -- something I've been meaning to work on in #17298 but I've gotten a bit distracted).

It might also make sense to return a different URL link (e.g. to the SQL page rather than the default page which takes people to the Jobs section) but this is minor and likely less useful than the other things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've set the catalogImplementation here now. This seems like something we could put in the SparkContext HTML repr as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also make sense to return a different URL link (e.g. to the SQL page rather than the default page which takes people to the Jobs section) but this is minor and likely less useful than the other things.

I don't know enough internals here to tell what URL(s) you'd want here. What properties or calls should I make for the SQL page?


@since(2.0)
def newSession(self):
"""
Expand Down