[SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver #20034

BruceXu1991 · 2017-12-20T13:15:25Z

What changes were proposed in this pull request?

fix table owner is null when create new table through spark sql

How was this patch tested?

manual test.
1、first create a table
2、then select the table properties from mysql which connected to hive metastore

Please review http://spark.apache.org/contributing.html before opening a pull request.

BruceXu1991 · 2017-12-21T08:38:53Z

@cloud-fan @gatorsmile could you review this issue?

cloud-fan · 2017-12-21T16:24:48Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala

  /** Returns the configuration for the current session. */
  def conf: HiveConf = state.getConf

-  private val userName = state.getAuthenticator.getUserName


Why this returns null?

cloud-fan · 2017-12-21T16:24:57Z

ok to test

cloud-fan · 2017-12-21T16:25:04Z

can you add a test?

dongjoon-hyun · 2017-12-21T17:21:55Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala

  def conf: HiveConf = state.getConf

-  private val userName = state.getAuthenticator.getUserName
+  private val userName = conf.getUser


So, does this happen in case of MySQL as Hive metastore?

yes, I met this problem by using MySQL as Hive metastore.
what's more, when I execute DESCRIBE FORMATTED spark_22846, NullPointerException will occur.

'''

DESCRIBE FORMATTED offline.spark_22846;
Error: java.lang.NullPointerException (state=,code=0)

'''

and the detail stack info:

17/12/22 18:18:10 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING, java.lang.NullPointerException at scala.collection.immutable.StringOps$.length$extension(StringOps.scala:47) at scala.collection.immutable.StringOps.length(StringOps.scala:47) at scala.collection.IndexedSeqOptimized$class.isEmpty(IndexedSeqOptimized.scala:27) at scala.collection.immutable.StringOps.isEmpty(StringOps.scala:29) at scala.collection.TraversableOnce$class.nonEmpty(TraversableOnce.scala:111) at scala.collection.immutable.StringOps.nonEmpty(StringOps.scala:29) at org.apache.spark.sql.catalyst.catalog.CatalogTable.toLinkedHashMap(interface.scala:301) at org.apache.spark.sql.execution.command.DescribeTableCommand.describeFormattedTableInfo(tables.scala:559) at org.apache.spark.sql.execution.command.DescribeTableCommand.run(tables.scala:537) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:767) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691)

this result of NPE is that owner is null. The relevant source code is below:

def toLinkedHashMap: mutable.LinkedHashMap[String, String] = { ......... line 301: if (owner.nonEmpty) map.put("Owner", owner) ........ }

do you know how Hive get the username internally?

thanks for your response, Fan.

now the current implementation of spark-2.2.1 is

private val userName = state.getAuthenticator.getUserName

when the implementation of state.getAuthenticator is HadoopDefaultAuthenticator, which is default in hive conf, the username is got.

however, in the case that the implementation of state.getAuthenticator is SessionStateUserAuthenticator, which is used in my case, then username will be null.

the simplified code below explains the reason:

HadoopDefaultAuthenticator

public class HadoopDefaultAuthenticator implements HiveAuthenticationProvider { @Override public String getUserName() { return userName; } @Override public void setConf(Configuration conf) { this.conf = conf; UserGroupInformation ugi = null; try { ugi = Utils.getUGI(); } catch (Exception e) { throw new RuntimeException(e); } this.userName = ugi.getShortUserName(); if (ugi.getGroupNames() != null) { this.groupNames = Arrays.asList(ugi.getGroupNames()); } } } public class Utils { public static UserGroupInformation getUGI() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser()); } return UserGroupInformation.getCurrentUser(); } }

it shows that HadoopDefaultAuthenticator will get username through Utils.getUGI(), so the username is HADOOP_USER_NAME or LoginUser.

SessionStateUserAuthenticator

public class SessionStateUserAuthenticator implements HiveAuthenticationProvider { @Override public void setConf(Configuration arg0) { } @Override public String getUserName() { return sessionState.getUserName(); } }

it shows that SessionStateUserAuthenticator get the username through sessionState.getUserName(), which is null, because username is not used in the instantiation of sessionState. Here is the instantiation of SessionState in HiveClientImpl

So getting username through conf.getUser may be more compatible to various use case.

the related code in HiveConf:

public class HiveConf extends Configuration { public String getUser() throws IOException { try { UserGroupInformation le = Utils.getUGI(); return le.getUserName(); } catch (LoginException var2) { throw new IOException(var2); } } }

SparkQA · 2017-12-21T18:29:37Z

Test build #85270 has finished for PR 20034 at commit e8c3035.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-12-27T02:09:04Z

thanks, merging to master!

fix SPARK-22846

e8c3035

cloud-fan reviewed Dec 21, 2017

View reviewed changes

dongjoon-hyun reviewed Dec 21, 2017

View reviewed changes

asfgit closed this in 6674acd Dec 27, 2017

HyukjinKwon mentioned this pull request Feb 20, 2019

[SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline #23837

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver #20034

[SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver #20034

Uh oh!

BruceXu1991 commented Dec 20, 2017 •

edited

Loading

Uh oh!

BruceXu1991 commented Dec 21, 2017

Uh oh!

cloud-fan Dec 21, 2017

Uh oh!

cloud-fan commented Dec 21, 2017

Uh oh!

cloud-fan commented Dec 21, 2017

Uh oh!

dongjoon-hyun Dec 21, 2017

Uh oh!

dongjoon-hyun Dec 21, 2017

Uh oh!

BruceXu1991 Dec 22, 2017 •

edited

Loading

Uh oh!

cloud-fan Dec 22, 2017

Uh oh!

BruceXu1991 Dec 23, 2017 •

edited

Loading

Uh oh!

SparkQA commented Dec 21, 2017

Uh oh!

cloud-fan commented Dec 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver #20034

[SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver #20034

Uh oh!

Conversation

BruceXu1991 commented Dec 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

BruceXu1991 commented Dec 21, 2017

Uh oh!

cloud-fan Dec 21, 2017

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Dec 21, 2017

Uh oh!

cloud-fan commented Dec 21, 2017

Uh oh!

dongjoon-hyun Dec 21, 2017

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Dec 21, 2017

Choose a reason for hiding this comment

Uh oh!

BruceXu1991 Dec 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Dec 22, 2017

Choose a reason for hiding this comment

Uh oh!

BruceXu1991 Dec 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Dec 21, 2017

Uh oh!

cloud-fan commented Dec 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BruceXu1991 commented Dec 20, 2017 •

edited

Loading

BruceXu1991 Dec 22, 2017 •

edited

Loading

BruceXu1991 Dec 23, 2017 •

edited

Loading