Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ private[hive] class HiveClientImpl(
/** Returns the configuration for the current session. */
def conf: HiveConf = state.getConf

private val userName = state.getAuthenticator.getUserName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this returns null?

private val userName = conf.getUser
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceXu1991. I want to reproduce your problem here. Could you describe your environment more specifically? For me, 2.2.1 works like the following.

scala> spark.version
res0: String = 2.2.1

scala> sql("CREATE TABLE spark_22846(a INT)")

scala> sql("DESCRIBE FORMATTED spark_22846").show
+--------------------+--------------------+-------+
|            col_name|           data_type|comment|
+--------------------+--------------------+-------+
|                   a|                 int|   null|
|                    |                    |       |
|# Detailed Table ...|                    |       |
|            Database|             default|       |
|               Table|         spark_22846|       |
|               Owner|            dongjoon|       |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, does this happen in case of MySQL as Hive metastore?

Copy link
Author

@BruceXu1991 BruceXu1991 Dec 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I met this problem by using MySQL as Hive metastore.
what's more, when I execute DESCRIBE FORMATTED spark_22846, NullPointerException will occur.

'''

DESCRIBE FORMATTED offline.spark_22846;
Error: java.lang.NullPointerException (state=,code=0)

'''

and the detail stack info:

17/12/22 18:18:10 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING,
java.lang.NullPointerException
        at scala.collection.immutable.StringOps$.length$extension(StringOps.scala:47)
        at scala.collection.immutable.StringOps.length(StringOps.scala:47)
        at scala.collection.IndexedSeqOptimized$class.isEmpty(IndexedSeqOptimized.scala:27)
        at scala.collection.immutable.StringOps.isEmpty(StringOps.scala:29)
        at scala.collection.TraversableOnce$class.nonEmpty(TraversableOnce.scala:111)
        at scala.collection.immutable.StringOps.nonEmpty(StringOps.scala:29)
        at org.apache.spark.sql.catalyst.catalog.CatalogTable.toLinkedHashMap(interface.scala:301)
        at org.apache.spark.sql.execution.command.DescribeTableCommand.describeFormattedTableInfo(tables.scala:559)
        at org.apache.spark.sql.execution.command.DescribeTableCommand.run(tables.scala:537)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:767)
        at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691)

this result of NPE is that owner is null. The relevant source code is below:

def toLinkedHashMap: mutable.LinkedHashMap[String, String] = {
.........
line 301: if (owner.nonEmpty) map.put("Owner", owner)
........
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know how Hive get the username internally?

Copy link
Author

@BruceXu1991 BruceXu1991 Dec 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your response, Fan.

now the current implementation of spark-2.2.1 is

private val userName = state.getAuthenticator.getUserName

when the implementation of state.getAuthenticator is HadoopDefaultAuthenticator, which is default in hive conf, the username is got.

however, in the case that the implementation of state.getAuthenticator is SessionStateUserAuthenticator, which is used in my case, then username will be null.

the simplified code below explains the reason:

  1. HadoopDefaultAuthenticator
public class HadoopDefaultAuthenticator implements HiveAuthenticationProvider {
@Override
  public String getUserName() {
    return userName;
  }

  @Override
  public void setConf(Configuration conf) {
    this.conf = conf;
    UserGroupInformation ugi = null;
    try {
      ugi = Utils.getUGI();
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
    this.userName = ugi.getShortUserName();
    if (ugi.getGroupNames() != null) {
      this.groupNames = Arrays.asList(ugi.getGroupNames());
    }
  }
}

public class Utils {
  public static UserGroupInformation getUGI() throws LoginException, IOException {
    String doAs = System.getenv("HADOOP_USER_NAME");
    if(doAs != null && doAs.length() > 0) {
      return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser());
    }
    return UserGroupInformation.getCurrentUser();
  }
}

it shows that HadoopDefaultAuthenticator will get username through Utils.getUGI(), so the username is HADOOP_USER_NAME or LoginUser.

  1. SessionStateUserAuthenticator
public class SessionStateUserAuthenticator implements HiveAuthenticationProvider {
  @Override
  public void setConf(Configuration arg0) {
  }

  @Override
  public String getUserName() {
    return sessionState.getUserName();
  }
}

it shows that SessionStateUserAuthenticator get the username through sessionState.getUserName(), which is null, because username is not used in the instantiation of sessionState. Here is the instantiation of SessionState in HiveClientImpl

So getting username through conf.getUser may be more compatible to various use case.

the related code in HiveConf:

public class HiveConf extends Configuration {

    public String getUser() throws IOException {
        try {
            UserGroupInformation le = Utils.getUGI();
            return le.getUserName();
        } catch (LoginException var2) {
            throw new IOException(var2);
        }
    }

}


override def getConf(key: String, defaultValue: String): String = {
conf.get(key, defaultValue)
Expand Down