-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48668][SQL] Support ALTER NAMESPACE ... UNSET PROPERTIES in v2 #47038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| case class UnsetNamespaceProperties( | ||
| namespace: LogicalPlan, | ||
| propertyKeys: Seq[String], | ||
| ifExists: Boolean) extends UnaryCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably use RunnableCommand to simplify the v2 command implementation. We can add a UnaryRunnableCommand, and then do
case class UnsetNamespacePropertiesCommand(...) extends UnaryRunnableCommand {
...
def run(...) {
val ResolvedIdentifier(catalog, ident) = child
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me give it a try, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I have improved it in the way you suggested.
| } | ||
|
|
||
| trait LeafRunnableCommand extends RunnableCommand with LeafLike[LogicalPlan] | ||
| trait UnaryRunnableCommand extends RunnableCommand with UnaryLike[LogicalPlan] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because RunnableCommand is defined in the module sql/core, it seems that adding UnaryRunnableCommand here is more appropriate.
|
|
||
| def unsetNonExistentPropertiesError( | ||
| properties: Seq[String], table: TableIdentifier): Throwable = { | ||
| properties: Seq[String], nameParts: Seq[String]): Throwable = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse UNSET_NONEXISTENT_PROPERTIES
| } | ||
|
|
||
| object ResolveCatalogs { | ||
| def resolveNamespace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract method resolveNamespace from class ResolveCatalogs to object ResolveCatalogs and reuse it in UnsetNamespacePropertiesCommand
| "UNSET_NONEXISTENT_PROPERTIES" : { | ||
| "message" : [ | ||
| "Attempted to unset non-existent properties [<properties>] in table <table>." | ||
| "Attempted to unset non-existent properties [<properties>] in relation <relationId>." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to reuse it in the namespace scene
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but namespace is not a relation... maybe just table or namespace <name>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, okay.
| * ALTER (DATABASE|SCHEMA|NAMESPACE) ... UNSET (DBPROPERTIES|PROPERTIES) [IF EXISTS] ...; | ||
| * }}} | ||
| */ | ||
| case class UnsetNamespacePropertiesCommand( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to define UnsetNamespacePropertiesCommand in module sql/core because it inherits UnaryRunnableCommand extends RunnableCommand and cannot be accessed in the module sql/catalyst (sql/core depends on sql/catalyst)
|
|
||
| (ctx.LOCAL != null, finalStorage, Some(DDLUtils.HIVE_PROVIDER)) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to put the parse process of UnsetNamespacePropertiesCommand here, because UnsetNamespacePropertiesCommand can only be defined within the module sql/core, and AstBuilder cannot access the class UnsetNamespacePropertiesCommand within the module
| override def run(sparkSession: SparkSession): Seq[Row] = { | ||
| val ResolvedIdentifier(catalog, ident) = child | ||
| val ns = ResolveCatalogs.resolveNamespace( | ||
| catalog, ident.namespace.toSeq :+ ident.name, fetchMetadata = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, we should leave this to the analyzer. We should put UnresolvedNamespace inside UnsetNamespacePropertiesCommand, instead of UnresolvedIdentifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me fix it.
Also, I found an issue during the migration of Unify v1 and v2 ALTER TABLE .. UNSET TBLPROPERTIES IF EXISTS tests.
The command (AlterTableExec) for ALTER TABLE ... UNSET TBLPROPERTIES ... in v2 seems to have ignored whether the IF EXISTS parameter.
I think we should fix it. WDYH?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, we should leave this to the analyzer. We should put
UnresolvedNamespaceinsideUnsetNamespacePropertiesCommand, instead ofUnresolvedIdentifier
Thanks, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should fix it. WDYH?
Yea we should fix, but in a separated PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me do it in another separate PR.
| "<functionName> with AES-<mode> does not support initialization vectors (IVs)." | ||
| ] | ||
| }, | ||
| "ALTER_NAMESPACE_PROPERTY" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename SET_NAMESPACE_PROPERTY to ALTER_NAMESPACE_PROPERTY, We will reuse it in SparkSqlParser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In SQL, SET also means update, I think it's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in the following two cases, it is UNSUPPORTED_FEATURE.SET_NAMESPACE_PROPERTY, and I am concerned that it may cause misunderstandings.
test reserved properties
sql(s"CREATE NAMESPACE $ns")
val sqlText = s"ALTER NAMESPACE $ns SET PROPERTIES ('$key'='dummyVal')"
sql(s"CREATE NAMESPACE $ns")
val sqlText = s"ALTER NAMESPACE $ns UNSET PROPERTIES ('$key')"
Okay, let me restore it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the corresponding errors for table properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The corresponding errors for table properties: UNSUPPORTED_FEATURE.SET_TABLE_PROPERTY, eg:
ALTER TABLE ... SET PROPERTIES ('$reservedKey'='...')"
ALTER TABLE ... UNSET PROPERTIES ('$reservedKey')"
| sql(s"ALTER NAMESPACE $ns SET PROPERTIES ('$key'='dummyVal')") | ||
| } | ||
| assert(exception.getMessage.contains(s"$key is a reserved namespace property")) | ||
| val sqlText = s"ALTER NAMESPACE $ns SET PROPERTIES ('$key'='dummyVal')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR,
Only let's use checkErrorMatchPVals to check for error-conditions
| import org.apache.spark.sql.errors.QueryCompilationErrors | ||
|
|
||
| /** | ||
| * A command that ALTER NAMESPACE UNSET PROPERTIES command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix the grammar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.../src/main/scala/org/apache/spark/sql/execution/command/UnsetNamespacePropertiesCommand.scala
Show resolved
Hide resolved
| is issued if the database is not found in the system. | ||
|
|
||
| ### ALTER PROPERTIES | ||
| ### ALTER SET PROPERTIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the entire page is for ALTER DATABASE, I think SET PROPERTIES is a better section title.
|
|
||
| Specifies the name of the database to be altered. | ||
|
|
||
| ### ALTER UNSET PROPERTIES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add examples for this new syntax, in the ### Examples section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| Specifies the name of the database to be altered. | ||
|
|
||
| ### ALTER LOCATION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| /** | ||
| * Create a [[UnsetNamespacePropertiesCommand]] command. | ||
| * | ||
| * For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * For example: | |
| * Expected format: |
| * For example: | ||
| * {{{ | ||
| * ALTER (DATABASE|SCHEMA|NAMESPACE) database | ||
| * UNSET (DBPROPERTIES | PROPERTIES) ('comment', 'key'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * UNSET (DBPROPERTIES | PROPERTIES) ('comment', 'key'); | |
| * UNSET (DBPROPERTIES | PROPERTIES) ('key1', 'key2'); |
|
@cloud-fan |
|
thanks, merging to master! |
| } | ||
| checkError(e, | ||
| errorClass = "SCHEMA_NOT_FOUND", | ||
| parameters = Map("schemaName" -> s"`$ns`")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noticed one thing: why does the error message omit the catalog name? For table not found, the error message contains the user-given table name in the SQL statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me investigate it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
v1 catalog
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
Line 316 in 4c99c4d
requireDbExists(dbName)
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
Line 253 in 4c99c4d
throw new NoSuchDatabaseException(db) -
v2 catalog
spark/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableCatalog.scala
Lines 206 to 213 in 4c99c4d
override def loadNamespaceMetadata(namespace: Array[String]): util.Map[String, String] = { Option(namespaces.get(namespace.toSeq)) match { case Some(metadata) => metadata.asJava case _ if namespaceExists(namespace) => util.Collections.emptyMap[String, String] case _ => throw new NoSuchNamespaceException(namespace)
From the user's perspective (when he uses multiple catalogs for federated queries and encounters some tables that do not exist), we should fix it, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do we also need to fix similar scenarios (when table not exist) below together?
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
Lines 257 to 267 in 4c99c4d
| private def requireTableExists(name: TableIdentifier): Unit = { | |
| if (!tableExists(name)) { | |
| throw new NoSuchTableException(db = name.database.get, table = name.table) | |
| } | |
| } | |
| private def requireTableNotExists(name: TableIdentifier): Unit = { | |
| if (tableExists(name)) { | |
| throw new TableAlreadyExistsException(db = name.database.get, table = name.table) | |
| } | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan Initial PR, just attach catalog name before the namespace. If this way is ok, I will continue to add table related as well.
#47276
Thanks!
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing #47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…pace` to `catalog.namespace` ### What changes were proposed in this pull request? The pr aims to change the value of `SCHEMA_NOT_FOUND` from `namespace` to `catalog.namespace`. ### Why are the changes needed? As discussing apache#47038 (comment), we should provide more friendly and clear prompt error message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Update existed UT & Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47276 from panbingkun/db_with_catalog. Authored-by: panbingkun <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
The pr aims to support
ALTER NAMESPACE ... UNSET PROPERTIESinv2.Why are the changes needed?
tableandview, we canadd,update, ordeletetable'spropertiesusing the following SQL:spark/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
Lines 148 to 151 in 3469ec6
SETsyntax fornamespace, notUNSETsyntaxspark/sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
Lines 106 to 107 in 3469ec6
SupportsNamespacesinterface supports deleting properties. I propose adding SQL syntax to facilitate users to use SQL instead of relying solely on APIs to manipulate the properties ofnamespacespark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsNamespaces.java
Lines 127 to 137 in 3469ec6
spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/NamespaceChange.java
Lines 59 to 61 in 3469ec6
Does this PR introduce any user-facing change?
Yes, end users can delete the properties of
namespacethrough SQL.How was this patch tested?
Add new UT.
Was this patch authored or co-authored using generative AI tooling?
No.