-
Notifications
You must be signed in to change notification settings - Fork 4
RFC-0051: EXCLUDE Clause #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
219ea6e
10eaba1
833dfda
4b394d7
3dfbf94
efda425
d379a6d
cea4b7a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
johnedquinn marked this conversation as resolved.
Show resolved
Hide resolved
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Top-level, let's format these lines to be like 80 or 120 characters wide. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -88,12 +88,12 @@ e.g. tableFoo.a[1].*[*].b['c'] | |
| * We restrict tuple attribute exclude steps to use string literals and collection index exclude steps to use int literals. Thus `<exclude paths>` are statically known. We can decide whether to add other exclude paths (e.g. expressions) if a use case arises. | ||
| * If sufficient schema is present and the path can be resolved, we assume the root of an `EXCLUDE` path can be omitted. The variable resolution rules follow what is already included in the PartiQL specification. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we might want to have an example of attribute as a variable. |
||
| * We require that every fully-qualified `<exclude path>` contain a root and at least one step. If a use case arises to exclude a binding tuple variable, then this functionality can be added. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the rationale for this limitation? We should put that here. |
||
| * S-expressions are part of the Ion type system. footnote:[https://amazon-ion.github.io/ion-docs/docs/spec.html#sexp]. | ||
| * S-expressions are part of the Ion type system.footnote:[https://amazon-ion.github.io/ion-docs/docs/spec.html#sexp] | ||
| PartiQL should support s-expression types and values since PartiQL's type system is a superset over the Ion types. Because the current PartiQL specification does not formally define s-expressions operations, we consider the definition of collection index and wildcard steps on s-expressions as out-of-scope for this RFC. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps the statement can be less assertive; I know this is one of those hotly debated topics. The spec. says:
So text can just convey the message that s-expressions semantics as a collection type is not fully defined yet, hence is out of the scope. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. This statement makes more assertions about the PartiQL value system than does the spec. |
||
|
|
||
| === Rewrite Procedure | ||
| ==== Step 1: subsumption of `EXCLUDE` paths | ||
| We perform the following step to ensure that there are no redundant `EXCLUDE` paths. That is, there is no path such that all of its excluded binding tuple values are excluded by another exclude path. footnote:[This subsumption step is included to make the subsequent rewrite steps easier to reason about. In a query without redundant exclude paths, this step is not necessary.] | ||
| We perform the following step to ensure that there are no redundant `EXCLUDE` paths. That is, there is no path such that all of its excluded binding tuple values are excluded by another exclude path.footnote:[This subsumption step is included to make the subsequent rewrite steps easier to reason about. In a query without redundant exclude paths, this step is not necessary.] | ||
|
|
||
| For each `<exclude path>` `p=root~p~s~1~...s~x~`, we compare it with all other ``<exclude path>``s. `<exclude path>` `p` is said to be subsumed by another path `q=root~q~t~1~...t~y~` and not included in the rewritten `EXCLUDE` clause if any of the following rules apply: | ||
|
|
||
|
|
@@ -103,7 +103,7 @@ NOTE: The following rules assume `root~p~=root~q~`. | |
| [[anchor-1a]] Rule 1.a:: | ||
| If `y = 0` (i.e. `q` has no steps), `q` subsumes `p`. | ||
| [[anchor-1b]] Rule 1.b:: | ||
| If `y ≥ x` and `s~1~...s~x~=t~1~...t~x~`, `q` subsumes `p`. Put another way if `p` has at least as many steps as `q` and the steps up to ``q``'s length are equivalent, `q` subsumes `p`. | ||
| If `x ≥ y` and `s~1~...s~x~=t~1~...t~x~`, `q` subsumes `p`. Put another way if `p` has at least as many steps as `q` and the steps up to ``q``'s length are equivalent, `q` subsumes `p`. | ||
|
|
||
| Otherwise, there must be some step at which `p` and `q` diverge. Let's call this step's index `i`. | ||
|
|
||
|
|
@@ -180,7 +180,7 @@ SELECT VALUE { | |
| 'r': | ||
| CASE | ||
| WHEN ... -- branch(es) dependent on ``s~1~``'s rewrite rule | ||
| ... -- nested `CASE` expressions for `s~2~...s~n~` | ||
| ... -- nested `CASE` expressions for `s~2~...s~n-1~` | ||
| CASE | ||
| WHEN ... -- branch(es) dependent on ``s~n~``'s rewrite rule | ||
| ELSE <v~n-1~> | ||
|
|
@@ -340,14 +340,14 @@ For multiple `EXCLUDE` paths, we employ a similar idea as the rewrite for a sing | |
| [source,partiql,subs="+{markup-in-source}"] | ||
| ---- | ||
| -- Let `M` represent the number of `EXCLUDE` paths | ||
| -- Let `R` represent the number of unique `EXCLUDE` path roots | ||
|
|
||
| -- Original query: | ||
| <select clause> | ||
| EXCLUDE p~1~,...,p~M~ | ||
| <from clause> | ||
| <other clauses> | ||
|
|
||
| -- Let `R` represent the number of unique `EXCLUDE` path roots | ||
| -- Rewritten to: | ||
| <select clause> | ||
| FROM ( | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A discussion from the original issue revolves around replacing items rather than just excluding them. A major use-case of PartiQL is using PartiQL as a means of performing transformations on semi-structured, open-schema data. Mentioned in the issue are also customers who have 1000+ columns in their source tables.
From how I've been reading this RFC, we might be able to provide a useful work-around -- at least for top-level values. We can take advantage of the fact that
LETevaluates beforeEXCLUDE. See below:For nested attributes, however, I couldn't immediately find an intuitive solution.
With this RFC, do you expect any future necessary RFC's to add support for
REPLACE? If so, in your opinion, does this RFC impede or allow for the addition ofREPLACE?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my assumption and to leave
REPLACEout of scope for this PR.REPLACEis included in the "Future possibilities" section of the RFC.I need to think more about the relationship between
EXCLUDEandREPLACE. I think the syntactic rewrite included in the RFC could be adapted to supportREPLACE, so I don't believe this RFC impedes an addition ofREPLACE. After I get back from the Thanksgiving holiday, I'll look more into if the syntactic rewrite approach could be applied to nested attributes ofREPLACE.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Playing around a bit with the rewrite rules from the RFC, we could do something similar in the nested case branches for
REPLACEof nested attributes. For example, using the query from example-tuple-attribute-as-final-step, if we had added theREPLACEclause:REPLACE t.b.field_x AS t.b.field_x * 42, the rewrite could add aWHENbranch likeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The full query could look something like:
, which the Kotlin implementation will output as: