-
Notifications
You must be signed in to change notification settings - Fork 156
Update existential subquery CIP #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 49 commits
3ea3b53
76226dd
ab43947
e76ad49
c6c4caa
62b48e0
0eb3b60
1399839
0f97ded
32aefb8
e7782cb
ede07fb
9a9150c
e29d795
4950b42
e199365
12dbb9f
5a2b853
9b53e94
248e3ce
4e87bde
31a0bef
91f9d22
f994d19
75c24aa
5ec66f6
3b424e3
aba4da9
f56edb9
4f3a475
e3b9679
127ea25
6b48569
9dfa6e3
16d844e
0410a1a
3c18456
eab2366
961affb
bf40a94
caa90a4
d7676fd
e2434ad
27579e6
72144e3
c423671
a842a3b
3fef5ba
5dc1466
a5c6c36
0550779
f256b3a
cd9bf2f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,102 +4,188 @@ | |
| :toc-placement: macro | ||
| :source-highlighter: codemirror | ||
|
|
||
| *Authors:* Andrés Taylor <[email protected]> | ||
| *Authors:* Andrés Taylor <[email protected]>, Hannes Voigt <[email protected]> | ||
|
|
||
|
|
||
| [abstract] | ||
| .Abstract | ||
| -- | ||
| This CIP introduces `EXISTS`, a function and a keyword for checking the existence of properties, simple patterns and full subqueries. | ||
| This CIP introduces existential subqueries and cleans up previously existing language constructs for existence testing. | ||
| -- | ||
|
|
||
| toc::[] | ||
|
|
||
| == Background | ||
|
|
||
| In a way, Cypher already has existential subqueries - pattern predicates checking for the existence of subgraphs. | ||
| This form is helpful, but not exhaustive in the types of subqueries that users want to be able to express. | ||
| openCypher has two language constructs that allow testing of existence. | ||
| Namely | ||
|
|
||
| * property existence predicate: `exists( n.prop )`, which is grammatically a https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=FunctionInvocation[<FunctionInvocation>] | ||
| * pattern predicates for the existence of subgraphs, which is grammatically a https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] | ||
|
|
||
| Both are allowed as https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=Atom[<Atom>]s of expressions. | ||
|
|
||
| Note that a <RelationshipsPattern> appearing in an expression is evaluated to a list of paths containing the matches of the pattern. | ||
|
||
| In a predicate context this list gets coerced to a boolean (empty list -> [underline]#*_False_*#, non-empty list -> [underline]#*_True_*#), letting this <RelationshipsPattern> behave like a predicate although it actually is a list expression. | ||
| As a result, https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>s] appear to behave differently depending on the context in which they appear. | ||
| Further, https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=PatternComprehension[<PatternComprehension>] offer a more powerful and syntactically better separated means to get lists of matches for a pattern within expressions. | ||
|
|
||
| The understanding of subqueries in Cypher has evolved and Cypher gained a proper https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=NullOperatorExpression[<NullOperatorExpression>] (`IS [ NOT ] NULL`), cf. https://github.com/opencypher/openCypher/blob/master/cip/1.accepted/CIP2018-10-29-EXISTS-and-IS-NOT-NULL.adoc[CIP2018-10-29 EXISTS and IS NOT NULL]. | ||
| Both developments and the issues around <RelationshipsPattern>s suggest the need for a refinement of the language constructs that allow testing of existence. | ||
|
||
|
|
||
| == Proposal | ||
|
|
||
| To make this feature more powerful, this CIP suggests the addition of a new function `exists()`, and a keyword `EXISTS {}`, allowing for two different predicates: | ||
| This CIP proposes: | ||
|
|
||
| * Property existence checking | ||
| * Subquery existence checking | ||
| * Restricting https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] to predicate contexts and boolean evaluation, when used in expression contexts | ||
| * Adding two forms of existential subqueries denoted with `EXISTS { ... }`: | ||
| ** Full existential subquery based on a https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RegularQuery[<RegularQuery>] | ||
| ** Simple existential subquery based on a single https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=Pattern[<Pattern>] and an optional https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=Where[<Where>] clause | ||
| * Removing `exists( n.prop )` in favor of the https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=NullOperatorExpression[<NullOperatorExpression>], which is a restatement of what has been accepted with https://github.com/opencypher/openCypher/blob/master/cip/1.accepted/CIP2018-10-29-EXISTS-and-IS-NOT-NULL.adoc[CIP2018-10-29 EXISTS and IS NOT NULL] | ||
|
|
||
| === Syntax | ||
|
|
||
| ---- | ||
| expression = <current definition of expression> | ||
| | property exists | ||
| | subquery exists | ||
| | simple subquery exists | ||
| ; | ||
| ==== Grammar | ||
|
|
||
| property exists = "exists", "(", expression ")" | ||
| ; | ||
| [source,bnf] | ||
| ---- | ||
| <Atom> ::= | ||
| ... | ||
| | <RelationshipsPattern> | ||
| | <ExistentialSubquery> | ||
|
|
||
| subquery exists = "EXISTS", "{", read only clause, { read only clause }, "}" ; | ||
| <ExistentialSubquery> ::= | ||
| <SimpleExistentialSubquery> | ||
| | <FullExistentialSubquery> | ||
|
|
||
| simple subquery exists = "EXISTS", "{", simple match, "}" ; | ||
| <SimpleExistentialSubquery> ::= | ||
| "EXISTS", "{", <SimpleMatch>, "}" | ||
|
|
||
| simple match = pattern, { ",", pattern }, [ "WHERE", predicate ] ; | ||
| <FullExistentialSubquery> ::= | ||
| "EXISTS", "{", <RegularQuery>, "}" | ||
|
|
||
| read only clause = match | ||
| | unwind | ||
| | with | ||
| ; | ||
| <SimpleMatch> ::= | ||
| <Pattern>, [ <Where> ] | ||
| ---- | ||
|
|
||
| Note that the openCypher grammar does not list <SimpleExistentialSubquery>, <SimpleMatch>, and <FullExistentialSubquery> as separate productions but represents them inline of https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=ExistentialSubquery[<ExistentialSubquery>]. | ||
|
|
||
| ==== Syntax Rules | ||
|
|
||
| * A https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] shall only be contained in sites whose expected type is exactly boolean. Specifically, a https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] shall only be simply contained in | ||
hvub marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ** <Where>, used in | ||
| *** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=Match[<Match>] | ||
| *** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=With[<With>] | ||
| *** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=YieldItems[<YieldItems>] | ||
| *** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=FilterExpression[<FilterExpression>], used in | ||
| **** `ALL`, `ANY`, `NONE`, `SINGLE` | ||
| **** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=ListComprehension[<ListComprehension>] | ||
| *** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=PatternComprehension[<PatternComprehension>] | ||
| ** https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=CaseAlternative[<CaseAlternative>] | ||
| * The https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RegularQuery[<RegularQuery>] contained in a <FullExistentialSubquery> shall not contain any https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=UpdatingClause[<UpdatingClause>] nor procedure or function calls that are not known to be free of side effects. | ||
|
|
||
| === Semantics | ||
|
|
||
| All forms of `EXISTS {}` accomplish the same task: checking whether a particular pattern exists in the graph. | ||
| They are expressions, and as such must be side-effect free; that is, the subqueries in `EXISTS {}` must not be updating queries. | ||
| All forms of `EXISTS {}` are scalar expressions - they return a single boolean value, and this value is never `null`. | ||
| https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>], <SimpleExistentialSubquery>, and <FullExistentialSubquery> accomplish the same task: checking whether the result of a subquery is not empty. | ||
|
|
||
| All three forms, | ||
|
|
||
| * Are boolean expressions, i.e. return a single boolean value | ||
| * Never return `null` | ||
| * Are side effect-free, i.e. <FullExistentialSubquery> shall not contain any <UpdatingClause> or other sources of side effects | ||
| * Can contain variables and parameters from the outer queries | ||
Mats-SX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ==== Property exists | ||
| https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] shall not introduce new variables. | ||
|
|
||
| In the property checking form, any expression is accepted as the input. | ||
| If the input expression evaluates to `null`, `exists()` will evaluate to `false`. For any other value, it will evaluate to `true`. | ||
| This form of `exists()` is equivalent to checking for `null` using `IS NOT NULL`: `exists(expr)` is equivalent to `expr IS NOT NULL`. | ||
| Both forms of https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=ExistentialSubquery[<ExistentialSubquery>] (<SimpleExistentialSubquery> and <FullExistentialSubquery>) are allowed to introduce new variables. | ||
| These variables necessarily shall have a name different from the names of all variables available from the outer queries. | ||
| Any variables introduced in an <ExistentialSubquery> are not available outside the subquery context. | ||
|
|
||
| This form is there to make property existence checking idiomatic in Cypher - we do not want the user thinking of missing properties as checking for `null` values. | ||
| https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] and <SimpleExistentialSubquery> are syntactically simpler and semantically less powerful forms of <FullExistentialSubquery>. | ||
| The semantics of https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] and <SimpleExistentialSubquery> can be defined as syntax transformations to <FullExistentialSubquery>. | ||
Mats-SX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ==== Subquery exists | ||
| ==== <RelationshipsPattern> | ||
|
|
||
| When checking for the existence of subqueries, variables and parameters from the outer query are available in the subquery. | ||
| Variables introduced in the subquery are not available on the outside. | ||
| A https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RelationshipsPattern[<RelationshipsPattern>] _RP_ is effectively replaced by the expression | ||
|
|
||
| For each matching subgraph evaluated with `EXISTS {}`, the result value must be `true` if the subquery finds at least one matching row. | ||
| If no matches are found, `false` should be returned. | ||
| `EXISTS { MATCH _RP_ RETURN 1 }` | ||
Mats-SX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| The subquery used in `EXISTS {}` has to follow the following rules: | ||
| ==== <SimpleExistentialSubquery> | ||
|
|
||
| * It has to be a read-only query. Updating the graph as part of a predicate is not allowed. | ||
| * It cannot end in a `RETURN` clause, nor may it contain `UNION`. | ||
| * When `WITH` ends the subquery, it is able to change the cardinality of the subquery. | ||
| In other words, since the `WITH` clause may include `WHERE`, `SKIP`, and/or `LIMIT`, the clause can with these sub-clauses turn a query that produces rows into one that does not. | ||
| * It may use nested `EXISTS {}` predicates. | ||
| A <SimpleExistentialSubquery> containing a <SimpleMatch> _SM_ is effectively replaced by the expression | ||
|
|
||
| ==== Simple subquery exists | ||
| `EXISTS { MATCH _SM_ RETURN 1 }` | ||
|
|
||
| When the subquery can be described with a single `MATCH-WHERE` clause, the `MATCH` keyword can be omitted, as in example 2A and B. | ||
| The difference between this form and a simple pattern predicate, which is already available, is that this form allows for introducing new variables inside the `EXISTS {}` scope. | ||
| ==== <FullExistentialSubquery> | ||
|
|
||
| A <FullExistentialSubquery> _FES_ is effectively evaluated as follows: | ||
|
|
||
| * Let _OUTER_VARIABLES_ be the current working record for which the expression containing _FES_ is evaluated. | ||
| * Let _NESTED_QUERY_ be the https://raw.githack.com/openCypher/openCypher/master/tools/grammar-production-links/grammarLink.html?p=RegularQuery[<RegularQuery>] immediately contained in _FES_. | ||
| * Let _RESULT_TABLE_ be the table resulting from evaluating _NESTED_QUERY_ on a driving table comprising _OUTER_VARIABLES_. | ||
| * Case: | ||
| ** If _RESULT_TABLE_ is an empty table (cardinality is zero), then the result of _FES_ is [underline]#*_False_*#. | ||
| ** Otherwise, the result of _FES_ is [underline]#*_True_*#. | ||
hvub marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Note that all fields in _RESULT_TABLE_ are ignored and only the number of rows in _RESULT_TABLE_ is relevant for the result of _FES_. | ||
|
|
||
| === Examples | ||
|
|
||
| _Example 1A:_ | ||
| ==== Property existence test | ||
|
|
||
| _Example 1:_ | ||
|
|
||
| Return all nodes that have a property named `slogan`. | ||
| [source, cypher] | ||
| ---- | ||
| MATCH (actor) | ||
| WHERE exists(actor.slogan) | ||
| WHERE actor.slogan IS NOT NULL | ||
| RETURN actor | ||
| ---- | ||
|
|
||
| ==== Pattern predicates in boolean expression context | ||
|
|
||
| _Example 2A:_ | ||
|
|
||
| Find all actors who won an award. | ||
|
|
||
| [source, cypher] | ||
| ---- | ||
| MATCH (actor:Actor) WHERE (actor)-[:WON]->(:Award) | ||
| RETURN actor | ||
| ---- | ||
|
|
||
| _Example 2B:_ | ||
|
|
||
| Find all actors with their major accolade. | ||
|
|
||
| [source, cypher] | ||
| ---- | ||
| MATCH (actor:Actor) | ||
| RETURN actor, | ||
| CASE actor | ||
hvub marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| WHEN (actor)-[:WON]->(:Oscar) THEN 'Oscar winner' | ||
| WHEN (actor)-[:WON]->(:GoldenGlobe) THEN 'Golden Globe winner' | ||
| ELSE 'None' | ||
| END AS accolade | ||
| ---- | ||
|
|
||
| _Example 2C:_ | ||
|
|
||
| Find all movies that have at least one award-winning actor in their cast. | ||
|
|
||
| [source, cypher] | ||
| ---- | ||
| MATCH (movie:Movie)<-[:ACTED_IN]-(actor:Actor) | ||
| WITH movie, collect(actor) AS cast | ||
| WHERE ANY(actor IN cast WHERE (actor)-[:WON]->(:Award)) | ||
| RETURN movie | ||
| ---- | ||
|
|
||
Mats-SX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ==== Existential subqueries | ||
|
|
||
| _Example 3A:_ | ||
|
|
||
| Find all actors who have acted together with another actor with the same name. | ||
|
|
||
| [source, cypher] | ||
|
|
@@ -112,7 +198,7 @@ WHERE EXISTS { | |
| RETURN actor | ||
| ---- | ||
|
|
||
| _Example 2B:_ | ||
| _Example 3B:_ | ||
|
|
||
| Find all actors who have acted together with another actor with the same name on at least two movies. | ||
|
|
||
|
|
@@ -128,11 +214,6 @@ WHERE EXISTS { | |
| RETURN actor | ||
| ---- | ||
|
|
||
| === Interaction with existing features | ||
|
|
||
| The `EXISTS {}` subquery clause renders obsolete the current pattern predicate syntax. | ||
| This allows the pattern predicates to be deprecated and/or removed in favour of `EXISTS {}`. | ||
|
|
||
| == What others do | ||
|
|
||
| This is very similar to what SQL does with its `EXISTS` functionality. | ||
|
|
@@ -156,7 +237,7 @@ RETURN person | |
|
|
||
| This proposal also allows for powerful subqueries, for example using aggregation inside the `EXISTS {}` query. | ||
|
|
||
| .Find all teams that have at least two members who have worked on successful projects. | ||
| Find all teams that have at least two members who have worked on successful projects. | ||
Mats-SX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| [source, cypher] | ||
| ---- | ||
| MATCH (team:Team) | ||
|
|
@@ -171,6 +252,9 @@ WHERE EXISTS { | |
| RETURN team | ||
| ---- | ||
|
|
||
| However, pattern predicates have a readability advantage in narrow cases. | ||
| Hence, this proposal retains them while removing their confusing meaning outside boolean expression context. | ||
Mats-SX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| == Caveats to this proposal | ||
|
|
||
| Subqueries are powerful constructs. As such they can be difficult to understand, and difficult for a query planner to get right. | ||
Uh oh!
There was an error while loading. Please reload this page.