Skip to content
Open
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
5c6f9d7
Added the nested subqueries CIP
Jun 22, 2016
d7d0a83
Sundry content edits to the subquery CIP
Jun 23, 2016
bf71712
Clarified the syntax wrt `OPTIONAL`
Jun 23, 2016
f6245fd
Added the notion of write subqueries, with `UNWIND` + `DO {...}` repl…
Jul 20, 2016
ede1334
Clarified the way in which variable bindings work (based on comments …
Jul 20, 2016
a1c6442
Addressed some feedback
boggle Sep 26, 2016
4caeb54
Addressing comments; making clarifications
Nov 17, 2016
5b5b9cc
Sketched out additional forms of nested subqueries.
boggle Mar 27, 2017
fe21475
Homogeneous syntax for OPTIONAL, MANDATORY, MATCH, DO WHEN
boggle Mar 30, 2017
80a1ce4
Address feedback and introduce new syntactic short forms
boggle Apr 13, 2017
b8f49d6
Add chained subqueries with `THEN` and overhaul document
boggle Apr 19, 2017
bf53252
Reflect discussion; add new conditional form of DO and WHERE shorthand
boggle Apr 20, 2017
70a91cd
Textual improvements
Apr 21, 2017
1f02e2b
Refer to Query Combinator CIP
Apr 21, 2017
cc176e8
Wording
boggle May 1, 2017
2921112
Rework CIP
boggle Oct 16, 2017
3ed1ca9
Clarify precedence rules
boggle Oct 16, 2017
2d2435f
Add ammending nested subqueries and fix query combinator precedence
boggle Oct 16, 2017
0156bc3
Fix definition of chained queries and move to right directory
boggle Oct 16, 2017
acfac59
Textual edits
Oct 17, 2017
7554cc9
Clarified query combinator semantics
boggle Oct 17, 2017
10fa182
Fixed erroneous queries
Oct 19, 2017
1ca70bf
Reformatted title
Jan 17, 2018
cfc2a43
Reworking/incorporating alternative CIP
boggle May 6, 2018
6199430
Fused with nested subqueries CIP from multigraph work
boggle May 6, 2018
5b6a333
Added stand-alone nested calls and some clarifications/fix-ups
boggle May 7, 2018
077fb18
Grammar fix
boggle May 7, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Homogeneous syntax for OPTIONAL, MANDATORY, MATCH, DO WHEN
  • Loading branch information
boggle committed Oct 16, 2017
commit fe214758fa27958144112bef41eeb995c43332a1
119 changes: 58 additions & 61 deletions cip/CIP2016-06-22-nested-subqueries.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,107 +31,92 @@ This CIP may be viewed in light of the EXISTS CIP and the forthcoming Pattern Co

== Proposal

This proposal suggests the introduction of two new subquery constructs to Cypher.
Nested subqueries are self-contained Cypher queries that are run within the scope of an outer Cypher query.

**1. Read-only match subqueries**
This proposal suggests the introduction of new nested subquery constructs to Cypher.

We propose the addition of new syntax to the `MATCH` clause for expressing nested read-only subqueries.
* Read-only nested match subqueries of the form `MATCH { [(a)->(b) [WHERE ...]] ... RETURN * }`
* Read-only nested optional match subqueries of the form `OPTIONAL { [(a)->(b) [WHERE ...]] ... RETURN * }`
* Read-only nested mandatory match subqueries of the form `MANDATORY { [(a)->(b) [WHERE ...]] ... [RETURN *] }`
* Read/Write nested subqueries of the form `DO WHEN ... { ... }` (not ending with `RETURN`)

Nested subqueries are self-contained, read-only Cypher queries.
All forms are introduced with a keyword in conjunction with optional subclauses which are then followed by an inner query in curly braces.

A nested read-only subquery is denoted using the following syntax: `MATCH { <subquery> }`.
Nested subqueries may be correlated - i.e. the inner query may use variables from the outer query - or uncorrelated.

Nested subqueries may be correlated - i.e. the subquery has a dependency on the outer query - or uncorrelated.
Nested subqueries can be contained within other nested subqueries at an arbitrary (but finite) depth.

As this proposal extends the `MATCH` clause, nested subqueries can be contained within other nested subqueries at arbitrary depth.

**2. Read-only optional match subqueries**
**1. Read-only nested match subqueries**

We propose the addition of new, abbreviated syntax for expressing nested read-only optional match subqueries.
We propose the addition of new syntax to the `MATCH` clause for expressing read-only nested match subqueries.

A nested read-only optional match subquery takes the form: `OPTIONAL { <subquery> }`.
A nested read-only match subquery is denoted using the following syntax: `MATCH { <inner-match-query> }`.

Nested optional match subqueries may be correlated - i.e. the subquery has a dependency on the outer query - or uncorrelated.
The inner match query is a full read-only Cypher query.

**3. Read-only mandatory match subqueries**
Moreover, any valid read-only Cypher query from which the leading `MATCH` keyword has been omitted may also be used as an inner match query.

We propose the addition of new, abbreviated syntax for expressing nested read-only mandatory match subqueries.
This rule only applies if the leading `MATCH` clause is the root clause of the inner query (i.e. is not the first clause inside a nested query or a `UNION`).

A nested read-only mandatory match subquery takes the form: `MANDATORY { <subquery> }`.

Nested mandatory match subqueries may be correlated - i.e. the subquery has a dependency on the outer query - or uncorrelated.
**2. Read-only nested optional match subqueries**

**4. Write-only/read-write subqueries**
We propose the addition of a new `OPTIONAL` clause for expressing read-only nested optional match subqueries.

We further propose the addition of a new syntax - the `DO` clause - for expressing nested write-only/read-write subqueries that _do not return any data_.
A nested read-only optional match subquery is denoted using the following syntax: `OPTIONAL { <inner-match-query> }`.

A nested write-only/read-write subquery is denoted using the following syntax: `DO { <subquery> }`.

We additionally propose removing the `FOREACH` clause from the current language as it is rendered obsolete by the introduction of `DO`.
**3. Read-only nested mandatory match subqueries**

=== Read-only subquery syntax
We propose the addition of a new `MANDATORY` clause for expressing read-only nested mandatory match subqueries.

All kinds of read-only subqueries support the following syntactical forms:
A nested read-only mandatory match subquery is denoted using the following syntax: `MANDATORY { <inner-mandatory-query> }`.

* `KEYWORD { <subquery> }`
* `KEYWORD { <pattern> [WHERE <predicate>] <subquery> }` which is syntactic sugar for `KEYWORD { MATCH <pattern> [WHERE <predicate>] WITH * <subquery> }`
* `KEYWORD MATCH <pattern> [WHERE <predicate>]` which is syntactic sugar for `KEYWORD { MATCH <pattern> [WHERE <predicate>] RETURN * }`
The inner mandatory query is any inner match query.

Here keyword is
Moreover, any inner match query from which the trailing final `RETURN` clause has been omitted may also be used as an inner mandatory query.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs examples to demonstrate the use of these. As well as the difference between having a RETURN clause and leaving it out.

In my understanding MANDATORY without RETURN would be like EXISTS, but as a clause with the capability of failing the query rather than a boolean expression. Interestingly that use of MANDATORY would not affect the cardinality of the query, whereas MANDATORY with RETURN could potentially increase the cardinality of the query (although never decrease it, since that would mean that no matches were found).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're actually thinking about making RETURN mandatory on MANDATORY match based on the reason that this will be the 95% use case and having to name in the remaining 5% isn't a big deal.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit would be less special casing. Then again, maybe someone might find it very natural to leave off RETURN.


* `OPTIONAL` for read-only optional match subqueries
* `MANDATORY` for read-only mandatory match subqueries
* `MATCH` for read-only match subqueries except for the last form which is just written as `MATCH <pattern> [WHERE <predicate>]`

All read-only subqueries may end in `RETURN`.
If they do not end with a `RETURN` clause, `RETURN *` is added implicitly.
**4. Read/Write nested subqueries**

=== Grammar changes
We propose the addition of a new `DO` clause for expressing read/write nested subqueries that _do no return any data_.

We extend the https://github.com/opencypher/openCypher/blob/master/grammar/cypher.xml[grammar] as follows:
A nested read/write subquery is denoted using the following syntax: `DO [WHEN predicate] { <inner-update-query> }`.

**1. Read-only subqueries**
Any updating Cypher query from which the trailing final `RETURN` clause has been omitted may be used as an inner update query.

[source, ebnf]
----
Match = [ "OPTIONAL" ], ( MatchPattern | NestedReadOnlySubquery ), [ "WHERE", Predicate ] ;
MatchPattern = "MATCH", Pattern ;
NestedReadOnlySubquery = "MATCH", "{", RegularQuery, "}" ;
----

**2. Write-only/read-write subqueries**
Using a
We additionally propose removing the `FOREACH` clause from the current language as it is rendered obsolete by the introduction of `DO`.

[source, ebnf]
----
Match = [ "OPTIONAL" ], "MATCH", Pattern, [ "WHERE", Predicate ], NestedWriteSubquery ;
NestedWriteSubquery = Unwind, "DO", "{", WriteSubquery, "}" ;
WriteSubquery = WriteOnlyClauseWithNoReturn, [ NestedWriteSubquery ] |
ReadWriteClauseWithNoReturn, [ NestedWriteSubquery ] ;
----

=== Semantic clarification

**1. Read-only subqueries**
**1. Read-only nested subqueries**

Conceptually, a nested subquery is evaluated for each incoming record and may produce an arbitrary number of result records.

All incoming variables remain in scope.

Any new variable bindings introduced by the final `RETURN` clause when evaluating the subquery will augment the variable bindings of the initial record. Therefore, nested subqueries cannot shadow variables present in the outer scope, and thus behave in the same way as `UNWIND` and `CALL` with regard to the introduction of new variable bindings. Any other variable bindings introduced in the subquery will not be visible to the outer scope.
Any new variable bindings introduced by the final `RETURN` clause when evaluating the subquery will augment the variable bindings of the initial record.
Therefore, nested subqueries cannot shadow variables present in the outer scope, and thus behave in the same way as `UNWIND` and `CALL` with regard to the introduction of new variable bindings.
Any other variable bindings that are introduced temporarily in the subquery will not be visible to the outer scope.

Subqueries interact with write clauses in the same way as `MATCH` does.

It is an error for a nested subquery to try to rebind (shadow) a pre-existing outer variable binding.

**2. Write-only/read-write subqueries**
**2. Read/Write subqueries**

Execution of a `DO` subquery does not change the cardinality; i.e. the inner update query is run for each incoming record (optionally filtered by the given predicate if a `WHEN` sub-clause is present).

Any input record is always passed on to the clause succeeding the `DO` subquery, irrespective of whether it was eligible for processing by the inner update query.

Execution of a `DO` subquery does not change the cardinality; i.e. the full subquery is run for each incoming record and then the record is passed on to the remainder of the outer query.
A `DO` clause that uses `WHEN` sub-clause is called _conditional DO_.

A query may end with a `DO` subquery in the same way that a query can currently end with any update clause.

=== Examples

**1. Read-only subqueries**
**1. Read-only nested match subqueries**

Post-UNION processing:
[source, cypher]
Expand All @@ -153,7 +138,7 @@ ORDER BY time DESC
LIMIT 10
----

Uncorrelated nested subquery:
Uncorrelated nested match subquery:
[source, cypher]
----
MATCH (f:Farm {id: $farmId})
Expand All @@ -170,7 +155,7 @@ MATCH {
RETURN f, name, code
----

Correlated nested subquery:
Correlated nested match subquery:
[source, cypher]
----
MATCH (f:Farm {id: $farmId})-[:IS_IN]->(country:Country)
Expand All @@ -188,7 +173,7 @@ MATCH {
RETURN f, name, code
----

Filtered and correlated nested subquery:
Filtered and correlated nested match subquery:
[source, cypher]
----
MATCH (f:Farm)-[:IS_IN]->(country:Country)
Expand All @@ -209,12 +194,12 @@ WHERE f.type = 'organic'
RETURN f, brand.name AS name, code
----

Doubly-nested subquery:
Doubly-nested match subquery:
[source, cypher]
----
MATCH (f:Farm {id: $farmId})
MATCH {
MATCH (c:Customer)-[:BUYS_FOOD_AT]->(f)
(c:Customer)-[:BUYS_FOOD_AT]->(f)
MATCH {
MATCH (c)-[:RETWEETS]->(t:Tweet)<-[:TWEETED_BY]-(f)
RETURN c, count(*) AS count
Expand All @@ -231,7 +216,7 @@ MATCH {
RETURN f.name AS name, type, sum(endorsement) AS endorsement
----

**2. Write-only/read-write subqueries**
**2. Read/Write nested subqueries**

We illustrate these by means of an 'old' version of the query, in which `FOREACH` is used, followed by the 'new' version, using `DO`.

Expand Down Expand Up @@ -293,6 +278,18 @@ DO {
}
----

Conditional `DO`
[source, cypher]
----
MATCH (r:Root)
UNWIND range(1, 10) AS x
DO WHEN x % 2 = 1 {
MERGE (c:Odd:Child {id: x})
MERGE (r)-[:PARENT]->(c)
}
----


=== Interaction with existing features

Apart from the suggested deprecation of the `FOREACH` clause, nested read-only, write-only and read-write subqueries do not interact directly with any existing features.
Expand Down