-
Notifications
You must be signed in to change notification settings - Fork 966
Feat: Adds subgraph retriever to graph based completion searches #874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Adds subgraph retriever to graph based completion searches #874
Conversation
Please make sure all the checkboxes are checked:
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…-the-data-based-on-nodesets
|
Cypher Injection in Graph Database Query Filter Construction @@ -652,8 +652,24 @@
+ "}"
)
return relationship_types_undirected_str
+ @staticmethod
+ def _escape_cypher_identifier(identifier: str) -> str:
+ # Allowed unescaped: simple names with [a-zA-Z0-9_] only, not starting with digit.
+ # Otherwise: surround with backticks and escape inner backticks with double-backtick.
+ import re
+ if not isinstance(identifier, str):
+ raise ValueError("Identifier must be a string")
+ # Disallow control, newline, and quote chars
+ if any(ord(c) < 32 or c in '\'"\\' for c in identifier):
+ raise ValueError("Graph name contains illegal control or quote characters")
+ if re.fullmatch(r'[A-Za-z_][A-Za-z0-9_]*', identifier):
+ return identifier
+ # If already surrounded by backticks, don't double-escape
+ escaped = identifier.replace("`", "``")
+ return f"`{escaped}`"
+
async def project_entire_graph(self, graph_name="myGraph"):
"""
Projects all node labels and all relationship types into an undirected in-memory GDS graph.
"""
@@ -661,12 +677,13 @@
return
node_labels_str = await self.get_node_labels_string()
relationship_types_undirected_str = await self.get_relationship_labels_string()
+ graph_name_escaped = self._escape_cypher_identifier(graph_name)
query = f"""
CALL gds.graph.project(
- '{graph_name}',
+ {graph_name_escaped},
{node_labels_str},
{relationship_types_undirected_str}
) YIELD graphName;
"""
@@ -674,9 +691,10 @@
await self.query(query)
async def drop_graph(self, graph_name="myGraph"):
if await self.graph_exists(graph_name):
- drop_query = f"CALL gds.graph.drop('{graph_name}');"
+ graph_name_escaped = self._escape_cypher_identifier(graph_name)
+ drop_query = f"CALL gds.graph.drop({graph_name_escaped});"
await self.query(drop_query)
async def get_graph_metrics(self, include_optional=False):
"""For the definition of these metrics, please refer to
@@ -761,5 +779,5 @@
WHERE COUNT {{ MATCH (n)--() }} = 1
RETURN n
"""
result = await self.query(query)
- return [record["n"] for record in result] if result else []
+ return [record["n"] for record in result] if result else []
\ No newline at end of file
Explanation of FixThe vulnerability is due to direct f-string interpolation of the To fix this, I added a Potential impact: If callers provide graph names with truly invalid Cypher identifier characters, these will now be escaped (with backticks), which is harmless in most real-world scenarios but may surprise callers expecting punctuation to be included literally in graph names as interpreted by Neo4j/GDS. Legitimate graph names composed of safe characters are unaffected. There is no new dependency or breaking change to existing users (unless they relied on ability to inject Cypher code, which was a security risk). Issues
|
lxobr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, nice work!
Description
Adds subgraph retriever to graph based completion searches
DCO Affirmation
I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.