-
Notifications
You must be signed in to change notification settings - Fork 62
Fix: Escape special characters in entity descriptions #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Prevents syntax errors when descriptions contain quotes or backslashes. The escaping only affects query generation stored data remains unchanged
WalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Escape special characters to prevent Cypher syntax errors | ||
| escaped_description = self.description.replace("\\", "\\\\").replace("'", "\\'").replace('"', '\\"') | ||
| non_unique_attributes += f"{', ' if len(non_unique_attributes) > 0 else ''} {descriptionKey}: '{escaped_description}'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Good fix for the immediate problem, but consider additional special characters and consistency.
The escaping approach is correct for apostrophes, and the order of replacements (backslash first, then quotes) properly prevents double-escaping. However, there are several areas that need verification:
-
Other special characters: Newlines (
\n), tabs (\t), carriage returns (\r), and other control characters can also break Cypher syntax if present in descriptions. Consider escaping these as well. -
Consistency with attributes: Lines 141 and 144 convert attributes using
str(attr). If the Attribute class doesn't perform similar escaping, attributes containing special characters will cause the same syntax errors. -
More robust approach: Parameterized queries would handle all escaping automatically and be more secure. If FalkorDB supports this, it would be preferable.
Verify that the escaping is sufficient for Cypher by checking the FalkorDB documentation:
Run this script to check if the Attribute class handles escaping:
Consider adding a test to verify special character handling:
# In your test file
def test_entity_description_with_special_characters():
"""Test that special characters in descriptions are properly escaped."""
entity = Entity(
label="Test",
attributes=[],
description="Text with 'quotes' and \"double quotes\" and backslash: \\ and newline:\nNext line"
)
query = entity.to_graph_query()
# Verify the query is valid Cypher and doesn't cause syntax errors🏁 Script executed:
#!/bin/bash
# Check how Attribute class converts to string
rg -nP -A 10 'class Attribute' graphrag_sdk/attribute.py
rg -nP -A 10 'def __str__' graphrag_sdk/attribute.pyLength of output: 1183
🌐 Web query:
FalkorDB Cypher string escaping special characters
💡 Result:
Short answer: FalkorDB follows OpenCypher conventions — use backslashes inside string literals to escape special characters (", ', \ , \n, \t, \uXXXX for Unicode), and use backticks (...) to escape identifiers (labels, property names, variables) that contain spaces, punctuation or reserved words. Examples:
- String literal with quotes and backslash:
- "He said "hello" and left\home"
- 'It's fine\nNext line'
- Unicode:
- "\u00B0" (degree sign)
- Identifier with special chars:
- MATCH (n:
Customer-Account) RETURN n.first-name
- MATCH (n:
Sources: FalkorDB Cypher docs and Cypher parsing / string-escape reference. [1][2]
References
[1] FalkorDB — Cypher docs.
[2] Cypher parsing / string-literal escaping (Neo4j Cypher manual).
Add missing escape sequences and fix attribute name escaping.
The verification confirms the original review concerns are valid. FalkorDB requires escaping for \n, \t, \r, and other control characters per OpenCypher conventions, but the current implementation only handles \\, ', and ".
Additionally, the Attribute class __str__ method (lines 161-170 in graphrag_sdk/attribute.py) returns a string without escaping the attribute name, which means attribute names containing special characters will cause syntax errors.
Required fixes:
-
Line 148 (graphrag_sdk/entity.py): Expand escaping to include:
escaped_description = (self.description .replace("\\", "\\\\") .replace("'", "\\'") .replace('"', '\\"') .replace("\n", "\\n") .replace("\t", "\\t") .replace("\r", "\\r") )
-
Lines 141, 144 (graphrag_sdk/entity.py): Attribute names must be escaped before embedding in the query. Either:
- Escape attribute names when building the query, or
- Update Attribute's
__str__method to escape the name field
🤖 Prompt for AI Agents
In graphrag_sdk/entity.py around lines 147 to 149 and referenced
Attribute.__str__ (graphrag_sdk/attribute.py lines ~161-170), the description
escaping only handles backslash, single and double quotes and misses control
characters, and attribute names are not escaped causing Cypher syntax errors;
update the description escaping to also replace "\n", "\t", and "\r" with their
escaped sequences (\\n, \\t, \\r) as shown in the review, and fix attribute-name
escaping by either escaping the attribute name where it is interpolated in
entity.py or, preferably, updating Attribute.__str__ to return the name properly
escaped for Cypher (escape backslashes, quotes and control characters the same
way as description) so all attribute names are safe when used in generated
queries.
Problem
Entity descriptions with apostrophes (e.g., "The team's first quarter goal") cause Cypher syntax errors and prevent know knowledge graph creation.
Solution
Added character escaping in 'Entity.to_graph_query()`:
Escape
\,',"characters before building Cypher queryEscaping is only for query syntax data stored in database is unchanged
Summary by CodeRabbit