Skip to content

Conversation

@ChenZiHong-Gavin
Copy link
Collaborator

This PR enhances error handling and input validation in the Kuzu graph database storage layer.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ChenZiHong-Gavin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness and error handling within the Kuzu graph database storage layer. It focuses on making data retrieval more resilient to malformed or unexpected data formats by implementing comprehensive JSON decoding error checks. Additionally, the changes enhance the database cleanup process, ensuring that the drop operation reliably removes database artifacts while properly managing active connections.

Highlights

  • Robust JSON Decoding: Enhanced error handling for JSON data retrieved from nodes and edges, ensuring that empty, non-string, or malformed JSON strings are gracefully handled and logged, preventing crashes during data retrieval.
  • Improved Database Drop Mechanism: The drop method for KuzuDB now correctly handles various types of db_path (directory, file, or other existing path) and ensures that database connections are properly closed before attempting to remove the database files.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the robustness of KuzuDB operations by adding comprehensive error handling for JSON decoding when retrieving node and edge data. It also enhances the drop method to handle various file system object types more gracefully. The changes are well-structured and address potential issues with malformed data or unexpected file paths. A consistent improvement would be to replace print statements with proper logging calls for better error tracking and system monitoring.

try:
return json.loads(data_str)
except json.JSONDecodeError as e:
print(f"Error decoding JSON for node {node_id}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print for error messages is generally not recommended in production code. It's better to use a logging framework (e.g., Python's logging module) to allow for configurable log levels, output destinations, and structured logging. This improves maintainability and debugging capabilities.

Suggested change
print(f"Error decoding JSON for node {node_id}: {e}")
logging.error(f"Error decoding JSON for node {node_id}: {e}")

try:
data = json.loads(data_str) if data_str and data_str.strip() else {}
except json.JSONDecodeError as e:
print(f"Error decoding JSON for node {node_id}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better error tracking and system monitoring, print statements should be replaced with a proper logging mechanism. This allows for more flexible handling of error messages.

Suggested change
print(f"Error decoding JSON for node {node_id}: {e}")
logging.error(f"Error decoding JSON for node {node_id}: {e}")

Comment on lines 177 to 179
print(
f"Error decoding JSON for edge {source_node_id}->{target_node_id}: {e}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Replacing print with a logging call is a best practice for handling errors in a way that can be configured and monitored. This ensures that error messages are handled consistently across the application.

Suggested change
print(
f"Error decoding JSON for edge {source_node_id}->{target_node_id}: {e}"
)
logging.error(
f"Error decoding JSON for edge {source_node_id}->{target_node_id}: {e}"
)

try:
data = json.loads(data_str) if data_str and data_str.strip() else {}
except json.JSONDecodeError as e:
print(f"Error decoding JSON for edge {src}->{dst}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's recommended to use a logging framework instead of print for error messages. This provides more control over log output and severity, which is crucial for debugging and operational insights.

Suggested change
print(f"Error decoding JSON for edge {src}->{dst}: {e}")
logging.error(f"Error decoding JSON for edge {src}->{dst}: {e}")

try:
data = json.loads(data_str) if data_str and data_str.strip() else {}
except json.JSONDecodeError as e:
print(f"Error decoding JSON for edge {src}->{dst}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistent error handling and better observability, print statements should be replaced with calls to a logging utility. This allows for centralized management of log messages.

Suggested change
print(f"Error decoding JSON for edge {src}->{dst}: {e}")
logging.error(f"Error decoding JSON for edge {src}->{dst}: {e}")

if os.path.isdir(self.db_path):
shutil.rmtree(self.db_path)
print(f"Dropped KuzuDB at {self.db_path}")
print(f"Dropped KuzuDB directory at {self.db_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Informational messages like this should ideally use logging.info instead of print. This allows for better control over log verbosity and integration with monitoring systems.

Suggested change
print(f"Dropped KuzuDB directory at {self.db_path}")
logging.info(f"Dropped KuzuDB directory at {self.db_path}")

print(f"Dropped KuzuDB directory at {self.db_path}")
elif os.path.isfile(self.db_path):
os.remove(self.db_path)
print(f"Dropped KuzuDB file at {self.db_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency and better log management, consider using logging.info instead of print for informational messages.

Suggested change
print(f"Dropped KuzuDB file at {self.db_path}")
logging.info(f"Dropped KuzuDB file at {self.db_path}")

print(f"Dropped KuzuDB file at {self.db_path}")
elif os.path.exists(self.db_path):
os.unlink(self.db_path)
print(f"Dropped KuzuDB path at {self.db_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using logging.info for successful operations provides a more robust logging solution than print, enabling easier debugging and operational oversight.

Suggested change
print(f"Dropped KuzuDB path at {self.db_path}")
logging.info(f"Dropped KuzuDB path at {self.db_path}")

os.unlink(self.db_path)
print(f"Dropped KuzuDB path at {self.db_path}")
else:
print(f"Database path {self.db_path} does not exist")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For situations where a path does not exist, logging.warning is more appropriate than print. This highlights a potential issue without being a critical error, and integrates well with logging systems.

Suggested change
print(f"Database path {self.db_path} does not exist")
logging.warning(f"Database path {self.db_path} does not exist")

@ChenZiHong-Gavin ChenZiHong-Gavin merged commit 714e959 into main Dec 22, 2025
4 checks passed
@ChenZiHong-Gavin ChenZiHong-Gavin deleted the fix/fix-kuzudb-error branch December 22, 2025 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants