Skip to content

Conversation

@web4akash
Copy link
Contributor

@web4akash web4akash commented Jan 19, 2026

Description

This PR improves the observability by adding the logging at different levels (debug and error)

Fixes #71

Type of Change

  • Bug fix (non-breaking change which fixes an issue)

Changes Made

  • Added debug-level logging for generated prompts in Llama and OpenAI generators
  • Logged raw LLM responses before JSON parsing
  • Logged JSON parsing errors with traceback information for easier debugging

Testing

  • Manual testing performed

Test Configuration

  • Python version: 3.11
  • Operating System: Linux (Fedora)
  • Relevant adapters tested:
    • LlamaDatasetGenerator
    • OpenAIDatasetGenerator (error path)

Test Results

DEBUG | agentunit.generators.llm_generator | Generated prompt logged successfully
DEBUG | agentunit.generators.llm_generator | Raw LLM response logged before parsing
ERROR | agentunit.generators.llm_generator | JSON parsing failure logged with traceback

Code Quality

  • My code follows the project's style guidelines (Ruff, Black)
  • I have performed a self-review of my own code
  • My changes generate no new warnings or errors

Documentation

  • No documentation updation

Breaking Changes

  • None

Dependencies

  • No new dependencies added

Performance Impact

  • No performance impact

Additional Context

  • This change improves debuggability without modifying runtime behavior or LLM interaction logic. Logging is emitted at DEBUG/ERROR levels only and
    does not affect normal execution.

Checklist

  • I have read the CONTRIBUTING.md guide
  • My branch name follows the convention (feature/, fix/, docs/, etc.)
  • My commit messages follow the conventional commit format
  • I have tested my changes locally

Reviewer Notes

Please pay special attention to:

  • Logging levels and message clarity
  • Placement of logs before parsing and on error paths

Summary by CodeRabbit

  • Chores
    • Enhanced logging across generation workflows (sync and async) to capture prompts and raw responses for improved observability.
    • Improved error logging for JSON parsing failures, recording raw responses before re-raising to aid diagnostics.
    • No changes to user-facing behavior or public APIs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 19, 2026

Walkthrough

Added module-level logging to the LLM generator module; debug logs now record constructed prompts and raw responses for both Llama and OpenAI flows, and JSON decode failures are logged at error level before being re-raised.

Changes

Cohort / File(s) Summary
Logging instrumentation in LLM generators
src/agentunit/generators/llm_generator.py
Added logging import and logger = logging.getLogger(__name__); instrumented synchronous and asynchronous Llama and OpenAI generation flows to log prompts and raw responses at debug level; JSON decode failures now log the raw response at error level before re-raising.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: logging for prompts, raw responses, and parsing errors in the LLM generator.
Description check ✅ Passed The description addresses most template sections with details on changes, testing setup, and context, though some checkboxes remain unchecked and sections are abbreviated.
Linked Issues check ✅ Passed The PR successfully implements all four coding requirements from issue #71: logger module added, debug-level logging for prompts, debug-level logging for raw responses, and error-level logging for parsing errors.
Out of Scope Changes check ✅ Passed All changes in the PR are directly aligned with the scope of issue #71, focusing exclusively on adding logging without modifying functional behavior or LLM interaction logic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/agentunit/generators/llm_generator.py (1)

120-131: Guard prompt/response logging to avoid sensitive data leakage.
These logs can include PII or secrets from prompts/responses; even at debug level, this can end up in centralized logs. Consider making payload logging opt‑in and/or truncating/redacting.

🔒 Suggested hardening (opt‑in + truncation)
 `@dataclass`
 class GeneratorConfig:
     """Configuration for dataset generation."""
@@
     include_edge_cases: bool = True
     edge_case_ratio: float = 0.3
+    log_payloads: bool = False
+    log_payload_max_chars: int = 2000
@@
-        logger.debug("Llama generated prompt:\n%s", prompt)
+        if self.config.log_payloads:
+            logger.debug(
+                "Llama generated prompt:\n%s",
+                prompt[: self.config.log_payload_max_chars],
+            )
@@
-        logger.debug("Llama raw response:\n%s", response)
+        if self.config.log_payloads:
+            logger.debug(
+                "Llama raw response:\n%s",
+                response[: self.config.log_payload_max_chars],
+            )
@@
-        logger.debug(
-            "OpenAI generated prompt (messages):\n%s",
-            json.dumps(messages, indent=2)
-        )
+        if self.config.log_payloads:
+            logger.debug(
+                "OpenAI generated prompt (messages):\n%s",
+                json.dumps(messages, indent=2)[: self.config.log_payload_max_chars],
+            )
@@
-        logger.debug("OpenAI raw response text:\n%s", response_text)
+        if self.config.log_payloads:
+            logger.debug(
+                "OpenAI raw response text:\n%s",
+                response_text[: self.config.log_payload_max_chars],
+            )

Also applies to: 265-279

🤖 Fix all issues with AI agents
In `@src/agentunit/generators/llm_generator.py`:
- Around line 163-169: The error handler for the json.JSONDecodeError is logging
"OpenAI" which is misleading; update the logger.error message in the except
json.JSONDecodeError block inside llm_generator.py to reference "Llama" (or
"Llama/LLM") instead of "OpenAI" while preserving the raw response argument and
exc_info=True so the same contextual data is logged; also update the
commented-out msg string if present to match the corrected Llama wording.
- Around line 313-319: The except block catching json.JSONDecodeError
incorrectly places the `raise` outside the except, causing "No active exception
to reraise" and the log message incorrectly labels the source as "Llama"; fix by
moving the `raise` into the except block so the original JSONDecodeError is
re-raised, and update the logger.error message label to "OpenAI" (keep using
`response_text` and `exc_info=True` as currently used to preserve details).

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 16.66667% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/agentunit/generators/llm_generator.py 16.66% 10 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Owner

@aviralgarg05 aviralgarg05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fix the lint issue, rest it is good to go

Copy link
Owner

@aviralgarg05 aviralgarg05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!, Thanks for your contribution

@aviralgarg05 aviralgarg05 merged commit b2f79e2 into aviralgarg05:main Jan 20, 2026
12 checks passed
@web4akash
Copy link
Contributor Author

LGTM!, Thanks for your contribution

Thank you for the review and for merging the PR!

@web4akash web4akash deleted the logging_for_llm_dataset_generator branch January 20, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add logging to LLM dataset generators

3 participants