Skip to content

Conversation

yohannj
Copy link
Contributor

@yohannj yohannj commented Jul 11, 2025

Summary

Implements issue 630.

Without clear direction, I went with "Proposal 1: Choose logger behavior using environment variables".

We could propose three modes:

FILE_ONLY: Current behavior, the default
CONSOLE_ONLY: Writing on stdout/stderr
FILE_AND_CONSOLE: Keep the current behavior but also write all logs on stdout/stderr

Implementation details

Code

The code is separated in different commit with clear (hopefully) responsibility and modification.

Especially the first commit splits the code of get_global_file_logger in different functions. This is to prepare the work on have multiple handlers and make it easier to review the actual feature being introduced.

Testing

The last commit is adding unit tests over the logs.py file. I believe it helps understand how everything behaves.

It also highlight a behavior that feels weird to me.
The passthrough file logger throws an exception when the folder that should contain it logs cannot be created, while PyAirbyte doesn't throw an exception when airbyte.logs.AIRBYTE_LOGGING_ROOT does not exists.

I believe it should behaves the same.
In contrast, the "global stats logger" and "global file logger" have a unique behavior.
If we want to change that, I can quickly create a separate PR.

Summary by CodeRabbit

  • New Features
    • Logging output can now be configured via an environment variable to write logs to files, console, or both.
  • Bug Fixes
    • Improved error handling and fallback behavior for logging when log directories are missing or cannot be created.
  • Tests
    • Added comprehensive unit tests to verify logging configuration, output destinations, and error handling under various scenarios.

Copy link
Contributor

coderabbitai bot commented Jul 11, 2025

📝 Walkthrough

Walkthrough

The logging system in airbyte/logs.py was refactored to support configurable logging outputs via a new environment variable, allowing logs to be directed to files, the console, or both. Helper functions and an enum were introduced to modularize handler setup. New unit tests comprehensively validate the new logging behaviors and error handling.

Changes

File(s) Change Summary
airbyte/logs.py Refactored logging system to support configurable logging behaviors via an environment variable; added helper functions, a new enum, and modular handler setup. Existing logger creation functions were updated to use the new logic.
tests/unit_tests/test_logs.py Added comprehensive unit tests for parsing logging behavior, logger creation, handler setup, and error handling for the new logging configuration system.

Sequence Diagram(s)

sequenceDiagram
    participant Env as Environment
    participant UserCode as User Code
    participant Logs as airbyte/logs.py
    participant Logger as logging.Logger
    participant Handlers as Logging Handlers

    Env->>Logs: Provides AIRBYTE_LOGGING_BEHAVIOR
    UserCode->>Logs: get_global_file_logger() / get_global_stats_logger() / new_passthrough_file_logger()
    Logs->>Logs: Parse AIRBYTE_LOGGING_BEHAVIOR
    Logs->>Handlers: Select and create appropriate handlers (File, Console, or both)
    Logs->>Logger: Remove existing handlers
    Logs->>Logger: Attach new handlers
    Logger-->>UserCode: Configured logger instance
Loading

Would you like to see a more detailed breakdown of the new helper functions or perhaps a comparison between the old and new logging flows, wdyt?


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 074ca7f and 53c7ce7.

📒 Files selected for processing (1)
  • airbyte/logs.py (8 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: yohannj
PR: airbytehq/PyAirbyte#716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
airbyte/logs.py (15)
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: yohannj
PR: airbytehq/PyAirbyte#716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-11T22:05:15.550Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-06T17:43:09.319Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:47-47
Timestamp: 2024-10-08T15:34:31.026Z
Learning: When reviewing changes in test fixtures, especially renaming, consider that they might be due to fixing copy-paste errors and may not impact core codepaths.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:47-47
Timestamp: 2024-08-31T00:58:32.484Z
Learning: When reviewing changes in test fixtures, especially renaming, consider that they might be due to fixing copy-paste errors and may not impact core codepaths.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-10-08T15:34:31.026Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-08-31T05:02:20.192Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
🧬 Code Graph Analysis (1)
airbyte/logs.py (1)
airbyte/constants.py (1)
  • _str_to_bool (63-65)
🔇 Additional comments (8)
airbyte/logs.py (8)

36-56: Well-designed enum and parsing logic!

The LoggingBehavior enum is clearly defined and the parsing function handles invalid values gracefully with a sensible default. The case-insensitive parsing is a nice touch for user experience.


70-84: Excellent constant definition and documentation!

The AIRBYTE_LOGGING_BEHAVIOR constant follows the established pattern in the codebase and includes clear documentation of all available options.


166-217: Nice refactoring of the global logger setup!

The refactored get_global_file_logger function maintains all existing functionality while adding flexibility through the new handler system. The separation of concerns with _get_global_handlers() and _remove_all_handlers() makes the code more maintainable.


221-254: Consistent refactoring pattern for stats logger!

The get_global_stats_logger function follows the same clean refactoring pattern as the global logger, maintaining functionality while adding configurability.


257-307: Excellent refactoring maintaining API compatibility!

The new_passthrough_file_logger function maintains its original signature while gaining the new configurability through the handler system. The refactoring is consistent with the other logger functions.


310-344: Clean handler selection logic with match statements!

The three handler selection functions (_get_global_handlers, _get_global_stats_handlers, _get_passthrough_handlers) use match statements effectively and properly filter out None handlers. The separation of concerns is excellent.


347-402: Well-structured file handler creation functions!

The file handler creation functions (_get_global_file_handler, _get_global_stats_file_handler, _get_passthrough_file_handler) are well-designed with proper error handling and consistent use of print statements for user feedback, which aligns with PyAirbyte's preferences.


405-429: Simple and effective utility functions!

The utility functions are well-designed: _get_console_handler() correctly uses stdout for console output, get_global_stats_log_path() maintains its original functionality with proper error handling, and _remove_all_handlers() is a clean utility for handler management.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
airbyte/logs.py (5)

11-11: Fix line length to comply with the 100-character limit.

The line exceeds the maximum allowed length. Would you consider breaking it into multiple lines, wdyt?

-PyAirbyte supports different logging behaviors controlled by the `AIRBYTE_LOGGING_BEHAVIOR` environment
-variable:
+PyAirbyte supports different logging behaviors controlled by the `AIRBYTE_LOGGING_BEHAVIOR`
+environment variable:

309-320: Use handlers.extend() instead of repeated append() calls.

For better code style and efficiency, would you consider using extend() to add multiple handlers at once, wdyt?

 def _get_global_handlers() -> list[logging.Handler]:
     handlers: list[logging.Handler] = []
     match AIRBYTE_LOGGING_BEHAVIOR:
         case LoggingBehavior.FILE_ONLY:
             handlers.append(_get_global_file_handler())
         case LoggingBehavior.CONSOLE_ONLY:
             handlers.append(_get_console_handler())
         case LoggingBehavior.FILE_AND_CONSOLE:
-            handlers.append(_get_global_file_handler())
-            handlers.append(_get_console_handler())
+            handlers.extend([_get_global_file_handler(), _get_console_handler()])
 
     return [h for h in handlers if h is not None]

323-333: Use handlers.extend() for consistency.

Similar to the previous function, using extend() would be more concise here too, wdyt?

 def _get_global_stats_handlers() -> list[logging.Handler]:
     handlers: list[logging.Handler] = []
     match AIRBYTE_LOGGING_BEHAVIOR:
         case LoggingBehavior.FILE_ONLY:
             handlers.append(_get_global_stats_file_handler())
         case LoggingBehavior.CONSOLE_ONLY:
             handlers.append(_get_console_handler())
         case LoggingBehavior.FILE_AND_CONSOLE:
-            handlers.append(_get_global_stats_file_handler())
-            handlers.append(_get_console_handler())
+            handlers.extend([_get_global_stats_file_handler(), _get_console_handler()])
     return [h for h in handlers if h is not None]

336-346: Use handlers.extend() for consistency.

Let's maintain the same pattern here as well, wdyt?

 def _get_passthrough_handlers(connector_name: str) -> list[logging.Handler]:
     handlers: list[logging.Handler] = []
     match AIRBYTE_LOGGING_BEHAVIOR:
         case LoggingBehavior.FILE_ONLY:
             handlers.append(_get_passthrough_file_handler(connector_name))
         case LoggingBehavior.CONSOLE_ONLY:
             handlers.append(_get_console_handler())
         case LoggingBehavior.FILE_AND_CONSOLE:
-            handlers.append(_get_passthrough_file_handler(connector_name))
-            handlers.append(_get_console_handler())
+            handlers.extend([_get_passthrough_file_handler(connector_name), _get_console_handler()])
     return [h for h in handlers if h is not None]

428-430: Add blank line before function definition.

The linter expects 2 blank lines before a function definition at the module level.

     return folder / "airbyte-stats.log"
 
+
 def _remove_all_handlers(logger: logging.Logger) -> None:
     """Remove all handlers from a logger."""
     logger.handlers.clear()
tests/unit_tests/test_logs.py (2)

17-17: Consider making connector_name a constant.

Since this is a test constant used throughout the file, would it be clearer to follow Python naming conventions and make it uppercase, wdyt?

-connector_name = "test_connector"
+CONNECTOR_NAME = "test_connector"

Note: You'll also need to update all references to connector_name to CONNECTOR_NAME throughout the file.


21-23: Remove unnecessary f-string prefixes.

These strings don't contain any placeholders, so the f prefix is not needed.

     logging.getLogger(f"airbyte").handlers.clear()
     logging.getLogger(f"airbyte.stats").handlers.clear()
-    logging.getLogger(f"airbyte.{connector_name}").handlers.clear()
+    logging.getLogger("airbyte").handlers.clear()
+    logging.getLogger("airbyte.stats").handlers.clear()
+    logging.getLogger(f"airbyte.{connector_name}").handlers.clear()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 816fa44 and 012282d.

📒 Files selected for processing (2)
  • airbyte/logs.py (8 hunks)
  • tests/unit_tests/test_logs.py (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
tests/unit_tests/test_logs.py (2)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
airbyte/logs.py (10)
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-10-08T15:34:31.026Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-08-31T05:02:20.192Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
🧬 Code Graph Analysis (2)
tests/unit_tests/test_logs.py (1)
airbyte/logs.py (5)
  • LoggingBehavior (36-40)
  • _parse_logging_behavior (48-54)
  • get_global_file_logger (165-216)
  • get_global_stats_logger (220-253)
  • new_passthrough_file_logger (256-306)
airbyte/logs.py (1)
airbyte/constants.py (1)
  • _str_to_bool (63-65)
🪛 GitHub Actions: Run Linters
tests/unit_tests/test_logs.py

[error] 21-21: f-string without any placeholders (F541). Remove extraneous f prefix.


[error] 22-22: f-string without any placeholders (F541). Remove extraneous f prefix.

airbyte/logs.py

[warning] 11-11: Line too long (103 > 100) (E501)


[error] 317-318: Use handlers.extend(...) instead of repeatedly calling handlers.append() (FURB113)


[error] 331-332: Use handlers.extend(...) instead of repeatedly calling handlers.append() (FURB113)


[error] 344-345: Use handlers.extend(...) instead of repeatedly calling handlers.append() (FURB113)


[error] 428-428: Expected 2 blank lines, found 1 (E302)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte/logs.py (1)

43-45: Import _str_to_bool from airbyte.constants instead of duplicating it.

This function already exists in airbyte/constants.py. To avoid code duplication and maintain consistency, would you consider importing it instead, wdyt?

+from airbyte.constants import _str_to_bool
 from airbyte_cdk.utils.datetime_helpers import ab_datetime_now
 
 
 class LoggingBehavior(enum.Enum):
     """Enumeration for PyAirbyte logging behavior."""
     FILE_ONLY = "FILE_ONLY"
     CONSOLE_ONLY = "CONSOLE_ONLY"
     FILE_AND_CONSOLE = "FILE_AND_CONSOLE"
 
 
-def _str_to_bool(value: str) -> bool:
-    """Convert a string value of an environment values to a boolean value."""
-    return bool(value) and value.lower() not in {"", "0", "false", "f", "no", "n", "off"}
-
-
 def _parse_logging_behavior(value: str) -> LoggingBehavior:
⛔ Skipped due to learnings
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-08-31T05:02:20.192Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-10-08T15:34:31.026Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
tests/unit_tests/test_logs.py (2)

26-49: Comprehensive test coverage for logging behavior parsing!

Great job on the thorough parameterized tests covering various input cases including edge cases like empty strings, invalid values, and case variations.


282-312: Well-structured test for passthrough logger with all behavior modes!

The parameterized test effectively validates logger creation across all logging behaviors and structured logging states. Excellent coverage!

@yohannj yohannj force-pushed the stream_log_to_console branch from 012282d to 46901e1 Compare July 11, 2025 18:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/unit_tests/test_logs.py (2)

76-134: Consider adding assertions for logger configuration

The parametrized test structure is excellent, but I notice it's only checking handler types and formatter formats. Would it be valuable to also verify the logger level and propagation settings? The actual implementation sets logger.setLevel(logging.INFO) and logger.propagate = False - wdyt about adding assertions for these configurations?

             assert logger is not None
             assert logger.name == "airbyte"
+            assert logger.level == logging.INFO
+            assert logger.propagate is False
             assert len(logger.handlers) == len(handlers)

20-26: Consider extracting the connector name constant

The clear_logger_caches() function uses the global connector_name variable, but it would be cleaner to parameterize this or make it more explicit. Since this is just a test utility function, the current approach is acceptable, but wdyt about making the dependency more explicit?

-def clear_logger_caches():
+def clear_logger_caches(connector_name: str = "test_connector"):
     get_global_file_logger.cache_clear()
     get_global_stats_logger.cache_clear()
-    logging.getLogger(f"airbyte").handlers.clear()
-    logging.getLogger(f"airbyte.stats").handlers.clear()
-    logging.getLogger(f"airbyte.{connector_name}").handlers.clear()
+    logging.getLogger("airbyte").handlers.clear()
+    logging.getLogger("airbyte.stats").handlers.clear()
+    logging.getLogger(f"airbyte.{connector_name}").handlers.clear()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 012282d and 46901e1.

📒 Files selected for processing (2)
  • airbyte/logs.py (8 hunks)
  • tests/unit_tests/test_logs.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/logs.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
tests/unit_tests/test_logs.py (2)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
🧬 Code Graph Analysis (1)
tests/unit_tests/test_logs.py (1)
airbyte/logs.py (5)
  • LoggingBehavior (36-41)
  • _parse_logging_behavior (49-55)
  • get_global_file_logger (166-217)
  • get_global_stats_logger (221-254)
  • new_passthrough_file_logger (257-307)
🪛 GitHub Actions: Run Linters
tests/unit_tests/test_logs.py

[error] 23-23: f-string without any placeholders (F541) reported by ruff. Remove extraneous f prefix.


[error] 24-24: f-string without any placeholders (F541) reported by ruff. Remove extraneous f prefix.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (4)
tests/unit_tests/test_logs.py (4)

29-50: Excellent test coverage for the parsing function

The parametrized tests thoroughly cover case insensitivity, invalid inputs, and edge cases for the _parse_logging_behavior function. The test cases align well with the implementation's behavior of defaulting to FILE_ONLY for invalid inputs.


147-167: Verify the directory creation failure test logic

The test expects different behaviors based on the logging configuration when directory creation fails. For FILE_ONLY scenarios, it expects None when no handlers are available, but for scenarios with console handlers, it expects a valid logger. This matches the implementation's fallback behavior - nice attention to detail! The warning suppression is also appropriate since the implementation likely emits warnings for directory creation failures.


225-262: Stats logger always uses structured format

I notice that the stats logger test always expects %(message)s format regardless of the airbyte_structured_logging parameter. Looking at the implementation in airbyte/logs.py, the stats logger indeed always uses structured logging format. The test correctly reflects this behavior.


1-469: Outstanding test coverage for the logging system

This test file provides comprehensive coverage of the new logging behaviors introduced in the PR. The use of parametrized tests, proper mocking of filesystem operations, and testing of edge cases (directory creation failures, missing logging root) demonstrates excellent testing practices. The tests align well with the implementation's behavior and the PR objectives.

@yohannj yohannj force-pushed the stream_log_to_console branch 2 times, most recently from da54dbb to 4fea0b7 Compare July 11, 2025 19:07
@yohannj yohannj force-pushed the stream_log_to_console branch from 4fea0b7 to 752671c Compare July 11, 2025 19:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
airbyte/logs.py (1)

175-180: Consider adding a debug log for handler selection

The handler selection logic is clean, but it might be helpful to add some debug logging to indicate when no handlers are available, wdyt? This could help with troubleshooting configuration issues.

 handlers = _get_global_handlers()
 if len(handlers) == 0:
+    # Consider adding debug logging here for troubleshooting
     return None
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4fea0b7 and 074ca7f.

📒 Files selected for processing (3)
  • airbyte/logs.py (8 hunks)
  • tests/integration_tests/test_lowcode_connectors.py (1 hunks)
  • tests/unit_tests/test_logs.py (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
tests/unit_tests/test_logs.py (12)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-11T22:05:15.550Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-06T17:43:09.319Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
airbyte/logs.py (14)
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-06T17:43:09.319Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-11T22:05:15.550Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:47-47
Timestamp: 2024-10-08T15:34:31.026Z
Learning: When reviewing changes in test fixtures, especially renaming, consider that they might be due to fixing copy-paste errors and may not impact core codepaths.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:47-47
Timestamp: 2024-08-31T00:58:32.484Z
Learning: When reviewing changes in test fixtures, especially renaming, consider that they might be due to fixing copy-paste errors and may not impact core codepaths.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-10-08T15:34:31.026Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-08-31T05:02:20.192Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
🧬 Code Graph Analysis (2)
tests/unit_tests/test_logs.py (1)
airbyte/logs.py (5)
  • LoggingBehavior (36-41)
  • _parse_logging_behavior (49-55)
  • get_global_file_logger (166-217)
  • get_global_stats_logger (221-254)
  • new_passthrough_file_logger (257-307)
airbyte/logs.py (1)
airbyte/constants.py (1)
  • _str_to_bool (63-65)
🪛 GitHub Actions: Run Linters
tests/integration_tests/test_lowcode_connectors.py

[error] 1-1: Ruff formatting check failed. File would be reformatted. Run 'ruff format --fix' to fix code style issues.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (13)
airbyte/logs.py (7)

36-41: LGTM on the LoggingBehavior enum!

Clean enum definition that clearly defines the three logging modes. The string values match the enum names, which makes the environment variable configuration intuitive.


49-55: Solid parsing logic with good fallback behavior!

The case-insensitive parsing using value.upper() is user-friendly, and the graceful fallback to FILE_ONLY on invalid values ensures robustness. The try-except pattern handles the ValueError appropriately.


70-84: Well-documented configuration with clear defaults!

The environment variable configuration is well-documented with comprehensive docstring explaining all valid values and fallback behavior. The default of FILE_ONLY maintains backward compatibility.


312-320: Excellent use of match statement for handler selection!

The match statement provides clean, readable logic for selecting handlers based on logging behavior. The list comprehension at the end efficiently filters out None handlers.


405-406: Simple and effective console handler implementation!

Using sys.stdout for console output is the right choice - it ensures logs go to stdout rather than stderr, which aligns with typical logging practices for application output.


427-429: Clean utility function for handler management!

The _remove_all_handlers function provides a clean way to reset logger handlers before configuring new ones. Using handlers.clear() is more efficient than removing handlers individually.


384-402: Confirm passthrough handler OSError propagation
The call to new_passthrough_file_logger in ConnectorBase (airbyte/_connector_base.py:78) is not wrapped in a try/except, so an OSError from mkdir will bubble up and abort initialization. Since the unit tests in tests/unit_tests/test_logs.py explicitly expect and assert this error, it looks intentional and consistent with the design. No changes required here—wdyt?

tests/unit_tests/test_logs.py (6)

20-26: Well-organized test setup with proper cache clearing!

The clear_logger_caches function properly resets both the LRU caches and logger handlers between tests, preventing test pollution. This is crucial for isolated test execution.


29-50: Comprehensive test coverage for parsing behavior!

The parametrized tests cover all valid logging behavior values with different case variations, plus edge cases for invalid inputs. The test confirms that invalid values correctly fall back to FILE_ONLY.


76-134: Excellent parametrized testing for different logging configurations!

The comprehensive matrix testing covers all combinations of logging behavior, structured logging settings, and expected outcomes. The assertions verify both handler types and formatter formats, ensuring the configuration works correctly across all scenarios.


147-167: Thorough error handling validation!

The directory creation failure tests properly simulate OSError conditions and verify that the system gracefully handles failures by falling back to console-only logging when file logging fails. The use of warnings.catch_warnings() prevents test noise from expected warning messages.


430-438: Good coverage of passthrough logger error handling!

The test correctly expects OSError for directory creation failures in the passthrough logger, which aligns with the intentional design choice to preserve the existing error handling behavior mentioned in the PR description.


1-469: Outstanding test coverage overall!

This test suite provides excellent coverage of the new logging system:

  • All logging behavior modes are tested
  • Both structured and plain text logging are covered
  • Error conditions are properly tested
  • Edge cases like missing logging root are handled
  • Proper mocking isolates the tests from filesystem dependencies

The test organization with separate classes for each logger type makes the suite maintainable and easy to understand. Great work on ensuring the logging refactoring is well-tested!

@yohannj yohannj force-pushed the stream_log_to_console branch from 074ca7f to 752671c Compare July 11, 2025 20:01
@yohannj yohannj marked this pull request as ready for review July 11, 2025 20:06
@yohannj yohannj changed the title Stream log to console feat(logging): Provide options to write logs to stdout Jul 11, 2025
@aaronsteers
Copy link
Contributor

aaronsteers commented Jul 14, 2025

/test-pr

PR test job started... Check job output.

❌ Tests failed.

@aaronsteers
Copy link
Contributor

aaronsteers commented Jul 16, 2025

/test-pr

PR test job started... Check job output.

✅ Tests passed.

@yohannj
Copy link
Contributor Author

yohannj commented Jul 18, 2025

/test-pr

Edit:

Command 'test-pr' is not configured for the user's permission level 'none'.

@aaronsteers
Copy link
Contributor

aaronsteers commented Jul 28, 2025

/test-pr

PR test job started... Check job output.

❌ Tests failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants