-
Notifications
You must be signed in to change notification settings - Fork 60
feat(logging): Provide options to write logs to stdout #716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe logging system in Changes
Sequence Diagram(s)sequenceDiagram
participant Env as Environment
participant UserCode as User Code
participant Logs as airbyte/logs.py
participant Logger as logging.Logger
participant Handlers as Logging Handlers
Env->>Logs: Provides AIRBYTE_LOGGING_BEHAVIOR
UserCode->>Logs: get_global_file_logger() / get_global_stats_logger() / new_passthrough_file_logger()
Logs->>Logs: Parse AIRBYTE_LOGGING_BEHAVIOR
Logs->>Handlers: Select and create appropriate handlers (File, Console, or both)
Logs->>Logger: Remove existing handlers
Logs->>Logger: Attach new handlers
Logger-->>UserCode: Configured logger instance
Would you like to see a more detailed breakdown of the new helper functions or perhaps a comparison between the old and new logging flows, wdyt? 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings
airbyte/logs.py (15)
🧬 Code Graph Analysis (1)airbyte/logs.py (1)
🔇 Additional comments (8)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
airbyte/logs.py (5)
11-11
: Fix line length to comply with the 100-character limit.The line exceeds the maximum allowed length. Would you consider breaking it into multiple lines, wdyt?
-PyAirbyte supports different logging behaviors controlled by the `AIRBYTE_LOGGING_BEHAVIOR` environment -variable: +PyAirbyte supports different logging behaviors controlled by the `AIRBYTE_LOGGING_BEHAVIOR` +environment variable:
309-320
: Usehandlers.extend()
instead of repeatedappend()
calls.For better code style and efficiency, would you consider using
extend()
to add multiple handlers at once, wdyt?def _get_global_handlers() -> list[logging.Handler]: handlers: list[logging.Handler] = [] match AIRBYTE_LOGGING_BEHAVIOR: case LoggingBehavior.FILE_ONLY: handlers.append(_get_global_file_handler()) case LoggingBehavior.CONSOLE_ONLY: handlers.append(_get_console_handler()) case LoggingBehavior.FILE_AND_CONSOLE: - handlers.append(_get_global_file_handler()) - handlers.append(_get_console_handler()) + handlers.extend([_get_global_file_handler(), _get_console_handler()]) return [h for h in handlers if h is not None]
323-333
: Usehandlers.extend()
for consistency.Similar to the previous function, using
extend()
would be more concise here too, wdyt?def _get_global_stats_handlers() -> list[logging.Handler]: handlers: list[logging.Handler] = [] match AIRBYTE_LOGGING_BEHAVIOR: case LoggingBehavior.FILE_ONLY: handlers.append(_get_global_stats_file_handler()) case LoggingBehavior.CONSOLE_ONLY: handlers.append(_get_console_handler()) case LoggingBehavior.FILE_AND_CONSOLE: - handlers.append(_get_global_stats_file_handler()) - handlers.append(_get_console_handler()) + handlers.extend([_get_global_stats_file_handler(), _get_console_handler()]) return [h for h in handlers if h is not None]
336-346
: Usehandlers.extend()
for consistency.Let's maintain the same pattern here as well, wdyt?
def _get_passthrough_handlers(connector_name: str) -> list[logging.Handler]: handlers: list[logging.Handler] = [] match AIRBYTE_LOGGING_BEHAVIOR: case LoggingBehavior.FILE_ONLY: handlers.append(_get_passthrough_file_handler(connector_name)) case LoggingBehavior.CONSOLE_ONLY: handlers.append(_get_console_handler()) case LoggingBehavior.FILE_AND_CONSOLE: - handlers.append(_get_passthrough_file_handler(connector_name)) - handlers.append(_get_console_handler()) + handlers.extend([_get_passthrough_file_handler(connector_name), _get_console_handler()]) return [h for h in handlers if h is not None]
428-430
: Add blank line before function definition.The linter expects 2 blank lines before a function definition at the module level.
return folder / "airbyte-stats.log" + def _remove_all_handlers(logger: logging.Logger) -> None: """Remove all handlers from a logger.""" logger.handlers.clear()
tests/unit_tests/test_logs.py (2)
17-17
: Consider makingconnector_name
a constant.Since this is a test constant used throughout the file, would it be clearer to follow Python naming conventions and make it uppercase, wdyt?
-connector_name = "test_connector" +CONNECTOR_NAME = "test_connector"Note: You'll also need to update all references to
connector_name
toCONNECTOR_NAME
throughout the file.
21-23
: Remove unnecessary f-string prefixes.These strings don't contain any placeholders, so the
f
prefix is not needed.logging.getLogger(f"airbyte").handlers.clear() logging.getLogger(f"airbyte.stats").handlers.clear() - logging.getLogger(f"airbyte.{connector_name}").handlers.clear() + logging.getLogger("airbyte").handlers.clear() + logging.getLogger("airbyte.stats").handlers.clear() + logging.getLogger(f"airbyte.{connector_name}").handlers.clear()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/logs.py
(8 hunks)tests/unit_tests/test_logs.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
tests/unit_tests/test_logs.py (2)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
airbyte/logs.py (10)
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-10-08T15:34:31.026Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-08-31T05:02:20.192Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
🧬 Code Graph Analysis (2)
tests/unit_tests/test_logs.py (1)
airbyte/logs.py (5)
LoggingBehavior
(36-40)_parse_logging_behavior
(48-54)get_global_file_logger
(165-216)get_global_stats_logger
(220-253)new_passthrough_file_logger
(256-306)
airbyte/logs.py (1)
airbyte/constants.py (1)
_str_to_bool
(63-65)
🪛 GitHub Actions: Run Linters
tests/unit_tests/test_logs.py
[error] 21-21: f-string without any placeholders (F541). Remove extraneous f
prefix.
[error] 22-22: f-string without any placeholders (F541). Remove extraneous f
prefix.
airbyte/logs.py
[warning] 11-11: Line too long (103 > 100) (E501)
[error] 317-318: Use handlers.extend(...)
instead of repeatedly calling handlers.append()
(FURB113)
[error] 331-332: Use handlers.extend(...)
instead of repeatedly calling handlers.append()
(FURB113)
[error] 344-345: Use handlers.extend(...)
instead of repeatedly calling handlers.append()
(FURB113)
[error] 428-428: Expected 2 blank lines, found 1 (E302)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte/logs.py (1)
43-45
: Import_str_to_bool
fromairbyte.constants
instead of duplicating it.This function already exists in
airbyte/constants.py
. To avoid code duplication and maintain consistency, would you consider importing it instead, wdyt?+from airbyte.constants import _str_to_bool from airbyte_cdk.utils.datetime_helpers import ab_datetime_now class LoggingBehavior(enum.Enum): """Enumeration for PyAirbyte logging behavior.""" FILE_ONLY = "FILE_ONLY" CONSOLE_ONLY = "CONSOLE_ONLY" FILE_AND_CONSOLE = "FILE_AND_CONSOLE" -def _str_to_bool(value: str) -> bool: - """Convert a string value of an environment values to a boolean value.""" - return bool(value) and value.lower() not in {"", "0", "false", "f", "no", "n", "off"} - - def _parse_logging_behavior(value: str) -> LoggingBehavior:⛔ Skipped due to learnings
Learnt from: aaronsteers PR: airbytehq/PyAirbyte#349 File: airbyte/constants.py:48-50 Timestamp: 2024-08-31T05:02:20.192Z Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers PR: airbytehq/PyAirbyte#349 File: airbyte/constants.py:48-50 Timestamp: 2024-10-08T15:34:31.026Z Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
tests/unit_tests/test_logs.py (2)
26-49
: Comprehensive test coverage for logging behavior parsing!Great job on the thorough parameterized tests covering various input cases including edge cases like empty strings, invalid values, and case variations.
282-312
: Well-structured test for passthrough logger with all behavior modes!The parameterized test effectively validates logger creation across all logging behaviors and structured logging states. Excellent coverage!
012282d
to
46901e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/unit_tests/test_logs.py (2)
76-134
: Consider adding assertions for logger configurationThe parametrized test structure is excellent, but I notice it's only checking handler types and formatter formats. Would it be valuable to also verify the logger level and propagation settings? The actual implementation sets
logger.setLevel(logging.INFO)
andlogger.propagate = False
- wdyt about adding assertions for these configurations?assert logger is not None assert logger.name == "airbyte" + assert logger.level == logging.INFO + assert logger.propagate is False assert len(logger.handlers) == len(handlers)
20-26
: Consider extracting the connector name constantThe
clear_logger_caches()
function uses the globalconnector_name
variable, but it would be cleaner to parameterize this or make it more explicit. Since this is just a test utility function, the current approach is acceptable, but wdyt about making the dependency more explicit?-def clear_logger_caches(): +def clear_logger_caches(connector_name: str = "test_connector"): get_global_file_logger.cache_clear() get_global_stats_logger.cache_clear() - logging.getLogger(f"airbyte").handlers.clear() - logging.getLogger(f"airbyte.stats").handlers.clear() - logging.getLogger(f"airbyte.{connector_name}").handlers.clear() + logging.getLogger("airbyte").handlers.clear() + logging.getLogger("airbyte.stats").handlers.clear() + logging.getLogger(f"airbyte.{connector_name}").handlers.clear()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/logs.py
(8 hunks)tests/unit_tests/test_logs.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte/logs.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
tests/unit_tests/test_logs.py (2)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
🧬 Code Graph Analysis (1)
tests/unit_tests/test_logs.py (1)
airbyte/logs.py (5)
LoggingBehavior
(36-41)_parse_logging_behavior
(49-55)get_global_file_logger
(166-217)get_global_stats_logger
(221-254)new_passthrough_file_logger
(257-307)
🪛 GitHub Actions: Run Linters
tests/unit_tests/test_logs.py
[error] 23-23: f-string without any placeholders (F541) reported by ruff. Remove extraneous f
prefix.
[error] 24-24: f-string without any placeholders (F541) reported by ruff. Remove extraneous f
prefix.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (4)
tests/unit_tests/test_logs.py (4)
29-50
: Excellent test coverage for the parsing functionThe parametrized tests thoroughly cover case insensitivity, invalid inputs, and edge cases for the
_parse_logging_behavior
function. The test cases align well with the implementation's behavior of defaulting toFILE_ONLY
for invalid inputs.
147-167
: Verify the directory creation failure test logicThe test expects different behaviors based on the logging configuration when directory creation fails. For
FILE_ONLY
scenarios, it expectsNone
when no handlers are available, but for scenarios with console handlers, it expects a valid logger. This matches the implementation's fallback behavior - nice attention to detail! The warning suppression is also appropriate since the implementation likely emits warnings for directory creation failures.
225-262
: Stats logger always uses structured formatI notice that the stats logger test always expects
%(message)s
format regardless of theairbyte_structured_logging
parameter. Looking at the implementation inairbyte/logs.py
, the stats logger indeed always uses structured logging format. The test correctly reflects this behavior.
1-469
: Outstanding test coverage for the logging systemThis test file provides comprehensive coverage of the new logging behaviors introduced in the PR. The use of parametrized tests, proper mocking of filesystem operations, and testing of edge cases (directory creation failures, missing logging root) demonstrates excellent testing practices. The tests align well with the implementation's behavior and the PR objectives.
da54dbb
to
4fea0b7
Compare
4fea0b7
to
752671c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
airbyte/logs.py (1)
175-180
: Consider adding a debug log for handler selectionThe handler selection logic is clean, but it might be helpful to add some debug logging to indicate when no handlers are available, wdyt? This could help with troubleshooting configuration issues.
handlers = _get_global_handlers() if len(handlers) == 0: + # Consider adding debug logging here for troubleshooting return None
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte/logs.py
(8 hunks)tests/integration_tests/test_lowcode_connectors.py
(1 hunks)tests/unit_tests/test_logs.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
tests/unit_tests/test_logs.py (12)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-11T22:05:15.550Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-06T17:43:09.319Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-08-31T01:20:08.405Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:48-48
Timestamp: 2024-10-08T15:34:31.026Z
Learning: Test fixtures in the PyAirbyte project do not need to align with real Docker repositories.
airbyte/logs.py (14)
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:29.133Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-07-09T06:38:54.843Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-07-09T06:37:48.088Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:0-0
Timestamp: 2024-10-18T07:00:43.413Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: Suraj-Vishwakarma70
PR: airbytehq/PyAirbyte#285
File: airbyte/sources/base.py:104-113
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In the PyAirbyte project, print statements are preferred over logging for consistency.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-06T17:43:09.319Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-11T22:05:15.550Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:47-47
Timestamp: 2024-10-08T15:34:31.026Z
Learning: When reviewing changes in test fixtures, especially renaming, consider that they might be due to fixing copy-paste errors and may not impact core codepaths.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#347
File: tests/integration_tests/fixtures/registry.json:47-47
Timestamp: 2024-08-31T00:58:32.484Z
Learning: When reviewing changes in test fixtures, especially renaming, consider that they might be due to fixing copy-paste errors and may not impact core codepaths.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-10-08T15:34:31.026Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#349
File: airbyte/constants.py:48-50
Timestamp: 2024-08-31T05:02:20.192Z
Learning: The `_str_to_bool` function in `airbyte/constants.py` is intended to handle only `'false'` and `'0'` as falsey values in environment variables.
🧬 Code Graph Analysis (2)
tests/unit_tests/test_logs.py (1)
airbyte/logs.py (5)
LoggingBehavior
(36-41)_parse_logging_behavior
(49-55)get_global_file_logger
(166-217)get_global_stats_logger
(221-254)new_passthrough_file_logger
(257-307)
airbyte/logs.py (1)
airbyte/constants.py (1)
_str_to_bool
(63-65)
🪛 GitHub Actions: Run Linters
tests/integration_tests/test_lowcode_connectors.py
[error] 1-1: Ruff formatting check failed. File would be reformatted. Run 'ruff format --fix' to fix code style issues.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (13)
airbyte/logs.py (7)
36-41
: LGTM on the LoggingBehavior enum!Clean enum definition that clearly defines the three logging modes. The string values match the enum names, which makes the environment variable configuration intuitive.
49-55
: Solid parsing logic with good fallback behavior!The case-insensitive parsing using
value.upper()
is user-friendly, and the graceful fallback toFILE_ONLY
on invalid values ensures robustness. The try-except pattern handles the ValueError appropriately.
70-84
: Well-documented configuration with clear defaults!The environment variable configuration is well-documented with comprehensive docstring explaining all valid values and fallback behavior. The default of
FILE_ONLY
maintains backward compatibility.
312-320
: Excellent use of match statement for handler selection!The match statement provides clean, readable logic for selecting handlers based on logging behavior. The list comprehension at the end efficiently filters out None handlers.
405-406
: Simple and effective console handler implementation!Using
sys.stdout
for console output is the right choice - it ensures logs go to stdout rather than stderr, which aligns with typical logging practices for application output.
427-429
: Clean utility function for handler management!The
_remove_all_handlers
function provides a clean way to reset logger handlers before configuring new ones. Usinghandlers.clear()
is more efficient than removing handlers individually.
384-402
: Confirm passthrough handler OSError propagation
The call tonew_passthrough_file_logger
in ConnectorBase (airbyte/_connector_base.py:78
) is not wrapped in a try/except, so anOSError
frommkdir
will bubble up and abort initialization. Since the unit tests intests/unit_tests/test_logs.py
explicitly expect and assert this error, it looks intentional and consistent with the design. No changes required here—wdyt?tests/unit_tests/test_logs.py (6)
20-26
: Well-organized test setup with proper cache clearing!The
clear_logger_caches
function properly resets both the LRU caches and logger handlers between tests, preventing test pollution. This is crucial for isolated test execution.
29-50
: Comprehensive test coverage for parsing behavior!The parametrized tests cover all valid logging behavior values with different case variations, plus edge cases for invalid inputs. The test confirms that invalid values correctly fall back to
FILE_ONLY
.
76-134
: Excellent parametrized testing for different logging configurations!The comprehensive matrix testing covers all combinations of logging behavior, structured logging settings, and expected outcomes. The assertions verify both handler types and formatter formats, ensuring the configuration works correctly across all scenarios.
147-167
: Thorough error handling validation!The directory creation failure tests properly simulate OSError conditions and verify that the system gracefully handles failures by falling back to console-only logging when file logging fails. The use of
warnings.catch_warnings()
prevents test noise from expected warning messages.
430-438
: Good coverage of passthrough logger error handling!The test correctly expects
OSError
for directory creation failures in the passthrough logger, which aligns with the intentional design choice to preserve the existing error handling behavior mentioned in the PR description.
1-469
: Outstanding test coverage overall!This test suite provides excellent coverage of the new logging system:
- All logging behavior modes are tested
- Both structured and plain text logging are covered
- Error conditions are properly tested
- Edge cases like missing logging root are handled
- Proper mocking isolates the tests from filesystem dependencies
The test organization with separate classes for each logger type makes the suite maintainable and easy to understand. Great work on ensuring the logging refactoring is well-tested!
074ca7f
to
752671c
Compare
/test-pr
|
/test-pr
|
/test-pr Edit:
|
/test-pr
|
Summary
Implements issue 630.
Without clear direction, I went with "Proposal 1: Choose logger behavior using environment variables".
Implementation details
Code
The code is separated in different commit with clear (hopefully) responsibility and modification.
Especially the first commit splits the code of
get_global_file_logger
in different functions. This is to prepare the work on have multiple handlers and make it easier to review the actual feature being introduced.Testing
The last commit is adding unit tests over the
logs.py
file. I believe it helps understand how everything behaves.It also highlight a behavior that feels weird to me.
The passthrough file logger throws an exception when the folder that should contain it logs cannot be created, while PyAirbyte doesn't throw an exception when
airbyte.logs.AIRBYTE_LOGGING_ROOT
does not exists.I believe it should behaves the same.
In contrast, the "global stats logger" and "global file logger" have a unique behavior.
If we want to change that, I can quickly create a separate PR.
Summary by CodeRabbit