-
Notifications
You must be signed in to change notification settings - Fork 30
feat: share API budget with parent streams #828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/have_parent_streams_use_api_budget#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/have_parent_streams_use_api_budgetHelpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
Fixed
Show fixed
Hide fixed
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
Fixed
Show fixed
Hide fixed
📝 WalkthroughWalkthroughThe ModelToComponentFactory now accepts and stores an optional Changes
Sequence DiagramsequenceDiagram
participant Test as Test / Caller
participant Factory as ModelToComponentFactory
participant SubFactory as Nested ModelToComponentFactory
participant Retriever as HTTP Retriever
Test->>Factory: __init__(..., api_budget=budget) / set_api_budget(...)
activate Factory
Factory->>Factory: store self._api_budget = api_budget
deactivate Factory
Test->>Factory: build components / create_substreams()
activate Factory
Factory->>SubFactory: __init__(..., api_budget=self._api_budget)
activate SubFactory
SubFactory->>SubFactory: store self._api_budget = api_budget
deactivate SubFactory
Factory->>Retriever: instantiate retriever with budget policies from self._api_budget
deactivate Factory
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
684-684: Nice addition of theapi_budgetparameter!The implementation looks good. Would it be helpful to add a brief docstring explaining when and why this parameter should be set? For instance, documenting that it's used to enforce rate limits across parent and child streams. Wdyt?
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)
11-11: Verify whether the Mock import is needed.I notice
Mockis imported fromunittest.mockbut don't see it used in the changed test code. Was this import added for future test cases, or could it be removed? wdyt?Run this script to check if Mock is used elsewhere in the test file:
#!/bin/bash # Check for Mock usage in the test file rg -n '\bMock\b' unit_tests/sources/declarative/parsers/test_model_to_component_factory.py | grep -v "^11:"
780-786: Consider whether accessing private attributes in assertions is the best approach.The assertions verify API budget propagation by accessing
._policies, which is a private attribute (indicated by the underscore prefix). While this is common in tests, it couples the test to internal implementation details. Would it be better to verify the behavior through public APIs or check for the existence of_api_budgetitself rather than drilling into_policies? For example:- # ensure api budget - assert get_retriever( - parent_stream_configs[0].stream - ).requester._http_client._api_budget._policies - assert get_retriever( - parent_stream_configs[1].stream - ).requester._http_client._api_budget._policies + # ensure api budget is set on parent streams + parent_0_budget = get_retriever(parent_stream_configs[0].stream).requester._http_client._api_budget + assert parent_0_budget is not None + assert isinstance(parent_0_budget, APIBudget) + + parent_1_budget = get_retriever(parent_stream_configs[1].stream).requester._http_client._api_budget + assert parent_1_budget is not None + assert isinstance(parent_1_budget, APIBudget)This would be slightly more resilient to internal refactoring while still verifying the PR objective. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py(3 hunks)unit_tests/sources/declarative/parsers/test_model_to_component_factory.py(4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
📚 Learning: 2025-01-14T00:20:32.310Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
Applied to files:
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
🧬 Code graph analysis (2)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (3)
unit_tests/connector_builder/test_connector_builder_handler.py (2)
streams(834-835)get_retriever(442-443)airbyte_cdk/sources/streams/call_rate.py (1)
APIBudget(513-627)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
set_api_budget(4256-4259)create_component(827-860)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
airbyte_cdk/sources/streams/call_rate.py (1)
APIBudget(513-627)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-intercom
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: SDM Docker Image Build
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
699-699: Clean type simplification!The type change looks good—removing the single-element
Unionmakes the type hint cleaner. The assignment correctly initializes the budget from the constructor parameter.
3891-3891: Perfect propagation of the API budget!This correctly passes the
api_budgetto the nested factory when creating parent streams, which achieves the PR objective of sharing API budgets between parent and child streams to avoid rate limits. The implementation ensures that parent streams inherit the same rate limiting configuration.unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)
748-772: Nice job creating a new factory instance for test isolation!Creating a dedicated
ModelToComponentFactoryinstance and configuring it withset_api_budgetbefore creating the partition router ensures this test doesn't affect others. The budget configuration looks comprehensive with the MovingWindowCallRatePolicy setup.
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
Show resolved
Hide resolved
PyTest Results (Fast)282 tests - 3 535 272 ✅ - 3 533 1m 34s ⏱️ - 5m 9s For more details on these failures, see this check. Results for commit d0ca8c6. ± Comparison against base commit 6504148. This pull request removes 3535 tests.♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-13T23:39:15.457Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Applied to files:
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
🧬 Code graph analysis (1)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
set_api_budget(4256-4259)create_component(827-860)unit_tests/connector_builder/test_connector_builder_handler.py (1)
get_retriever(442-443)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: source-intercom
- GitHub Check: Check: destination-motherduck
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Analyze (python)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py (1)
747-765: API budget setup looks good! Quick question about the empty matchers list.The API budget configuration is set up correctly. I notice the
matcherslist is empty (line 760), which means this policy will apply to all HTTP requests made by the parent streams. Is this intentional for this test, or would it be more realistic to include a matcher? Looking at other tests liketest_api_budget()(line 4220-4226), they specify matchers to target specific endpoints. Wdyt?
unit_tests/sources/declarative/parsers/test_model_to_component_factory.py
Show resolved
Hide resolved
PyTest Results (Full)3 820 tests 3 808 ✅ 11m 22s ⏱️ Results for commit d0ca8c6. |
What
We thought it would address the CDK part of https://github.com/airbytehq/oncall/issues/9874 but in the end it was just a misconfiguration of the PEP stream having 403 needed to be consider as RATE_LIMITED.
Today, API budgets are only applied to the child streams. It is important to avoid rate limit to have this also shared on the parent stream.
How
Add
api_budgetin the constructor of ModelToComponentFactory and pass it when creating the factory for the parent stream.Summary by CodeRabbit
New Features
Tests