-
Notifications
You must be signed in to change notification settings - Fork 69
feat(integrations): Add Sonar connector support with git submodule #873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add new Integration class that inherits from ConnectorBase - Create SonarExecutor for YAML-based connector execution - Add get_integration() factory function - Support Sonar verbs (get, list, create, update, delete, retrieve, search) - Add connector-sdk as optional dependency in integrations group - Include comprehensive unit tests for Integration and SonarExecutor - Update ConnectorBase to support 'integration' connector type - Add asyncio pytest marker to configuration This enables PyAirbyte to run Sonar YAML-only connectors as a new connector type called 'integrations', which use different verbs than traditional sources/destinations and leverage the Sonar connector-sdk for dynamic execution. Co-Authored-By: AJ Steers <[email protected]>
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1763667751-add-sonar-integration-support' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1763667751-add-sonar-integration-support'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
PyTest Results (Fast Tests Only, No Creds)30 tests - 290 29 ✅ - 291 3s ⏱️ - 5m 54s For more details on these failures, see this check. Results for commit 1361ab1. ± Comparison against base commit 2981b3d. This pull request removes 320 and adds 30 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
PyTest Results (Full)422 tests +33 401 ✅ +28 28m 24s ⏱️ + 1m 18s For more details on these failures, see this check. Results for commit 1361ab1. ± Comparison against base commit 2981b3d. ♻️ This comment has been updated with latest results. |
📝 WalkthroughWalkthroughAdds Sonar-based YAML integrations: new SonarExecutor, Integration wrapper, and get_integration factory; exposes integrations and types at the package top-level; extends ConnectorBase.connector_type to include "integration"; adds unit tests, a sonar submodule, project deps, and CI/workflow updates for submodule checkout. Changes
Sequence Diagram(s)sequenceDiagram
actor Dev
participant Factory as get_integration
participant Int as Integration
participant SE as SonarExecutor
participant SDK as connector-sdk
rect rgb(235,245,235)
Note over Dev,Factory: Factory creates integration
Dev->>Factory: get_integration(name, yaml_path, ...)
Factory->>SE: SonarExecutor(...)
Factory-->>Int: Integration(SE, name, yaml_path)
end
rect rgb(235,245,255)
Note over Dev,SDK: Lazy YAML load & resource discovery
Dev->>Int: list_resources()
Int->>SE: _get_executor() (lazy) / request YAML
SE->>SDK: config_loader.load_connector_config()
SDK-->>SE: config
SE-->>Int: resources
Int-->>Dev: resources
end
rect rgb(255,245,235)
Note over Dev,SDK: Async execution flow
Dev->>Int: aexecute(resource,verb,params)
Int->>SE: aexecute(...)
SE->>SDK: executor.aexecute(...)
SDK-->>SE: result
SE-->>Int: result
Int-->>Dev: result
end
rect rgb(255,245,245)
Note over Dev,Int: Sync-to-async bridge
Dev->>Int: execute(...)
alt no running event loop
Int->>Int: asyncio.run(aexecute(...))
Int-->>Dev: result
else running event loop
Int-->>Dev: raises PyAirbyteInputError
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–60 minutes
Possibly related PRs
Suggested reviewers
Would you like me to flag specific test cases or source lines for deeper inspection, wdyt? Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
pyproject.toml (1)
58-60: Consider long‑term ergonomics of the localconnector-sdkpath dependencyUsing
connector-sdk = {path = "../sonar/connector-sdk", develop = true}under anintegrationsgroup is nice for local development, but it assumes the Sonar repo is checked out as a sibling directory. For contributors or CI setups that don’t mirror that layout,poetry install --with integrationswill fail until they manually arrange the directory. Would it be worth either (a) documenting this expectation explicitly, or (b) planning a follow‑up to switch to a VCS/PyPI requirement once connector-sdk is published and versioned? wdyt?airbyte/_executors/sonar.py (1)
100-127: Mark intentionally unused parameters to keep linters quietIn
execute()andensure_installation(), several parameters (args,stdin,suppress_stderr,auto_fix) are intentionally unused, which may triggerARG002warnings under ruff. Would you be open to making that intent explicit, e.g.:def execute( self, args: list[str], *, - stdin: Any = None, # noqa: ANN401 - suppress_stderr: bool = False, + stdin: Any = None, # noqa: ANN401 + suppress_stderr: bool = False, ) -> Iterator[str]: """Execute is not supported for Sonar connectors.""" - """Execute is not supported for Sonar connectors. + _ = (args, stdin, suppress_stderr) + """Execute is not supported for Sonar connectors. @@ def ensure_installation(self, *, auto_fix: bool = True) -> None: """Ensure connector is available (no-op for YAML connectors).""" - pass + _ = auto_fix + pass(or alternatively adding
# noqa: ARG002on those parameters)? That would keep static checks clean without changing behavior. wdyt?Also applies to: 163-173
airbyte/integrations/base.py (1)
69-147: Maybe centralizeload_connector_configinto a module-level helper to simplify imports and testing?Right now
_validate_yaml,list_resources, andlist_verbseach do their own inline:from connector_sdk.config_loader import load_connector_configand then call it. That keeps the dependency optional, but it also (a) repeats the import logic, and (b) makes it hard for tests to patch
airbyte.integrations.base.load_connector_configas they’re currently written, since that symbol doesn’t actually exist at module scope.Would you consider introducing a single module-level helper or alias (e.g. a
load_connector_configvariable set inside atry/except ImportErrorthat’sNonewhen connector-sdk isn’t installed) and having these methods call through it? That would both de-duplicate the import/error-handling logic and give tests a stable patch target without changing the external Integration API, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
airbyte/__init__.py(5 hunks)airbyte/_connector_base.py(1 hunks)airbyte/_executors/sonar.py(1 hunks)airbyte/integrations/__init__.py(1 hunks)airbyte/integrations/base.py(1 hunks)airbyte/integrations/util.py(1 hunks)pyproject.toml(2 hunks)tests/unit_tests/integrations/__init__.py(1 hunks)tests/unit_tests/integrations/test_integration.py(1 hunks)tests/unit_tests/integrations/test_sonar_executor.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
airbyte/integrations/__init__.py (2)
airbyte/integrations/base.py (1)
Integration(18-288)airbyte/integrations/util.py (1)
get_integration(16-64)
tests/unit_tests/integrations/test_sonar_executor.py (3)
airbyte/_executors/sonar.py (10)
SonarExecutor(20-195)_get_executor(63-98)_cli(101-107)execute(109-127)aexecute(129-146)aexecute_batch(148-161)ensure_installation(163-165)install(167-169)uninstall(171-173)get_installed_version(175-195)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/integrations/base.py (4)
execute(177-210)aexecute(149-175)execute_batch(242-264)aexecute_batch(212-240)
airbyte/integrations/util.py (2)
airbyte/_executors/sonar.py (1)
SonarExecutor(20-195)airbyte/integrations/base.py (1)
Integration(18-288)
tests/unit_tests/integrations/test_integration.py (4)
airbyte/_executors/sonar.py (4)
SonarExecutor(20-195)aexecute(129-146)execute(109-127)aexecute_batch(148-161)airbyte/integrations/base.py (9)
Integration(18-288)_validate_yaml(69-91)list_resources(93-111)list_verbs(113-147)aexecute(149-175)execute(177-210)aexecute_batch(212-240)execute_batch(242-264)check(266-288)airbyte/integrations/util.py (1)
get_integration(16-64)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)
airbyte/_executors/sonar.py (5)
airbyte/_executors/base.py (1)
Executor(159-248)airbyte/registry.py (1)
ConnectorMetadata(65-96)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/integrations/base.py (4)
execute(177-210)aexecute(149-175)aexecute_batch(212-240)execute_batch(242-264)tests/unit_tests/integrations/test_sonar_executor.py (1)
executor(31-37)
airbyte/__init__.py (2)
airbyte/integrations/base.py (1)
Integration(18-288)airbyte/integrations/util.py (1)
get_integration(16-64)
airbyte/integrations/base.py (2)
airbyte/_connector_base.py (3)
ConnectorBase(54-508)name(84-86)check(327-373)airbyte/_executors/sonar.py (4)
SonarExecutor(20-195)aexecute(129-146)execute(109-127)aexecute_batch(148-161)
🔇 Additional comments (8)
tests/unit_tests/integrations/__init__.py (1)
1-2: Test package header and docstring look solidThe minimal module header and focused docstring are exactly what we need here; no further changes from my side.
pyproject.toml (1)
95-102: Async pytest marker looks good; just ensure plugin wiring is in placeThe
"asyncio"marker description matches howpytest-asyncioworks (“automatically applied by pytest-asyncio”), so the config reads cleanly. Can you double‑check thatpytest-asyncio(or an equivalent plugin providing this marker) is always installed in environments running these tests so we don’t hit unknown‑marker warnings if someone runs pytest without extras? wdyt?airbyte/_executors/sonar.py (2)
27-62: Constructor and YAML existence check read cleanlyThe constructor wiring and early
yaml_pathexistence guard make the failure mode clear and user‑friendly, especially withinput_valuepopulated. I don’t see any changes needed here.
63-97: Runtime coupling toconnector-sdkAPI merits a quick real‑world verificationThe lazy creation of
ConnectorExecutorand the optional dependency behavior look well thought out, and delegatingaexecute/aexecute_batchplus version detection to connector-sdk keeps this thin. Since the type relationship toconnector-sdkis only duck‑typed, could we explicitly smoke‑test against a real Sonar YAML once available to confirm that:
ConnectorExecutor.execute(resource, verb, params)and.execute_batch(operations)are indeed awaitable with these exact signatures, andload_connector_config(...).connector.versionis the right field to read for versioning?That would catch any upstream API drift early, given this isn’t enforced at type‑check time. Does that fit with your planned manual verification for this PR? wdyt?
Also applies to: 129-162, 175-195
airbyte/_connector_base.py (1)
57-57:connector_typeliteral widening aligns with the newIntegrationAllowing
"integration"in theconnector_typeliteral makes sense for the new Integration subclass and keeps type hints accurate. Givendocs_urlbuildshttps://docs.airbyte.com/integrations/{self.connector_type}s/..., that will now resolve to an.../integrations/...path for integrations—can you confirm that’s the intended docs location for this new connector type, or should we special‑case integrations later if the docs hierarchy differs? wdyt?airbyte/integrations/__init__.py (1)
1-11: Clean public facade for integrationsRe‑exporting
Integrationandget_integrationhere with a concise module docstring makes the integrations surface discoverable and tidy; this looks good to me as‑is.airbyte/__init__.py (1)
135-136: Top-level Integration exports look consistentThe new imports and
__all__entries cleanly exposeIntegrationandget_integrationat the root package while still keeping the connector-sdk dependency lazy through the underlying modules, so I don’t see any issues here.Also applies to: 159-159, 181-182, 194-195, 203-204
airbyte/integrations/util.py (1)
1-69:get_integrationfactory wiring looks goodThe factory cleanly instantiates
SonarExecutorwith the logging flags and wraps it inIntegration, matching the executor’s constructor and the Integration API without forcing connector-sdk at import time, so this part looks solid to me.
- Add sonar repo as git submodule at ./sonar - Update connector-sdk dependency to use submodule path - Add pytest-asyncio to dev dependencies for async test execution - Configure Deptry to ignore connector-sdk in DEP002 rule - Exclude sonar submodule from ruff checks and pytest collection - Move ruff select rules to [lint] section per deprecation warning This is a temporary solution to make connector-sdk available locally. User will add GitHub secret to CI for submodule access. Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
.gitmodules (1)
1-3: Consider documenting submodule clone instructions.Adding a git submodule is solid for bringing in the Sonar connector-sdk. However, users cloning the repo may not realize they need to use
git clone --recursiveor rungit submodule update --init --recursiveto populate the submodule.Wdyt about adding a note to the README or CONTRIBUTING guide on cloning with submodules? This will help contributors get up and running smoothly.
.ruff.toml (1)
35-40: Remove duplicate "ICN" rule in the select list.Line 39 duplicates the "ICN" rule already present on line 35. While duplicates don't break linting, they're a minor code smell that can be cleaned up.
Apply this diff:
"I", # isort "ICN", # flake8-import-conventions "INP", # flake8-no-pep420 "INT", # flake8-gettext "ISC", # flake8-implicit-str-concat - "ICN", # flake8-import-conventions "LOG", # flake8-logging
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
.gitmodules(1 hunks).ruff.toml(1 hunks)pyproject.toml(4 hunks)sonar(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- sonar
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-10T16:17:57.989Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 417
File: airbyte/cli.py:503-504
Timestamp: 2024-10-10T16:17:57.989Z
Learning: In the PyAirbyte project, support for Python versions earlier than 3.10 is not necessary, as the project requires Python 3.10 or newer.
Applied to files:
.ruff.toml
🔇 Additional comments (1)
.ruff.toml (1)
6-8: Correctly excluding the sonar submodule from linting.Makes sense! Since the sonar submodule is an external project with its own code standards, excluding it from PyAirbyte's Ruff linting keeps the toolchain clean.
- Add GitHub App authentication to all pytest jobs - Configure checkout to use Octavia Bot token with access to sonar repo - Enable recursive submodule checkout - This allows CI to clone the private sonar submodule for connector-sdk Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/python_pytest.yml (1)
28-35: Consider using YAML anchors to reduce duplication of GitHub App authentication.The GitHub App authentication pattern is repeated identically across 4 jobs. This creates maintenance burden if credentials or permissions need to change later.
Would you consider refactoring using YAML anchors (e.g.,
&auth-stepand*auth-step) to DRY this up? For example:# At the top level (outside jobs): .github-app-auth: &github-app-auth - name: Authenticate as GitHub App uses: actions/create-github-app-token@67018539274d69449ef7c02e8e71183d1719ab42 id: get-app-token with: owner: "airbytehq" repositories: "PyAirbyte,sonar" app-id: ${{ secrets.OCTAVIA_BOT_APP_ID }} private-key: ${{ secrets.OCTAVIA_BOT_PRIVATE_KEY }} # Then in each job: steps: - *github-app-auth(Not critical for this PR, but something for a follow-up refactor, wdyt?)
Also applies to: 104-111, 193-200, 277-284
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/python_pytest.yml(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
.github/workflows/python_pytest.yml (1)
38-40: Verify GitHub App has permission to recursively clone the sonar submodule.The
submodules: recursivecheckout is correctly configured across all jobs. However, ensure that the GitHub App's permissions include access to the sonar repository (which appears to be private based on the PR objectives). If the app's credentials lack the necessary permissions, submodule cloning will fail silently or with a cryptic error.Can you confirm that
OCTAVIA_BOT_*secrets are configured in the organization and the app has access to bothPyAirbyteandsonarrepositories?Also applies to: 114-116, 203-205, 287-289
| - name: Authenticate as GitHub App | ||
| uses: actions/create-github-app-token@67018539274d69449ef7c02e8e71183d1719ab42 # v2.1.4 | ||
| id: get-app-token | ||
| with: | ||
| owner: "airbytehq" | ||
| repositories: "PyAirbyte,sonar" | ||
| app-id: ${{ secrets.OCTAVIA_BOT_APP_ID }} | ||
| private-key: ${{ secrets.OCTAVIA_BOT_PRIVATE_KEY }} | ||
| - name: Checkout code | ||
| uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0 | ||
| with: | ||
| token: ${{ steps.get-app-token.outputs.token }} | ||
| submodules: recursive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify GitHub App token is available and properly scoped for fork PRs.
The GitHub App authentication pattern looks correct for accessing the private sonar submodule, but the pytest-no-creds job runs on fork PRs where the OCTAVIA_BOT_APP_ID and OCTAVIA_BOT_PRIVATE_KEY secrets won't be available. This will cause the create-github-app-token step to fail on fork PRs.
Do you want to add a conditional check (similar to the pytest job's fork logic) to skip the GitHub App token creation on forks, or does the action gracefully handle missing secrets?
I'll verify this concern by examining the workflow file and checking how the GitHub App token action handles missing secrets.
🏁 Script executed:
#!/bin/bash
# Inspect the full workflow file to understand pytest-no-creds job config
# and check for existing fork protection conditionals
cat .github/workflows/python_pytest.yml | head -n 350Length of output: 10302
🏁 Script executed:
#!/bin/bash
# Also check if there are any conditional checks (if:) in the workflow
rg -n "if:" .github/workflows/python_pytest.yml -C2Length of output: 2982
🏁 Script executed:
#!/bin/bash
# Search for pytest-no-creds job definition
rg -n "pytest-no-creds" .github/workflows/python_pytest.yml -A15Length of output: 1924
🌐 Web query:
actions/create-github-app-token missing secrets behavior fork PR
💡 Result:
Short answer: actions/create-github-app-token needs the app private key (a secret) to create an installation token — GitHub will NOT provide repository secrets to workflows triggered by a pull request from a fork, so the action will be unable to read the PRIVATE_KEY (and will fail or not produce a usable token) when the workflow runs on a fork PR. [1][2]
What to do (brief):
- If you must run with secrets for fork PRs, use pull_request_target (runs in base repo and can access secrets) — but avoid checking out or running untrusted PR code in that job because of security risks. [3]
- Alternatively require contributors to open branches in the same repo (not a fork) or have a maintainer re-run/approve the workflow so secrets are available.
- Or design the workflow to run a safe, unprivileged check for fork PRs and only run secret-requiring jobs after merge or via a maintainer-triggered workflow_run. [3][1]
Sources:
- actions/create-github-app-token repo (usage/inputs/notes). [1]
- GitHub docs: secrets are not passed to workflows triggered by a pull request from a fork. [2]
- GitHub security docs on pull_request vs pull_request_target and associated risks. [3]
References:
[1] actions/create-github-app-token (GitHub). (github.com)
[2] Understanding GitHub secret types / Secrets limitations (GitHub Docs). (docs.github.com)
[3] GitHub Security Lab / docs on pull_request_target and secrets (GitHub). (securitylab.github.com)
Citations:
- 1: https://github.com/actions/create-github-app-token?utm_source=openai
- 2: https://docs.github.com/en/code-security/getting-started/understanding-github-secret-types?utm_source=openai
- 3: https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/?utm_source=openai
Add fork protection conditionals to pytest-fast, pytest-no-creds, and dependency-analysis jobs.
Your concern is spot-on—but the issue affects more than pytest-no-creds. GitHub does NOT provide repository secrets to workflows triggered by a pull request from a fork, so all three jobs that use create-github-app-token will fail when triggered on fork PRs.
This is a design contradiction: the workflow comments explicitly state that pytest-no-creds exists to run tests "on forks, where secrets are not available," yet it now requires secrets for GitHub App authentication. The pytest job correctly protects against this with a fork conditional at lines 170–172, but pytest-fast, pytest-no-creds, and dependency-analysis lack this protection.
You'll want to add the same conditional to those three jobs:
if: >
(github.event_name == 'push' && github.ref == 'refs/heads/main') ||
(github.event.pull_request.head.repo.fork == false)Wdyt? Should pytest-no-creds perhaps have different logic if it's meant to be fork-friendly, or is the intent to require these jobs to only run on the main repo going forward?
🤖 Prompt for AI Agents
.github/workflows/python_pytest.yml lines 28-40: the jobs pytest-fast,
pytest-no-creds, and dependency-analysis call create-github-app-token and will
fail on forked PRs because secrets are not provided; add the same
fork-protection conditional used by the pytest job (the check that allows only
pushes to main or pull_request where github.event.pull_request.head.repo.fork ==
false) to each of those three job definitions so they are skipped for forked PR
runs.
- Add GitHub App authentication to all jobs that run poetry install - Configure checkout to use Octavia Bot token with access to sonar repo - Enable recursive submodule checkout in all workflows - Update repositories list to include both PyAirbyte and sonar - This allows CI to clone the private sonar submodule for connector-sdk Updated workflows: - python_lint.yml (3 jobs) - pydoc_preview.yml - pydoc_publish.yml - autofix.yml - test-pr-command.yml - fix-pr-command.yml Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:2:27: Double quote to prevent globbing and word splitting [shellcheck]
| run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:4:26: Double quote to prevent globbing and word splitting [shellcheck]
| run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:2:27: Double quote to prevent globbing and word splitting [shellcheck]
| run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:4:26: Double quote to prevent globbing and word splitting [shellcheck]
| run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/test-pr-command.yml (1)
79-93: Update checkout step to use the app token for consistent private submodule access.The original review comment is correct. I've verified that this is a bug: all 6 other workflows in
.github/workflows/that usesubmodules: recursiveconsistently usetoken: ${{ steps.get-app-token.outputs.token }}, buttest-pr-command.ymlis the only exception at line 91.Interestingly, the same workflow creates the app token at line 81 and uses it successfully at lines 108 and 158 for other steps—so the pattern is clear within this very file.
Should line 91 be updated to use
token: ${{ steps.get-app-token.outputs.token }}? This ensures the sonar submodule checkout has proper authentication to any private repositories, matching the established pattern across all other workflows. Wdyt?
🧹 Nitpick comments (1)
.github/workflows/pydoc_publish.yml (1)
44-46: Recursive submodule checkout may impact workflow performance and reliability.Using
submodules: recursivewill fetch all nested submodules. While this ensures complete checkout, it could:
- Slow down the workflow if there are many nested submodules
- Increase the likelihood of transient failures if any nested submodule is temporarily unavailable
Should we consider using a non-recursive checkout if only the top-level
sonarsubmodule is needed?- name: Checkout code uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0 with: token: ${{ steps.get-app-token.outputs.token }} - submodules: recursive + submodules: true
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/autofix.yml(1 hunks).github/workflows/fix-pr-command.yml(2 hunks).github/workflows/pydoc_preview.yml(1 hunks).github/workflows/pydoc_publish.yml(1 hunks).github/workflows/python_lint.yml(3 hunks).github/workflows/test-pr-command.yml(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (9)
.github/workflows/pydoc_preview.yml (1)
17-29: Verify that required GitHub App secrets are configured and have sufficient permissions.The changes correctly implement GitHub App authentication and recursive submodule checkout to support the new sonar submodule. However, this requires:
- Repository secrets
OCTAVIA_BOT_APP_IDandOCTAVIA_BOT_PRIVATE_KEYto be configured in the repository settings- The GitHub App to have read access to both "PyAirbyte" and "sonar" repositories
- The app token to have sufficient permissions (at minimum, read access to repository contents and metadata)
Could you verify that these prerequisites are in place, especially if this is the first workflow to use this GitHub App? Also, have you confirmed that similar configurations are applied consistently across other workflows (per the PR description, other workflow files receive similar updates)?
.github/workflows/python_lint.yml (2)
20-32: Verify GitHub App secrets are configured with appropriate permissions.The three jobs (ruff-lint-check, ruff-format-check, python-type-checks) now require
OCTAVIA_BOT_APP_IDandOCTAVIA_BOT_PRIVATE_KEYsecrets to authenticate and checkout the private sonar submodule.Before merging, confirm:
- These secrets exist in the repository settings
- The GitHub App has permissions to access both PyAirbyte and sonar repos
- The app token scope in the "repositories" field covers both repos (✓ already present:
"PyAirbyte,sonar")Without proper secrets, all linting jobs will fail during the GitHub App authentication step.
Wdyt on documenting these secret setup steps somewhere in the repo (e.g., CONTRIBUTING.md or in PR comments) for future maintainers?
30-32: Assess CI performance impact of recursive submodule checkout.The
submodules: recursiveflag will recursively clone all nested submodules. If the sonar repo itself has submodules, this could add non-trivial time to checkout across three jobs.Could you verify that:
- The sonar submodule doesn't have nested submodules that would bloat CI time?
- Or, if needed, consider
submodules: true(non-recursive) if only the top-level sonar submodule is required?Also applies to: 67-69, 100-102
.github/workflows/autofix.yml (2)
15-22: GitHub App authentication approach looks solid.The app token strategy with explicit repository access (both PyAirbyte and sonar) is the right way to handle private submodule checkout. The action version is properly pinned, and the secrets reference is clean.
23-27: Potential credential mismatch between checkout and push—wdyt?The checkout step uses the app token (from
steps.get-app-token), but the push step still uses the defaultsecrets.GITHUB_TOKEN. While this should work for the main PyAirbyte repo, I'm wondering if you should pass the app token to the push action for consistency, especially if the git configuration from the authenticated checkout expects the same credentials downstream.- name: Push changes uses: ad-m/github-push-action@77c5b412c50b723d2a4fbc6d71fb5723bcd439aa # v1.0.0 with: - github_token: ${{ secrets.GITHUB_TOKEN }} + github_token: ${{ steps.get-app-token.outputs.token }}This is particularly relevant if any git operations later read from the sonar submodule or if the workflow ever needs to modify submodule references. Have you tested this with the submodule in place, or is this a case where GITHUB_TOKEN is expected to be sufficient?
Also applies to: 50-53
.github/workflows/fix-pr-command.yml (2)
43-43: Adding "sonar" repository to token scope looks good.This change correctly grants the GitHub App permission to authenticate when cloning the private sonar submodule during checkout. Makes sense for the new dependency management approach.
Wdyt about verifying that the corresponding
.gitmodulesconfiguration is correctly set up elsewhere in the PR? That would confirm the end-to-end submodule setup is complete.
52-52: Recursive submodule checkout is consistent across CI workflows.Verification confirms that
submodules: recursiveand the sonar repository scope are already established in all major workflows (python_pytest.yml,test-pr-command.yml,python_lint.yml,pydoc_publish.yml,pydoc_preview.yml, andautofix.yml). This change aligns perfectly with the existing pattern, ensuring the sonar submodule and any nested dependencies are properly fetched during CI jobs..github/workflows/pydoc_publish.yml (2)
39-39: Scoping verified: repository names in token match submodule configuration.The verification confirms that the GitHub App token scoping is correct. The
sonarrepository name in the token scope matches the submodule configuration in.gitmodulesexactly (sourced fromhttps://github.com/airbytehq/sonar.git). The configuration is consistent and ready to go.
34-46: GitHub App authentication pattern is consistent with existing workflows; no action needed.The
OCTAVIA_BOT_APP_IDandOCTAVIA_BOT_PRIVATE_KEYsecrets are already configured and in active use across nine other workflows (python_pytest.yml, python_lint.yml, autofix.yml, pydoc_preview.yml, and others). The pydoc_publish.yml changes follow the same established pattern, so there's no risk of CI failures due to missing secrets.
- Add deptry exclusion for sonar submodule directory - Fix test patches to mock connector_sdk.config_loader.load_connector_config - This resolves CI failures in dependency-analysis and pytest-fast jobs Co-Authored-By: AJ Steers <[email protected]>
- Add .venv, sonar, and tests to deptry exclusions - Add connector-sdk to DEP004 ignore list (imported but declared as dev dependency) - This resolves the 19,214 deptry errors from scanning .venv and sonar submodule Co-Authored-By: AJ Steers <[email protected]>
- Add 'check-deps' poe task that runs 'deptry airbyte' to scan only the package - Update python_pytest.yml workflow to use 'poetry run poe check-deps' - This ensures deptry respects the exclude configuration in pyproject.toml Co-Authored-By: AJ Steers <[email protected]>
- Add DEP003 ignore for 'airbyte' to suppress false-positives for self-imports - Add DEP004 ignore for both 'connector-sdk' and 'connector_sdk' to handle module vs distribution name mismatch - This resolves all deptry errors: 350 DEP003 errors (pre-existing) and 6 DEP004 errors (connector-sdk) - Deptry now reports 'Success! No dependency issues found.' Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/python_pytest.yml(5 hunks)pyproject.toml(6 hunks)tests/unit_tests/integrations/test_integration.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- .github/workflows/python_pytest.yml
- tests/unit_tests/integrations/test_integration.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
pyproject.toml (3)
58-59: Integrations group with path dependency looks good—how will users install it?The setup is structurally correct:
connector-sdkas a path dependency withdevelop = trueis the right pattern for a temporary local/submodule dependency. However, users won't automatically get this group unless they explicitly install with--with integrationsor similar.Wdyt about clarifying in docs/README that users interested in Sonar integrations need to install with the integrations group? This could prevent confusion.
174-175: Sonar directory properly excluded from deptry analysis.The deptry configuration correctly excludes the sonar submodule, which aligns with the PR objective to keep connector-sdk (still in submodule form) out of dependency checks. ✓
183-184: Connector-SDK properly ignored in optional-dependency rules.Adding
connector-sdkto both DEP004 (misplaced) and DEP002 (unused) ignore lists is sensible: it's an optional, lazily-imported integration dependency. The comments clearly explain why it's ignored, which helps maintainability.Also applies to: 194-195
| pdoc = "^16.0.0" | ||
| poethepoet = ">=0.26.1,<0.32.0" | ||
| pytest = "^8.2.0" | ||
| pytest-asyncio = "^0.23.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify pytest-asyncio 0.23.0 compatibility with pytest 8.2.0.
You've added pytest-asyncio 0.23.0 and the corresponding asyncio marker, plus the --ignore=sonar flag. However, the PR notes mention two async tests are failing locally due to pytest-asyncio configuration issues. Please confirm that this version pairing is compatible and that the test failures are resolved, wdyt?
Also applies to: 95-95, 102-102
🌐 Web query:
Is pytest-asyncio 0.23.0 compatible with pytest 8.2.0? What configuration is needed?
💡 Result:
Short answer: No — pytest-asyncio 0.23.0 is not guaranteed fully compatible with pytest 8.2.0. The project fixed compatibility with pytest 8.2 in later 0.23.x releases (0.23.6) and bumped minimum pytest to 8.2.0 in 0.24.0, so you should upgrade pytest-asyncio to >=0.23.6 (better: >=0.24.0) when using pytest 8.2.0. [1][2]
Required/configuration notes:
- 0.23.0 removed the old asyncio_event_loop mark — use pytest.mark.asyncio(scope="module"|"class"|...) to request non-function loop scopes. You can also use the event_loop_policy fixture introduced in 0.23.0. [3][1]
- From 0.24.0 onward the API prefers loop_scope (instead of scope) for pytest.mark.asyncio and adds the asyncio_default_fixture_loop_scope config option; upgrade if you want the newer config options and explicit support for pytest ≥8.2.0. [2][1]
References:
- pytest-asyncio changelog (0.23.x / fixes for pytest 8.2 / 0.24.0 notes). [1][2]
- pytest-asyncio markers/docs (usage of pytest.mark.asyncio with scope). [3]
🏁 Script executed:
#!/bin/bash
# Examine pyproject.toml at relevant lines and surrounding context
cat pyproject.toml | head -110 | tail -40Length of output: 46
---
🏁 Script executed:
```shell
#!/bin/bash
# Search for pyproject.toml in repository
find . -name "pyproject.toml" -type f 2>/dev/null | head -5
Length of output: 65
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine pyproject.toml to see pytest and pytest-asyncio versions and configuration
cat ./pyproject.toml | sed -n '60,110p'
Length of output: 2085
---
🏁 Script executed:
```shell
#!/bin/bash
# Search for pytest.mark.asyncio usage in test files
rg -n "pytest\.mark\.asyncio|@asyncio|async def test_" --type py tests/ | head -20
Length of output: 829
Upgrade pytest-asyncio to at least 0.23.6 or 0.24.0 to resolve pytest 8.2.0 compatibility issues.
The pytest-asyncio 0.23.0 you've specified is not guaranteed fully compatible with pytest 8.2.0—compatibility was fixed in later 0.23.x releases (0.23.6) and in 0.24.0. This aligns with your note that async tests are failing locally.
Your tests are using the correct @pytest.mark.asyncio marker (the proper API for 0.23.0+), so the marker configuration is right. However, the version pairing itself is the issue.
Consider updating your constraint from "^0.23.0" to "^0.23.6" or "^0.24.0" in pyproject.toml (line 71). From 0.24.0 onward, the API prefers loop_scope instead of scope for pytest.mark.asyncio and adds the asyncio_default_fixture_loop_scope config option, which gives you more explicit control, wdyt?
🤖 Prompt for AI Agents
In pyproject.toml around line 72, the pytest-asyncio dependency is pinned to
"^0.23.0" which is incompatible with pytest 8.2.0; update the constraint to a
compatible release such as "^0.23.6" or "^0.24.0" to resolve async test
failures, then run dependency install and tests to verify the issue is fixed.
Co-Authored-By: AJ Steers <[email protected]>
feat(integrations): Add Sonar connector support with git submodule
Summary
This PR adds support for running Sonar connectors in PyAirbyte as a new connector type called "integrations". Sonar connectors are YAML-based connectors that provide API access through resource and verb operations (e.g.,
customers.list(),customers.get(id="123")), rather than the stream-based model used by sources and destinations.Key changes:
Integrationclass that inherits fromConnectorBasebut not fromSourceorDestinationSonarExecutorto run Sonar connectors using the connector-sdk'sConnectorExecutorget_integration()factory function for easy integration creationcheck-depspoe task for running deptry on the airbyte package onlyArchitecture:
Review & Testing Checklist for Human
Test with a real Sonar connector: All tests are mocked. Verify end-to-end functionality with an actual Sonar connector YAML file:
Verify Octavia Bot secrets are configured: The CI workflows require
OCTAVIA_BOT_APP_IDandOCTAVIA_BOT_PRIVATE_KEYsecrets to access the private sonar submodule. Verify these are set up in the repo settings.Review fork PR behavior: Fork PRs will fail CI because they won't have access to Octavia Bot secrets. Decide if this is acceptable during the temporary submodule phase or if we need to add fork-skip logic to workflows.
Decide on long-term strategy: The git submodule approach is explicitly temporary. Decide whether to:
Review async test configuration: Two async tests have pytest-asyncio configuration issues (
test_aexecuteandtest_aexecute_batchin both test files). The sync wrapper tests pass, so functionality is verified, but decide whether to fix pytest-asyncio configuration or skip async tests temporarily.Notes
submodules: recursive.check-depspoe task that runsdeptry airbyteto scan only the package, ensuring deptry respects the exclude configuration.Requested by: AJ Steers (@aaronsteers)
Session: https://app.devin.ai/sessions/bf60b6038a32425b8d87fdd83198f52a
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.