Skip to content

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Nov 13, 2025

feat: Add bearer token authentication support (do not merge)

Summary

This PR adds support for bearer token authentication to PyAirbyte's Cloud integration as an alternative to OAuth2 client credentials flow.

Status: DRAFT - Implementation in progress

This is an early draft PR created to get visibility and early feedback. No code changes have been implemented yet.

Planned Changes:

  1. Add bearer_token parameter to CloudWorkspace class
  2. Add validation to ensure only one auth method is used (client_id+secret OR bearer_token, mutually exclusive)
  3. Add environment variable support for bearer token (AIRBYTE_CLOUD_BEARER_TOKEN)
  4. Update get_airbyte_server_instance() to support bearer token authentication
  5. Update _make_config_api_request() to use bearer token if provided
  6. Add public create_oauth_token() method to CloudWorkspace to generate and return bearer tokens
  7. Update documentation with bearer token usage examples
  8. Add tests for bearer token authentication

Review & Testing Checklist for Human

  • No code to review yet - this is an early draft PR
  • Once implemented, verify that bearer token and client credentials are mutually exclusive
  • Test that bearer token authentication works end-to-end with Cloud API
  • Verify environment variable resolution works correctly
  • Test the new create_oauth_token() method

Notes

Summary by CodeRabbit

  • New Features
    • Bearer token authentication added as an alternative to OAuth2 client credentials.
    • Client ID/secret now optional across API functions; supply either bearer token or client credentials.
    • Workspace/client initialization validates and enforces a single authentication method.
    • New public method to generate an OAuth2 bearer token from client credentials.
    • Added environment variable support for bearer token configuration.

@devin-ai-integration
Copy link
Contributor

Original prompt from AJ Steers
@Devin - Does PyAirbyte still have the ability to authenticate to Cloud using a bearer token directly rather than passing client ID & secret?
Thread URL: https://airbytehq-team.slack.com/archives/D089P0UPVT4/p1763009638868299

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1763010134-bearer-token-auth' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1763010134-bearer-token-auth'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link

github-actions bot commented Nov 13, 2025

PyTest Results (Fast Tests Only, No Creds)

320 tests  ±0   320 ✅ ±0   5m 48s ⏱️ -6s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 97392bf. ± Comparison against base commit 7eb746b.

♻️ This comment has been updated with latest results.

@aaronsteers aaronsteers marked this pull request as ready for review November 13, 2025 05:11
- Add bearer_token parameter to CloudWorkspace class as alternative to client credentials
- Add validation to ensure only one auth method is used (client_id+secret OR bearer_token)
- Add CLOUD_BEARER_TOKEN_ENV_VAR constant and resolve_cloud_bearer_token() function
- Update get_airbyte_server_instance() to support bearer token authentication
- Update _make_config_api_request() to use bearer token if provided
- Add create_oauth_token() public method to CloudWorkspace
- Update all api_util functions to accept optional bearer_token parameter
- Update CloudWorkspace docstring with bearer token usage examples
- Add type: ignore comments for optional client_id/client_secret parameters

Co-Authored-By: AJ Steers <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 13, 2025

📝 Walkthrough

Walkthrough

This change adds optional bearer-token authentication alongside OAuth2 client credentials across Airbyte cloud utilities. Public API functions and CloudWorkspace now accept a bearer_token (and make client credentials optional), enforce exactly-one-auth validation, propagate bearer tokens to API calls, and add a helper to resolve bearer tokens from the environment plus CloudWorkspace.create_oauth_token().

Changes

Cohort / File(s) Summary
Core API Authentication Refactor
airbyte/_util/api_util.py
Make client_id/client_secret optional; add bearer_token param across public API functions; enforce mutual-exclusion (exactly one auth method); get_airbyte_server_instance() selects security model based on inputs; _make_config_api_request() and related flows accept/resolve bearer tokens.
CloudWorkspace Integration
airbyte/cloud/workspaces.py
CloudWorkspace fields client_id/client_secret become optional, new bearer_token field added; validation ensures one auth method; credential wrapping deferred until after validation; create_oauth_token() added; bearer_token threaded to API calls.
Cloud Bearer Token Resolver & Constants
airbyte/cloud/auth.py, airbyte/constants.py
New `resolve_cloud_bearer_token(input_value: str
Cloud Callsites — Propagate bearer_token
airbyte/cloud/connections.py, airbyte/cloud/connectors.py, airbyte/cloud/sync_results.py
Thread bearer_token=self.workspace.bearer_token into many internal api_util calls; add type: ignore[arg-type] where optional client creds are passed to satisfy typing. No control-flow changes.
Public API Signatures Updated
... (list_connections, list_workspaces, get_workspace, create_connection, get_connection_by_name, delete_connection, patch_connection, _make_config_api_request, check_connector, custom YAML source definition functions, etc.)
Numerous public function signatures now accept `client_id: SecretString

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CloudWorkspace
    participant APIUtil as api_util
    participant AirbyteAPI

    rect rgb(240, 245, 250)
    Note over User,CloudWorkspace: Initialization
    User->>CloudWorkspace: instantiate (client_id/client_secret) OR (bearer_token)
    activate CloudWorkspace
    CloudWorkspace->>CloudWorkspace: validate exactly one auth method
    CloudWorkspace->>CloudWorkspace: wrap provided secret(s) into SecretString
    deactivate CloudWorkspace
    end

    rect rgb(235, 250, 235)
    Note over CloudWorkspace,AirbyteAPI: API call flow
    CloudWorkspace->>APIUtil: call API with (client_id, client_secret, bearer_token)
    activate APIUtil
    APIUtil->>APIUtil: if bearer_token provided -> use bearer_auth Security
    alt bearer_token
        APIUtil->>AirbyteAPI: init AirbyteAPI with bearer_auth (token supplied)
    else no bearer_token
        APIUtil->>APIUtil: fetch token via get_bearer_token(client_id, client_secret)
        APIUtil->>AirbyteAPI: init AirbyteAPI with client_credentials Security (token obtained)
    end
    APIUtil->>AirbyteAPI: perform request
    deactivate APIUtil
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas to focus review on:
    • Mutual-exclusion auth validation across airbyte/_util/api_util.py and CloudWorkspace init — ensure every code path enforces exactly-one-auth.
    • Correct propagation of bearer_token into all api_util callsites (spot-check connections.py, connectors.py, sync_results.py).
    • create_oauth_token() behavior when bearer_token is already configured — confirm it raises as intended.
    • The use of type: ignore[arg-type] — should we instead adjust types so ignores aren't needed, or are these acceptable temporary workarounds? wdyt?

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the primary change: adding bearer token authentication support as an alternative to OAuth2 client credentials.
Docstring Coverage ✅ Passed Docstring coverage is 95.89% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1763010134-bearer-token-auth

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49a96aa and 97392bf.

📒 Files selected for processing (3)
  • airbyte/cloud/connections.py (6 hunks)
  • airbyte/cloud/connectors.py (12 hunks)
  • airbyte/cloud/sync_results.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/cloud/connectors.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (11)
airbyte/cloud/connections.py (6)

63-65: Past review comment addressed!

The addition of bearer_token=self.workspace.bearer_token resolves the issue flagged in the previous review where bearer-token-only workspaces couldn't fetch connection info. This change enables the new authentication path to work correctly, wdyt?


182-184: LGTM!

Bearer token authentication support added consistently for connection sync operations.


221-223: LGTM!

Bearer token support added for job log retrieval operations.


274-276: LGTM!

Bearer token authentication now supported for connection rename operations.


294-296: LGTM!

Bearer token support added for table prefix updates.


319-321: LGTM!

Bearer token authentication now supported for stream selection updates. All connection API operations are consistently updated to support the new authentication flow.

airbyte/cloud/sync_results.py (5)

256-258: Bearer token propagation looks good, but type comment inconsistency noted

The bearer_token addition correctly enables the new authentication flow. However, I noticed that connections.py includes # type: ignore[arg-type] comments on the client_id and client_secret parameters (lines 63-64), but they're absent here. Should we add them for consistency, or does the type checker behave differently in this context, wdyt?


268-270: LGTM!

Bearer token authentication support added for destination configuration retrieval.


290-292: LGTM!

Bearer token support added for job info retrieval operations.


317-319: LGTM!

Bearer token correctly propagated through the fallback API call in the exception handling path. This ensures the new authentication flow works even when datetime parsing requires the raw API response.


337-339: LGTM!

Bearer token authentication support added for fetching job attempts. All sync result API operations now consistently support the new authentication flow.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
airbyte/cloud/connectors.py (1)

162-168: Bearer-token auth path breaks here
When a workspace is instantiated with only bearer_token, both client_id and client_secret stay None, so this call reaches api_util.check_connector with no usable credentials at all. The new bearer-token flow then 401s instead of succeeding. Could we thread bearer_token=self.workspace.bearer_token (as you did for the workspace methods) through this and the other connector API calls so the new auth mode actually works, wdyt?

         result = api_util.check_connector(
             workspace_id=self.workspace.workspace_id,
             connector_type=self.connector_type,
             actor_id=self.connector_id,
             api_root=self.workspace.api_root,
             client_id=self.workspace.client_id,  # type: ignore[arg-type]
             client_secret=self.workspace.client_secret,  # type: ignore[arg-type]
+            bearer_token=self.workspace.bearer_token,
         )
airbyte/cloud/workspaces.py (1)

348-378: Bearer-token deletions 401
permanently_delete_source and permanently_delete_destination never forward self.bearer_token, so bearer-token-only workspaces can’t delete resources—they send neither client creds nor a token to the Config API. Could we add bearer_token=self.bearer_token to these api_util.delete_* calls to keep the new auth mode consistent, wdyt?

         api_util.delete_source(
             source_id=source.connector_id if isinstance(source, CloudSource) else source,
             api_root=this.api_root,
             client_id=self.client_id,  # type: ignore[arg-type]
             client_secret=self.client_secret,  # type: ignore[arg-type]
+            bearer_token=self.bearer_token,
         )
...
         api_util.delete_destination(
             destination_id=(destination if isinstance(destination, str) else destination.destination_id),
             api_root=self.api_root,
             client_id=self.client_id,  # type: ignore[arg-type]
             client_secret=self.client_secret,  # type: ignore[arg-type]
+            bearer_token=self.bearer_token,
         )
airbyte/_util/api_util.py (1)

1389-1395: Don't drop bearer_token during safe-mode checks

Line 1391: If callers supply only bearer_token (with client_id/client_secret unset), this safe-mode lookup hits get_custom_yaml_source_definition without any auth, so get_airbyte_server_instance raises PyAirbyteInputError and the delete path breaks for bearer-token users. Could we forward the provided bearer token here? wdyt?

         definition_info = get_custom_yaml_source_definition(
             workspace_id=workspace_id,
             definition_id=definition_id,
             api_root=api_root,
             client_id=client_id,
             client_secret=client_secret,
+            bearer_token=bearer_token,
         )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7eb746b and 49a96aa.

📒 Files selected for processing (6)
  • airbyte/_util/api_util.py (43 hunks)
  • airbyte/cloud/auth.py (1 hunks)
  • airbyte/cloud/connections.py (6 hunks)
  • airbyte/cloud/connectors.py (12 hunks)
  • airbyte/cloud/workspaces.py (16 hunks)
  • airbyte/constants.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
airbyte/cloud/auth.py (2)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/secrets/util.py (1)
  • try_get_secret (33-60)
airbyte/cloud/workspaces.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/_util/api_util.py (1)
  • get_bearer_token (1041-1066)
airbyte/_util/api_util.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/cloud/workspaces.py (1)
  • list_connections (466-492)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

@devin-ai-integration devin-ai-integration bot changed the title feat: Add bearer token authentication support (do not merge) feat: Add bearer token authentication support Nov 13, 2025
…ules

- Add bearer_token parameter to all api_util calls in connectors.py (13 calls)
- Add bearer_token parameter to all api_util calls in connections.py (7 calls)
- Add bearer_token parameter to all api_util calls in sync_results.py (6 calls)
- Ensures bearer token authentication works throughout cloud integration

Co-Authored-By: AJ Steers <[email protected]>
@github-actions
Copy link

PyTest Results (Full)

389 tests  ±0   373 ✅ +1   23m 27s ⏱️ - 6m 57s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌  - 1 

Results for commit 97392bf. ± Comparison against base commit 7eb746b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants