Merged
Conversation
…sumption breaks when needing to push to a bucket we incorrectly assumed to be public in the worker
This commit resolves a series of cascading failures in the Hugging Face ingestion integration test. The root cause was a design flaw where the
background worker lacked the security context (tenant_id, region) of the original requester, forcing it to incorrectly assume the target bucket was
public.
The fix involved re-architecting the ingestion flow to securely propagate the necessary context from the initial gRPC request to the worker.
Key Changes:
- **Schema:** The `hf_ingestions` table has been updated to store the `tenant_id`, `requester_app_id`, and `target_region`, providing the worker with
the information it needs to act on the user's behalf.
- **Services:**
- The `start_ingestion` service now correctly captures the `tenant_id` and `app_id` from the caller's JWT claims and persists them to the database.
- Fixed a bug where the JWT `sub` claim (the app ID) was being incorrectly used as an app name. The service now correctly looks up the app by its
ID.
- **Worker:**
- The `handle_hf_ingestion` worker has been refactored to query and use the `tenant_id` and `target_region` when looking up the target bucket,
removing the flawed "public bucket" assumption.
- All `println!` macros have been replaced with structured `tracing` logs (`info!`, `debug!`, `error!`).
- A debugging `panic!` has been removed in favor of proper error logging and returning a `Result`.
- **Tests:**
- The `hf_ingestion_integration_test` has been fixed and made more robust. It no longer fails with a `403 Forbidden` during verification.
- The test now correctly verifies that the private object is inaccessible to anonymous requests first, then uses a gRPC call to make the bucket
public, and finally confirms that the object is accessible.
- Corrected a bug where the initial `create_bucket` gRPC call was missing its authorization token.
The rust tooling is wrong about all of these, they're being used but only in the test so it is emitting warnings for them. The only legit one is the deprecated functions being used in crypto which we should really upgrade but will do in a future commit
This commit completes a major architectural refactoring to prepare the Anvil workspace for an open-core model, separating the foundational components from future enterprise
extensions.
The key changes include:
- **Crate Separation:** The original `anvil` crate has been split into `anvil-core` (a pure library containing the fundamental structs, traits, and managers) and `anvil` (the main
binary application that depends on `anvil-core`).
- **Enterprise Feature Flag:** An `enterprise` feature flag has been added to the `anvil` crate. When enabled, it activates an optional dependency on the `anvil-enterprise` crate,
allowing for the seamless addition of enterprise-specific services and logic.
- **Test Harness Migration:** The test utilities have been extracted into a dedicated `anvil-test-utils` crate, which is now used by all integration tests across the workspace.
- **Build Fixes:** Resolved numerous compilation, dependency, and routing issues that arose during the refactoring, resulting in a stable build where all original OSS tests now pass
successfully on the new architecture.
This new structure provides a clean and maintainable foundation for building and releasing both open-source and commercial versions of Anvil from a unified codebase.
…enterprise in tests
- anvil-core: export cloneable AuthInterceptorFn and return it from create_grpc_router
- anvil: pass core-provided interceptor into enterprise extender and serve merged Routes
- test-utils: enable anvil crate’s enterprise feature so TestCluster includes enterprise services
- Rationale: stable, scalable extension point for enterprise gRPC services and consistent middleware
This commit introduces a significant enhancement to the Anvil streaming core by implementing a new FFI (Foreign Function Interface) layer with caching and
metrics. This FFI is consumed by a new Python SDK, `anvil-torch`, which provides a lazy-loading mechanism for PyTorch tensors, enabling more efficient memory
usage and faster model loading for large models.
Key changes include:
- **FFI Layer (`anvil-ffi`):**
- Implemented a new FFI with an `AnvilTensor` struct for binary-safe data transfer.
- Added an LRU cache for tensors to reduce redundant data fetching.
- Introduced metrics for cache hits, misses, and bytes fetched.
- Improved error handling with `last_error_message`.
- **Python SDK (`anvil-sdk-py`):**
- Created the `anvil-torch` package with an `AnvilLoaderWrapper` to interface with the new FFI.
- Implemented `enable`, `metrics`, and `load_from_anvil` functions for seamless PyTorch integration.
- Added end-to-end tests for streaming inference with PyTorch models.
- **Build and Test Infrastructure (`Justfile`):**
- Added a comprehensive set of `just` commands for end-to-end testing using Docker Compose.
- New commands streamline the process of bootstrapping Anvil, managing Hugging Face model ingestion, and running integration tests.
- **Enterprise Features:**
- Implemented pagination for the `list_tensors` service in `anvil-enterprise`.
- **Bug Fixes and Refinements:**
- Updated `ObjectRef` to use `Option<String>` for `version_id` to avoid empty strings.
- Corrected linker arguments for macOS in `anvil-sdk-py-bindings`.
… denied masked as a 404
… to be read as well
…the same DB in the test...the current theory why the created bucket is not found
…sed by depending on HOME instead of using --config explicitly
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.