Skip to content

Huggingface based ingestion#4

Merged
zcourts merged 46 commits intomainfrom
feature/hf
Nov 13, 2025
Merged

Huggingface based ingestion#4
zcourts merged 46 commits intomainfrom
feature/hf

Conversation

@zcourts
Copy link
Contributor

@zcourts zcourts commented Oct 30, 2025

No description provided.

…sumption breaks when needing to push to a bucket we incorrectly assumed to be public in the worker
     This commit resolves a series of cascading failures in the Hugging Face ingestion integration test. The root cause was a design flaw where the
      background worker lacked the security context (tenant_id, region) of the original requester, forcing it to incorrectly assume the target bucket was
      public.

     The fix involved re-architecting the ingestion flow to securely propagate the necessary context from the initial gRPC request to the worker.

     Key Changes:

     - **Schema:** The `hf_ingestions` table has been updated to store the `tenant_id`, `requester_app_id`, and `target_region`, providing the worker with
      the information it needs to act on the user's behalf.

    - **Services:**
        - The `start_ingestion` service now correctly captures the `tenant_id` and `app_id` from the caller's JWT claims and persists them to the database.
        - Fixed a bug where the JWT `sub` claim (the app ID) was being incorrectly used as an app name. The service now correctly looks up the app by its
      ID.

    - **Worker:**
        - The `handle_hf_ingestion` worker has been refactored to query and use the `tenant_id` and `target_region` when looking up the target bucket,
      removing the flawed "public bucket" assumption.
        - All `println!` macros have been replaced with structured `tracing` logs (`info!`, `debug!`, `error!`).
        - A debugging `panic!` has been removed in favor of proper error logging and returning a `Result`.

 - **Tests:**
        - The `hf_ingestion_integration_test` has been fixed and made more robust. It no longer fails with a `403 Forbidden` during verification.
        - The test now correctly verifies that the private object is inaccessible to anonymous requests first, then uses a gRPC call to make the bucket
      public, and finally confirms that the object is accessible.
        - Corrected a bug where the initial `create_bucket` gRPC call was missing its authorization token.
The rust tooling is wrong about all of these, they're being used but only in the test so it is emitting warnings for them.
The only legit one is the deprecated functions being used in crypto which we should really upgrade but will do in a future commit
     This commit completes a major architectural refactoring to prepare the Anvil workspace for an open-core model, separating the foundational components from future enterprise
      extensions.

     The key changes include:

     - **Crate Separation:** The original `anvil` crate has been split into `anvil-core` (a pure library containing the fundamental structs, traits, and managers) and `anvil` (the main
      binary application that depends on `anvil-core`).

     - **Enterprise Feature Flag:** An `enterprise` feature flag has been added to the `anvil` crate. When enabled, it activates an optional dependency on the `anvil-enterprise` crate,
      allowing for the seamless addition of enterprise-specific services and logic.

    - **Test Harness Migration:** The test utilities have been extracted into a dedicated `anvil-test-utils` crate, which is now used by all integration tests across the workspace.

    - **Build Fixes:** Resolved numerous compilation, dependency, and routing issues that arose during the refactoring, resulting in a stable build where all original OSS tests now pass
      successfully on the new architecture.

    This new structure provides a clean and maintainable foundation for building and releasing both open-source and commercial versions of Anvil from a unified codebase.
…enterprise in tests

      - anvil-core: export cloneable AuthInterceptorFn and return it from create_grpc_router
      - anvil: pass core-provided interceptor into enterprise extender and serve merged Routes
      - test-utils: enable anvil crate’s enterprise feature so TestCluster includes enterprise services
      - Rationale: stable, scalable extension point for enterprise gRPC services and consistent middleware
This commit introduces a significant enhancement to the Anvil streaming core by implementing a new FFI (Foreign Function Interface) layer with caching and
      metrics. This FFI is consumed by a new Python SDK, `anvil-torch`, which provides a lazy-loading mechanism for PyTorch tensors, enabling more efficient memory
      usage and faster model loading for large models.

Key changes include:

- **FFI Layer (`anvil-ffi`):**
     - Implemented a new FFI with an `AnvilTensor` struct for binary-safe data transfer.
     - Added an LRU cache for tensors to reduce redundant data fetching.
     - Introduced metrics for cache hits, misses, and bytes fetched.
     - Improved error handling with `last_error_message`.

 - **Python SDK (`anvil-sdk-py`):**
     - Created the `anvil-torch` package with an `AnvilLoaderWrapper` to interface with the new FFI.
     - Implemented `enable`, `metrics`, and `load_from_anvil` functions for seamless PyTorch integration.
     - Added end-to-end tests for streaming inference with PyTorch models.

 - **Build and Test Infrastructure (`Justfile`):**
     - Added a comprehensive set of `just` commands for end-to-end testing using Docker Compose.
     - New commands streamline the process of bootstrapping Anvil, managing Hugging Face model ingestion, and running integration tests.

 - **Enterprise Features:**
     - Implemented pagination for the `list_tensors` service in `anvil-enterprise`.

 - **Bug Fixes and Refinements:**
     - Updated `ObjectRef` to use `Option<String>` for `version_id` to avoid empty strings.
     - Corrected linker arguments for macOS in `anvil-sdk-py-bindings`.
@zcourts zcourts merged commit c8ab349 into main Nov 13, 2025
1 check passed
@zcourts zcourts deleted the feature/hf branch November 13, 2025 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant