Skip to content

Merge backend fixes, API integration, and cleanup#82

Merged
Dinghye merged 6 commits into
mainfrom
backend
May 14, 2026
Merged

Merge backend fixes, API integration, and cleanup#82
Dinghye merged 6 commits into
mainfrom
backend

Conversation

@Dinghye
Copy link
Copy Markdown
Collaborator

@Dinghye Dinghye commented May 13, 2026

Summary

This PR merges the backend branch and consolidates a set of backend-focused fixes and cleanup changes.

The branch primarily:

  • fixes provider behavior in the backend flow
  • re-merges the API file to restore the intended backend integration
  • removes obsolete embedder utility code
  • adds fetch-data example/docs updates to improve backend usage guidance

What Changed

  • Fixed backend provider behavior to make request/response handling more consistent.
  • Re-integrated the API file after the previous merge left the backend path in an inconsistent state.
  • Removed outdated embedder utility code that is no longer needed by the current backend implementation.
  • Ran formatting/cleanup changes to align the codebase with Ruff and reduce lint noise.
  • Added documentation/examples for fetch-data usage to make backend workflows easier to follow.

Testing

How did you verify the change?

  • I ran the relevant tests locally.
  • I updated docs for user-facing behavior changes.
  • I updated CHANGELOG.md for user-facing API, model, semantic, or installation changes.
  • This PR does not need a changelog entry.

Notes

Anything reviewers should know about scope, tradeoffs, or follow-up work.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR merges backend-focused fixes and refactors provider-related responsibilities by splitting backend resolution, provider fetch helpers, and runtime/device logic into clearer modules, while updating embedders, pipelines, tests, and documentation to match the new structure.

Changes:

  • Introduces providers.resolution (backend selection/provider lifecycle) and providers.fetch (shared fetch helpers + inspection helpers), removing the legacy embedders/runtime_utils.py.
  • Removes NormalizationSpec from ModelInputSpec and standardizes fetch helpers to return raw provider values (normalization now happens inside each embedder’s get_embedding() path).
  • Moves/renames embedder utilities (meta_utilsmeta, config_utilsconfig) and updates tests/docs accordingly.

Reviewed changes

Copilot reviewed 52 out of 54 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_wildsat_variant_config.py Adjusts mocked S2 values to raw-DN expectations.
tests/test_tile_fetch_overrides.py Updates monkeypatch targets to providers.fetch.
tests/test_satmaepp_preprocess_alignment.py Updates S2 RGB fetch mocking to new provider fetch helper.
tests/test_runtime_utils_fetch.py Repoints tests from removed runtime utils to providers.fetch.
tests/test_meta_utils.py Updates imports from embedders.meta_utils to embedders.meta.
tests/test_inspect.py Updates internal patch points for inspection flow (but still imports rs_embed.inspect).
tests/test_export_batch.py Updates export tests to patch providers.fetch helpers; removes NormalizationSpec usage.
tests/test_export_batch_terrafm_s1_fetch.py Updates patch points to providers.fetch.
tests/test_embedding.py Updates imports from embedders.meta_utils to embedders.meta.
tests/test_device_resolution.py Moves resolve_device_auto_torch import to tools.runtime.
tests/test_batch_overrides_all_models.py Updates raw input expectations in batch prefetch tests.
tests/test_backend_resolution.py Updates patch points to tools.runtime / providers.resolution.
src/rs_embed/tools/runtime.py Hosts device resolution helpers; uses new provider fetch helper for API-side input prep.
src/rs_embed/tools/normalization.py Adds coerce_input_to_tchw / coerce_single_input_chw helpers and updates provider-default logic import.
src/rs_embed/providers/resolution.py New: centralized provider backend resolution and provider creation/caching.
src/rs_embed/providers/gee.py Refactors GEE provider to use shared gee_utils helpers and adds fetch_sensor_patch_chw.
src/rs_embed/providers/gee_utils.py Extracts/organizes GEE helpers; adds generic bbox-splitting fallback utility.
src/rs_embed/providers/fetch.py New: shared fetch helpers (S2 RGB, S1 VV/VH, multiframe S2) + fetch-result inspection helper.
src/rs_embed/providers/base.py Removes default S1 fetch stub and removes GEE-coupled default fetch_sensor_patch_chw implementation.
src/rs_embed/pipelines/prefetch.py Switches prefetch manager to provider-agnostic fetch/inspect helpers and plan builder rename.
src/rs_embed/pipelines/point_payload.py Switches payload builder to provider-agnostic fetch/inspect helpers.
src/rs_embed/pipelines/exporter.py Switches exporter to provider-agnostic fetch/inspect helpers.
src/rs_embed/inspect.py Removed; inspection API moved into api.py.
src/rs_embed/export.py Removed; export_npz wrapper removed from package.
src/rs_embed/embedders/runtime_utils.py Removed; functionality split into providers.* and tools.*.
src/rs_embed/embedders/precomputed_tessera.py Updates meta helper import to embedders.meta.
src/rs_embed/embedders/precomputed_gse_annual.py Updates provider helper imports (providers.fetch/providers.resolution) and meta import.
src/rs_embed/embedders/precomputed_copernicus_embed.py Updates meta helper import to embedders.meta.
src/rs_embed/embedders/onthefly_wildsat.py Uses providers.fetch for S2 RGB raw fetch; localizes helper functions; removes normalization spec.
src/rs_embed/embedders/onthefly_thor.py Uses providers.fetch + tools.runtime; localizes helper functions; removes normalization spec.
src/rs_embed/embedders/onthefly_terramind.py Uses providers.fetch + tools.*; localizes helper functions; removes normalization spec.
src/rs_embed/embedders/onthefly_terrafm.py Uses providers.fetch + tools.*; removes normalization spec.
src/rs_embed/embedders/onthefly_scalemae.py Removes _vit_mae_utils dependency; localizes fetch/preprocess/meta helpers.
src/rs_embed/embedders/onthefly_satvision_toa.py Removes _vit_mae_utils dependency; localizes token/meta helpers; uses providers.fetch.
src/rs_embed/embedders/onthefly_satmaepp.py Removes _vit_mae_utils dependency; localizes fetch/token/meta helpers.
src/rs_embed/embedders/onthefly_satmaepp_s2.py Removes _vit_mae_utils dependency; localizes meta helper; uses providers.fetch.
src/rs_embed/embedders/onthefly_satmae.py Removes _vit_mae_utils dependency; localizes preprocess/meta/token helpers.
src/rs_embed/embedders/onthefly_remoteclip.py Switches S2 RGB fetch to raw DN; localizes CHW→u8 helper; updates meta import.
src/rs_embed/embedders/onthefly_prithvi.py Switches fetch/runtime/meta helpers to new modules; localizes token/meta helpers.
src/rs_embed/embedders/onthefly_galileo.py Switches to providers.fetch and moves input coercion to tools.normalization.
src/rs_embed/embedders/onthefly_fomo.py Updates runtime/meta imports and ensures torch helper localization.
src/rs_embed/embedders/onthefly_dofa.py Switches to providers.fetch/tools.* and updates meta import.
src/rs_embed/embedders/onthefly_anysat.py Switches to providers.fetch/tools.* and updates meta import.
src/rs_embed/embedders/onthefly_agrifm.py Switches to providers.fetch/tools.runtime and updates meta import.
src/rs_embed/embedders/meta.py New: consolidated temporal/meta helpers used across embedders.
src/rs_embed/embedders/config.py New: consolidated model_config parsing helpers.
src/rs_embed/embedders/base.py Routes provider caching through providers.resolution; clarifies raw-fetch contract.
src/rs_embed/embedders/_vit_mae_utils.py Removed; functionality moved into individual embedder modules.
src/rs_embed/core/specs.py Removes NormalizationSpec and updates ModelInputSpec documentation accordingly.
src/rs_embed/api.py Adds inspect_provider_patch / inspect_gee_patch to public API; uses providers.fetch + providers.resolution.
src/rs_embed/init.py Re-exports new inspection entrypoints; drops export_npz.
docs/extending.md Updates guidance for provider-backed fetching and model_config flow; adds examples.
docs/architecture.md Documents new provider submodules (resolution.py, fetch.py).
CHANGELOG.md Documents normalization and module refactor changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_inspect.py
Comment on lines 1 to +3
import numpy as np

import rs_embed.inspect as inspect_mod

Comment thread src/rs_embed/__init__.py
Comment on lines 11 to 20
from .api import (
describe_model,
export_batch,
get_embedding,
get_embeddings_batch,
inspect_gee_patch,
inspect_provider_patch,
list_models,
reset_runtime,
)
Comment thread CHANGELOG.md
Comment on lines +13 to +20
- **Normalization responsibility moved entirely to embedders.** `NormalizationSpec` has been removed from `ModelInputSpec` and from `rs_embed.core.specs`. The `apply_normalization()` helper in `rs_embed.providers.fetch` has been removed. `fetch_input()` and all fetch helpers (`fetch_collection_patch_chw`, `fetch_s2_rgb_chw`, `fetch_s2_multiframe_raw_tchw`, `fetch_s1_vvvh_raw_chw`) now consistently return raw provider values (S2 DN in [0, 10000], S1 linear float, etc.); each embedder applies its own normalization inside `get_embedding()`. This eliminates a misleading contract where `ModelInputSpec.normalization` was declared but never automatically applied, and removes a normalize→denormalize round-trip that existed in some batch paths (remoteclip, wildsat).

- **Internal refactor: provider and embedder layer separation.** The `embedders/runtime_utils.py` grab-bag module has been removed and its contents redistributed by responsibility:
- Provider selection and lifecycle management → `providers/resolution.py` (`default_provider_backend_name`, `resolve_provider_backend_name`, `is_provider_backend`, `get_cached_provider`, `create_provider_for_backend`)
- Provider fetch helpers and satellite normalization → `providers/fetch.py` (`fetch_sensor_patch_chw`, `fetch_collection_patch_chw`, `fetch_s2_rgb_chw`, `fetch_s1_vvvh_raw_chw`, `fetch_s2_multiframe_raw_tchw`, `normalize_s1_vvvh_chw`, `apply_normalization`, etc.)
- Device resolution and embedder instance loading → `tools/runtime.py` (`resolve_device_auto_torch`, `load_cached_with_device`)
- Input coercion → `tools/normalization.py` (`coerce_input_to_tchw`, `coerce_single_input_chw`)
- This fixes an inverted dependency where `tools/` was importing from `embedders/`.
Comment thread docs/extending.md
relax_iw_on_empty=True, # fall back if no IW images found
)
# raw: float32 [2, H, W] — channel 0 = VV, channel 1 = VH
# meta: {"iw_used": bool, "orbit": str | None, ...}
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 54 out of 56 changed files in this pull request and generated 3 comments.

Comment thread tests/test_cli_parsers.py Outdated
Comment on lines +7 to +15
import rs_embed.api as api

_export_module = types.ModuleType("rs_embed.export")
_export_module.export_npz = api.export_batch
sys.modules.setdefault("rs_embed.export", _export_module)

_inspect_module = types.ModuleType("rs_embed.inspect")
_inspect_module.inspect_gee_patch = api.inspect_gee_patch
sys.modules.setdefault("rs_embed.inspect", _inspect_module)
Comment thread src/rs_embed/api.py
Comment on lines +589 to +594
x_chw = _fetch_sensor_patch_chw(
provider,
spatial=spatial,
temporal=temporal,
sensor=sensor,
)
Comment on lines +414 to 428
def _fetch_multiframe_impl(
self,
*,
spatial: SpatialSpec,
temporal: TemporalSpec,
collection: str,
bands: Sequence[str],
n_frames: int = 8,
scale_m: int = 10,
cloudy_pct: int | None = 30,
composite: str = "median",
fill_value: float = 0.0,
) -> np.ndarray:
import ee

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 57 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

src/rs_embed/embedders/onthefly_remoteclip.py:62

  • _fetch_s2_rgb_chw now delegates to providers.fetch.fetch_s2_rgb_chw, which returns raw S2 DN in [0,10000]. However, RemoteCLIP’s get_embedding path treats the returned array as normalized [0,1] (it runs value_range=(0,1) checks and generates quicklooks with vmin/vmax=0..1). Either normalize here (divide by 10000 and clip to [0,1]) or update downstream code to consistently treat s2_rgb_chw as raw DN.
def _fetch_s2_rgb_chw(
    provider: ProviderBase,
    spatial: SpatialSpec,
    temporal: TemporalSpec,
    *,
    scale_m: int = 10,
    cloudy_pct: int = 30,
    composite: str = "median",
) -> np.ndarray:
    return _fetch_s2_rgb_chw_shared(
        provider,
        spatial=spatial,
        temporal=temporal,
        scale_m=int(scale_m),
        cloudy_pct=int(cloudy_pct),
        composite=str(composite),
    )



def _s2_rgb_u8_from_chw(s2_chw):
x = np.clip(np.asarray(s2_chw, dtype=np.float32) / 10000.0, 0.0, 1.0)
Comment thread src/rs_embed/api.py
Comment on lines +589 to +594
x_chw = _fetch_sensor_patch_chw(
provider,
spatial=spatial,
temporal=temporal,
sensor=sensor,
)
Comment thread docs/extending.md
relax_iw_on_empty=True, # fall back if no IW images found
)
# raw: float32 [2, H, W] — channel 0 = VV, channel 1 = VH
# meta: {"iw_used": bool, "orbit": str | None, ...}
Comment on lines +214 to +219
scale_m=int(scale_m),
cloudy_pct=(int(cloudy_pct) if cloudy_pct is not None else None),
composite=str(composite),
fill_value=float(fill_value),
# fetch_fn=_fetch,
)
@Dinghye Dinghye merged commit b9df0f7 into main May 14, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants