Conversation
There was a problem hiding this comment.
Pull request overview
This PR merges backend-focused fixes and refactors provider-related responsibilities by splitting backend resolution, provider fetch helpers, and runtime/device logic into clearer modules, while updating embedders, pipelines, tests, and documentation to match the new structure.
Changes:
- Introduces
providers.resolution(backend selection/provider lifecycle) andproviders.fetch(shared fetch helpers + inspection helpers), removing the legacyembedders/runtime_utils.py. - Removes
NormalizationSpecfromModelInputSpecand standardizes fetch helpers to return raw provider values (normalization now happens inside each embedder’sget_embedding()path). - Moves/renames embedder utilities (
meta_utils→meta,config_utils→config) and updates tests/docs accordingly.
Reviewed changes
Copilot reviewed 52 out of 54 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_wildsat_variant_config.py | Adjusts mocked S2 values to raw-DN expectations. |
| tests/test_tile_fetch_overrides.py | Updates monkeypatch targets to providers.fetch. |
| tests/test_satmaepp_preprocess_alignment.py | Updates S2 RGB fetch mocking to new provider fetch helper. |
| tests/test_runtime_utils_fetch.py | Repoints tests from removed runtime utils to providers.fetch. |
| tests/test_meta_utils.py | Updates imports from embedders.meta_utils to embedders.meta. |
| tests/test_inspect.py | Updates internal patch points for inspection flow (but still imports rs_embed.inspect). |
| tests/test_export_batch.py | Updates export tests to patch providers.fetch helpers; removes NormalizationSpec usage. |
| tests/test_export_batch_terrafm_s1_fetch.py | Updates patch points to providers.fetch. |
| tests/test_embedding.py | Updates imports from embedders.meta_utils to embedders.meta. |
| tests/test_device_resolution.py | Moves resolve_device_auto_torch import to tools.runtime. |
| tests/test_batch_overrides_all_models.py | Updates raw input expectations in batch prefetch tests. |
| tests/test_backend_resolution.py | Updates patch points to tools.runtime / providers.resolution. |
| src/rs_embed/tools/runtime.py | Hosts device resolution helpers; uses new provider fetch helper for API-side input prep. |
| src/rs_embed/tools/normalization.py | Adds coerce_input_to_tchw / coerce_single_input_chw helpers and updates provider-default logic import. |
| src/rs_embed/providers/resolution.py | New: centralized provider backend resolution and provider creation/caching. |
| src/rs_embed/providers/gee.py | Refactors GEE provider to use shared gee_utils helpers and adds fetch_sensor_patch_chw. |
| src/rs_embed/providers/gee_utils.py | Extracts/organizes GEE helpers; adds generic bbox-splitting fallback utility. |
| src/rs_embed/providers/fetch.py | New: shared fetch helpers (S2 RGB, S1 VV/VH, multiframe S2) + fetch-result inspection helper. |
| src/rs_embed/providers/base.py | Removes default S1 fetch stub and removes GEE-coupled default fetch_sensor_patch_chw implementation. |
| src/rs_embed/pipelines/prefetch.py | Switches prefetch manager to provider-agnostic fetch/inspect helpers and plan builder rename. |
| src/rs_embed/pipelines/point_payload.py | Switches payload builder to provider-agnostic fetch/inspect helpers. |
| src/rs_embed/pipelines/exporter.py | Switches exporter to provider-agnostic fetch/inspect helpers. |
| src/rs_embed/inspect.py | Removed; inspection API moved into api.py. |
| src/rs_embed/export.py | Removed; export_npz wrapper removed from package. |
| src/rs_embed/embedders/runtime_utils.py | Removed; functionality split into providers.* and tools.*. |
| src/rs_embed/embedders/precomputed_tessera.py | Updates meta helper import to embedders.meta. |
| src/rs_embed/embedders/precomputed_gse_annual.py | Updates provider helper imports (providers.fetch/providers.resolution) and meta import. |
| src/rs_embed/embedders/precomputed_copernicus_embed.py | Updates meta helper import to embedders.meta. |
| src/rs_embed/embedders/onthefly_wildsat.py | Uses providers.fetch for S2 RGB raw fetch; localizes helper functions; removes normalization spec. |
| src/rs_embed/embedders/onthefly_thor.py | Uses providers.fetch + tools.runtime; localizes helper functions; removes normalization spec. |
| src/rs_embed/embedders/onthefly_terramind.py | Uses providers.fetch + tools.*; localizes helper functions; removes normalization spec. |
| src/rs_embed/embedders/onthefly_terrafm.py | Uses providers.fetch + tools.*; removes normalization spec. |
| src/rs_embed/embedders/onthefly_scalemae.py | Removes _vit_mae_utils dependency; localizes fetch/preprocess/meta helpers. |
| src/rs_embed/embedders/onthefly_satvision_toa.py | Removes _vit_mae_utils dependency; localizes token/meta helpers; uses providers.fetch. |
| src/rs_embed/embedders/onthefly_satmaepp.py | Removes _vit_mae_utils dependency; localizes fetch/token/meta helpers. |
| src/rs_embed/embedders/onthefly_satmaepp_s2.py | Removes _vit_mae_utils dependency; localizes meta helper; uses providers.fetch. |
| src/rs_embed/embedders/onthefly_satmae.py | Removes _vit_mae_utils dependency; localizes preprocess/meta/token helpers. |
| src/rs_embed/embedders/onthefly_remoteclip.py | Switches S2 RGB fetch to raw DN; localizes CHW→u8 helper; updates meta import. |
| src/rs_embed/embedders/onthefly_prithvi.py | Switches fetch/runtime/meta helpers to new modules; localizes token/meta helpers. |
| src/rs_embed/embedders/onthefly_galileo.py | Switches to providers.fetch and moves input coercion to tools.normalization. |
| src/rs_embed/embedders/onthefly_fomo.py | Updates runtime/meta imports and ensures torch helper localization. |
| src/rs_embed/embedders/onthefly_dofa.py | Switches to providers.fetch/tools.* and updates meta import. |
| src/rs_embed/embedders/onthefly_anysat.py | Switches to providers.fetch/tools.* and updates meta import. |
| src/rs_embed/embedders/onthefly_agrifm.py | Switches to providers.fetch/tools.runtime and updates meta import. |
| src/rs_embed/embedders/meta.py | New: consolidated temporal/meta helpers used across embedders. |
| src/rs_embed/embedders/config.py | New: consolidated model_config parsing helpers. |
| src/rs_embed/embedders/base.py | Routes provider caching through providers.resolution; clarifies raw-fetch contract. |
| src/rs_embed/embedders/_vit_mae_utils.py | Removed; functionality moved into individual embedder modules. |
| src/rs_embed/core/specs.py | Removes NormalizationSpec and updates ModelInputSpec documentation accordingly. |
| src/rs_embed/api.py | Adds inspect_provider_patch / inspect_gee_patch to public API; uses providers.fetch + providers.resolution. |
| src/rs_embed/init.py | Re-exports new inspection entrypoints; drops export_npz. |
| docs/extending.md | Updates guidance for provider-backed fetching and model_config flow; adds examples. |
| docs/architecture.md | Documents new provider submodules (resolution.py, fetch.py). |
| CHANGELOG.md | Documents normalization and module refactor changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
1
to
+3
| import numpy as np | ||
|
|
||
| import rs_embed.inspect as inspect_mod | ||
|
|
Comment on lines
11
to
20
| from .api import ( | ||
| describe_model, | ||
| export_batch, | ||
| get_embedding, | ||
| get_embeddings_batch, | ||
| inspect_gee_patch, | ||
| inspect_provider_patch, | ||
| list_models, | ||
| reset_runtime, | ||
| ) |
Comment on lines
+13
to
+20
| - **Normalization responsibility moved entirely to embedders.** `NormalizationSpec` has been removed from `ModelInputSpec` and from `rs_embed.core.specs`. The `apply_normalization()` helper in `rs_embed.providers.fetch` has been removed. `fetch_input()` and all fetch helpers (`fetch_collection_patch_chw`, `fetch_s2_rgb_chw`, `fetch_s2_multiframe_raw_tchw`, `fetch_s1_vvvh_raw_chw`) now consistently return raw provider values (S2 DN in [0, 10000], S1 linear float, etc.); each embedder applies its own normalization inside `get_embedding()`. This eliminates a misleading contract where `ModelInputSpec.normalization` was declared but never automatically applied, and removes a normalize→denormalize round-trip that existed in some batch paths (remoteclip, wildsat). | ||
|
|
||
| - **Internal refactor: provider and embedder layer separation.** The `embedders/runtime_utils.py` grab-bag module has been removed and its contents redistributed by responsibility: | ||
| - Provider selection and lifecycle management → `providers/resolution.py` (`default_provider_backend_name`, `resolve_provider_backend_name`, `is_provider_backend`, `get_cached_provider`, `create_provider_for_backend`) | ||
| - Provider fetch helpers and satellite normalization → `providers/fetch.py` (`fetch_sensor_patch_chw`, `fetch_collection_patch_chw`, `fetch_s2_rgb_chw`, `fetch_s1_vvvh_raw_chw`, `fetch_s2_multiframe_raw_tchw`, `normalize_s1_vvvh_chw`, `apply_normalization`, etc.) | ||
| - Device resolution and embedder instance loading → `tools/runtime.py` (`resolve_device_auto_torch`, `load_cached_with_device`) | ||
| - Input coercion → `tools/normalization.py` (`coerce_input_to_tchw`, `coerce_single_input_chw`) | ||
| - This fixes an inverted dependency where `tools/` was importing from `embedders/`. |
| relax_iw_on_empty=True, # fall back if no IW images found | ||
| ) | ||
| # raw: float32 [2, H, W] — channel 0 = VV, channel 1 = VH | ||
| # meta: {"iw_used": bool, "orbit": str | None, ...} |
Comment on lines
+7
to
+15
| import rs_embed.api as api | ||
|
|
||
| _export_module = types.ModuleType("rs_embed.export") | ||
| _export_module.export_npz = api.export_batch | ||
| sys.modules.setdefault("rs_embed.export", _export_module) | ||
|
|
||
| _inspect_module = types.ModuleType("rs_embed.inspect") | ||
| _inspect_module.inspect_gee_patch = api.inspect_gee_patch | ||
| sys.modules.setdefault("rs_embed.inspect", _inspect_module) |
Comment on lines
+589
to
+594
| x_chw = _fetch_sensor_patch_chw( | ||
| provider, | ||
| spatial=spatial, | ||
| temporal=temporal, | ||
| sensor=sensor, | ||
| ) |
Comment on lines
+414
to
428
| def _fetch_multiframe_impl( | ||
| self, | ||
| *, | ||
| spatial: SpatialSpec, | ||
| temporal: TemporalSpec, | ||
| collection: str, | ||
| bands: Sequence[str], | ||
| n_frames: int = 8, | ||
| scale_m: int = 10, | ||
| cloudy_pct: int | None = 30, | ||
| composite: str = "median", | ||
| fill_value: float = 0.0, | ||
| ) -> np.ndarray: | ||
| import ee | ||
|
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 55 out of 57 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
src/rs_embed/embedders/onthefly_remoteclip.py:62
- _fetch_s2_rgb_chw now delegates to providers.fetch.fetch_s2_rgb_chw, which returns raw S2 DN in [0,10000]. However, RemoteCLIP’s get_embedding path treats the returned array as normalized [0,1] (it runs value_range=(0,1) checks and generates quicklooks with vmin/vmax=0..1). Either normalize here (divide by 10000 and clip to [0,1]) or update downstream code to consistently treat s2_rgb_chw as raw DN.
def _fetch_s2_rgb_chw(
provider: ProviderBase,
spatial: SpatialSpec,
temporal: TemporalSpec,
*,
scale_m: int = 10,
cloudy_pct: int = 30,
composite: str = "median",
) -> np.ndarray:
return _fetch_s2_rgb_chw_shared(
provider,
spatial=spatial,
temporal=temporal,
scale_m=int(scale_m),
cloudy_pct=int(cloudy_pct),
composite=str(composite),
)
|
|
||
|
|
||
| def _s2_rgb_u8_from_chw(s2_chw): | ||
| x = np.clip(np.asarray(s2_chw, dtype=np.float32) / 10000.0, 0.0, 1.0) |
Comment on lines
+589
to
+594
| x_chw = _fetch_sensor_patch_chw( | ||
| provider, | ||
| spatial=spatial, | ||
| temporal=temporal, | ||
| sensor=sensor, | ||
| ) |
| relax_iw_on_empty=True, # fall back if no IW images found | ||
| ) | ||
| # raw: float32 [2, H, W] — channel 0 = VV, channel 1 = VH | ||
| # meta: {"iw_used": bool, "orbit": str | None, ...} |
Comment on lines
+214
to
+219
| scale_m=int(scale_m), | ||
| cloudy_pct=(int(cloudy_pct) if cloudy_pct is not None else None), | ||
| composite=str(composite), | ||
| fill_value=float(fill_value), | ||
| # fetch_fn=_fetch, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR merges the
backendbranch and consolidates a set of backend-focused fixes and cleanup changes.The branch primarily:
What Changed
Testing
How did you verify the change?
CHANGELOG.mdfor user-facing API, model, semantic, or installation changes.Notes
Anything reviewers should know about scope, tradeoffs, or follow-up work.