Skip to content

feat: add load_export() reader API#68

Merged
Dinghye merged 2 commits into
cybergis:mainfrom
amrit110:feat/load-export
Apr 3, 2026
Merged

feat: add load_export() reader API#68
Dinghye merged 2 commits into
cybergis:mainfrom
amrit110:feat/load-export

Conversation

@amrit110
Copy link
Copy Markdown
Contributor

@amrit110 amrit110 commented Apr 3, 2026

Summary

Closes #17.

Adds load_export(path) as the read-side counterpart to export_batch(), closing the write/read asymmetry where users had to manually parse NPZ array keys and JSON manifests to recover their embeddings.

The loop is now symmetric:

# Write
export_batch(
    spatials=[PointBuffer(121.5, 31.2, 2048)],
    temporal=TemporalSpec.range("2022-06-01", "2022-09-01"),
    models=["remoteclip"],
    target=ExportTarget.combined("exports/run"),
)

# Read
result = load_export("exports/run.npz")
emb = result.embedding("remoteclip")   # shape (n_items, dim)

Changes

  • src/rs_embed/load.py — new module implementing load_export, ExportResult, and ModelResult
    • Auto-detects combined (single .npz/.nc file) vs per-item (directory) layout
    • NaN-fills failed spatial points in partial runs; surfaces status="partial"
    • result.embedding(model) typed shortcut; ok_models / failed_models convenience properties
    • NetCDF support via optional xarray import
  • src/rs_embed/__init__.py — exports load_export, ExportResult, ModelResult at top level
  • src/rs_embed/api.py — removes legacy private re-exports (_default_provider_backend_for_api etc.) that were kept with # noqa: F401
  • tests/test_load_export.py — 45 unit tests covering both layouts, path routing, partial failures, NaN fill, two-model runs, and error cases
  • tests/test_backend_resolution.py — updated to import from canonical location (tools/normalization) now that the api.py re-exports are gone
  • docs/api_load.md — new docs page with full API reference, data structure docs, and usage examples
  • docs/api.md, docs/api_export.md, mkdocs.yml — nav and cross-links updated
  • CHANGELOG.md[Unreleased] entry added
  • .pre-commit-config.yaml — fixes hook id ruffruff-check (the non-legacy name)

Test plan

  • python -m pytest tests/test_load_export.py -v — all 45 tests pass
  • python -m pytest — full suite (570 tests) passes
  • pre-commit run --all-files — prettier, ruff-check, ruff-format all pass
  • mkdocs build --strict — docs build clean (pre-existing extending.md error unrelated to this PR)

amrit110 added 2 commits April 2, 2026 23:08
Adds load_export(path) as the read-side counterpart to export_batch(),
closing the write/read asymmetry where users had to manually parse NPZ
keys and manifest JSON to recover their embeddings.

- Auto-detects combined (single file) vs per-item (directory) layout
- Returns ExportResult with per-model ModelResult entries
- NaN-fills failed points in partial runs; surfaces status="partial"
- result.embedding(model) typed shortcut; ok_models / failed_models
- ExportResult and ModelResult exported from rs_embed top-level
- 45 unit tests covering both layouts, error cases, partial failures
- New docs page at docs/api_load.md with full API reference and examples
- Removes legacy private re-exports from api.py
- Fixes pre-commit hook name: ruff -> ruff-check
@Dinghye Dinghye merged commit 0341579 into cybergis:main Apr 3, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create helper function for reading/loading downloaded files

2 participants