Skip to content

Conversation

@hrodmn
Copy link
Contributor

@hrodmn hrodmn commented Oct 16, 2025

One thing I have struggled with when starting out with a new concept_id is discovering the right min/max values for the rescale parameter. This adds some helpful context for parametrizing image requests with good rescale values.

I wonder if there is a simpler way to extract these values, though. The valid_min and valid_max attributes are basically describing the data type, not the actual data.

I tested it on MUR SST and quickly discovered that it was not feasible to load the entire array into memory to do the statistics calculations, hence the sampling logic to create a subsample of grid cells.

Here is what the /compatibility response for MUR SST looks like with the statistics added:

{
  "concept_id": "C1996881146-POCLOUD",
  "backend": "xarray",
  "datetime": [
    {
      "RangeDateTimes": [
        {
          "BeginningDateTime": "2002-05-31T21:00:00.000Z"
        }
      ]
    }
  ],
  "variables": {
    "analysed_sst": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float64",
      "min": 271.34999999999997,
      "max": 305.777,
      "mean": 285.94320627067367,
      "p01": 271.34999999999997,
      "p05": 271.34999999999997,
      "p95": 302.435,
      "p99": 303.42058999999995
    },
    "analysis_error": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float64",
      "min": 0.34,
      "max": 0.42,
      "mean": 0.3795658071202146,
      "p01": 0.35000000000000003,
      "p05": 0.35000000000000003,
      "p95": 0.4,
      "p99": 0.41000000000000003
    },
    "mask": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float32",
      "min": 1.0,
      "max": 13.0,
      "mean": 2.6793582439422607,
      "p01": 1.0,
      "p05": 1.0,
      "p95": 9.0,
      "p99": 9.0
    },
    "sea_ice_fraction": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float64",
      "min": 0.0,
      "max": 1.0,
      "mean": 0.4074759072428851,
      "p01": 0.0,
      "p05": 0.0,
      "p95": 1.0,
      "p99": 1.0
    }
  },
  "dimensions": {
    "time": 1,
    "lat": 17999,
    "lon": 36000
  },
  "coordinates": {
    "time": {
      "size": 1,
      "dtype": "datetime64[ns]"
    },
    "lat": {
      "size": 17999,
      "dtype": "float32",
      "min": -89.98999786376953,
      "max": 89.98999786376953
    },
    "lon": {
      "size": 36000,
      "dtype": "float32",
      "min": -179.99000549316406,
      "max": 180.0
    }
  },
  "example_assets": "s3://podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
}

@github-actions
Copy link

github-actions bot commented Oct 16, 2025

📚 Documentation preview will be available at: https://developmentseed.github.io/titiler-cmr/pr-previews/pr-82/

Status: ✅ Preview is ready!

github-actions bot pushed a commit that referenced this pull request Oct 16, 2025
github-actions bot pushed a commit that referenced this pull request Oct 17, 2025
@hrodmn hrodmn marked this pull request as ready for review October 21, 2025 18:56
@hrodmn hrodmn force-pushed the feat/compatibility-stats branch from a4807d7 to 0a2f21a Compare October 21, 2025 19:20
github-actions bot pushed a commit that referenced this pull request Oct 21, 2025
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@abarciauskas-bgse
Copy link
Contributor

Thanks @hrodmn, this looks super useful. Given the memory/compute load perhaps we could make it optional with a compute_stats boolean parameter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants