Skip to content

Conversation

@maxrjones
Copy link
Member

Using claude to try out various caching mechanisms with the ObstoreReader

@maxrjones maxrjones changed the title Experiment with different obstore reading cache methods [Not for merging] Experiment with different obstore reading cache methods Dec 19, 2025
@maxrjones maxrjones marked this pull request as draft December 19, 2025 23:15
@TomNicholas
Copy link

This seems cool - would be great to know how it performs.

@maxrjones
Copy link
Member Author

This seems cool - would be great to know how it performs.

image

from the scripts code in this PR

@TomNicholas
Copy link

TomNicholas commented Jan 9, 2026

Okay so IIUC we should expect parsing a HDF file to extract virtual references to be basically ~4-5x faster using obstore_hybrid than it is when just naively downloading the whole file, which was the workaround I've used previously. That's great.

(I feel like fetching data with Icechunk should be faster than 9.85s, but that's a separate problem for another time.)

@maxrjones
Copy link
Member Author

Okay so IIUC we should expect parsing a HDF file to extract virtual references to be basically ~4-5x faster using obstore_hybrid than it is when just naively downloading the whole file, which was the workaround I've used previously. That's great.

Yeah it'll depend on the file structure and such, but you should be able to find a reasonable choice based on a single file and apply it to all.

(I feel like fetching data with Icechunk should be faster than 9.85s, but that's a separate problem for another time.)

Yeah the slowness of loading a spatial subset is weird. There could be a bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants