Skip to content

Add sanitization functions and netcdf export (with cleanup) to FileIO#173

Merged
pbeaucage merged 3 commits intomainfrom
97-sanitize-attrs-prior-export
Apr 1, 2025
Merged

Add sanitization functions and netcdf export (with cleanup) to FileIO#173
pbeaucage merged 3 commits intomainfrom
97-sanitize-attrs-prior-export

Conversation

@pbeaucage
Copy link
Collaborator

The current FileIO module is not super useful for two reasons:

  • it fails on attributes that are not JSON serializable
  • netCDF export doesn't work with multiindexes

This PR attempts to resolve these by adding attrs sanitization functions to the FileIO module and by adding a netcdf export that automatically unstacks multiindexes.

@pbeaucage pbeaucage linked an issue Jan 23, 2025 that may be closed by this pull request
@pbeaucage pbeaucage requested a review from martintb April 1, 2025 16:21
Copy link
Collaborator

@martintb martintb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments. They're style focused so I'm going to click approve and let you decide if you want to address.

Returns:
xarray.DataArray or xarray.Dataset: A copy of the input object with sanitized attributes.
"""
def sanitize_value(value):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the nested function, but Copilot seems to think it's justified so I'm not sure what to say. A general utility for jsonifying Python objects doesn't seems like a reasonable thing to put in a utils.py or even this in module.


"""
da = self._obj
# sanitize attrs and make netcdf safe by converting dicts to json strings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two functions which clean up attrs? Is there a case where they would be used separately?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetCDF safe is a stricter standard than json serializable.

@pbeaucage pbeaucage merged commit 7000f63 into main Apr 1, 2025
16 checks passed
@pbeaucage pbeaucage deleted the 97-sanitize-attrs-prior-export branch April 21, 2025 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Export Function for (Integrated) Datasets?

2 participants