Skip to content

🐛[BUG]: NetCDF4 backend synchronization issue #546

@jleinonen

Description

@jleinonen

Version

Latest main

On which installation method(s) does this occur?

source

Describe the issue

There seems to be a subtle issue with NetCDF4 files not being properly flushed to disk leading to issues trying to move/rename the file immediately after writing.

On my system, this works ok (output_file is on a Lustre file system):

results = NetCDF4Backend(output_file)
run.deterministic([start_time], num_steps, model, source, results, device=device)

ds = xr.open_dataset(output_file)
print(float(ds["t2m"].max()))

outputs 304.91934.

Meanwhile this leads to an output file that consists of fill values:

results = NetCDF4Backend("/tmp/tmp_file.nc")
run.deterministic([start_time], num_steps, model, source, results, device=device)
shutil.move("/tmp/tmp_file.nc", output_file)

ds = xr.open_dataset(output_file)
print(float(ds["t2m"].max()))

outputs 9.969209968386869e+36. Note that this involves a move from the local /tmp file system to Lustre.

Adding .sync() of the NetCDF4 Dataset fixes the issue:

results = NetCDF4Backend("/tmp/tmp_file.nc")
run.deterministic([start_time], num_steps, model, source, results, device=device)
results.root.sync()  # <-- ADDED
shutil.move("/tmp/tmp_file.nc", output_file)

ds = xr.open_dataset(output_file)
print(float(ds["t2m"].max()))

outputs 305.44061 (the model is probabilistic, so it's normal to have a different result).

The latter is a quick fix but I don't think it's a great solution as it breaks the interchangeability of IO backends (results.root.sync() will fail if results is another backend that doesn't have this attribute). Would it be reasonable to add self.root.sync() at the end of the write method of NetCDF4Backend? I'm not sure if this would have adverse performance effects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions