Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FATES write file error, Invalid dimension ID or name, nf_mod.F90 #1333

Closed
danielletijerina opened this issue Feb 13, 2025 · 3 comments
Closed

Comments

@danielletijerina
Copy link

Describe the issue

I am encountering an issue when running the coupled ELM-FATES model for a gridded domain. The issue is when an output file is trying to be written, an error emerges saying the output file has invalid dimension ID or name:

1:  Opened file ./testYF_FATESonly1.IELMFATES.pm-cpu.gnu..2025-02-10.elm.h0.2011-01.nc to write           5
1:  pio_support::pio_die:: myrank=   -1 : ERROR: nf_mod.F90:   1293 : NetCDF: Invalid dimension ID or name

I have tested the same configuration with only ELM (changing only export COMPSET=IELMFATES to export COMPSET=IELM) and this produced viable output for multiple months of simulation. As a result, I am more confident this is a FATES issues and not an ELM one.

Things I have tried and/or checked which have all resulted in the same file-write error:

  • Changing all NetCDF files to a consistent type - “netcdf-4 classic model”
  • Changed the PIO version from 1 to 2 (this was a different error)
  • Checked the dimensions for domain and surface files
  • Checked the dimensions for the forcing files
  • Replaced the FATES parameter file with one that is compatible with the FATES version

Additionally, I need to use a specific version of ELM-FATES that is compatible to be coupled with the ParFlow hydrologic model. The versions I am using are:
E3SM - v2.0.0-alpha.2-22950-g6ece06802d
FATES - sci.1.61.0_api.25.0.0-2-g2105ed18

I have attached the E3SM and Land logs, as well as my create case script. Thanks!

lnd.log.35707822.250210-202354.txt
e3sm.log.35707822.250210-202354.txt

er_dtk_scratch_FATES_ELM.sh.txt

Relevant log output

1:  Opened file ./testYF_FATESonly1.IELMFATES.pm-cpu.gnu..2025-02-10.elm.h0.2011-01.nc to write           5
 1:  pio_support::pio_die:: myrank=          -1 : ERROR: nf_mod.F90:        1293 : NetCDF: Invalid dimension ID or name
21:  pio_support::pio_die:: myrank=          -1 : ERROR: nf_mod.F90:        1293 : NetCDF: Invalid dimension ID or name
31:  pio_support::pio_die:: myrank=          -1 : ERROR: nf_mod.F90:        1293 : NetCDF: Invalid dimension ID or name
 1: MPICH ERROR [Rank 1] [job id 35707822.0] [Mon Feb 10 20:39:47 2025] [nid006584] - Abort(1) (rank 1 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
 1: 
11:  pio_support::pio_die:: myrank=          -1 : ERROR: nf_mod.F90:        1293 : NetCDF: Invalid dimension ID or name
 1: aborting job:
 1: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
11: MPICH ERROR [Rank 11] [job id 35707822.0] [Mon Feb 10 20:39:47 2025] [nid006584] - Abort(1) (rank 11 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 11
11: 
11: aborting job:
11: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 11
21: MPICH ERROR [Rank 21] [job id 35707822.0] [Mon Feb 10 20:39:47 2025] [nid006584] - Abort(1) (rank 21 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 21
21: 
21: aborting job:
21: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 21
31: MPICH ERROR [Rank 31] [job id 35707822.0] [Mon Feb 10 20:39:47 2025] [nid006584] - Abort(1) (rank 31 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 31
31: 
31: aborting job:
31: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 31
srun: error: nid006584: tasks 1,11,21,31: Exited with exit code 255
srun: Terminating StepId=35707822.0
 0: slurmstepd: error: *** STEP 35707822.0 ON nid006584 CANCELLED AT 2025-02-11T04:39:49 ***
srun: error: nid006584: tasks 0,2-10,12-20,22-30,32-39: Terminated
srun: Force Terminated StepId=35707822.0

FATES tag

sci.1.61.0_api.25.0.0-2-g2105ed18

Host land model tag

ELM - v2.0.0-alpha.2-22950-g6ece06802d

Machine

perlmutter

Other supported machine name

No response

Additional context

No response

@glemieux
Copy link
Contributor

glemieux commented Feb 24, 2025

@JessicaNeedham suggests looking into the climate forcing data potentially missing some variable.

@glemieux
Copy link
Contributor

Using PIO_DEBUG_LEVEL=6 @danielletijerina and I where able to track down that this appears to be due to this line missing from the htape_create subroutine: https://github.com/E3SM-Project/E3SM/blob/9cfde3a73bbd26e50dced09ecb0c3fbf54ea86c6/components/elm/src/main/histFileMod.F90#L1941C1-L1942C1

We're trying to assess why it is missing.

@glemieux
Copy link
Contributor

Since we've determined the cause of the original issue and the rest of the troubleshooting is not specific to fates, I'm going to close this out.

@github-project-automation github-project-automation bot moved this from ❕Todo to ✔ Done in FATES issue board Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✔ Done
Development

No branches or pull requests

2 participants