Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash using netcdf-fortran >=4.6 #16

Open
hippalectryon-0 opened this issue Mar 11, 2024 · 5 comments
Open

Crash using netcdf-fortran >=4.6 #16

hippalectryon-0 opened this issue Mar 11, 2024 · 5 comments

Comments

@hippalectryon-0
Copy link

I recently recompiled ecrad using the latest netcdf fortran library (4.6.1), and noticed a new crash in a config that worked fine before:

*** Error defining variable pressure_hl: NetCDF: Name contains illegal characters
Error writing NetCDF file
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
ERROR STOP 1

Error termination. Backtrace:
#0  0x7f9c74223960 in ???
#1  0x7f9c742244d9 in ???
#2  0x7f9c74225bb2 in ???
#3  0x55cc96b506ed in __radiation_io_MOD_radiation_abort
        at ecrad/utilities/radiation_io.F90:52
#4  0x55cc96b4434e in __easy_netcdf_MOD_define_variable
        at ecrad/utilities/easy_netcdf.F90:2072
#5  0x55cc96a8a4a2 in __radiation_save_MOD_save_fluxes
        at ecrad/radiation/radiation_save.F90:156
#6  0x55cc96a4c368 in ecrad_driver
        at ecrad/driver/ecrad_driver.F90:376
#7  0x55cc96a4a738 in main
        at ecrad/driver/ecrad_driver.F90:33

After some investigation, I can confirm that the command used to obtain this error ../bin/ecrad conf.nam nc1.nc nc_out.nc works fine with version 4.5.4, but doesn't work with 4.6.0 and 4.6.1.

@reuterbal
Copy link
Contributor

Thanks for this. We have found similar problems and found out that this is an issue in HDF5 1.14.3.
The problem has been fixed in 1.14.4. Because NetCDF uses HDF5 under the hood this problem occurs.

Please check whether you may be using a 1.14 release of HDF5 that is older than 1.14.4

Bug report: HDFGroup/hdf5#3831
"Fix" commit (if you can call that a fix): HDFGroup/hdf5@e0d095e

@hippalectryon-0
Copy link
Author

This still occurs using hdf5 1.14.4-3

@reuterbal
Copy link
Contributor

Can you share the conf.nam and nc1.nc file to reproduce the problem?

@reuterbal reuterbal reopened this Oct 4, 2024
@hippalectryon-0
Copy link
Author

hippalectryon-0 commented Oct 5, 2024

Of course !
Attached is a zip containing a "minimal" reproduction of the crash(run.sh). Tested locally, it crashes on my machine but not on our "older" cluster.
ecrad_segfault_reprod.zip

@reuterbal
Copy link
Contributor

reuterbal commented Oct 14, 2024

Thank you for providing the reproducer. Unfortunately, I'm unable to observe the same behaviour.
I have tested the latest ecrad master, built with

  • Intel 2021.4.0
  • GNU 8.5.0
  • GNU 13.2.0

and netCDF C 4.9.2 + netCDF Fortran 4.6.1 + hdf5 1.14.3 (with the SIGFPE bug patched).
In all three configurations I could run your example without any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants