Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cell_methods in Example 7.13 #587

Open
TomLav opened this issue Jan 22, 2025 · 8 comments
Open

Fix cell_methods in Example 7.13 #587

TomLav opened this issue Jan 22, 2025 · 8 comments
Labels
defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors

Comments

@TomLav
Copy link

TomLav commented Jan 22, 2025

I have a (maybe stupid) question on Example 7.13. Below I reproduce the main variable in this.

  float temperature(time,lat,lon);
    temperature:long_name="surface air temperature";
    temperature:cell_methods="time: mean within days ",
      "time: mean over days time: mean over years";
    temperature:units="K";

Here we see that :cell_methods is over two strings, with a ',' linking the two. I am wondering if this is valid CDL syntax? In any case (valid or not), would the example be easier to read if :cell_methods was a single string?

@larsbarring
Copy link
Contributor

To me this looks like an array of strings, something that is currently discussed elsewhere (can't find the issue/discussion just now).

@davidhassell
Copy link
Contributor

Yes - a quick test suggests that this is indeed how CDL represents an attribute that is a vector of strings, though such attributes are prefix with string by ncdump. I created test.nc.gz with:

# Python
import netCDF4
n = netCDF4.Dataset('test.nc', "w", format="NETCDF4")
n.Conventions = "CF-1.12"
n.vector = ['a', 'bb']
n.close()

and its CDL looks like:

$ ncdump test.nc
netcdf test {

// global attributes:
		:Conventions = "CF-1.12" ;
		string :vector = "a", "bb" ;
}

So, as things currently stand I think that example 7.13 is defective, and the temperature:cell_methods attribute should indeed be a single string.

I found one of the issues discussing vector string: https://github.com/orgs/cf-convention/discussions/341. That issue is about global attributes only. There are are other aspects when considering vector variable attributes, but I can't find the discussion on that one right now, either.

David

@TomLav
Copy link
Author

TomLav commented Jan 22, 2025

Thank you both.

So would the correct syntax be?

variables:
  float temperature(time,lat,lon);
    temperature:long_name="surface air temperature";
    temperature:cell_methods="time: mean within days time: mean over days time: mean over years";
    temperature:units="K";

The description of Example 7.13 reads: Temperature for each hour of the typical climatological day

Someone with expertise of :cell_methods should check what the correct string should be wrt the description of 7.13 before we correct this.

@davidhassell davidhassell added the defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors label Jan 22, 2025
@davidhassell
Copy link
Contributor

Hi Thomas, I think your corrected version looks good - thanks. I think it's safe to marks this issue as a defect.

As an aside, and as far as I'm aware, in CDL you can split a string across multiple lines, but the result is a string new-line characters inserted:

$ cat test2.cdl
netcdf test {

// global attributes:
		:Conventions = "CF-1.12" ;
		string :vector = "a", "bb" ;
		:cell_methods =
"time: mean within days
 time: mean over days
 time: mean over years";
}
$ ncgen -knc4 -o test2.nc test2.cdl
$ ncdump test2.nc
netcdf test2 {

// global attributes:
		:Conventions = "CF-1.12" ;
		string :vector = "a", "bb" ;
		:cell_methods = "time: mean within days\n time: mean over days\n time: mean over years" ;
}

@JonathanGregory
Copy link
Contributor

The NUG says

The length of an attribute is the number of data values or the number of characters in the character string assigned to it if the type is char. Multiple values are assigned to non-character attributes by separating the values with commas (',').

By "character" attribute it means type char. If you provide a single string, ncgen inteprets this as a char array, but I suppose ncgen infers that you mean string, not char, if you provide more than one string value. Hence this agrees that the text in the Example implies a vector of strings, and is defective.

Thanks, Thomas and David.

@TomLav
Copy link
Author

TomLav commented Feb 7, 2025

Title

Fix :cell_methods string in Example 7.13

Moderator

@davidhassell (?)

Requirement Summary

Attribute :cell_methods in Example 7.13 is broken: it should be a single string.

Technical Proposal Summary

Change the value of the cell_methods attribute to "time: mean within days time: mean over days time: mean over years"

Associated pull request

PR #591

@TomLav TomLav changed the title Question about Example 7.13 Fix cell_methods in Example 7.13 Feb 7, 2025
@davidhassell
Copy link
Contributor

Thanks, Thomas. The PR look good to me.

@JonathanGregory
Copy link
Contributor

Since this is a defect issue, it will be accepted if no-one raises any objection in three weeks, before 1st March. Thanks, Thomas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors
Projects
None yet
Development

No branches or pull requests

4 participants