Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CF file handler support for YAML-defined datasets is incomplete #3056

Open
gerritholl opened this issue Feb 11, 2025 · 15 comments
Open

CF file handler support for YAML-defined datasets is incomplete #3056

gerritholl opened this issue Feb 11, 2025 · 15 comments

Comments

@gerritholl
Copy link
Member

gerritholl commented Feb 11, 2025

Describe the bug

A custom composite depending on a custom reader using SatpyCFFileHandler is not shown when I call available_composite_ids() or available_composite_names(). However, the YAML-defined dataset is still loadable.

To Reproduce

import tempfile
import os
import pathlib

comp_config = """sensor_name: visir

composites:
  world_comp_fci_masked:
    compositor: !!python/name:satpy.composites.LongitudeMaskingCompositor
    prerequisites:
    - image_fci
    standard_name: world_comp_fci_masked
    lon_min: -37.5
    lon_max: 22.75
"""

reader_config = """
reader:
    name: world-nc-image
    description: reads intermediate NetCDF files for world composite
    reader: !!python/name:satpy.readers.yaml_reader.FileYAMLReader
    sensors: [world-images]

file_types:
    graphic_fci:
      file_reader: !!python/name:satpy.readers.satpy_cf_nc.SatpyCFFileHandler
      file_patterns:
         - 'fci-ir105-wcm3km_epsg4326-{start_time:%Y%m%d%H%M}.nc'

datasets:
  image_fci:
    nc_store_name: ir_105
    name: image_fci
    file_type: graphic_fci
"""

with tempfile.TemporaryDirectory() as td:
    p = pathlib.Path(td)
    fn = p / "composites" / "visir.yaml"
    fn.parent.mkdir(exist_ok=True, parents=True)
    with fn.open(mode="wt", encoding="ascii") as fp:
        fp.write(comp_config)
    fn = p / "readers" / "world-nc-image.yaml"
    fn.parent.mkdir(exist_ok=True, parents=True)
    with fn.open(mode="wt", encoding="ascii") as fp:
        fp.write(reader_config)
    os.environ["SATPY_CONFIG_PATH"] = str(td)
    from satpy import Scene
    from satpy.utils import debug_on; debug_on()

    sc = Scene(filenames={"world-nc-image": ['/media/nas/x21308/scratch/wcm/tmp/fci-ir105-wcm3km_epsg4326-202502111130.nc']})
    print(sc.available_composite_ids())  # does not list world_comp_fci_masked
    sc.load(["world_comp_fci_masked"])  # works (loadable)

Expected behavior

I expect that available_composite_ids() shows all composites that can be loaded.

Actual results

[DEBUG: 2025-02-11 14:23:39 : satpy.readers.yaml_reader] Reading ('/tmp/tmp1wm134uk/readers/world-nc-image.yaml',)
[DEBUG: 2025-02-11 14:23:39 : satpy.readers.yaml_reader] Assigning to world-nc-image: ['/media/nas/x21308/scratch/wcm/tmp/fci-ir105-wcm3km_epsg4326-202502111130.nc']
[DEBUG: 2025-02-11 14:23:40 : satpy.composites.config_loader] Looking for composites config file fci.yaml
[DEBUG: 2025-02-11 14:23:40 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 14:23:40 : pyorbital.tlefile] Path to the Pyorbital configuration (where e.g. platforms.txt is found): /home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/pyorbital/etc
[DEBUG: 2025-02-11 14:23:40 : satpy.readers.satpy_cf_nc] Getting data for: image_fci
[WARNING: 2025-02-11 14:23:40 : satpy.readers.yaml_reader] Can't load ancillary dataset ir_105_pixel_quality
[DEBUG: 2025-02-11 14:23:41 : satpy.scene] Unloading dataset: DataID(name='image_fci', modifiers=())
[DataID(name='colorized_ir_clouds'), DataID(name='geo_color_high_clouds'), DataID(name='ir108_3d'), DataID(name='ir_cloud_day'), DataID(name='night_ir105')]
<sys>:0: DeprecationWarning: Call to deprecated function (or staticmethod) _destroy.

Environment Info:

  • OS: openSUSE Leap 15.6
  • Satpy Version: current main (v0.54.0-84-gc776b2607)

Additional context

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

My guess is that Satpy doesn't know to load visir.yaml because there is no composite config for composites/world-images.yaml (not sure if the hyphen versus underscore is a problem anywhere in satpy, regardless...). My guess if you created that world-images.yaml composite file with sensor_name: visir/world-images and nothing else then it would start working.

@gerritholl
Copy link
Member Author

But it does know how to load it. The load call is successful. It's just available_composite_names() that doesn't list it.

@gerritholl
Copy link
Member Author

Replacing sensor_name: visir by sensor_name: visir/world-images results in an infinite loop:

[DEBUG: 2025-02-11 16:01:52 : satpy.readers.yaml_reader] Reading ('/tmp/tmpydym7ez1/readers/world-nc-image.yaml',)
[DEBUG: 2025-02-11 16:01:52 : satpy.readers.yaml_reader] Assigning to world-nc-image: ['/media/nas/x21308/scratch/wcm/tmp/fci-ir105-wcm3km_epsg4326-202502111130.nc']
[DEBUG: 2025-02-11 16:01:52 : satpy.composites.config_loader] Looking for composites config file fci.yaml
[DEBUG: 2025-02-11 16:01:52 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : pyorbital.tlefile] Path to the Pyorbital configuration (where e.g. platforms.txt is found): /home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/pyorbital/etc
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2025-02-11 16:01:53 : satpy.composites.config_loader] Looking for composites config file visir.yaml
...
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/yaml/parser.py", line 477, in parse_flow_sequence_entry
    if not self.check_token(FlowSequenceEndToken):
           ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/yaml/scanner.py", line 116, in check_token
    self.fetch_more_tokens()
    ~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/yaml/scanner.py", line 255, in fetch_more_tokens
    return self.fetch_plain()
           ~~~~~~~~~~~~~~~~^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/yaml/scanner.py", line 671, in fetch_plain
    self.save_possible_simple_key()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/yaml/scanner.py", line 309, in save_possible_simple_key
    self.index, self.line, self.column, self.get_mark())
                                        ~~~~~~~~~~~~~^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/yaml/reader.py", line 119, in get_mark
    return Mark(self.name, self.index, self.line, self.column,
            None, None)
RecursionError: maximum recursion depth exceeded

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

What does sc.sensor_names give you?

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

I didn't say replace it, I said create the new file with that sensor_name. You've created a visir.yaml that says that it has a parent sensor config called visir.

@gerritholl
Copy link
Member Author

sc.sensor_names gives {fci} (that's in the NetCDF file).

I tried adding a second file composites/world-images.yaml with the sole content sensor_name: visir/world-images, but it makes no difference as far as available_composite_names() is concerned.

Is the sensor attribute important?

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

Yeah, the sensors are given to the composite config loading to know which YAML files to load. From your log we can see it is loading "fci" so it is finding "fci.yaml" based on the sensor name(s) and "visir.yaml" based on the fci.yaml file listing it as a parent.

The available names does create a separate dependency tree from the main Scene dependency tree, but it shouldn't be coming up with completely different results. Let me see if I can run your code and figure this out.

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

Oh I'd need your NetCDF file though.

@gerritholl
Copy link
Member Author

@gerritholl
Copy link
Member Author

The problem is that the sensor attribute returned by the reader does not match any of the sensors defined in the reader YAML?

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

The reader should be taking its sensor information from the file handlers if they exist. Since you do have a file handler, then it should be taking that information from the file handler. And we see that is true since sc.sensor_names has "fci".

Thanks for the file. Let me try this out.

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

So I had to change the example from 3km to 10km, but not a big deal. Here is what the reader is saying is available inside the available composite method:

list(self._readers["world-nc-image"].available_dataset_ids)
[
DataID(name='wcm10km_epsg4326', modifiers=()),
DataID(name='ir_105', wavelength=WavelengthRange(min=9.8, central=10.5, max=11.2, unit='µm'), resolution=np.int64(2000), calibration=<2>, modifiers=()),
DataID(name='y', modifiers=()),
DataID(name='x', modifiers=())
]

So the image_fci variable/dataset is not showing as available. The composite does show up in the list of all possible composites. So that is being loaded fine.

I kept debugging and it looks like the loading of the variable is "bugged" in a sense. Or rather, the variable being listed as not available is the bug, but the file handler is able to load it anyway. When it comes to Scene.available_X the Scene says "only tell me about things that we know are available", the file handler says that image_fci is not available. When we do Scene.load it doesn't care if something is available or not because we want to inform the user why we couldn't load something they asked for. So the Scene goes to the reader and says "load image_fci" and the file handler returns it just fine...even though it isn't "available". So let's see why it is marked as unavailable...

@djhoese
Copy link
Member

djhoese commented Feb 11, 2025

Here we go, you're doing something that isn't supported:

def _existing_datasets(self, configured_datasets=None):
"""Add information of existing datasets."""
for is_avail, ds_info in (configured_datasets or []):
yield is_avail, ds_info

This for loop should, technically, be checking if the file type matches and if so check if "nc_store_name" or "name" of the configured datasets are in the file. Then is_avail should be set to True, otherwise it should remain None.

@gerritholl
Copy link
Member Author

Should this "not supported" be considered a bug / wart / missing feature in the satpy CF filehandler, or in the YAML configuration for my local reader that uses the CF filehandler?

@djhoese
Copy link
Member

djhoese commented Feb 17, 2025

I would consider it a missing feature of the CF file handler. It does not support YAML-defined datasets.

@gerritholl gerritholl changed the title available_composite_ids omits composites depending on custom reader CF file handler support for YAML-defined datasets is incomplete Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants