Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Explore land sea mask generation #576

Open
lee1043 opened this issue Dec 5, 2023 · 8 comments
Open

[Feature]: Explore land sea mask generation #576

lee1043 opened this issue Dec 5, 2023 · 8 comments
Assignees
Labels
type: enhancement New enhancement request

Comments

@lee1043
Copy link
Collaborator

lee1043 commented Dec 5, 2023

Is your feature request related to a problem?

I know that we had discussed this before and concluded to take an independent path to host land sea mask generation capability in a separate package (i.e., pcmdi_utils), I recently came across much simpler and more reliable method that possibly be worth considering to implement to xcdat.

The new approach uses regionmask, a package that is available via conda and pip.

This method has following advantages:
(1) It overcomes the complexity of the regrid2-based method that is originated from cdutil. Although the more precised land sea fraction conservation is considered in the regrid2-based method, in common practical case of global uses its influence is not very critical.
(2) It is not using global-land-mask, which is only available via pip, thus does not complicate xcdat installation.

Describe the solution you'd like

Function

[Proposed function updated -- boolean option added]

import regionmask
import xarray as xr
import xcdat as xc


def create_land_sea_mask(ds: xr.Dataset, boolean: bool=False) -> xr.DataArray:
    """
    A function generates land sea mask (1: land, 0: sea) for given xarray Dataset,
    assuming the given xarray dataset and has latitude and longitude coordinates. 

    Parameters
    ----------
    ds : xr.Dataset
        A Dataset object.
    boolen : bool, optional
        Set mask value to True (land) or False (sea), by default False
        
    Returns
    -------
    xr.DataArray
        A DataArray of land sea mask (1: land, 0: sea)
    """
    # Create a land-sea mask using regionmask
    land_mask = regionmask.defined_regions.natural_earth_v5_0_0.land_110

    # Get the longitude and latitude from the xarray dataset
    key_lon = xc.axis.get_dim_keys(ds, axis="X")
    key_lat = xc.axis.get_dim_keys(ds, axis="Y")
    
    lon = ds[key_lon]
    lat = ds[key_lat]

    # Mask the land-sea mask to match the dataset's coordinates
    land_sea_mask = land_mask.mask(lon, lat)
    
    if not boolean:
        # Convert the land-sea mask to a boolean mask
        land_sea_mask = xr.where(land_sea_mask, 0, 1)  

    return land_sea_mask

Examples

[Examples updated -- boolean option and regional use case added]

target_grid = xc.create_uniform_grid(-90, 90, 1, 0, 359, 1)
mask = create_land_sea_mask(target_grid)
mask.plot()

output1

target_grid = xc.create_uniform_grid(-90, 90, 0.5, -180, 179, 0.5)
mask = create_land_sea_mask(target_grid)
mask.plot()

output 2png

mask = create_land_sea_mask(target_grid, boolean=True)
mask.plot()

output4

target_grid = xc.create_uniform_grid(20, 45, 0.1, 110, 135, 0.1)
mask = create_land_sea_mask(target_grid, boolean=True)
mask.plot()

output5

Describe alternatives you've considered

No response

Additional context

No response

@lee1043
Copy link
Collaborator Author

lee1043 commented Dec 5, 2023

@mzelinka @pochedls @bosup would you be interested in trying this function that I drafted and share your feedback?

@pochedls
Copy link
Collaborator

pochedls commented Dec 5, 2023

@lee1043 – I like regionmask. I ran into an issue with it that was recently closed. Have you hit any issues with NaNs in experimenting with it?

I'm supportive of including support for land/sea masking in xcdat, though it might need some discussion (e.g., why not just use regionmask or other packages directly without wrapping them).

@lee1043
Copy link
Collaborator Author

lee1043 commented Dec 5, 2023

@pochedls glad to learn you already have explored this path! Thank you for sharing the edge case. I haven't hit such issue, maybe the issue you posted is specifically coming with ocean_basins_50. Is there a specific region you are using the ocean_basins_50? Otherwise, I tried a workaround of your edge case using land_110, not sure if that is something you were looking for.

import numpy as np

# create mask
#ocean = regionmask.defined_regions.natural_earth_v5_0_0.ocean_basins_50
ocean = regionmask.defined_regions.natural_earth_v5_0_0.land_110
mask = ocean.mask(ds.lon, ds.lat)
mask = xr.DataArray(np.nan_to_num(mask, nan=True), coords={"lat": ds.lat, "lon": ds.lon}, dims=["lat", "lon"])
mask.plot()

output3

@pochedls
Copy link
Collaborator

pochedls commented Dec 5, 2023

This issue is likely fixed with ocean_basins_50 (the issue I opened was closed). I was just wondering if you had hit anything similar. Mark also found a library to do land/sea masking, but I don't think the library was available via anaconda.

@lee1043
Copy link
Collaborator Author

lee1043 commented Dec 5, 2023

@pochedls I was able to reproduce the edge case issue you had, so I presume the issue has not been fully resolved. And you are right, the library Mark found, global-land-mask, is only available via pip install.

@lee1043
Copy link
Collaborator Author

lee1043 commented Jun 3, 2024

Just a note, this capability is implemented to PMP (here), which I'd be happy to copy it to xCDAT if we decide to implement.

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Jun 5, 2024

As discussed in our meeting today (06/05/24), we decided that @lee1043 will open a PR with his function from the PMP commented above since this feature seems to be useful for the community and brought up frequently. We will perform additional validation in this new PR, add unit tests, and update documentation.

@xCDAT/core-developers

@pochedls
Copy link
Collaborator

pochedls commented Jun 5, 2024

@lee1043 - thanks for working on this. One feature that would be nice to include: An (optional?) flag to mask land or ocean (so either land/ocean grid cells are 1 and the other is np.nan). This would be very helpful to include in this function (rather than having the user do .where operations later to do the masking).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New enhancement request
Projects
Status: Todo
Development

No branches or pull requests

3 participants