Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

h3_polygon_to_cells: Invalid geometries can return millions of Hex IDs #157

Open
jmealo opened this issue Oct 10, 2024 · 4 comments
Open
Labels
enhancement 🚀 New feature or request good first issue 👶 Good for newcomers

Comments

@jmealo
Copy link

jmealo commented Oct 10, 2024

Issue:

I've found that calling h3_polygon_to_cells with invalid geometries can result in as many as 2.3 million hex_ids being returned at a resolution of 7. This is not a bug in h3-pg but, rather how the h3 library handles invalid geometries. It's unclear whether we can guard against this, or should at the very least provide documentation/warnings about the issues and workarounds to mitigate them.

Cause:

  • All of the geometries that caused these issues fail ST_IsValid().
  • Using ST_MakeValid() turns them in bow ties.

Test case:

{
  "type": "Polygon",
  "coordinates": [
    [
      [-148.5, 29.1],
      [-148.5, 63.9],
      [-72.5, 29.1],
      [-72.5, 63.9],
      [-148.5, 29.1]
    ]
  ]
}
WITH h3_cells AS (
  SELECT h3_polygon_to_cells(
    ST_GeomFromGeoJSON('{
      "type": "Polygon",
      "coordinates": [
        [
          [-148.5, 29.1],
          [-148.5, 63.9],
          [-72.5, 29.1],
          [-72.5, 63.9],
          [-148.5, 29.1]
        ]
      ]
    }'),
    7
  ) AS cells
)
SELECT COUNT(1) 
FROM h3_cells;
# count = 2,364,092

Solution:

  • It might make sense to update the documentation or examples to include a call to ST_IsValid() ... or a warning that invalid geometries could return millions of hex ids?

Upstream issue:

uber/h3#926

@jmealo jmealo changed the title Documentation Update/Warning: Invalid geometries can create million of Hex IDs h3_polygon_to_cells: Invalid geometries can return millions of Hex IDs Oct 10, 2024
@zachasme
Copy link
Owner

We can definitely handle this more nicely. Principally I'm trying to keep the base h3 extension as close to upstream as possible, instead using the h3-pg extension as a friendlier PostGIS wrapper around those. So I would be open to checking validity or using ST_MakeValid when calling the h3-pg functions.

@zachasme zachasme added enhancement 🚀 New feature or request good first issue 👶 Good for newcomers labels Oct 11, 2024
@jmealo
Copy link
Author

jmealo commented Oct 11, 2024

@zachasme: The issue I ran into with my particular geometries is that ST_MakeValid() was returning huge bowties, that would still create what I assume to be millions of hexes at higher resolutions. I'm not 100% sure what the correct behavior is. I'm pushing to add ST_IsValid() constraints or triggers to our tables that store geometry/geography.

@zachasme
Copy link
Owner

I would be okay with adding a flag to the postgis variants of polygon_to_cells that either enables or disables running ST_IsValid on inputs. What do you think of that solution?

@jmealo
Copy link
Author

jmealo commented Nov 5, 2024

@zachasme: I think that could make sense. I got pulled off this but will be revisiting it in the next few weeks.

Thoughts:

  • It's unsafe to call these with invalid geometry
  • It's very cheap to use ST_IsValid() ... should the flag default to ON by default?

I would be OK with providing a flag, if it defaults to ON. We should verify that checking a flag at runtime isn't more expensive than calling ST_IsValid() 🤔 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🚀 New feature or request good first issue 👶 Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants