-
-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use rtree spatial index from PlanetilerConfig bounds by default for Geopackages #635
Conversation
Kudos, SonarCloud Quality Gate passed! |
Full logs: https://github.com/onthegomap/planetiler/actions/runs/13371218015 |
planetiler-core/src/main/java/com/onthegomap/planetiler/reader/GeoPackageReader.java
Outdated
Show resolved
Hide resolved
c8d3fee
to
0b2c894
Compare
If I use the 4326 land/water polygons this drops the protomaps basemap build from ~30 seconds to 10 seconds for Monaco. Once we can port Natural Earth to use GPKG it should be even faster. |
|
||
Iterable<FeatureRow> results; | ||
|
||
if (this.bounds != null && indexer.isIndexed() && srsId == 4326) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to transform the bounds from 4326 to whatever CRS the geopackage source uses. See ShapefileReader as an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if that handled non-conformal transformations correctly, but if so we can use that. How were you able to use the Shapefile filter without .sbx
sidecar files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Geotools ReferencedEnvelope#transform
drops a bunch of points along the edge of the bounding box in the old coordinate system, then computes the bounding box of those points in the new coordinate system. Also if it throws a TransformException then it just skips the filter.
I'm not 100% sure how the geotools filter predicate works internally. My guess/hope would be that it uses the spatial index if available, but then falls back to just applying it against each shape if not. I added it based off the description here: https://docs.geotools.org/latest/userguide/library/main/filter.html
planetiler-core/src/main/java/com/onthegomap/planetiler/reader/GeoPackageReader.java
Outdated
Show resolved
Hide resolved
WorkerPipeline.start("test", Stats.inMemory()) | ||
.fromGenerator("geopackage", reader::readFeatures, 1) | ||
.addBuffer("reader_queue", 100, 1) | ||
.sinkToConsumer("counter", 1, elem -> { | ||
points.add(elem.latLonGeometry()); | ||
}).await(); | ||
assertEquals(expectedCount, points.size(), id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need WorkerPipeline here? I think we could potentially simplify this to just:
reader.readFeatures(feature -> {
points.add(elem.latLonGeometry());
});
|
Thanks! This looks good to me. |
This is a first attempt at addressing #478 .
The existing issue is that all Shapefile or Geopackage features are read by every run, even if the
--bounds
argument is passed to limit the tile output or the bounds are determined implicitly based on theosm.pbf
. To reduce the runtime we can:.fbn
,.fbx
and/orqix
sidecars depending on what software created the Shapefile. Maybe GeoTools can read all of these, but I'm not familiar with its internals.rtree_
indexes (A vanilla SQLite feature, no Spatialite required)ne_50m_admin_0_boundary_lines_land
as tables we care about. IMO, this doesn't reduce the runtime by much relative to spatial indexing, so it may be an unworthy optimization.My proposal here is to not change any APIs, but change the default behavior of Geopackage to use the determined
PlanetilerConfig
bounds and apply the spatial index. Remaining todos:rtree_
indexes will always exist on gpkg or are optionalAnother relevant format would be FlatGeobuf, which has an optional spatial index as part of the single file, but I don't see key advantages of FGB over GPKG in most Planetiler use cases where data lives on disk.