-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert springs to use ST_PointOnSurface and reformat SQL #3233
Conversation
"natural" | ||
FROM planet_osm_point | ||
WHERE way && !bbox! | ||
) _ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like unwanted space + underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you suggest as an alternative for what to call the inner subselect? _
is a reasonably standard name when you just need a name. s
, p
, q
are also common throw-away names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, however I don't like such shortcuts in the big SQL section of a public project. Typical i
for loop index might be OK in typical code, but this one is already hard to maintain, underscore is worse than any letter and really looks like a typo, and you can't expect that other people will be real programmers to catch it. Something self-commenting like inner_subselect
would be probably good.
BTW: there are more undocumented conventions and sections that I don't understand, it might make sense to add more comments one day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inner_subselect
as a name would do for me.
I support this. Longer identifiers do no harm during processing, but increase legibility of the code. |
"natural" | ||
FROM | ||
(SELECT | ||
ST_PointOnSurface(way) AS way, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is psql smart enough to only do the ST_PointOnSurface calculation for objects satisfying "natural" IN ('spring')
, in this query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see now it avoids the duplication of WHERE "natural" IN ('spring')
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the conditions on the tag columns are simple enough a subselect isn't necessary, but on longer queries, these can be over a dozen lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, clear. And what about the question above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres can do the query either way, the results are the same. It will pick whichever is cheapest, which will normally be using the index, filtering by geometry, then calculating st_pointonsurface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand you correctly, in this query it will also calculate ST_PointOnSurface for polygons that are not natural=spring, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no guarantee one way or the other. It's unlikely to, but with a small database and the right indexes it might.
@@ -285,14 +285,22 @@ Layer: | |||
<<: *osm2pgsql | |||
table: |- | |||
(SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does wrapping this in a superquery accomplish anything at all? It seems to me that leaving lines 287-290 and 302 out would yield exactly the same result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since then the query would return all polygons regardless of their natural
column.
This is essentially the same method as #3228 (after my last changes), which gives performance problems. We therefore should check if this PR causes performance problems too. |
I fixed the filter on the points table now. Unfortunately, this still gives me a performance hit of about 50% of overall rendering time. |
FROM planet_osm_point | ||
WHERE way && !bbox! | ||
) _ | ||
WHERE "natural" IN ('spring') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not add another WHERE way && !bbox!
in the outer subquery? Because now it will also forward points to Mapnik that are outside of the bounding box that is currently being rendered (as long as they are the label position of a polygon that falls within that bounding box).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but the PR is no different than current behavior. I suppose the difference is that filtering on contains is now cheap to do in-db
Marking this as WIP as I suspect this will cause performance issues. |
What are those? I don't see any. |
Is anybody able to check the performance of this PR? I think it would be good if we could give this attention, as it is a necessary step for the move to vector tikes. |
I found no performance issues when testing this and see nothing that could cause performance issues. |
@pnorman I still don't fully understand why we need the subselect. Why can't we simply do this?
Is the only reason to avoid duplication of |
Regarding performance: I made a quick and dirty test to extend the method proposed here to all layers. This shows no performance degradation. So I'm happy to merge this - I'm sorry that I didn't review it sooner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Awaiting your question about the reason for the subselect, and then this can be merged.
Performance is the same. It avoids duplication and is a different way to think about it. The duplication isn't a big issue here, but when we get to a query like the POI layers, there's a huge WHERE clause, and deduplicating that is a big gain. |
Clear. You can merge it if you like (or I can merge it, as you prefer). |
Resolves gravitystorm#1644. This is a necessary step towards vector rendering (see gravitystorm#1644). This is a second trial of gravitystorm#3228, based on the word by @pnorman in gravitystorm#3233. This is a first step, once we have this than we can probably merge the point and poly layers (for example amenity-points and amenity-points-poly).
This partly replaces #3228 and is an example of converting a layer to use
ST_PointOnSurface