Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Switch to ST_PointOnSurface #3228

Closed

Conversation

matthijsmelissen
Copy link
Collaborator

@matthijsmelissen matthijsmelissen commented May 12, 2018

Resolves #1644.

This is a necessary step towards vector rendering (see #1644).

As this changes the label placement algorithm for polygons, this causes differences in label placement.

It seems this comes with a performance penalty. This should be investigated further.

Before:
screen shot 2018-05-12 at 02 19 55
After:
screen shot 2018-05-12 at 02 19 12

(SQL deliberately not indented to increase readability of the diff.)

@kocio-pl
Copy link
Collaborator

Big general questions from me:

  1. What is the source of the performance problem? Why database is much slower than Mapnik?
  2. How could it be fixed? Is there a way to use some faster algorithm in the database?

@matthijsmelissen
Copy link
Collaborator Author

Personally I have no idea unfortunately.

@pnorman
Copy link
Collaborator

pnorman commented May 12, 2018

If there's a big performance penalty that gets worse as database size goes up, it's probably from a missing way && !bbox!. What layers are slow?

How could it be fixed? Is there a way to use some faster algorithm in the database?

We'd have to write the function.

@pnorman
Copy link
Collaborator

pnorman commented May 12, 2018

I'm inclined towards doing this layer by layer to start, starting with layers like bridges where there isn't a point equivalent.

@matthijsmelissen
Copy link
Collaborator Author

matthijsmelissen commented May 12, 2018

I did some more testing.

it's probably from a missing way && !bbox!.

No, interestingly enough it's the exact opposite: the way && !bbox! check itself causes nearly all of the performance degration.

I used a set of tiles which takes 11s to render in master, and 17s in this PR, both on z17.

  • Replacing ST_PointOnSurface(w) everywhere by w did not improve performance (still 17s). This means that the ST_PointOnSurface check itself does not cause a noticeable degradation (at least if we apply it only to ways matching way && !bbox!).
  • However, replacing ST_PointOnSurface(w) by w and also dropping the way && !bbox! (but keeping the subquery) did fix the issue (back to 11s). This means the way && !bbox! check is the limiting check.
  • As a sanity check, I also dropped way && !bbox! without replacing ST_PointOnSurface(w). As expected, this caused an even worse performance loss (no tiles rendered within one minute).

@pnorman Is there anything we can do with the condition to speed this up? Does Postgres guarantee that conjuncts are evaluated left to right, i.e. is fast_check AND slow_check necessarily faster than slow_check and fast_check?

Edit: it seems replacing way && !bbox! AND man_made = 'bridge' by man_made = 'bridge' AND way && !bbox! etc. everywhere does not make any difference.

@pnorman
Copy link
Collaborator

pnorman commented May 13, 2018

No, interestingly enough it's the exact opposite: the way && !bbox! check itself causes nearly all of the performance degration.

Then you're missing the addition of a way && !bbox! check. When Mapnik has a subquery fragment for a layer that doesn't contain a !bbox! token it wraps it in SELECT ... FROM subquery WHERE <geom name> & !bbox!, so in a worst-case scenario you can make it at least as fast as it was before by adding an explicit way && !bbox!

Copy link
Collaborator

@pnorman pnorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging as request changes as it's got performance issues, some missed SQL, and the SQL will need reformatting.

@@ -167,7 +161,6 @@
#springs {
[natural = 'spring'][zoom >= 14] {
marker-file: url('symbols/spring.svg');
marker-placement: interior;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There don't seem to be the required changes to the springs SQL

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

math1985 added 2 commits May 14, 2018 01:11
Resolves gravitystorm#1644.

This is a necessary step towards vector rendering (see gravitystorm#1644).

As this changes the label placement algorithm for polygons, this causes differences in label placement.

(SQL deliberately not indented to increase readability of the diff.)
@matthijsmelissen
Copy link
Collaborator Author

In fact the subqueries I created were not necessary, so I removed them.

This should also fix the formatting issue.

@matthijsmelissen
Copy link
Collaborator Author

The performance issues remain in my latest version. The code should now be simpler and easier to review, @pnorman could you check if this is the right approach?

I'm inclined towards doing this layer by layer to start, starting with layers like bridges where there isn't a point equivalent.

I'm fine with either approach, however I'm afraid we might miss performance issues if we do this layer by layer. The performance issues created by each of the layer changes might be insignificant, but significant when considered together. Also, some layers might have a worse performance hit then others, so when converting layer by layer we can't be sure we won't run into performance problems when converting the last layer.

@ImreSamu
Copy link

As I remember - calling ST_Centroid for simple polygons ( ST_NPoints(polygon) <= 5 ) is faster than ST_PointOnSurface()

see more:

@kocio-pl
Copy link
Collaborator

We can't say which ones are "simple", so we have to treat them all the same. ST_Centroid is not suitable for this job anyway, because it does not guarantee to return the point on the surface (maybe that's why it's faster):

@ImreSamu
Copy link

ImreSamu commented May 14, 2018

@kocio-pl

it does not guarantee to return the point on the surface

As I understand - checking convexity of polygons gives better result .
What is your opinion ?

CASE  WHEN ST_NPoints(ST_ConvexHull(geometry))=ST_NPoints(geometry)
           THEN ST_Centroid(geometry)
           ELSE ST_PointOnSurface(geometry)

@kocio-pl
Copy link
Collaborator

I'm not a programmer and I don't know just by looking at the code, but I was involved in investigating #1465 and testing improved interior algorithm implementation in Mapnik (mapnik/mapnik#3780), so you could reuse their tests and I'm willing to evaluate the output (not the performance however). Do you like to try?

@pnorman
Copy link
Collaborator

pnorman commented May 15, 2018

As I understand - checking convexity of polygons gives better result .
What is your opinion ?

Too much SQL, and if there are ST_PointOnSurface performance improvements we should get them in PostGIS.

@pnorman
Copy link
Collaborator

pnorman commented May 15, 2018

The performance issues remain in my latest version. The code should now be simpler and easier to review, @pnorman could you check if this is the right approach?

I still find it difficult to review the changes to all the unrelated layout, and we need to establish conventions for how we're going to do the SQL. This is why I did #3233.

In this pr, the springs code is incorrect since it only applies the !bbox! condition to the rows coming from the polygons table, not the points table.

@matthijsmelissen
Copy link
Collaborator Author

In this pr, the springs code is incorrect since it only applies the !bbox! condition to the rows coming from the polygons table, not the points table.

Whoops. How does Mapnik decide then whether to wrap the code in a superquery with WHERE way && !bbox!? Does it simply check whether !bbox! already occurs in the query?

Now I think of it, this might well be the reason for the performance issues I encountered (it would certainly explain why adding AND way && !bbox! introduced the problem).

@matthijsmelissen
Copy link
Collaborator Author

I still find it difficult to review the changes to all the unrelated layout,

There was one case of unrelated layout, and that's fixed now. Some brackets needed to be added in unrelated code due to OR/AND precedence order.

and we need to establish conventions for how we're going to do the SQL. This is why I did #3233.

Yes, I agree we need that.

In this pr, the springs code is incorrect since it only applies the !bbox! condition to the rows coming from the polygons table, not the points table.

Fixed that now.

Unfortunately, after the fix performance is still as bad as it used to be.

@pnorman
Copy link
Collaborator

pnorman commented May 16, 2018

Whoops. How does Mapnik decide then whether to wrap the code in a superquery with WHERE way && !bbox!? Does it simply check whether !bbox! already occurs in the query?

Mapnik always wraps the query fragment we supply, but it only adds the way && !bbox! condition if it hasn't put the bbox token into the query fragment already.

@matthijsmelissen matthijsmelissen changed the title Switch to ST_PointOnSurface [WIP] Switch to ST_PointOnSurface Jan 2, 2019
matthijsmelissen pushed a commit to matthijsmelissen/openstreetmap-carto that referenced this pull request Jan 12, 2019
Resolves gravitystorm#1644.

This is a necessary step towards vector rendering (see gravitystorm#1644).

This is a second trial of gravitystorm#3228, based on the word by @pnorman in gravitystorm#3233.

This is a first step, once we have this than we can probably merge the point and poly layers (for example amenity-points and amenity-points-poly).
@matthijsmelissen
Copy link
Collaborator Author

Superseded by #3636

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants