You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Based on a review a couple of improvements would be possible, feel free to correct me if I'm wrong and note that we can have @stevenbal help out with these changes if necessary:
Pagination has been implemented in As developer, I want the API to be paginated #148 and implemented by Anna in PR As developer, I want the objects API to include a reference to the objecttype #36 . It has yet to be merged or released and will not be a backwards-compatable API change. What should be kept in mind is that this is something that might hamper in-bulk retrievals (getting 1M objects will need 2000 API calls instead of 1 with a pagination of 500-per-page, and those 2000 calls will be much more difficult to optimize performance-wise than one big call). I would suggest allowing the API-user to toggle paginating or allow a MAX_INT pageSize as an appropriate size depends on the use-case.
All objectrecords have a date and filtering seems date-enabled (last record / current record), however I didn't notice a DB index on the date fields. This could be added and you might want to only order on '-index' for the last record to avoid the date-ordering overhead altogether https://github.com/maykinmedia/objects-api/blob/master/src/objects/core/models.py#L71
Specific indexes can be added on-the-fly to a jsonb field ( https://www.postgresql.org/docs/current/datatype-json.html ) however this does depend on the use-case and the optimal indexes will be different depending on the object-type.
4.1) It might make sense to add indexes to certain fields which occur in many object-types.
4.2) As an alternative it would be possible to extend the viewing and creating of jsonb indexes to an administrator of the objects-api. This would allow an administrator to tweak the performance for their use-case. In-depth knowledge of the objects stored and the API calls used would be essential to do this properly though, and the performance could also be negatively impacted if used incorrectly. I would not recommend exposing this functionality via an API.
Geometry field used is automatically indexed using a spatial index, so doesn't need one set explicitly.
Based on the above I would recommend 2+3 and like to discuss 1+4 further. 2+3+4 can be implemented without API changes so I would only want to purse 1 on the short term to avoid the API change later on.
The text was updated successfully, but these errors were encountered:
Let's assess the time-to-response for different page sizes in the performance tests before making any decision (Feature/pagination #153)
Nice catch! No DB indexes were created to optimize performance, indexes on date fields are certainly needed. I hope we will find other fields for indexing when running performance test.
No objection, but I thought that DB-specific field should be more optimized to work with this particular DB
Afaik we don't have enough information now which data attributes would be used by many object types. I think this optimization is a bit premature, let's collect some data from clients first
It looks like geometry field representation can be a bottleneck itself, even without filtering on it. We'll see the results after the performance testing
From: https://github.com/orgs/Gemeente-DenHaag/projects/3#card-62299949
Based on a review a couple of improvements would be possible, feel free to correct me if I'm wrong and note that we can have @stevenbal help out with these changes if necessary:
Pagination has been implemented in As developer, I want the API to be paginated #148 and implemented by Anna in PR As developer, I want the objects API to include a reference to the objecttype #36 . It has yet to be merged or released and will not be a backwards-compatable API change. What should be kept in mind is that this is something that might hamper in-bulk retrievals (getting 1M objects will need 2000 API calls instead of 1 with a pagination of 500-per-page, and those 2000 calls will be much more difficult to optimize performance-wise than one big call). I would suggest allowing the API-user to toggle paginating or allow a MAX_INT pageSize as an appropriate size depends on the use-case.
All objectrecords have a date and filtering seems date-enabled (last record / current record), however I didn't notice a DB index on the date fields. This could be added and you might want to only order on '-index' for the last record to avoid the date-ordering overhead altogether https://github.com/maykinmedia/objects-api/blob/master/src/objects/core/models.py#L71
The postgres JSONField could be switched to the built-in Django 3.1 db.models.JSONField https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.JSONField but both use jsonb under the hood so it shouldn't matter too much.
Specific indexes can be added on-the-fly to a jsonb field ( https://www.postgresql.org/docs/current/datatype-json.html ) however this does depend on the use-case and the optimal indexes will be different depending on the object-type.
4.1) It might make sense to add indexes to certain fields which occur in many object-types.
4.2) As an alternative it would be possible to extend the viewing and creating of jsonb indexes to an administrator of the objects-api. This would allow an administrator to tweak the performance for their use-case. In-depth knowledge of the objects stored and the API calls used would be essential to do this properly though, and the performance could also be negatively impacted if used incorrectly. I would not recommend exposing this functionality via an API.
Geometry field used is automatically indexed using a spatial index, so doesn't need one set explicitly.
Based on the above I would recommend 2+3 and like to discuss 1+4 further. 2+3+4 can be implemented without API changes so I would only want to purse 1 on the short term to avoid the API change later on.
The text was updated successfully, but these errors were encountered: