-
-
Notifications
You must be signed in to change notification settings - Fork 64
20200507 DB Q&A Notes
- Joel Henderson
- John Ritchey
- Adam Kendis
- Jake Mensch
- John Darragh
- Cynthia Kiser
- Tim Eccleston
Meeting Video Recording
PG Performance Test
Postgres SQL Range Types
Intro to materialized views:
PG Pass
Allows for Env File auto look at startup
- Background: 3 teams were looking for help with their postgres DB. We reached out to CFA and got Tim and Joel (with H4LA) to advise our devs.
- 311 - Needs help improving DB performance to have much faster response times for user requests prior to beta release in time for meeting with the mayor. We only write to the database once per day. The user only reads from the database. 32 fields. 5.2M rows. Only use 9 of those fields. User Submits filters. When broad enough, can take 10-15 seconds for a typical query. Redis running. Looking for strategies to improve read times.
- Food Oasis - Needs some help with postgres. John has a lot of experience with DB Administration
- Civic Tech Index - Needs to setup a database for the project.
- Normalize our 1 table
- Remove unused cols
Postgres is a great transactional DB. For Speed, do more with Redis or similarly designed solution. Build things as ready to serve as possible:
-
1st Filter Set needs to hit Postgres
-
Faster Open Source Alternatives to Postgres: Redis / MongoDB.
-
Date Range Queries are very slow. Make a clustered index that starts with that date field.
-
Make dates a day or week number as an index so searching is faster. (strings, vchar etc. are very slow. Needs to be enum. Will be 10x faster)
-
Have pages for most common queries
Joel, using a typical MacBook, typical PG 12, and I generated a demo table and data. A few findings are below.
- No meaningful speed difference between using categories as text vs. categories as enum. I’m presuming that the categories are generally short words. Adding an index gave 4x speed.
- The PG time comparison functions are even slower than I thought. Adding an index to the timestamp column gave a 30x improvement.
- Pre-calculating a column “recent_week” and its index, as we discussed, was 10x /worse/ than just using the timestamp field and its index. EXPLAIN shows PG ignores the recent_week index, instead doing a full scan.