Skip to content

2024‐03‐23 ‐ Intermittent API Errors

Mx Corey Frang edited this page May 6, 2024 · 5 revisions

Incident: Intermittent API errors

Date: 2024-03-23 through 2024-04-08

Cause of error

The API server was configured with an SQL connection pool of 5, which wasn't ever a problem with standard usage patterns at the time. However new automation features had changed the number of requests the average user would make.

More than one automation job was "incomplete" (tests still marked as queued), which caused any admins looking at the test queue page to be requesting status updates every 2 seconds. Each of these status update requests was 2 - 3 API requests per incomplete test run (2 at the time). It makes sense that the site became more unstable when multiple admins were looking at the test queue.

Solutions

  • #990 - Increase connection limit.

Linked Issues & PRs

  • #983 - Initial issue that discovered root cause.
  • #990 - Fix for connection limits
  • #984
  • #1000 - New status update UI reduces number of requests per active test (and hopefully results in more "complete" or "cancelled" test results)
  • #1002 - Automation job incomplete