Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch error tracking to Sentry #4409

Merged
merged 4 commits into from
Jan 21, 2025
Merged

Switch error tracking to Sentry #4409

merged 4 commits into from
Jan 21, 2025

Conversation

rakyi
Copy link
Contributor

@rakyi rakyi commented Jan 8, 2025

  • Remove all usage of Bugsnag
  • Add Sentry to admin client bundle and all server side processes
    • Move the Sentry initialization to be as early as possible to maximize the chance an error will be captured
    • Include node profiling integration for server side
    • Create a new admin project for the admin client and server processes
  • Tell Sentry about:
    • release is the COMMIT_SHA of the currently running code - helps with triage and works with GitHub integration for stuff like comments on PRs and other features
    • environment is good for filtering issues, we are gonna distinguish between prod and staging from now on. Running Sentry on staging should help us catch issues before they get to prod.
  • Add GitHub action for creating Sentry releases
    • Telling Sentry about a release before it gets the first event from that version makes some features nicer to use (I think e.g. "Mark as resolved in the next release")
  • Add Sentry Vite plugin - used to upload source maps as part of the production build. This is the recommended way to do it and AFAIK the easiest to get client source maps to work also in staging.
  • Rework manual capturing of errors:
    • Rename logErrorAndMaybeSendToBugsnag to logErrorAndMaybeCaptureInSentry
    • Where it makes sense, throw an error instead of calling this function and let the Sentry error handler capture it
    • Remove two instances of logging gdoc validation errors and warnings during baking, since we get more than 1k of them in one bake, which makes them useless and consumes a lot of our quota. We should add it back only if we making it smarter or fix most of the errors first.
    • Remove unnecessary warn wrapper for console.warn
    • Change JsonError to Error where it isn't necessary
  • Increase Bundlewatch limit substantially - Sentry dependencies were previously not included in the build at all, giving us a false sense of the bundle being smaller than it was

You can see various testing issues and new issues from baking etc. from this branch as well as current ones from production in Sentry.

Resolves #4204

Depends on https://github.com/owid/ops/pull/263

@rakyi rakyi force-pushed the remove-bugsnag branch 2 times, most recently from 0ab560e to 34156fa Compare January 8, 2025 17:28
@rakyi
Copy link
Contributor Author

rakyi commented Jan 8, 2025

Bundle watch originally failed here with a large bundle size increase. We are in fact making the bundle smaller from 2.91 MB to 2.87 MB (I compared prod vs staging), but Bundlewatch didn't pick it up, because it seems that rollup is smart enough to remove code that won't run at import time when it's guarded by a falsy condition, i.e. the SENTRY_DSN isn't set in CI, so the code that was using Sentry didn't run, so rollup didn't include in the bundle previously.

We have now removed the condition and initialize Sentry always (which is fine to do and fails gracefully without DSN). So rollup now doesn't remove the code, and Bundlewatch thus detects the bundle increase.

@owidbot
Copy link
Contributor

owidbot commented Jan 8, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-remove-bugsnag

SVG tester:

Number of differences (default views): 0 ✅
Number of differences (all views): 220 (5246b1) ❌

Edited: 2025-01-20 13:10:54 UTC
Execution time: 1.29 seconds

@rakyi rakyi force-pushed the remove-bugsnag branch 16 times, most recently from 0fe4200 to fe458c8 Compare January 14, 2025 10:00
@rakyi rakyi marked this pull request as ready for review January 14, 2025 11:02
@ikesau ikesau self-requested a review January 16, 2025 21:05
Copy link
Member

@ikesau ikesau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice to have someone go through this all and make it sane.

There's one other random mention of Bugsnag in the codebase (a comment explaining why some object isn't desctructured) which I think we can also remove.

I think doing some tire kicking over a video call would be a good way to test this. Wanna book me for 15 minutes? 🙂

adminSiteClient/instrument.ts Outdated Show resolved Hide resolved
@@ -0,0 +1,7 @@
import * as Sentry from "@sentry/react"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have sentry and scope in the filename, I think

e.g.

initSentryAdminSiteClient.ts
initSentryServer.ts
initSentrySite.ts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep the name, because:

  • instrument is what is used in Sentry docs, so if you read the docs and then search for such file, you'll easily find all three of them. Scope is in the module path, i.e. the directory name, we don't usually repeat that in nested file names.
  • instrument is named by what it does, so if we ever want to add instrumentation of something else, we can.
  • We don't have to change the name, when we change providers. We also don't call files/folders as specific as mysql when we don't need to, but rather db.

adminSiteServer/appClass.tsx Outdated Show resolved Hide resolved
adminSiteServer/authentication.ts Show resolved Hide resolved
.github/workflows/sentry.yml Show resolved Hide resolved
baker/SiteBaker.tsx Show resolved Hide resolved
site/instrument.ts Show resolved Hide resolved
site/instrument.ts Outdated Show resolved Hide resolved
@rakyi rakyi requested a review from ikesau January 17, 2025 11:13
@ikesau
Copy link
Member

ikesau commented Jan 17, 2025

Manual testing notes.

1.

image

This is from trying to add a circular redirect (which is inferrable through the stack trace) but I'm wondering if this is the top-level message given for all the JsonErrors?

2.

image

OOC, what's up with all these different ADMIN-X instances? Do they represent different staging instances? Is there a way we can name them more clearly?

3.

Here's an error from me adding a malformed gdoc. In Bugsnag, I was able to look at the headers and see the user's email in them, but it looks like they're not here in Sentry? Not essential, but it would be nice to have as I often used this information to quickly follow up with authors who might be stuck on something.

4.

I wonder if we should add some more DSNs - the fact that errors/crashes from baking and indexing are both under admin is possible to understand, but if there's not much cost 2 having 2 additional buckets for those issues then maybe that would make things clearer?

@rakyi
Copy link
Contributor Author

rakyi commented Jan 20, 2025

Thanks a lot for the manual testing!

  1. Good catch. The handling of API errors is pretty weird in admin. I switched the code to throw JsonError instead of a plain object, which seems to have fixed the problem.
  2. Labels like ADMIN-X are identifiers for Sentry issues unique across an org so you can easily reference them in various places like comments, commit messages, etc.
  3. Good point, I forgot to check whether users are associated with requests! I added the code to set a user in Sentry. See here for how it looks with a user.
  4. We can have basically as many projects as we want. It's all about finding what works best for a given org/team. In a past job, I was frustrated with having too many of them and IME it worked better with fewer. I'd say let's see if two work well for us and adjust later if necessary.

Copy link
Member

@ikesau ikesau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Let's get this in and ready to demo for knowledge sharing. 🔥

@rakyi rakyi merged commit c949584 into master Jan 21, 2025
21 of 23 checks passed
@rakyi rakyi deleted the remove-bugsnag branch January 21, 2025 09:59
Copy link

sentry-io bot commented Jan 21, 2025

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ Error: Gateway Timeout loadVariableDataAndMetadata(packages/@ourworldi... View Issue
  • ‼️ TypeError: Failed to fetch loadVariableDataAndMetadata(packages/@ourworldi... View Issue
  • ‼️ TypeError: Failed to fetch promiseGetter(packages/@ourworldindata/utils/sr... View Issue
  • ‼️ TypeError: Failed to fetch promiseGetter(packages/@ourworldindata/utils/sr... View Issue
  • ‼️ Error: Gateway Timeout loadVariableDataAndMetadata(packages/@ourworldi... View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐝 analytics: swap out Bugsnag for Sentry
3 participants