Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(core): Improve URL parsing utilities #15882

Merged
merged 7 commits into from
Apr 1, 2025

Conversation

AbhiPrasad
Copy link
Member

@AbhiPrasad AbhiPrasad commented Mar 27, 2025

The bottleneck of many of the tasks written down in our Node SDK performance improvement task #15861 is parseUrl, defined here:

export function parseUrl(url: string): PartialURL {

We created #15767 to track removal of parseUrl. See more details in the GH issue.

While working on tasks for #15767, I initially PR'd #15768, which introduced a parseStringToURL method as a replacement for parseUrl. While trying to implement that method though I realized parseStringToURL has a lot of downsides.

This PR removes parseStringToURL in favor of a new parseStringToURLObject. Given parseStringToURL was never exported by the core SDK, this is not a breaking change.

To understand parseStringToURLObject, let's first look at it's method signature:

function parseStringToURLObject(url: string, urlBase?: string | URL | undefined): URLObject | undefined

parseStringToURLObject takes a string URL and turns it into a URLObject, that is typed like so:

type RelativeURL = {
  isRelative: true;
  pathname: URL['pathname'];
  search: URL['search'];
  hash: URL['hash'];
};

type URLObject = RelativeURL | URL;

the JavaScript URL built-in does not handle relative URLs, so we need to use a separate object to to track relative URLs. Luckily it's pretty small surface area, we only need to worry about pathname, search, and hash.

For ease of usage of this function, I also introduced a isURLObjectRelative helper. This will make sure that users can handle both relative and absolute URLs with ease.

Given packages/core/src/fetch.ts was using parseStringToURL, I refactored that file to use parseStringToURLObject. The change feels way better to me, much easier to read and understand what is going on.

@AbhiPrasad AbhiPrasad requested a review from a team March 27, 2025 20:16
@AbhiPrasad AbhiPrasad self-assigned this Mar 27, 2025
@AbhiPrasad AbhiPrasad requested review from mydea and andreiborza and removed request for a team March 27, 2025 20:16
Copy link
Contributor

github-actions bot commented Mar 27, 2025

size-limit report 📦

Path Size % Change Change
@sentry/browser 23.2 KB - -
@sentry/browser - with treeshaking flags 23.02 KB - -
@sentry/browser (incl. Tracing) 36.76 KB +0.43% +160 B 🔺
@sentry/browser (incl. Tracing, Replay) 73.92 KB +0.21% +153 B 🔺
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 67.32 KB +0.27% +180 B 🔺
@sentry/browser (incl. Tracing, Replay with Canvas) 78.59 KB +0.2% +160 B 🔺
@sentry/browser (incl. Tracing, Replay, Feedback) 91.16 KB +0.18% +165 B 🔺
@sentry/browser (incl. Feedback) 40.33 KB - -
@sentry/browser (incl. sendFeedback) 27.83 KB - -
@sentry/browser (incl. FeedbackAsync) 32.63 KB - -
@sentry/react 25 KB - -
@sentry/react (incl. Tracing) 38.68 KB +0.42% +164 B 🔺
@sentry/vue 27.41 KB - -
@sentry/vue (incl. Tracing) 38.48 KB +0.49% +191 B 🔺
@sentry/svelte 23.23 KB - -
CDN Bundle 24.44 KB - -
CDN Bundle (incl. Tracing) 36.78 KB +0.45% +165 B 🔺
CDN Bundle (incl. Tracing, Replay) 71.8 KB +0.24% +170 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback) 76.99 KB +0.24% +182 B 🔺
CDN Bundle - uncompressed 71.24 KB - -
CDN Bundle (incl. Tracing) - uncompressed 108.76 KB +0.32% +349 B 🔺
CDN Bundle (incl. Tracing, Replay) - uncompressed 220.05 KB +0.16% +349 B 🔺
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 232.62 KB +0.15% +349 B 🔺
@sentry/nextjs (client) 39.96 KB +0.41% +165 B 🔺
@sentry/sveltekit (client) 37.2 KB +0.46% +172 B 🔺
@sentry/node 142.89 KB +0.01% +1 B 🔺
@sentry/node - without tracing 96.08 KB - -
@sentry/aws-serverless 120.43 KB +0.01% +1 B 🔺

View base workflow run

@AbhiPrasad AbhiPrasad force-pushed the abhi-ref-parse-string-to-url branch from 2718fd1 to a2d3ae6 Compare March 28, 2025 17:10
Copy link
Member

@lforst lforst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually quite beautiful


type URLObject = RelativeURL | URL;

// Curious about `thismessage:/`? See: https://www.rfc-editor.org/rfc/rfc2557.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dayum

@AbhiPrasad AbhiPrasad enabled auto-merge (squash) April 1, 2025 07:37
@AbhiPrasad AbhiPrasad merged commit f0fc41f into develop Apr 1, 2025
151 of 152 checks passed
@AbhiPrasad AbhiPrasad deleted the abhi-ref-parse-string-to-url branch April 1, 2025 08:01
onurtemizkan pushed a commit that referenced this pull request Apr 3, 2025
The bottleneck of many of the tasks written down in our Node SDK
performance improvement task
#15861 is
`parseUrl`, defined here:

https://github.com/getsentry/sentry-javascript/blob/3d63621714b31c1ea4c2ab2d90d5684a36805a43/packages/core/src/utils-hoist/url.ts#L17

We created #15767
to track removal of `parseUrl`. See more details in the GH issue.

While working on tasks for
#15767, I initially
PR'd #15768, which
introduced a `parseStringToURL` method as a replacement for `parseUrl`.
While trying to implement that method though I realized
`parseStringToURL` has a lot of downsides.

This PR removes `parseStringToURL` in favor of a new
`parseStringToURLObject`. Given `parseStringToURL` was never exported by
the core SDK, this is not a breaking change.

To understand `parseStringToURLObject`, let's first look at it's method
signature:

```ts
function parseStringToURLObject(url: string, urlBase?: string | URL | undefined): URLObject | undefined
```

`parseStringToURLObject` takes a string URL and turns it into a
`URLObject`, that is typed like so:

```ts
type RelativeURL = {
  isRelative: true;
  pathname: URL['pathname'];
  search: URL['search'];
  hash: URL['hash'];
};

type URLObject = RelativeURL | URL;
``` 

the JavaScript URL built-in does not handle relative URLs, so we need to
use a separate object to to track relative URLs. Luckily it's pretty
small surface area, we only need to worry about `pathname`, `search`,
and `hash`.

For ease of usage of this function, I also introduced a
`isURLObjectRelative` helper. This will make sure that users can handle
both relative and absolute URLs with ease.

Given `packages/core/src/fetch.ts` was using `parseStringToURL`, I
refactored that file to use `parseStringToURLObject`. The change feels
way better to me, much easier to read and understand what is going on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants