Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parsed URIs throughout Dialogue #2432

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open

Cache parsed URIs throughout Dialogue #2432

wants to merge 6 commits into from

Conversation

schlosna
Copy link
Contributor

Before this PR

Noticed new URI(String) and some of its side effects in JFRs from services making high volumes of Dialogue requests.

After this PR

==COMMIT_MSG==
Parsing String to URI can be expensive in terms of CPU and allocations for high throughput services. This generalizes the caching previously added to DnsSupport in #2398 to also cover ApacheHttpClientBlockingChannel requests and HttpsProxyDefaultRoutePlanner proxy parsing to leverage a shared parsed URI cache.
==COMMIT_MSG==

Possible downsides?

Parsing String to URI can be expensive in terms of CPU and allocations
for high throughput services. This generalizes the caching previously
added to DnsSupport in #2398 to
also cover ApacheHttpClientBlockingChannel requests and
HttpsProxyDefaultRoutePlanner proxy parsing to leverage a shared parsed
URI cache.
@schlosna schlosna requested a review from carterkozak November 22, 2024 16:28
@changelog-app
Copy link

changelog-app bot commented Nov 22, 2024

Generate changelog in changelog/@unreleased

What do the change types mean?
  • feature: A new feature of the service.
  • improvement: An incremental improvement in the functionality or operation of the service.
  • fix: Remedies the incorrect behaviour of a component of the service in a backwards-compatible way.
  • break: Has the potential to break consumers of this service's API, inclusive of both Palantir services
    and external consumers of the service's API (e.g. customer-written software or integrations).
  • deprecation: Advertises the intention to remove service functionality without any change to the
    operation of the service itself.
  • manualTask: Requires the possibility of manual intervention (running a script, eyeballing configuration,
    performing database surgery, ...) at the time of upgrade for it to succeed.
  • migration: A fully automatic upgrade migration task with no engineer input required.

Note: only one type should be chosen.

How are new versions calculated?
  • ❗The break and manual task changelog types will result in a major release!
  • 🐛 The fix changelog type will result in a minor release in most cases, and a patch release version for patch branches. This behaviour is configurable in autorelease.
  • ✨ All others will result in a minor version release.

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Parsing String to URI can be expensive in terms of CPU and allocations for high throughput services. This generalizes the caching previously added to DnsSupport in #2398 to also cover ApacheHttpClientBlockingChannel requests and HttpsProxyDefaultRoutePlanner proxy parsing to leverage a shared parsed URI cache.

Check the box to generate changelog(s)

  • Generate changelog entry

* This prefix may reconfigure several aspects of the client to work better in a world where requests are routed
* through a service mesh like istio/envoy.
*/
private static final String MESH_PREFIX = "mesh-";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point the mesh- prefix is no longer used (or supported) in any environment -- it should make things simpler if we introduce another change first to remove support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at removing mesh mode, and it has tentacles across a bunch of modules & tests, so might hold off on removing that until after this PR if that's ok.

@@ -100,8 +102,9 @@ final class ApacheHttpClientBlockingChannel implements BlockingChannel {
public Response execute(Endpoint endpoint, Request request) throws IOException {
// Create base request given the URL
URL target = baseUrl.render(endpoint, request);
URI targetUri = Uris.tryParse(target.toString()).uriOrThrow();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike the base URIs, this value will have much higher cardinality any time an endpoint includes path parameters. I'm a little bit uneasy about caching.

We could bypass the URL and URI types entirely, setting specific elements directly on the request builder instead. That would require expanding the BaseUrl API a bit, but may produce better results. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, updated to just use the URL protocol, authority, and file which avoids some of the more expensive new URI(String) parsing and not flooding the URI parsing cache with high cardinality URIs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split this portion out to a separate PR #2437 as this is portion is the biggest win by avoiding a URI parse per request.

} catch (final URISyntaxException ex) {
throw new HttpException("Cannot convert host to URI: " + target, ex);
}
final URI targetUri = parseTargetUri(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the cardinality is similarly high here, however unlike the per-request parse, this should only occur once per new connection we create, which should be an order or two lower magnitude than requests.

My preference is not to use the cache here. I suspect the perf cost of uri parsing is insignificant compared to tls handshake overhead. If that turns out not to be the case, It think we could plumb through the original base uri via the HttpContext so we can reuse the same value for each connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants