Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Propagate trace to NDK #4137

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

feat: Propagate trace to NDK #4137

wants to merge 6 commits into from

Conversation

bitsandfoxes
Copy link
Contributor

@bitsandfoxes bitsandfoxes commented Feb 4, 2025

Resolves #3562

Corresponding PR on native: getsentry/sentry-native#1137
Corresponding PR on Unity: getsentry/sentry-unity#1997

Context

Native events and crashes do not share the same trace ID as the rest of the layers. This is relevant in two ways:

  1. Synching the trace for native events
  2. Allowing hybrid SDKs (i.e. RN, Unity) to propagate their trace to Java - and implicitly to native

Proposal

Synching trace to native

The NDKScopeObserver is already getting invoked on changes to the propagation context via setTrace. The changes in getsentry/sentry-native#1137 provide a way to set the Trace context on the native scope.

Setting the trace from hybrid SDKs

The Java SDK could/should provide a way to continue a trace outside of accepting trace-header & baggage. The most straightforward way I can think of is to accept the trace ID and a span ID as String and update the propagation context.

Implementation

Option 1: Leveraging PropagationContext

Since the propagation context is already getting used as a fallback to provide the trace context in case of there being no active transaction, it could be reused to hold the trace passed into the SDK.

Option 2: Writing the trace context to the scope

The trace passed into the SDK could also be written directly to the scope. The rest of the event processing would need to be required to not overwrite the context.

Result

Screenshot 2025-02-06 at 12 34 19

Copy link
Contributor

github-actions bot commented Feb 4, 2025

Fails
🚫 Please consider adding a changelog entry for the next release.
Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

- Propagate trace to NDK ([#4137](https://github.com/getsentry/sentry-java/pull/4137))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against ac54b3c

Comment on lines +135 to +137
if(spanContext == null) {
return;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is questionable. I think we should "clear" the trace here too.
My question is: Since I'm "reusing" the setTrace this receives null when the transaction get cleared from the scope. But I think in a hybrid SDK context, there should always be a trace active.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, with TWP (Trace without performance) we always have a trace in the scope. Regardless of spans or transactions.
That's so we can propagate this to downstream services, as well as connecting events (errors) happning on the same process, that are on the same trace. Now how long we keep the same trace is something to be defined depednant on what's possible. We want something that's less than the whole lifetime of the process (it's OK on a CLI for example, but not on a game). And more than just a few milliseconds (if arbitrary). Ideally it's something that's reasonable. On mobile for example, could be a trace id for each screen. But that would reset if the app goes on to the background for 30 seconds or longer (e.g: Session). This way we'd have some related telemetry tied to each other via that trace id

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we're falling back to scope.getPropagationContext().getSpanContext() here:

if (spanContext == null) {
// we always need a trace_id to properly link with traces/replays, so we fallback to
// propagation context values and create a fake SpanContext
store(scope.getPropagationContext().toSpanContext(), TRACE_FILENAME);
} else {
store(spanContext, TRACE_FILENAME);
}

maybe you could do the same?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bitsandfoxes When do you generate new traceId and spanId in unity?

RN for example generated new traceId and spanId on every new screen, browser on every reload.

RN never clears the id. When does unity clear the traceId?

Comment on lines +1157 to +1160
public static void setTrace(
final @NotNull String traceId, final @NotNull String spanId) {
getCurrentScopes().setTrace(traceId, spanId);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and could be used in RN as is.

Rn will propagate the traceId from the JS scope.

https://github.com/getsentry/sentry-javascript/blob/fc48f8397d1ed647a4e2d95dba33f022df03daa3/packages/core/src/scope.ts#L120

@krystofwoldrich
Copy link
Member

In JS the spanId is now called propagationSpanId but it's the same thing with different names.

https://github.com/getsentry/sentry-javascript/blob/95cc948ef0136e1228916e3216c4a08b84fa8230/packages/core/src/types-hoist/tracing.ts#L45

@@ -125,4 +129,17 @@ public void removeExtra(final @NotNull String key) {
.log(SentryLevel.ERROR, e, "Scope sync removeExtra(%s) has an error.", key);
}
}

@Override
public void setTrace(@Nullable SpanContext spanContext, @NotNull IScope scope) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'd be curious how it works with the existing integrations of the Android SDK that generate a new trace (e.g. Activity). I guess we'd have to disable those in the hybrid SDKs, otherwise it'd override what the hybrids set whenever a new activity shows up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I assumed that the sentry-android doesn't generate a new traceId since we are not using performance/tracing on the java layer.

We set the following.

    options.setTracesSampleRate(null);
    options.setTracesSampler(null);
    options.setEnableTracing(false);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at

we will need an option to disable generating of the traceId in java.

At the moment the trace would be broken because RN would set traceId when navigation starts, but sentry-android would generate new one once the navigation was processed in JS and thus native created a new activity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's the issue, since we always need a trace running, we're still spinning up a new one even if perf/tracing is disabled 😅 I guess we need a different mechanism of letting us know we shouldn't create a new trace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also generate a new trace on a new fragment the most popular RN library creates fragments as "screens".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, fragments create only child spans for now, but we may change it soon. I updated #3719 to reflect that

*/
// return TransactionContext (if performance enabled) or null (if performance disabled)
public static void setTrace(
final @NotNull String traceId, final @NotNull String spanId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the sample_rand spec. getsentry/sentry-docs#12592

The continuing of a trace should require sample_rand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Synchronize PropagationContext across hybrid layers
4 participants