Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracing instrumentation #80

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 45 additions & 2 deletions AerospikeClient/Async/AsyncCommand.cs
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,8 @@ public abstract class AsyncCommand : Command, IAsyncCommand, ITimeout
private bool compressed;
private bool inAuthenticate;
private bool inHeader = true;

private Activity activity;

/// <summary>
/// Default Constructor.
/// </summary>
Expand Down Expand Up @@ -203,14 +204,25 @@ private void ExecuteCore()
}
ExecuteCommand();
}

private void ExecuteCommand()
{
iteration++;

Exception exception = null;

StartActivity();

try
{
node = (AsyncNode)GetNode(cluster);

if (activity != null)
{
activity.SetTag("node.name", node.Name);
activity.SetTag("node.address", node.NodeAddress.ToString());
}
Comment on lines +222 to +224
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those don't seem to be standard tags, I would use network.peer* as mentioned below.


node.ValidateErrorCount();
conn = node.GetAsyncConnection();

Expand All @@ -229,29 +241,60 @@ private void ExecuteCommand()
}
catch (AerospikeException.Connection aec)
{
exception = aec;
ErrorCount++;
ConnectionFailed(aec);
}
catch (AerospikeException.Backoff aeb)
{
exception = aeb;
ErrorCount++;
Backoff(aeb);
}
catch (AerospikeException ae)
{
exception = ae;
ErrorCount++;
FailOnApplicationError(ae);
}
catch (SocketException se)
{
exception = se;
ErrorCount++;
OnSocketError(se.SocketErrorCode);
}
catch (Exception e)
{
exception = e;
ErrorCount++;
FailOnApplicationError(new AerospikeException(e));
}
finally
{
if (activity != null && exception != null)
{
activity.SetStatus(ActivityStatusCode.Error, exception.ToString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the status description should contain the full exception. There are exception tags made for this https://opentelemetry.io/docs/specs/semconv/attributes-registry/exception. See how they do it in OpenTelemetry.Api: https://github.com/open-telemetry/opentelemetry-dotnet/blob/14ee5c456ee91ffc11e29b35333834b2e04f53d3/src/OpenTelemetry.Api/Trace/TelemetrySpan.cs#L266.

That would allow a UI to format the exception if the rights tags are present.

}
}
}

private void StartActivity()
{
// Because of the retry mecanism, one operation can result in several I/Os, on several nodes.
// For this reason, we create a new Activity on every retry, so that we can set the actual node
// adress and the I/O status as tags of this current Activity.

// If we're retrying, stop the previous Activity before restarting a new one.
activity?.Dispose();
activity = Tracing.Source.StartActivity(GetType().Name, ActivityKind.Client);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://opentelemetry.io/docs/specs/semconv/database/database-spans/

According to the spec, we should try to make the span name look like <db.operation> <db.name>.<db.sql.table>. Also this string should probably be cached as we don't want such allocation in the very hot path.

if (activity != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

activity.IsAllDataRequested should also be checked before setting tags.

{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to have the following tags:

  • db.users
  • network.peer.address
  • network.peer.port
  • db.name
  • db.operation.
  • db.statement

// https://github.com/open-telemetry/opentelemetry-specification/blob/6ce62202e5407518e19c56c445c13682ef51a51d/specification/trace/semantic_conventions/span-general.md#general-remote-service-attributes
activity.SetTag("peer.service", Tracing.SERVICE_NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service name is one of the most important information. When the trace is displayed on a UI (Jaeger, Tempo), it will be the most visible info. I would avoid hardcoding it but either use the cluster name or a new user-provided string.

// https://github.com/open-telemetry/opentelemetry-specification/blob/6ce62202e5407518e19c56c445c13682ef51a51d/specification/trace/semantic_conventions/database.md#connection-level-attributes
activity.SetTag("db.system", Tracing.SERVICE_NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the spec, db.system should be lowercased (Aerospike -> aerospike). I've tried adding it to the spec but kind of got ignored open-telemetry/opentelemetry-specification#2090

activity.SetTag("iteration", iteration);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a standard for retries tags. Though there is http.request.resend_count so maybe we could name it db.resend_count?

}
}

public void OnConnected()
Expand Down
13 changes: 13 additions & 0 deletions AerospikeClient/Util/Tracing.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
using System.Diagnostics;

namespace Aerospike.Client
{
public class Tracing
{
private const string SOURCE_NAME = "Aerospike.Client";

internal const string SERVICE_NAME = "Aerospike";

internal static readonly ActivitySource Source = new ActivitySource(SOURCE_NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add the version of the assembly in the ActivitySource ctor

}
}