Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
  • Loading branch information
jmacd committed Sep 13, 2024
1 parent e6ac103 commit c2cd1ba
Show file tree
Hide file tree
Showing 2 changed files with 108 additions and 60 deletions.
104 changes: 104 additions & 0 deletions lightstep/processor/satellitesamplerprocessor/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Lightstep Satellite Sampler

This package contains an OpenTelemetry processor for an OpenTelemetry
traces pipeline that makes sampling decisions consistent with the
legacy Lightstep Satellite. This component enables a slow transition
from Lightstep Satellites to OpenTelemetry Collectors without
simultaneously changing sampling algorithms.

## Recommended usage

This component supports operating a mixture of Lightstep Satellites
and OpenTelemetry Collectors with consistent probability sampling.
Here is a recommended sequence of steps for performing a migratation
to OpenTelemetry Collectors for Lightstep Satellite users.

### Build a custom OpenTelemetry Collector

This component is provided as a standalone component, meant for
incorporating into a custom build of the OpenTelemetry Collector using
the [OpenTelemetry Collector
builder](https://opentelemetry.io/docs/collector/custom-collector/)
tool. In your Collector's build configuration, add the following
processor component:

```
- gomod: github.com/lightstep/otel-collector-charts/lightstep/processor/satellitesamplerprocessor VERSIONTAG
```

where `VERSIONTAG` corresponds with the targetted OpenTelemetry
Collector release version. At the time of this writing, the version
tag is `v0.109.0`.

Users are advised to include the OpenTelemetry Probabilistic Sampler
processor in their build, to complete this transition. For example:

```
- gomod: github.com/open-telemetry/opentelemetry-collector-contrib/processor/probabilisticsamplerprocessor v0.109.0
```

Follow the usual steps to build your collector (e.g., `builder
--config build.yaml`).

### Configure the sampler

You will need to know the sampling probability configured with
Lightstep Satellites, in percentage terms. Say the Lightstep
Satellite is configured with 10% sampling (i.e., 1-in-10).

Edit OpenTelemetry Collector configuration to include a
`satellitesatempler` block. In the following example, the OTel-Arrow
receiver and exporter are configured with `satellitesampler` with 10%
sampling and [concurrent batch
processor](https://github.com/open-telemetry/otel-arrow/blob/main/collector/processor/concurrentbatchprocessor/README.md).

```
exporters:
otelarrow:
...
receivers:
otelarrow:
...
processors:
satellitesampler:
percent: 10
concurrentbatch:
service:
pipelines:
traces:
receivers: [otelarrow]
processors: [satellitesampler, concurrentbatch]
exporters: [otelarrow]
```

Collectors with this configuration may be deployed alongside a pool of
Lightstep Satellites sampling and the resulting traces will be
complete.

### Migrate to the OpenTelemetry Probabilistic Sampler

After decomissioning Lightstep Satellites and replacing them with
OpenTelemetry Collectors, users are advised to migrate to an
OpenTelemetry Collector processor with equivalent functionality, the
[Probabilistic Sampler Processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/probabilisticsamplerprocessor/README.md).

A change of sampling configuration, either to change algorithm or to
change probability, typically results in broken traces. Users are
advised to plan accordingly and make a quick transition between
samplers, with only a brief, planned period of broken traces.

Redeploy the pool of Collectors with the Probabilistic Sampler
processor configured instead of the Satellite sampler processor. Make
this transition as quickly, if possible, because traces will be
potentially incomplete as long as both samplers are configured for the
same destination.

```
processors:
probabilisticsampler:
mode: equalizing
sampling_percentage: 10
```

The "equalizing" mode is recommended, see that [component's
documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/probabilisticsamplerprocessor/README.md#equalizing).
Original file line number Diff line number Diff line change
@@ -1,61 +1,5 @@
The code in this directory is is a copy of the OpenTelemetry package
in collector-contrib/pkg/sampling. The copy here has has ServiceNow
copyright because it was originally authored here.

Code organization:

# Tracestate handling

- w3ctracestate.go: the outer tracestate structure like `key=value,...`
- oteltracestate.go: the inner tracestate structure like `key:value;...`
- cloudobstracestate.go: the inner tracestate structure like `key:value;...` (internal only)
- common.go: shared parser, serializer for either tracestate

This includes an implementation of the W3C trace randomness feature,
described here: https://www.w3.org/TR/trace-context-2/#randomness-of-trace-id

# Overview of tracestate identifiers

There are two vendor codes:

- "ot" refers to formal OTel specifications
- "sn" for ServiceNow refers to internal Cloud Observability sampling (as by the Lightstep satellite)

The OTel trace state keys:

- "p" refers to the [legacy OTel power-of-two sampling](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md)
- "r" used in the legacy convention, 1-2 decimal digits
- "th" refers to the [modern OTel 56-bit sampling](https://github.com/open-telemetry/oteps/pull/235)
- "rv" refers to the modern randomness value, 14 hex digits.

The Cloud Observability trace state keys:

- "s" refers to the satellite sampling, uses the same encoding as "th" but is modeled as an acceptance threshold.

Note that to convert from an OTel rejection threshold to a Satellite sampler acceptance threshold, the unsigned value of the threshold should be subtracted from the maximum adjusted count,

```
satelliteSamplerThreshold, _ = UnsignedToThreshold(MaxAdjustedCount - otelModernThreshold.Unsigned())
```

# Encoding and decoding

- probability.go: defines
`ProbabilityToThreshold()`
`(Threshold).Probability()`
- threshold.go: defines
`TValueToThreshold()`
`(Threshold).TValue()`
`(Threshold).ShouldSample()`
- randomness.go: defines
`TraceIDToRandomness()`
`RValueToRandomness()`
`(Randomness).RValue()`

# Callers of note

- In common-go/wire/oteltoegresspb/otel_to_egresspb.go:
`TraceStateToAdjustedCount()`

- In internalcollector/components/satellitesamplerprocessor/traces.go:
`createTracesProcessor()`
in
[collector-contrib/pkg/sampling](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/sampling/README.md). The
copy here has has Copyright ServiceNow, Inc because it was originally
authored at ServiceNow before being contributed to OpenTelemetry.

0 comments on commit c2cd1ba

Please sign in to comment.