Skip to content

Commit

Permalink
Simplified self-diagnostics example (#2274)
Browse files Browse the repository at this point in the history
  • Loading branch information
cijothomas authored Nov 5, 2024
1 parent 91f44ff commit dd7b531
Show file tree
Hide file tree
Showing 6 changed files with 61 additions and 243 deletions.
4 changes: 0 additions & 4 deletions examples/self-diagnostics/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,7 @@ publish = false
opentelemetry = { path = "../../opentelemetry" }
opentelemetry_sdk = { path = "../../opentelemetry-sdk", features = ["rt-tokio"]}
opentelemetry-stdout = { path = "../../opentelemetry-stdout"}
opentelemetry-appender-tracing = { path = "../../opentelemetry-appender-tracing"}
tokio = { workspace = true, features = ["full"] }
tracing = { workspace = true, features = ["std"]}
tracing-core = { workspace = true }
tracing-subscriber = { version = "0.3.18", features = ["env-filter","registry", "std"]}
opentelemetry-otlp = { path = "../../opentelemetry-otlp", features = ["http-proto", "reqwest-client", "logs"] }
once_cell ={ version = "1.19.0"}
ctrlc = "3.4"
6 changes: 0 additions & 6 deletions examples/self-diagnostics/Dockerfile

This file was deleted.

111 changes: 23 additions & 88 deletions examples/self-diagnostics/README.md
Original file line number Diff line number Diff line change
@@ -1,93 +1,28 @@
# Basic OpenTelemetry metrics example with custom error handler:

This example shows how to setup the custom error handler for self-diagnostics.

## Custom Error Handling:

A custom error handler is set up to capture and record errors using the `tracing` crate's `error!` macro. These errors are then exported to a collector using the `opentelemetry-appender-tracing` crate, which utilizes the OTLP log exporter over `HTTP/protobuf`. As a result, any errors generated by the configured OTLP metrics pipeline are funneled through this custom error handler for proper recording and export.
This example shows how to self-diagnose OpenTelemetry by enabling its internal
logs. OpenTelemetry crates publish internal logs when "internal-logs" feature is
enabled. This feature is enabled by default. Internal logs are published using
`tracing` events, and hence, a `tracing` subscriber must be configured without
which the logs are simply discarded.

## Filtering logs from external dependencies of OTLP Exporter:

The example configures a tracing `filter` to restrict logs from external crates (`hyper`, `tonic`, and `reqwest`) used by the OTLP Exporter to the `error` level. This helps prevent an infinite loop of log generation when these crates emit logs that are picked up by the tracing subscriber.

## Ensure that the internally generated errors are logged only once:

By using a hashset to track seen errors, the custom error handler ensures that the same error is not logged multiple times. This is particularly useful for handling scenarios where continuous error logging might occur, such as when the OpenTelemetry collector is not running.


## Usage

### `docker-compose`

By default runs against the `otel/opentelemetry-collector:latest` image, and uses `reqwest-client`
as the http client, using http as the transport.

```shell
docker-compose up
```

In another terminal run the application `cargo run`

The docker-compose terminal will display logs, traces, metrics.

Press Ctrl+C to stop the collector, and then tear it down:

```shell
docker-compose down
```

### Manual

If you don't want to use `docker-compose`, you can manually run the `otel/opentelemetry-collector` container
and inspect the logs to see traces being transferred.

On Unix based systems use:

```shell
# From the current directory, run `opentelemetry-collector`
docker run --rm -it -p 4318:4318 -v $(pwd):/cfg otel/opentelemetry-collector:latest --config=/cfg/otel-collector-config.yaml
```

On Windows use:

```shell
# From the current directory, run `opentelemetry-collector`
docker run --rm -it -p 4318:4318 -v "%cd%":/cfg otel/opentelemetry-collector:latest --config=/cfg/otel-collector-config.yaml
```

Run the app which exports logs, metrics and traces via OTLP to the collector

```shell
cargo run
```

### Output:

- If the docker instance for collector is running, below error should be logged into the container. There won't be any logs from the `hyper`, `reqwest` and `tonic` crates.
```
otel-collector-1 | 2024-06-05T17:09:46.926Z info LogExporter {"kind": "exporter", "data_type": "logs", "name": "logging", "resource logs": 1, "log records": 1}
otel-collector-1 | 2024-06-05T17:09:46.926Z info ResourceLog #0
otel-collector-1 | Resource SchemaURL:
otel-collector-1 | Resource attributes:
otel-collector-1 | -> telemetry.sdk.name: Str(opentelemetry)
otel-collector-1 | -> telemetry.sdk.version: Str(0.23.0)
otel-collector-1 | -> telemetry.sdk.language: Str(rust)
otel-collector-1 | -> service.name: Str(unknown_service)
otel-collector-1 | ScopeLogs #0
otel-collector-1 | ScopeLogs SchemaURL:
otel-collector-1 | InstrumentationScope opentelemetry-appender-tracing 0.4.0
otel-collector-1 | LogRecord #0
otel-collector-1 | ObservedTimestamp: 2024-06-05 17:09:45.931951161 +0000 UTC
otel-collector-1 | Timestamp: 1970-01-01 00:00:00 +0000 UTC
otel-collector-1 | SeverityText: ERROR
otel-collector-1 | SeverityNumber: Error(17)
otel-collector-1 | Body: Str(OpenTelemetry metrics error occurred: Metrics error: Warning: Maximum data points for metric stream exceeded. Entry added to overflow. Subsequent overflows to same metric until next collect will not be logged.)
otel-collector-1 | Attributes:
otel-collector-1 | -> name: Str(event examples/self-diagnostics/src/main.rs:42)
otel-collector-1 | Trace ID:
otel-collector-1 | Span ID:
otel-collector-1 | Flags: 0
otel-collector-1 | {"kind": "exporter", "data_type": "logs", "name": "logging"}
```

- The SDK will keep trying to upload metrics at regular intervals if the collector's Docker instance is down. To avoid a logging loop, internal errors like 'Connection refused' will be attempted to be logged only once.
The example configures a tracing `filter` to restrict logs from external crates
(`hyper`, `tonic`, and `reqwest` etc.) used by the OTLP Exporter to the `error`
level. This helps prevent an infinite loop of log generation when these crates
emit logs that are picked up by the tracing subscriber. This is only a
workaround until [the root
issue](https://github.com/open-telemetry/opentelemetry-rust/issues/761) is
resolved.

## Filtering logs to be send to OpenTelemetry itself

If you use [OpenTelemetry Tracing
Appender](../../opentelemetry-appender-tracing/README.md) to send `tracing` logs
to OpenTelemetry, then enabling OpenTelemetry internal logs can also cause
infinite, recursive logging. You can filter out all OpenTelemetry internal logs
from being sent to [OpenTelemetry Tracing
Appender](../../opentelemetry-appender-tracing/README.md) using a filter, like
"add_directive("opentelemetry=off".parse().unwrap())" being done for tracing's
`FmtSubscriber`.
11 changes: 0 additions & 11 deletions examples/self-diagnostics/docker-compose.yaml

This file was deleted.

29 changes: 0 additions & 29 deletions examples/self-diagnostics/otel-collector-config.yaml

This file was deleted.

143 changes: 38 additions & 105 deletions examples/self-diagnostics/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,85 +1,16 @@
use opentelemetry::global::{self, Error as OtelError};
use opentelemetry::global;
use opentelemetry::KeyValue;
use opentelemetry_appender_tracing::layer;
use opentelemetry_otlp::{LogExporter, MetricExporter, WithExportConfig};
use opentelemetry_sdk::metrics::PeriodicReader;
use tracing_subscriber::prelude::*;

use std::error::Error;

use once_cell::sync::Lazy;
use std::collections::HashSet;
use std::sync::{Arc, Mutex};

use std::sync::mpsc::channel;

fn init_logger_provider() -> opentelemetry_sdk::logs::LoggerProvider {
let exporter = LogExporter::builder()
.with_http()
.with_endpoint("http://localhost:4318/v1/logs")
.build()
.unwrap();

let provider = opentelemetry_sdk::logs::LoggerProvider::builder()
.with_batch_exporter(exporter, opentelemetry_sdk::runtime::Tokio)
.build();

let cloned_provider = provider.clone();

// Specialized filter to process
// - ERROR logs from specific targets
// - ERROR logs generated internally.
let internal_and_dependency_filter = tracing_subscriber::filter::filter_fn(|metadata| {
let target = metadata.target();

// Only allow ERROR logs from specific targets
(target.starts_with("hyper")
|| target.starts_with("hyper_util")
|| target.starts_with("hyper")
|| target.starts_with("tonic")
|| target.starts_with("tower")
|| target.starts_with("reqwest")
|| target.starts_with("opentelemetry"))
&& metadata.level() == &tracing::Level::ERROR
});
// Configure fmt::Layer to print detailed log information, including structured fields
let fmt_internal_and_dependency_layer =
tracing_subscriber::fmt::layer().with_filter(internal_and_dependency_filter.clone());

// Application filter to exclude specific targets entirely, regardless of level
let application_filter = tracing_subscriber::filter::filter_fn(|metadata| {
let target = metadata.target();

// Exclude logs from specific targets for the application layer
!(target.starts_with("hyper")
|| target.starts_with("hyper_util")
|| target.starts_with("hyper")
|| target.starts_with("tonic")
|| target.starts_with("tower")
|| target.starts_with("reqwest")
|| target.starts_with("opentelemetry"))
});

let application_layer = layer::OpenTelemetryTracingBridge::new(&cloned_provider)
.with_filter(application_filter.clone());

tracing_subscriber::registry()
.with(fmt_internal_and_dependency_layer)
.with(application_layer)
.init();
provider
}
use tracing::info;
use tracing_subscriber::fmt;
use tracing_subscriber::prelude::*;
use tracing_subscriber::EnvFilter;

fn init_meter_provider() -> opentelemetry_sdk::metrics::SdkMeterProvider {
let exporter = MetricExporter::builder()
.with_http()
.with_endpoint("http://localhost:4318/v1/metrics")
.build()
.unwrap();
let exporter = opentelemetry_stdout::MetricExporterBuilder::default().build();

let reader = PeriodicReader::builder(exporter, opentelemetry_sdk::runtime::Tokio)
.with_interval(std::time::Duration::from_secs(1))
.build();
let reader = PeriodicReader::builder(exporter, opentelemetry_sdk::runtime::Tokio).build();

let provider = opentelemetry_sdk::metrics::SdkMeterProvider::builder()
.with_reader(reader)
Expand All @@ -92,41 +23,43 @@ fn init_meter_provider() -> opentelemetry_sdk::metrics::SdkMeterProvider {

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {
let logger_provider = init_logger_provider();
// OpenTelemetry uses `tracing` crate for its internal logging. Unless a
// tracing subscriber is set, the logs will be discarded. In this example,
// we configure a `tracing` subscriber to:
// 1. Print logs of level INFO or higher to stdout.
// 2. Filter logs from OpenTelemetry's dependencies (like tonic, hyper,
// reqwest etc. which are commonly used by the OTLP exporter) to only print
// ERROR-level logs. This filtering helps reduce repetitive log messages
// that could otherwise create an infinite loop of log output. This is a
// workaround until
// https://github.com/open-telemetry/opentelemetry-rust/issues/761 is
// resolved.

// Target name used by OpenTelemetry always start with "opentelemetry".
// Hence, one may use "add_directive("opentelemetry=off".parse().unwrap())"
// to turn off all logs from OpenTelemetry.

let filter = EnvFilter::new("info")
.add_directive("hyper=error".parse().unwrap())
.add_directive("tonic=error".parse().unwrap())
.add_directive("h2=error".parse().unwrap())
.add_directive("tower=error".parse().unwrap())
.add_directive("reqwest=error".parse().unwrap());
tracing_subscriber::registry()
.with(fmt::layer().with_thread_names(true).with_filter(filter))
.init();

// Initialize the MeterProvider with the stdout Exporter.
let meter_provider = init_meter_provider();
info!("Starting self-diagnostics example");

// Create a meter from the above MeterProvider.
let meter = global::meter("example");
// Create a Counter Instrument.
let counter = meter.u64_counter("my_counter").build();

// Record measurements with unique key-value pairs to exceed the cardinality limit
// of 2000 and trigger error message
for i in 0..3000 {
counter.add(
10,
&[KeyValue::new(
format!("mykey{}", i),
format!("myvalue{}", i),
)],
);
}

let (tx, rx) = channel();

ctrlc::set_handler(move || tx.send(()).expect("Could not send signal on channel."))
.expect("Error setting Ctrl-C handler");

println!("Press Ctrl-C to continue...");
rx.recv().expect("Could not receive from channel.");
println!("Got Ctrl-C, Doing shutdown and existing.");
// Create a counter using an invalid name to trigger
// internal log about the same.
let counter = meter.u64_counter("my_counter with_space").build();
counter.add(10, &[KeyValue::new("key", "value")]);

// MeterProvider is configured with an OTLP Exporter to export metrics every 1 second,
// however shutting down the MeterProvider here instantly flushes
// the metrics, instead of waiting for the 1 sec interval.
meter_provider.shutdown()?;
let _ = logger_provider.shutdown();
info!("Shutdown complete. Bye!");
Ok(())
}

0 comments on commit dd7b531

Please sign in to comment.