diff --git a/docs/configuration/node-config.md b/docs/configuration/node-config.md index cecd1ae8d97..225c75c2a93 100644 --- a/docs/configuration/node-config.md +++ b/docs/configuration/node-config.md @@ -192,11 +192,23 @@ This section contains the configuration options for a Searcher. | --- | --- | --- | | `aggregation_memory_limit` | Controls the maximum amount of memory that can be used for aggregations before aborting. This limit is per request and single leaf query (a leaf query is querying one or multiple splits concurrently). It is used to prevent excessive memory usage during the aggregation phase, which can lead to performance degradation or crashes. Since it is per request, concurrent requests can exceed the limit. | `500M`| | `aggregation_bucket_limit` | Determines the maximum number of buckets returned to the client. | `65000` | -| `fast_field_cache_capacity` | Fast field cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` | -| `split_footer_cache_capacity` | Split footer cache (it is essentially the hotcache) capacity on a Searcher.| `500M` | -| `partial_request_cache_capacity` | Partial request cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`. | `64M` | +| `fast_field_cache_capacity` | Fast field in memory cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` | +| `split_footer_cache_capacity` | Split footer in memory cache (it is essentially the hotcache) capacity on a Searcher.| `500M` | +| `partial_request_cache_capacity` | Partial request in memory cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`. | `64M` | | `max_num_concurrent_split_searches` | Maximum number of concurrent split search requests running on a Searcher. | `100` | | `max_num_concurrent_split_streams` | Maximum number of concurrent split stream requests running on a Searcher. | `100` | +| `split_cache` | Searcher split cache configuration options defined in the section below. | | + + +### Searcher split cache configuration + +This section contains the configuration options for the searcher split cache. + +| Property | Description | Default value | +| `max_num_bytes` | Maximum size in bytes allowed in the split cache. | `1G` | +| `max_num_splits` | Maximum number of splits allowed in the split cache. | `10000` | +| `num_concurrent_downloads` | Maximum number of concurrent download of splits. | `1` | + Example: @@ -205,6 +217,10 @@ searcher: fast_field_cache_capacity: 1G split_footer_cache_capacity: 500M partial_request_cache_capacity: 64M + split_cache: + max_num_bytes: 1G + max_num_splits: 10000 + num_concurrent_downloads: 1 ``` ## Jaeger configuration diff --git a/docs/distributed-tracing/otel-service.md b/docs/distributed-tracing/otel-service.md index 49e9b0376b0..68d35d1eb88 100644 --- a/docs/distributed-tracing/otel-service.md +++ b/docs/distributed-tracing/otel-service.md @@ -5,7 +5,7 @@ sidebar_position: 5 Quickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector, or from your application directly, via an exporter. This endpoint is enabled by default. -When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed on the `otel-trace-v0` index, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model). +When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed in the `otel-trace-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model). If for any reason, you want to disable this endpoint, you can: - Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit. @@ -17,26 +17,32 @@ indexer: enable_otlp_endpoint: false ``` +## Sending spans in your own index + +You can send spans in the index of your choice by setting the header `qw-otel-traces-index` of your gRPC request to the targeted index ID. + + ## Trace and span data model -A trace is a collection of spans that represents a single request. A span represents a single operation within a trace. OpenTelemetry collectors send spans, Quickwit then indexes them in the `otel-trace-v0` index that maps OpenTelemetry span model to an indexed document in Quickwit. +A trace is a collection of spans that represents a single request. A span represents a single operation within a trace. OpenTelemetry collectors send spans, Quickwit then indexes them in the `otel-trace-v0_7` index by default that maps OpenTelemetry span model to an indexed document in Quickwit. The span model is derived from the [OpenTelemetry specification](https://opentelemetry.io/docs/reference/specification/trace/api/). -Below is the doc mapping of the `otel-trace-v0` index: +Below is the doc mapping of the `otel-trace-v0_7` index: ```yaml version: 0.7 -index_id: otel-trace-v0 +index_id: otel-trace-v0_7 doc_mapping: mode: strict field_mappings: - name: trace_id - type: text - tokenizer: raw + type: bytes + input_format: hex + output_format: hex fast: true - name: trace_state type: text @@ -44,6 +50,7 @@ doc_mapping: - name: service_name type: text tokenizer: raw + fast: true - name: resource_attributes type: json tokenizer: raw @@ -63,37 +70,39 @@ doc_mapping: type: u64 indexed: false - name: span_id - type: text - tokenizer: raw + type: bytes + input_format: hex + output_format: hex - name: span_kind type: u64 - name: span_name type: text tokenizer: raw + fast: true - name: span_fingerprint type: text tokenizer: raw - name: span_start_timestamp_nanos - type: u64 + type: datetime + input_formats: [unix_timestamp] + output_format: unix_timestamp_nanos indexed: false + fast: true + fast_precision: milliseconds - name: span_end_timestamp_nanos - type: u64 - indexed: false - - name: span_start_timestamp_secs type: datetime input_formats: [unix_timestamp] + output_format: unix_timestamp_nanos indexed: false - fast: true - fast_precision: seconds - stored: false + fast: false - name: span_duration_millis type: u64 indexed: false fast: true - stored: false - name: span_attributes type: json tokenizer: raw + fast: true - name: span_dropped_attributes_count type: u64 indexed: false @@ -105,13 +114,16 @@ doc_mapping: indexed: false - name: span_status type: json - indexed: false + indexed: true - name: parent_span_id - type: text + type: bytes + input_format: hex + output_format: hex indexed: false - name: events type: array tokenizer: raw + fast: true - name: event_names type: array tokenizer: default @@ -121,7 +133,7 @@ doc_mapping: type: array tokenizer: raw - timestamp_field: span_start_timestamp_secs + timestamp_field: span_start_timestamp_nanos indexing_settings: commit_timeout_secs: 5 @@ -132,10 +144,8 @@ search_settings: ## Known limitations -There are a few limitations on the current distributed tracing setup in Quickwit 0.5: -- Aggregations are not available on sparse fields and JSON field, this will be fixed in 0.6. This means that only the timestamp and `trace_id` fields can support aggregations. -- The OTLP gRPC service does not provide High-Availability and High-Durability, this will be fixed in Q2/Q3. -- OTLP gRPC service index documents only in the `otel-trace-v0` index. -- OTLP HTTP is not available but it should be easy to add. +There are a few limitations on the current distributed tracing setup in Quickwit 0.7: +- The OTLP gRPC service does not provide High-Availability and High-Durability, This will be fixed in 0.8. +- OTLP HTTP is only available with the Binary Protobuf Encoding. OTLP HTTP with JSON encoding is not planned yet, but this can be easily fixed in the next version. Please open an issue if you need this feature. If you are interested in new features or discovered other limitations, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit). diff --git a/docs/distributed-tracing/overview.md b/docs/distributed-tracing/overview.md index 30e22838ef7..bbd71cb9c79 100644 --- a/docs/distributed-tracing/overview.md +++ b/docs/distributed-tracing/overview.md @@ -8,13 +8,13 @@ Distributed Tracing is a process that tracks your application requests flowing t Quickwit is a cloud-native engine to index and search unstructured data which makes it a perfect fit for a traces backend. -Moreover, Quickwit supports natively the [OpenTelemetry protocol](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and the [Jaeger UI](https://www.jaegertracing.io/). **This means that you can use Quickwit to store your traces and to query them with Jaeger UI**. +Moreover, Quickwit supports natively the [OpenTelemetry gRPC and HTTP (protobuf only) protocol](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and the [Jaeger gRPC API (SpanReader only)](https://www.jaegertracing.io/). **This means that you can use Quickwit to store your traces and to query them with Jaeger UI**. ![Quickwit Distributed Tracing](../assets/images/distributed-tracing-overview-light.png#gh-light-mode-only)![Quickwit Distributed Tracing](../assets/images/distributed-tracing-overview-dark.png#gh-dark-mode-only) ## Plug Quickwit to Jaeger -Quickwit implements a gRPC service compatible with Jaeger UI. All you need is to configure Jaeger with a (span) storage type `grpc-plugin` and you will be able to visualize your traces in Jaeger that are stored in Quickwit. +Quickwit implements a gRPC service compatible with Jaeger UI. All you need is to configure Jaeger with a (span) storage type `grpc-plugin` and you will be able to visualize your traces in Jaeger that are stored in any Quickwit's indexes matching the pattern `otel-traces-v0_*`. We made a tutorial on [how to plug Quickwit to Jaeger UI](plug-quickwit-to-jaeger.md) that will guide you through the process. diff --git a/docs/distributed-tracing/plug-quickwit-to-jaeger.md b/docs/distributed-tracing/plug-quickwit-to-jaeger.md index f09512f681d..91002e20cad 100644 --- a/docs/distributed-tracing/plug-quickwit-to-jaeger.md +++ b/docs/distributed-tracing/plug-quickwit-to-jaeger.md @@ -40,7 +40,7 @@ docker run --rm --name jaeger-qw \ ### Linux -By default, quickwit is listening to `127.0.0.1`, and will not respond to request directed +By default, Quickwit is listening to `127.0.0.1`, and will not respond to request directed to the docker bridge (`172.17.0.1`). There are different ways to solve this problem. The easiest is probably to use host network mode. diff --git a/docs/distributed-tracing/send-traces/using-otel-collector.md b/docs/distributed-tracing/send-traces/using-otel-collector.md index 87b27b36fbe..dd9213ecc6f 100644 --- a/docs/distributed-tracing/send-traces/using-otel-collector.md +++ b/docs/distributed-tracing/send-traces/using-otel-collector.md @@ -29,6 +29,10 @@ exporters: endpoint: host.docker.internal:7281 tls: insecure: true + # By default, traces are sent to the otel-traces-v0_7. + # You can customize the index ID By setting this header. + # headers: + # qw-otel-traces-index: otel-traces-v0_7 service: pipelines: diff --git a/docs/log-management/otel-service.md b/docs/log-management/otel-service.md index 6e55bd4b7cd..859b462a3e9 100644 --- a/docs/log-management/otel-service.md +++ b/docs/log-management/otel-service.md @@ -5,7 +5,7 @@ sidebar_position: 4 Quickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector. This endpoint is enabled by default. -When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed on the `otel-trace-v0` index, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model). +When enabled, Quickwit will start the gRPC service ready to receive logs from an OpenTelemetry collector. The logs are indexed in the `otel-logs-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model). If for any reason, you want to disable this endpoint, you can: - Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit. @@ -17,61 +17,72 @@ indexer: enable_otlp_endpoint: false ``` +## Sending logs in your own index + +You can send logs in the index of your choice by setting the header `qw-otel-logs-index` of your gRPC request to the targeted index ID. + + ## OpenTelemetry logs data model -Quickwit sends OpenTelemetry logs into the `otel-logs-v0` index which is automatically created if you enable the OpenTelemetry service. +Quickwit sends OpenTelemetry logs into the `otel-logs-v0_7` index by default which is automatically created if you enable the OpenTelemetry service. The doc mapping of this index described below is derived from the [OpenTelemetry logs data model](https://opentelemetry.io/docs/reference/specification/logs/data-model/). ```yaml version: 0.7 -index_id: otel-logs-v0 +index_id: otel-logs-v0_7 doc_mapping: mode: strict field_mappings: - - name: timestamp_secs + - name: timestamp_nanos type: datetime input_formats: [unix_timestamp] + output_format: unix_timestamp_nanos indexed: false fast: true - fast_precision: seconds - stored: false - - name: timestamp_nanos - type: u64 - indexed: false + fast_precision: milliseconds - name: observed_timestamp_nanos - type: u64 - indexed: false + type: datetime + input_formats: [unix_timestamp] + output_format: unix_timestamp_nanos - name: service_name type: text tokenizer: raw + fast: true - name: severity_text type: text tokenizer: raw + fast: true - name: severity_number type: u64 + fast: true - name: body type: json + tokenizer: default - name: attributes type: json tokenizer: raw + fast: true - name: dropped_attributes_count type: u64 indexed: false - name: trace_id - type: text - tokenizer: raw + type: bytes + input_format: hex + output_format: hex - name: span_id - type: text - tokenizer: raw + type: bytes + input_format: hex + output_format: hex - name: trace_flags type: u64 indexed: false - name: resource_attributes type: json tokenizer: raw + fast: true - name: resource_dropped_attributes_count type: u64 indexed: false @@ -88,13 +99,13 @@ doc_mapping: type: u64 indexed: false - timestamp_field: timestamp_secs + timestamp_field: timestamp_nanos indexing_settings: commit_timeout_secs: 5 search_settings: - default_search_fields: [] + default_search_fields: [body.message] ``` ## UI Integration @@ -108,10 +119,7 @@ You can also send traces to Quickwit that you can visualize in Jaeger UI, as exp ## Known limitations There are a few limitations on the log management setup in Quickwit 0.7: -- Aggregations are not available on sparse fields and JSON field, this will be fixed in 0.7. This means that only the timestamp field can support aggregations. -- The ingest API does not provide High-Availability and High-Durability, this will be fixed in Q2/Q3. -- Grafana and Elasticsearch query API support are planned for Q2 2023. -- OTLP gRPC service index documents only in the `otel-logs-v0` index. -- OTLP HTTP is not available but it should be easy to add. +- The ingest API does not provide High-Availability and High-Durability, this will be fixed in 0.8. +- OTLP HTTP is only available with the Binary Protobuf Encoding. OTLP HTTP with JSON encoding is not planned yet, but this can be easily fixed in the next version. Please open an issue if you need this feature. If you are interested in new features or discover other limitations, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit). diff --git a/docs/log-management/overview.md b/docs/log-management/overview.md index c1fd11e97bc..6ff35da6556 100644 --- a/docs/log-management/overview.md +++ b/docs/log-management/overview.md @@ -5,7 +5,7 @@ sidebar_position: 1 --- Quickwit is built from the ground up to [efficiently index unstructured data](../guides/schemaless.md), and search it effortlessly on cloud storage. -Moreover, Quickwit supports OpenTelemetry out of the box and provides a REST API ready to ingest any JSON formatted logs. +Moreover, Quickwit supports OpenTelemetry gRPC and HTTP (protobuf only) protocols out of the box and provides a REST API ready to ingest any JSON formatted logs. **This makes Quickwit a perfect fit for logs!**. ![Quickwit Log Management](../assets/images/log-management-overview-light.svg#gh-light-mode-only)![Quickwit Log Management](../assets/images/log-management-overview-dark.svg#gh-dark-mode-only) diff --git a/docs/log-management/send-logs/using-otel-collector-with-helm.md b/docs/log-management/send-logs/using-otel-collector-with-helm.md index d2d9771286e..3d8d753bf91 100644 --- a/docs/log-management/send-logs/using-otel-collector-with-helm.md +++ b/docs/log-management/send-logs/using-otel-collector-with-helm.md @@ -144,10 +144,12 @@ config: exporters: otlp: endpoint: quickwit-indexer.qw-tutorial.svc.cluster.local:7281 - # Quickwit OTEL gRPC endpoint does not support compression yet. - compression: none tls: insecure: true + # By default, logs are sent to the otel-logs-v0_7. + # You can customize the index ID By setting this header. + # headers: + # qw-otel-logs-index: otel-logs-v0_7 service: pipelines: logs: diff --git a/docs/log-management/send-logs/using-otel-collector.md b/docs/log-management/send-logs/using-otel-collector.md index d21e59d8f51..04c67f734b3 100644 --- a/docs/log-management/send-logs/using-otel-collector.md +++ b/docs/log-management/send-logs/using-otel-collector.md @@ -29,8 +29,11 @@ exporters: otlp/quickwit: endpoint: host.docker.internal:7281 tls: - insecure: true - + insecure: true + # By default, logs are sent to the otel-logs-v0_7. + # You can customize the index ID By setting this header. + # headers: + # qw-otel-logs-index: otel-logs-v0_7 service: pipelines: logs: @@ -58,6 +61,10 @@ exporters: endpoint: 127.0.0.1:7281 tls: insecure: true + # By default, logs are sent to the otel-logs-v0_7. + # You can customize the index ID By setting this header. + # headers: + # qw-otel-logs-index: otel-logs-v0_7 service: pipelines: diff --git a/docs/log-management/supported-agents.md b/docs/log-management/supported-agents.md index bbdc0eeb9c0..257358642b5 100644 --- a/docs/log-management/supported-agents.md +++ b/docs/log-management/supported-agents.md @@ -58,7 +58,7 @@ You can also send your logs directly to this index by using the [ingest API](/do Quickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector. This endpoint is enabled by default. -When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed on the `otel-trace-v0` index, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model). +When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed in the `otel-trace-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model). If for any reason, you want to disable this endpoint, you can: - Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit. diff --git a/docs/operating/upgrades.md b/docs/operating/upgrades.md new file mode 100644 index 00000000000..29f52d8e58b --- /dev/null +++ b/docs/operating/upgrades.md @@ -0,0 +1,20 @@ +--- +title: Version 0.7 upgrade +sidebar_position: 4 +--- + +## Migration from 0.6.x to 0.7.0 + +The format of the index and internal objects stored in the metastore of 0.7 is backward compatible with 0.6. + +If you are using the OTEL indexes and ingesting data into indexes the `otel-logs-v0_6` and `otel-traces-v0_6`, you must stop indexing before upgrading. Indeed, the first time you start Quickwit 0.7, it will update the doc mapping fields of Trace ID and Span ID of those two indexes by changing their input/output formats from `base64` to `hex`. This is automatic: you don't have to perform any manual operation. + +Quickwit 0.7 will also create the new index `otel-traces-v0_7`, which is now used by default when ingesting data with the OTEL gRPC and HTTP API. The Jaeger gRPC and HTTP APIs will query both `otel-traces-v0_6` and `otel-traces-v0_7` by default. It's possible to define the index ID you want to use for OTEL gRPC endpoints and Jaeger gRPC API by setting the request header `qw-otel-logs-index` or `qw-otel-traces-index` to the index ID you want to target. + + +## Migration from 0.7.0 to 0.7.1 + +Quickwit 0.7.1 will create the new index `otel-logs-v0_7` which is now used by default when ingesting data with the OTEL gRPC and HTTP API. + +In the traces index `otel-traces-v0_7`, the `service_name` field is now `fast`. +No migration is done if `otel-traces-v0_7` already exists. If you want `service_name` field to be `fast`, you have to delete first the existing `otel-traces-v0_7` index or you need to create your own index.