Skip to content

Commit

Permalink
Merge branch 'main' into count_api
Browse files Browse the repository at this point in the history
  • Loading branch information
PSeitz authored Jan 18, 2024
2 parents c6f1705 + 170eead commit 1238406
Show file tree
Hide file tree
Showing 90 changed files with 2,211 additions and 1,279 deletions.
3 changes: 3 additions & 0 deletions config/quickwit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ version: 0.7
# x-header-1: header-value-1
# x-header-2: header-value-2
#
# grpc:
# max_message_size: 10 MiB
#
# IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs.
# The environment variable `QW_ADVERTISE_ADDRESS` can also be used to override this value.
# The default advertise address is `listen_address`. If `listen_address` is unspecified (`0.0.0.0`),
Expand Down
2 changes: 1 addition & 1 deletion config/tutorials/vector-otel-logs/vector.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,4 @@ method = "post"
inputs = ["remap_syslog"]
encoding.codec = "json"
framing.method = "newline_delimited"
uri = "http://127.0.0.1:7280/api/v1/otel-logs-v0_6/ingest"
uri = "http://127.0.0.1:7280/api/v1/otel-logs-v0_7/ingest"
42 changes: 38 additions & 4 deletions docs/configuration/node-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,6 @@ A wildcard, single origin, or multiple origins can be specified as part of the `
Example of a REST configuration:

```yaml

rest:
listen_port: 1789
extra_headers:
Expand All @@ -65,9 +64,28 @@ rest:
# cors_allow_origins: # Or allow multiple origins
# - https://my-hdfs-logs.domain.com
# - https://my-hdfs.other-domain.com
```

## gRPC configuration

This section contains the configuration options for gRPC services and clients used for internal communication between nodes.

| Property | Description | Env variable | Default value |
| --- | --- | --- | --- |
| `max_message_size` | The maximum size (in bytes) of messages exchanged by internal gRPC clients and services. | | `20 MiB` |

Example of a gRPC configuration:

```yaml
grpc:
max_message_size: 30 MiB
```
:::warning
We advise changing the default value of 20 MiB only if you encounter the following error:
`Error, message length too large: found 24732228 bytes, the limit is: 20971520 bytes.` In that case, increase `max_message_size` by increments of 10 MiB until the issue disappears. This is a temporary fix: the next version of Quickwit, 0.8, will rely exclusively on gRPC streaming endpoints and handle messages of any length.
:::

## Storage configuration

Please refer to the dedicated [storage configuration](storage-config) page to learn more about configuring Quickwit for various storage providers.
Expand Down Expand Up @@ -174,11 +192,23 @@ This section contains the configuration options for a Searcher.
| --- | --- | --- |
| `aggregation_memory_limit` | Controls the maximum amount of memory that can be used for aggregations before aborting. This limit is per request and single leaf query (a leaf query is querying one or multiple splits concurrently). It is used to prevent excessive memory usage during the aggregation phase, which can lead to performance degradation or crashes. Since it is per request, concurrent requests can exceed the limit. | `500M`|
| `aggregation_bucket_limit` | Determines the maximum number of buckets returned to the client. | `65000` |
| `fast_field_cache_capacity` | Fast field cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` |
| `split_footer_cache_capacity` | Split footer cache (it is essentially the hotcache) capacity on a Searcher.| `500M` |
| `partial_request_cache_capacity` | Partial request cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`. | `64M` |
| `fast_field_cache_capacity` | Fast field in memory cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The [metrics](../reference/metrics.md) starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value. | `1G` |
| `split_footer_cache_capacity` | Split footer in memory cache (it is essentially the hotcache) capacity on a Searcher.| `500M` |
| `partial_request_cache_capacity` | Partial request in memory cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`. | `64M` |
| `max_num_concurrent_split_searches` | Maximum number of concurrent split search requests running on a Searcher. | `100` |
| `max_num_concurrent_split_streams` | Maximum number of concurrent split stream requests running on a Searcher. | `100` |
| `split_cache` | Searcher split cache configuration options defined in the section below. | |


### Searcher split cache configuration

This section contains the configuration options for the searcher split cache.

| Property | Description | Default value |
| `max_num_bytes` | Maximum size in bytes allowed in the split cache. | `1G` |
| `max_num_splits` | Maximum number of splits allowed in the split cache. | `10000` |
| `num_concurrent_downloads` | Maximum number of concurrent download of splits. | `1` |


Example:

Expand All @@ -187,6 +217,10 @@ searcher:
fast_field_cache_capacity: 1G
split_footer_cache_capacity: 500M
partial_request_cache_capacity: 64M
split_cache:
max_num_bytes: 1G
max_num_splits: 10000
num_concurrent_downloads: 1
```

## Jaeger configuration
Expand Down
58 changes: 34 additions & 24 deletions docs/distributed-tracing/otel-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ sidebar_position: 5

Quickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector, or from your application directly, via an exporter. This endpoint is enabled by default.

When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed on the `otel-trace-v0` index, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).
When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed in the `otel-trace-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).

If for any reason, you want to disable this endpoint, you can:
- Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit.
Expand All @@ -17,33 +17,40 @@ indexer:
enable_otlp_endpoint: false
```
## Sending spans in your own index
You can send spans in the index of your choice by setting the header `qw-otel-traces-index` of your gRPC request to the targeted index ID.


## Trace and span data model

A trace is a collection of spans that represents a single request. A span represents a single operation within a trace. OpenTelemetry collectors send spans, Quickwit then indexes them in the `otel-trace-v0` index that maps OpenTelemetry span model to an indexed document in Quickwit.
A trace is a collection of spans that represents a single request. A span represents a single operation within a trace. OpenTelemetry collectors send spans, Quickwit then indexes them in the `otel-trace-v0_7` index by default that maps OpenTelemetry span model to an indexed document in Quickwit.

The span model is derived from the [OpenTelemetry specification](https://opentelemetry.io/docs/reference/specification/trace/api/).

Below is the doc mapping of the `otel-trace-v0` index:
Below is the doc mapping of the `otel-trace-v0_7` index:

```yaml
version: 0.7
index_id: otel-trace-v0
index_id: otel-trace-v0_7
doc_mapping:
mode: strict
field_mappings:
- name: trace_id
type: text
tokenizer: raw
type: bytes
input_format: hex
output_format: hex
fast: true
- name: trace_state
type: text
indexed: false
- name: service_name
type: text
tokenizer: raw
fast: true
- name: resource_attributes
type: json
tokenizer: raw
Expand All @@ -63,37 +70,39 @@ doc_mapping:
type: u64
indexed: false
- name: span_id
type: text
tokenizer: raw
type: bytes
input_format: hex
output_format: hex
- name: span_kind
type: u64
- name: span_name
type: text
tokenizer: raw
fast: true
- name: span_fingerprint
type: text
tokenizer: raw
- name: span_start_timestamp_nanos
type: u64
type: datetime
input_formats: [unix_timestamp]
output_format: unix_timestamp_nanos
indexed: false
fast: true
fast_precision: milliseconds
- name: span_end_timestamp_nanos
type: u64
indexed: false
- name: span_start_timestamp_secs
type: datetime
input_formats: [unix_timestamp]
output_format: unix_timestamp_nanos
indexed: false
fast: true
fast_precision: seconds
stored: false
fast: false
- name: span_duration_millis
type: u64
indexed: false
fast: true
stored: false
- name: span_attributes
type: json
tokenizer: raw
fast: true
- name: span_dropped_attributes_count
type: u64
indexed: false
Expand All @@ -105,13 +114,16 @@ doc_mapping:
indexed: false
- name: span_status
type: json
indexed: false
indexed: true
- name: parent_span_id
type: text
type: bytes
input_format: hex
output_format: hex
indexed: false
- name: events
type: array<json>
tokenizer: raw
fast: true
- name: event_names
type: array<text>
tokenizer: default
Expand All @@ -121,7 +133,7 @@ doc_mapping:
type: array<json>
tokenizer: raw
timestamp_field: span_start_timestamp_secs
timestamp_field: span_start_timestamp_nanos
indexing_settings:
commit_timeout_secs: 5
Expand All @@ -132,10 +144,8 @@ search_settings:

## Known limitations

There are a few limitations on the current distributed tracing setup in Quickwit 0.5:
- Aggregations are not available on sparse fields and JSON field, this will be fixed in 0.6. This means that only the timestamp and `trace_id` fields can support aggregations.
- The OTLP gRPC service does not provide High-Availability and High-Durability, this will be fixed in Q2/Q3.
- OTLP gRPC service index documents only in the `otel-trace-v0` index.
- OTLP HTTP is not available but it should be easy to add.
There are a few limitations on the current distributed tracing setup in Quickwit 0.7:
- The OTLP gRPC service does not provide High-Availability and High-Durability, This will be fixed in 0.8.
- OTLP HTTP is only available with the Binary Protobuf Encoding. OTLP HTTP with JSON encoding is not planned yet, but this can be easily fixed in the next version. Please open an issue if you need this feature.

If you are interested in new features or discovered other limitations, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit).
4 changes: 2 additions & 2 deletions docs/distributed-tracing/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ Distributed Tracing is a process that tracks your application requests flowing t

Quickwit is a cloud-native engine to index and search unstructured data which makes it a perfect fit for a traces backend.

Moreover, Quickwit supports natively the [OpenTelemetry protocol](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and the [Jaeger UI](https://www.jaegertracing.io/). **This means that you can use Quickwit to store your traces and to query them with Jaeger UI**.
Moreover, Quickwit supports natively the [OpenTelemetry gRPC and HTTP (protobuf only) protocol](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and the [Jaeger gRPC API (SpanReader only)](https://www.jaegertracing.io/). **This means that you can use Quickwit to store your traces and to query them with Jaeger UI**.

![Quickwit Distributed Tracing](../assets/images/distributed-tracing-overview-light.png#gh-light-mode-only)![Quickwit Distributed Tracing](../assets/images/distributed-tracing-overview-dark.png#gh-dark-mode-only)

## Plug Quickwit to Jaeger

Quickwit implements a gRPC service compatible with Jaeger UI. All you need is to configure Jaeger with a (span) storage type `grpc-plugin` and you will be able to visualize your traces in Jaeger that are stored in Quickwit.
Quickwit implements a gRPC service compatible with Jaeger UI. All you need is to configure Jaeger with a (span) storage type `grpc-plugin` and you will be able to visualize your traces in Jaeger that are stored in any Quickwit's indexes matching the pattern `otel-traces-v0_*`.

We made a tutorial on [how to plug Quickwit to Jaeger UI](plug-quickwit-to-jaeger.md) that will guide you through the process.

Expand Down
2 changes: 1 addition & 1 deletion docs/distributed-tracing/plug-quickwit-to-jaeger.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ docker run --rm --name jaeger-qw \

### Linux

By default, quickwit is listening to `127.0.0.1`, and will not respond to request directed
By default, Quickwit is listening to `127.0.0.1`, and will not respond to request directed
to the docker bridge (`172.17.0.1`). There are different ways to solve this problem.
The easiest is probably to use host network mode.

Expand Down
4 changes: 4 additions & 0 deletions docs/distributed-tracing/send-traces/using-otel-collector.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@ exporters:
endpoint: host.docker.internal:7281
tls:
insecure: true
# By default, traces are sent to the otel-traces-v0_7.
# You can customize the index ID By setting this header.
# headers:
# qw-otel-traces-index: otel-traces-v0_7

service:
pipelines:
Expand Down
52 changes: 30 additions & 22 deletions docs/log-management/otel-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ sidebar_position: 4

Quickwit natively supports the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/otlp/) and provides a gRPC endpoint to receive spans from an OpenTelemetry collector. This endpoint is enabled by default.

When enabled, Quickwit will start the gRPC service ready to receive spans from an OpenTelemetry collector. The spans are indexed on the `otel-trace-v0` index, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).
When enabled, Quickwit will start the gRPC service ready to receive logs from an OpenTelemetry collector. The logs are indexed in the `otel-logs-v0_7` index by default, and this index will be automatically created if not present. The index doc mapping is described in the next [section](#trace-and-span-data-model).

If for any reason, you want to disable this endpoint, you can:
- Set the `QW_ENABLE_OTLP_ENDPOINT` environment variable to `false` when starting Quickwit.
Expand All @@ -17,61 +17,72 @@ indexer:
enable_otlp_endpoint: false
```
## Sending logs in your own index
You can send logs in the index of your choice by setting the header `qw-otel-logs-index` of your gRPC request to the targeted index ID.


## OpenTelemetry logs data model

Quickwit sends OpenTelemetry logs into the `otel-logs-v0` index which is automatically created if you enable the OpenTelemetry service.
Quickwit sends OpenTelemetry logs into the `otel-logs-v0_7` index by default which is automatically created if you enable the OpenTelemetry service.
The doc mapping of this index described below is derived from the [OpenTelemetry logs data model](https://opentelemetry.io/docs/reference/specification/logs/data-model/).

```yaml
version: 0.7
index_id: otel-logs-v0
index_id: otel-logs-v0_7
doc_mapping:
mode: strict
field_mappings:
- name: timestamp_secs
- name: timestamp_nanos
type: datetime
input_formats: [unix_timestamp]
output_format: unix_timestamp_nanos
indexed: false
fast: true
fast_precision: seconds
stored: false
- name: timestamp_nanos
type: u64
indexed: false
fast_precision: milliseconds
- name: observed_timestamp_nanos
type: u64
indexed: false
type: datetime
input_formats: [unix_timestamp]
output_format: unix_timestamp_nanos
- name: service_name
type: text
tokenizer: raw
fast: true
- name: severity_text
type: text
tokenizer: raw
fast: true
- name: severity_number
type: u64
fast: true
- name: body
type: json
tokenizer: default
- name: attributes
type: json
tokenizer: raw
fast: true
- name: dropped_attributes_count
type: u64
indexed: false
- name: trace_id
type: text
tokenizer: raw
type: bytes
input_format: hex
output_format: hex
- name: span_id
type: text
tokenizer: raw
type: bytes
input_format: hex
output_format: hex
- name: trace_flags
type: u64
indexed: false
- name: resource_attributes
type: json
tokenizer: raw
fast: true
- name: resource_dropped_attributes_count
type: u64
indexed: false
Expand All @@ -88,13 +99,13 @@ doc_mapping:
type: u64
indexed: false
timestamp_field: timestamp_secs
timestamp_field: timestamp_nanos
indexing_settings:
commit_timeout_secs: 5
search_settings:
default_search_fields: []
default_search_fields: [body.message]
```

## UI Integration
Expand All @@ -108,10 +119,7 @@ You can also send traces to Quickwit that you can visualize in Jaeger UI, as exp
## Known limitations

There are a few limitations on the log management setup in Quickwit 0.7:
- Aggregations are not available on sparse fields and JSON field, this will be fixed in 0.7. This means that only the timestamp field can support aggregations.
- The ingest API does not provide High-Availability and High-Durability, this will be fixed in Q2/Q3.
- Grafana and Elasticsearch query API support are planned for Q2 2023.
- OTLP gRPC service index documents only in the `otel-logs-v0` index.
- OTLP HTTP is not available but it should be easy to add.
- The ingest API does not provide High-Availability and High-Durability, this will be fixed in 0.8.
- OTLP HTTP is only available with the Binary Protobuf Encoding. OTLP HTTP with JSON encoding is not planned yet, but this can be easily fixed in the next version. Please open an issue if you need this feature.

If you are interested in new features or discover other limitations, please open an issue on [GitHub](https://github.com/quickwit-oss/quickwit).
Loading

0 comments on commit 1238406

Please sign in to comment.