Skip to content

Commit

Permalink
Merge branch 'main' into etolbakov/grafana-tracing-support
Browse files Browse the repository at this point in the history
  • Loading branch information
etolbakov authored Dec 9, 2023
2 parents de10253 + 735e384 commit 870f389
Show file tree
Hide file tree
Showing 172 changed files with 15,231 additions and 9,047 deletions.
42 changes: 42 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
{
"name": "Quickwit",
"image": "mcr.microsoft.com/devcontainers/rust:latest",
"customizations": {
"codespaces": {
"openFiles": [
"CONTRIBUTING.md"
]
},
"vscode": {
"extensions": [
"rust-lang.rust-analyzer"
]
}
},
"hostRequirements": {
"cpus": 4,
"memory": "16gb"
},
"runArgs": [
"--init"
],
"mounts": [
{
"source": "/var/run/docker.sock",
"target": "/var/run/docker.sock",
"type": "bind"
}
],
"features": {
"docker-from-docker": {
"version": "latest",
"moby": true
},
"ghcr.io/devcontainers/features/node:1": {
"version": "18"
},
"ghcr.io/devcontainers/features/aws-cli:1": {},
"ghcr.io/devcontainers-contrib/features/protoc:1": {}
},
"postCreateCommand": ".devcontainer/post-create.sh"
}
53 changes: 53 additions & 0 deletions .devcontainer/post-create.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/bin/bash

# Define success and error color codes
SUCCESS_COLOR="\e[32m"
ERROR_COLOR="\e[31m"
RESET_COLOR="\e[0m"

# Define success tracking variables
rustupToolchainNightlyInstalled=false
cmakeInstalled=false


# Define installation functions

#Installing manually for now until we figure out why "ghcr.io/devcontainers-community/features/cmake": {} is not working
install_cmake() {
echo -e "Installing CMake..."
sudo apt-get update
sudo apt-get install -y cmake > /dev/null 2>&1
if [[ "$(cmake --version)" =~ "cmake version" ]]; then
echo -e "${SUCCESS_COLOR}CMake installed successfully.${RESET_COLOR}"
cmakeInstalled=true
else
echo -e "${ERROR_COLOR}CMake installation failed. Please install it manually.${RESET_COLOR}"
fi
}

install_rustup_toolchain_nightly() {
echo -e "Installing Rustup nightly toolchain..."
rustup toolchain install nightly > /dev/null 2>&1
rustup component add rustfmt --toolchain nightly > /dev/null 2>&1
if [[ "$(rustup toolchain list)" =~ "nightly" && "$(rustup component list --toolchain nightly | grep rustfmt)" =~ "installed" ]]; then
echo -e "${SUCCESS_COLOR}Rustup nightly toolchain and rustfmt installed successfully.${RESET_COLOR}"
rustupToolchainNightlyInstalled=true
else
echo -e "${ERROR_COLOR}Rustup nightly toolchain and/or rustfmt installation failed. Please install them manually.${RESET_COLOR}"
fi
}

# Install tools
install_cmake
install_rustup_toolchain_nightly

# Copy our custom welcome message to replace the default github welcome message
sudo cp .devcontainer/welcome.txt /usr/local/etc/vscode-dev-containers/first-run-notice.txt


# Check the success tracking variables
if $rustupToolchainNightlyInstalled && $cmakeInstalled; then
echo -e "${SUCCESS_COLOR}All tools installed successfully.${RESET_COLOR}"
else
echo -e "${ERROR_COLOR}One or more tools failed to install. Please check the output for errors and install the failed tools manually.${RESET_COLOR}"
fi
16 changes: 16 additions & 0 deletions .devcontainer/welcome.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
👋 Welcome to the project!
All the necessary tools have already been installed for you 🎉.
You can go ahead and start hacking! Happy coding💻.

Here are some useful commands you can run:

🔧 `make test-all` - starts necessary Docker services and runs all tests.
🔧 `make -k test-all docker-compose-down` - the same as above, but tears down the Docker services after running all the tests.
🔧 `make fmt` - runs formatter, this command requires the nightly toolchain to be installed by running `rustup toolchain install nightly`.
🔧 `make fix` - runs formatter and clippy checks.
🔧 `make typos` - runs the spellcheck tool over the codebase. (Install by running `cargo install typos`)
🔧 `make build-docs` - builds docs.
🔧 `make docker-compose-up` - starts Docker services.
🔧 `make docker-compose-down` - stops Docker services.
🔧 `make docker-compose-logs` - shows Docker logs.

14 changes: 13 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,24 @@ You will be notified by email from the CI system if any issues are discovered, b
3. To build docs run `make build-docs`.

# Development

## Setup & run tests

### Local Development

1. Install Rust, CMake, Docker (https://docs.docker.com/engine/install/) and Docker Compose (https://docs.docker.com/compose/install/)
2. Install node@18 and `npm install -g yarn`
3. Install awslocal https://github.com/localstack/awscli-local
4. Install protoc https://grpc.io/docs/protoc-installation/ (you may need to install the latest binaries rather than your distro's flavor)
5. Run all tests using `make test-all`

### GitHub Codespaces

[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/quickwit-oss/quickwit?devcontainer_path=.devcontainer/devcontainer.json)

GitHub Codespaces provides a fully configured development environment in the cloud, making it easy to get started with Quickwit development. By clicking the badge above, you can create a codespace with all the necessary tools installed and configured.

### Running tests
Run `make test-all` to run all tests.

## Useful commands
* `make test-all` - starts necessary Docker services and runs all tests.
Expand Down
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,12 +136,14 @@ Our business model relies on our commercial license. There is no plan to become

We are always thrilled to receive contributions: code, documentation, issues, or feedback. Here's how you can help us build the future of log management:

- Check out the [GitHub issues labeled "Good first issue"](https://github.com/quickwit-oss/quickwit/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) for a great place to start.
- Familiarize yourself with our [Contributor Covenant Code of Conduct](https://github.com/quickwit-oss/quickwit/blob/0add0562f08e4edd46f5c5537e8ef457d42a508e/CODE_OF_CONDUCT.md).
- Delve into our [contributing guide](CONTRIBUTING.md).
- [Create a fork of Quickwit](https://github.com/quickwit-oss/quickwit/fork) and submit your pull request!

✨ And to thank you for your contributions, claim your swag by emailing us at [email protected].
- Start by checking out the [GitHub issues labeled "Good first issue"](https://github.com/quickwit-oss/quickwit/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22). These are a great place for newcomers to contribute.
- Read our [Contributor Covenant Code of Conduct](./CODE_OF_CONDUCT.md) to understand our community standards.
- [Create a fork of Quickwit](https://github.com/quickwit-oss/quickwit/fork) to have your own copy of the repository where you can make changes.
- To understand how to contribute, read our [contributing guide](./CONTRIBUTING.md).
- Set up your development environment following our [development setup guide](./CONTRIBUTING.md#development).
- Once you've made your changes and tested them, you can contribute by [submitting a pull request](./CONTRIBUTING.md#submitting-a-pr).

✨ After your contributions are accepted, don't forget to claim your swag by emailing us at [email protected]. Thank you for contributing!

# 💬 Join Our Community

Expand Down
9 changes: 8 additions & 1 deletion config/quickwit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,14 @@ version: 0.6
# 2. pass `0.0.0.0` and let Quickwit do its best to discover the node's IP (see `advertise_address`)
#
# listen_address: 127.0.0.1
# rest_listen_port: 7280
#
# rest:
# listen_port: 7280
# cors_allow_origins:
# - "http://localhost:3000"
# extra_headers:
# x-header-1: header-value-1
# x-header-2: header-value-2
#
# IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs.
# The environment variable `QW_ADVERTISE_ADDRESS` can also be used to override this value.
Expand Down
6 changes: 5 additions & 1 deletion config/tutorials/hdfs-logs/searcher-1.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
version: 0.6
node_id: searcher-1
listen_address: 127.0.0.1
rest_listen_port: 7280
rest:
listen_port: 7280
ingest_api:
max_queue_memory_usage: 4GiB
max_queue_disk_usage: 8GiB
peer_seeds:
- 127.0.0.1:7290 # searcher-2
- 127.0.0.1:7300 # searcher-3
3 changes: 2 additions & 1 deletion config/tutorials/hdfs-logs/searcher-2.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
version: 0.6
node_id: searcher-2
listen_address: 127.0.0.1
rest_listen_port: 7290
rest:
listen_port: 7290
peer_seeds:
- 127.0.0.1:7280 # searcher-1
- 127.0.0.1:7300 # searcher-3
3 changes: 2 additions & 1 deletion config/tutorials/hdfs-logs/searcher-3.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
version: 0.6
node_id: searcher-3
listen_address: 127.0.0.1
rest_listen_port: 7300
rest:
listen_port: 7300
peer_seeds:
- 127.0.0.1:7280 # searcher-1
- 127.0.0.1:7290 # searcher-2
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ services:
# It is not an official docker image
# if we prefer we can build a docker from the official docker image (gcloud cli)
# and install the pubsub emulator https://cloud.google.com/pubsub/docs/emulator
image: thekevjames/gcloud-pubsub-emulator:${GCLOUD_EMULATOR:-7555256f2c}
image: thekevjames/gcloud-pubsub-emulator:${GCLOUD_EMULATOR:-455.0.0}
container_name: gcp-pubsub-emulator
ports:
- "${MAP_HOST_GCLOUD_EMULATOR:-127.0.0.1}:8681:8681"
Expand Down
15 changes: 8 additions & 7 deletions docs/configuration/index-config.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
title: Index configuration
sidebar_position: 3
toc_max_heading_level: 4
---

This page describes how to configure an index.
Expand Down Expand Up @@ -82,11 +83,11 @@ The file storage will not work when running quickwit in distributed mode. Instea

## Doc mapping

The doc mapping defines how a document and the fields it contains are stored and indexed for a given index. A document is a collection of named fields, each having its own data type (text, binary, datetime, bool, i64, u64, f64).
The doc mapping defines how a document and the fields it contains are stored and indexed for a given index. A document is a collection of named fields, each having its own data type (text, bytes, datetime, bool, i64, u64, f64, ip, json).

| Variable | Description | Default value |
| ------------- | ------------- | ------------- |
| `field_mappings` | Collection of field mapping, each having its own data type (text, binary, datetime, bool, i64, u64, f64). | `[]` |
| `field_mappings` | Collection of field mapping, each having its own data type (text, binary, datetime, bool, i64, u64, f64, ip, json). | `[]` |
| `mode` | Defines how quickwit should handle document fields that are not present in the `field_mappings`. In particular, the "dynamic" mode makes it possible to use quickwit in a schemaless manner. (See [mode](#mode)) | `dynamic`
| `dynamic_mapping` | This parameter is only allowed when `mode` is set to `dynamic`. It then defines whether dynamically mapped fields should be indexed, stored, etc. | (See [mode](#mode))
| `tag_fields` | Collection of fields* already defined in `field_mappings` whose values will be stored as part of the `tags` metadata. [Learn more about tags](../overview/concepts/querying.md#tag-pruning). | `[]` |
Expand All @@ -101,7 +102,7 @@ The doc mapping defines how a document and the fields it contains are stored and
### Field types

Each field[^1] has a type that indicates the kind of data it contains, such as integer on 64 bits or text.
Quickwit supports the following raw types [`text`](#text-type), [`i64`](#numeric-types-i64-u64-and-f64-type), [`u64`](#numeric-types-i64-u64-and-f64-type), [`f64`](#numeric-types-i64-u64-and-f64-type), [`datetime`](#datetime-type), [`bool`](#bool-type), [`ip`](#ip-type), and [`bytes`](#bytes-type), and also supports composite types such as array and object. Behind the scenes, Quickwit is using tantivy field types, don't hesitate to look at [tantivy documentation](https://github.com/tantivy-search/tantivy) if you want to go into the details.
Quickwit supports the following raw types [`text`](#text-type), [`i64`](#numeric-types-i64-u64-and-f64-type), [`u64`](#numeric-types-i64-u64-and-f64-type), [`f64`](#numeric-types-i64-u64-and-f64-type), [`datetime`](#datetime-type), [`bool`](#bool-type), [`ip`](#ip-type), [`bytes`](#bytes-type), and [`json`](#json-type), and also supports composite types such as array and object. Behind the scenes, Quickwit is using tantivy field types, don't hesitate to look at [tantivy documentation](https://github.com/tantivy-search/tantivy) if you want to go into the details.

### Raw types

Expand Down Expand Up @@ -135,7 +136,7 @@ fast:
| `fieldnorms` | Whether to store fieldnorms for the field. Fieldnorms are required to calculate the BM25 Score of the document. | `false` |
| `fast` | Whether value is stored in a fast field. The fast field will contain the term ids and the dictionary. The default behaviour for `true` is to store the original text unchanged. The normalizers on the fast field is seperately configured. It can be configured via `normalizer: lowercase`. ([See normalizers](#description-of-available-normalizers)) for a list of available normalizers. | `false` |

#### **Description of available tokenizers**
##### Description of available tokenizers

| Tokenizer | Description |
| ------------- | ------------- |
Expand All @@ -145,7 +146,7 @@ fast:
| `chinese_compatible` | Chop between each CJK character in addition to what `default` does. Should be used with `record: position` to be able to properly search |
| `lowercase` | Applies a lowercase transformation on the text. It does not tokenize the text. |

#### **Description of available normalizers**
##### Description of available normalizers

| Normalizer | Description |
| ------------- | ------------- |
Expand Down Expand Up @@ -387,13 +388,13 @@ If, in addition, `attributes` is set as a default search field, then `color:red`

### Composite types

#### **array**
#### array

Quickwit supports arrays for all raw types except for `object` types.

To declare an array type of `i64` in the index config, you just have to set the type to `array<i64>`.

#### **object**
#### object

Quickwit supports nested objects as long as it does not contain arrays of objects.

Expand Down
66 changes: 42 additions & 24 deletions docs/configuration/node-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,48 @@ A commented example is available here: [quickwit.yaml](https://github.com/quickw
| `enabled_services` | Enabled services (control_plane, indexer, janitor, metastore, searcher) | `QW_ENABLED_SERVICES` | all services |
| `listen_address` | The IP address or hostname that Quickwit service binds to for starting REST and GRPC server and connecting this node to other nodes. By default, Quickwit binds itself to 127.0.0.1 (localhost). This default is not valid when trying to form a cluster. | `QW_LISTEN_ADDRESS` | `127.0.0.1` |
| `advertise_address` | IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs. | `QW_ADVERTISE_ADDRESS` | `listen_address` |
| `rest_listen_port` | The port which to listen for HTTP REST API. | `QW_REST_LISTEN_PORT` | `7280` |
| `gossip_listen_port` | The port which to listen for the Gossip cluster membership service (UDP). | `QW_GOSSIP_LISTEN_PORT` | `rest_listen_port` |
| `grpc_listen_port` | The port which to listen for the gRPC service.| `QW_GRPC_LISTEN_PORT` | `rest_listen_port + 1` |
| `gossip_listen_port` | The port which to listen for the Gossip cluster membership service (UDP). | `QW_GOSSIP_LISTEN_PORT` | `rest.listen_port` |
| `grpc_listen_port` | The port on which gRPC services listen for traffic. | `QW_GRPC_LISTEN_PORT` | `rest.listen_port + 1` |
| `peer_seeds` | List of IP addresses or hostnames used to bootstrap the cluster and discover the complete set of nodes. This list may contain the current node address and does not need to be exhaustive. | `QW_PEER_SEEDS` | |
| `data_dir` | Path to directory where data (tmp data, splits kept for caching purpose) is persisted. This is mostly used in indexing. | `QW_DATA_DIR` | `./qwdata` |
| `metastore_uri` | Metastore URI. Can be a local directory or `s3://my-bucket/indexes` or `postgres://username:password@localhost:5432/metastore`. [Learn more about the metastore configuration](metastore-config.md). | `QW_METASTORE_URI` | `{data_dir}/indexes` |
| `default_index_root_uri` | Default index root URI that defines the location where index data (splits) is stored. The index URI is built following the scheme: `{default_index_root_uri}/{index-id}` | `QW_DEFAULT_INDEX_ROOT_URI` | `{data_dir}/indexes` |
| `rest_cors_allow_origins` | Configure the CORS origins which are allowed to access the API. [Read more](#configuring-cors-cross-origin-resource-sharing) | |

## REST configuration

This section contains the REST API configuration options.

| Property | Description | Env variable | Default value |
| --- | --- | --- | --- |
| `listen_port` | The port on which the REST API listens for HTTP traffic. | `QW_REST_LISTEN_PORT` | `7280` |
| `cors_allow_origins` | Configure the CORS origins which are allowed to access the API. [Read more](#configuring-cors-cross-origin-resource-sharing) | |
| `extra_headers` | List of header names and values | | |

### Configuring CORS (Cross-origin resource sharing)

CORS (Cross-origin resource sharing) describes which address or origins can access the REST API from the browser.
By default, sharing resources cross-origin is not allowed.

A wildcard, single origin, or multiple origins can be specified as part of the `cors_allow_origins` parameter:


Example of a REST configuration:

```yaml

rest:
listen_port: 1789
extra_headers:
x-header-1: header-value-1
x-header-2: header-value-2
cors_allow_origins: '*'

# cors_allow_origins: https://my-hdfs-logs.domain.com # Optionally we can specify one domain
# cors_allow_origins: # Or allow multiple origins
# - https://my-hdfs-logs.domain.com
# - https://my-hdfs.other-domain.com

```

## Storage configuration

Expand Down Expand Up @@ -203,7 +237,8 @@ version: 0.6
cluster_id: quickwit-cluster
node_id: my-unique-node-id
listen_address: ${QW_LISTEN_ADDRESS}
rest_listen_port: ${QW_LISTEN_PORT:-1111}
rest:
listen_port: ${QW_LISTEN_PORT:-1111}
```
Will be interpreted by Quickwit as:
Expand All @@ -213,23 +248,6 @@ version: 0.6
cluster_id: quickwit-cluster
node_id: my-unique-node-id
listen_address: 0.0.0.0
rest_listen_port: 1111
```
## Configuring CORS (Cross-origin resource sharing)
CORS (Cross-origin resource sharing) describes which address or origins can access the REST API from the browser.
By default, sharing resources cross-origin is not allowed.
A wildcard, single origin, or multiple origins can be specified as part of the `rest_cors_allow_origins` parameter:

```yaml
version: 0.6
index_id: hdfs
rest_cors_allow_origins: '*' # Allow all origins
# rest_cors_allow_origins: https://my-hdfs-logs.domain.com # Optionally we can specify one domain
# rest_cors_allow_origins: # Or allow multiple origins
# - https://my-hdfs-logs.domain.com
# - https://my-hdfs.other-domain.com
rest:
listen_port: 1111
```
10 changes: 5 additions & 5 deletions docs/configuration/ports-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,18 @@ sidebar_position: 6
---

When starting a quickwit search server, one important parameter that can be configured is
the `rest_listen_port` (defaults to :7280).
the `rest.listen_port` (defaults to :7280).

Internally, Quickwit will, in fact, use three sockets. The ports of these three sockets
cannot be configured independently at the moment.
The ports used are computed relative to the `rest_listen_port` port, as follows.
The ports used are computed relative to the `rest.listen_port` port, as follows.


| Service | Port used | Protocol | Default |
|-------------------------------|---------------------------|----------|-----------|
| Http server with the rest api | `${rest_listen_port}` | TCP | 7280 |
| Cluster membership | `${rest_listen_port}` | UDP | 7280 |
| GRPC service | `${rest_listen_port} + 1` | TCP | 7281 |
| Http server with the rest api | `${rest.listen_port}` | TCP | 7280 |
| Cluster membership | `${rest.listen_port}` | UDP | 7280 |
| GRPC service | `${rest.listen_port} + 1` | TCP | 7281 |

It is not possible for the moment to configure these ports independently.

Expand Down
Loading

0 comments on commit 870f389

Please sign in to comment.