Skip to content

Commit

Permalink
[exporter/clickhouse] Add compress option to config, enabled by def…
Browse files Browse the repository at this point in the history
…ault (#34365)

**Description:**

This change adds a new `compress` option to the config and sets it to
`lz4` by default.

In the current version of the exporter, users must know to provide
`compress` in the DSN URL to gain the network performance benefits of
compression. The only way they would have known this before is if they
copied the sample from the README, but this is likely replaced when they
paste their server address.

ClickHouse has excellent compression for storage and network. It is
recommended to enable it for clients such as the OTel exporter to
improve performance.

In summary:
- Added `compress` field to config
- `endpoint` (DSN URL) and `connection_params` takes priority
- If left empty from all sources, will default to `lz4`
- Valid options are based on the underlying `clickhouse-go` driver:
`none` (disabled), `zstd`, `lz4` (default), `gzip`, `deflate`, `br`,
`true` (lz4).

The `true` option comes from an older version of `clickhouse-go` and is
an alias for `lz4`. To prevent unexpected changes in behavior, I have
manually re-added this check to the config parser instead of assuming
the driver will still interpret it as `lz4`.

**Testing:**
- Updated unit tests for DSN + config parsing
- Ran integration tests locally

**Documentation:**
- Updated README config options list + sample
- Added changelog

---------

Co-authored-by: Pablo Baeyens <[email protected]>
  • Loading branch information
SpencerTorres and mx-psi authored Aug 8, 2024
1 parent 2d8a631 commit 71a6797
Show file tree
Hide file tree
Showing 4 changed files with 96 additions and 22 deletions.
34 changes: 34 additions & 0 deletions .chloggen/clickhouse-add-compress-option.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: breaking

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: clickhouseexporter

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Add `compress` option to ClickHouse exporter, with default value of `lz4`"

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [34365]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
This change adds a new `compress` option to the config field and enables it by default.
Prior to this change, compression was not enabled by default.
The only way to enable compression prior to this change was via the DSN URL.
With this change, `lz4` compression will be enabled by default.
The list of valid options is provided by the underlying `clickhouse-go` driver.
While this change is marked as breaking, there should be no effect to existing deployments by enabling compression.
Compression should improve network performance on most deployments that have a remote ClickHouse server.
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: []
4 changes: 3 additions & 1 deletion exporter/clickhouseexporter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,7 @@ Connection options:
- `database` (default = default): The database name. Overrides the database defined in `endpoint` when this setting is not equal to `default`.
- `connection_params` (default = {}). Params is the extra connection parameters with map format. Query parameters provided in `endpoint` will be individually overwritten if present in this map.
- `create_schema` (default = true): When set to true, will run DDL to create the database and tables. (See [schema management](#schema-management))
- `compress` (default = lz4): Controls the compression algorithm. Valid options: `none` (disabled), `zstd`, `lz4` (default), `gzip`, `deflate`, `br`, `true` (lz4). Ignored if `compress` is set in the `endpoint` or `connection_params`.
- `async_insert` (default = true): Enables [async inserts](https://clickhouse.com/docs/en/optimize/asynchronous-inserts). Ignored if async inserts are configured in the `endpoint` or `connection_params`. Async inserts may still be overridden server-side.

ClickHouse tables:
Expand Down Expand Up @@ -356,10 +357,11 @@ processors:
send_batch_size: 100000
exporters:
clickhouse:
endpoint: tcp://127.0.0.1:9000?dial_timeout=10s&compress=lz4
endpoint: tcp://127.0.0.1:9000?dial_timeout=10s
database: otel
async_insert: true
ttl: 72h
compress: lz4
create_schema: true
logs_table_name: otel_logs
traces_table_name: otel_traces
Expand Down
8 changes: 8 additions & 0 deletions exporter/clickhouseexporter/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ type Config struct {
ClusterName string `mapstructure:"cluster_name"`
// CreateSchema if set to true will run the DDL for creating the database and tables. default is true.
CreateSchema bool `mapstructure:"create_schema"`
// Compress controls the compression algorithm. Valid options: `none` (disabled), `zstd`, `lz4` (default), `gzip`, `deflate`, `br`, `true` (lz4).
Compress string `mapstructure:"compress"`
// AsyncInsert if true will enable async inserts. Default is `true`.
// Ignored if async inserts are configured in the `endpoint` or `connection_params`.
// Async inserts may still be overridden server-side.
Expand Down Expand Up @@ -108,6 +110,12 @@ func (cfg *Config) buildDSN() (string, error) {
queryParams.Set("async_insert", fmt.Sprintf("%t", cfg.AsyncInsert))
}

if !queryParams.Has("compress") && (cfg.Compress == "" || cfg.Compress == "true") {
queryParams.Set("compress", "lz4")
} else if !queryParams.Has("compress") {
queryParams.Set("compress", cfg.Compress)
}

// Use database from config if not specified in path, or if config is not default.
if dsnURL.Path == "" || cfg.Database != defaultDatabase {
dsnURL.Path = cfg.Database
Expand Down
72 changes: 51 additions & 21 deletions exporter/clickhouseexporter/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ func TestConfig_buildDSN(t *testing.T) {
Username string
Password string
Database string
Compress string
ConnectionParams map[string]string
AsyncInsert *bool
}
Expand All @@ -127,6 +128,9 @@ func TestConfig_buildDSN(t *testing.T) {
if fields.ConnectionParams != nil {
cfg.ConnectionParams = fields.ConnectionParams
}
if fields.Compress != "" {
cfg.Compress = fields.Compress
}
if fields.AsyncInsert != nil {
cfg.AsyncInsert = *fields.AsyncInsert
}
Expand Down Expand Up @@ -155,7 +159,7 @@ func TestConfig_buildDSN(t *testing.T) {
wantChOptions: ChOptions{
Secure: false,
},
want: "clickhouse://127.0.0.1:9000/default?async_insert=true",
want: "clickhouse://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "Support tcp scheme",
Expand All @@ -165,7 +169,7 @@ func TestConfig_buildDSN(t *testing.T) {
wantChOptions: ChOptions{
Secure: false,
},
want: "tcp://127.0.0.1:9000/default?async_insert=true",
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "prefers database name from config over from DSN",
Expand All @@ -178,7 +182,7 @@ func TestConfig_buildDSN(t *testing.T) {
wantChOptions: ChOptions{
Secure: false,
},
want: "clickhouse://foo:[email protected]:9000/otel?async_insert=true",
want: "clickhouse://foo:[email protected]:9000/otel?async_insert=true&compress=lz4",
},
{
name: "use database name from DSN if not set in config",
Expand All @@ -190,7 +194,7 @@ func TestConfig_buildDSN(t *testing.T) {
wantChOptions: ChOptions{
Secure: false,
},
want: "clickhouse://foo:[email protected]:9000/otel?async_insert=true",
want: "clickhouse://foo:[email protected]:9000/otel?async_insert=true&compress=lz4",
},
{
name: "invalid config",
Expand All @@ -210,29 +214,29 @@ func TestConfig_buildDSN(t *testing.T) {
wantChOptions: ChOptions{
Secure: true,
},
want: "https://127.0.0.1:9000/default?async_insert=true&secure=true",
want: "https://127.0.0.1:9000/default?async_insert=true&compress=lz4&secure=true",
},
{
name: "Preserve query parameters",
fields: fields{
Endpoint: "clickhouse://127.0.0.1:9000?secure=true&foo=bar",
Endpoint: "clickhouse://127.0.0.1:9000?secure=true&compress=lz4&foo=bar",
},
wantChOptions: ChOptions{
Secure: true,
},
want: "clickhouse://127.0.0.1:9000/default?async_insert=true&foo=bar&secure=true",
want: "clickhouse://127.0.0.1:9000/default?async_insert=true&compress=lz4&foo=bar&secure=true",
},
{
name: "Parse clickhouse settings",
fields: fields{
Endpoint: "https://127.0.0.1:9000?secure=true&dial_timeout=30s&compress=lz4",
Endpoint: "https://127.0.0.1:9000?secure=true&dial_timeout=30s&compress=br",
},
wantChOptions: ChOptions{
Secure: true,
DialTimeout: 30 * time.Second,
Compress: clickhouse.CompressionLZ4,
Compress: clickhouse.CompressionBrotli,
},
want: "https://127.0.0.1:9000/default?async_insert=true&compress=lz4&dial_timeout=30s&secure=true",
want: "https://127.0.0.1:9000/default?async_insert=true&compress=br&dial_timeout=30s&secure=true",
},
{
name: "Should respect connection parameters",
Expand All @@ -243,29 +247,29 @@ func TestConfig_buildDSN(t *testing.T) {
wantChOptions: ChOptions{
Secure: true,
},
want: "clickhouse://127.0.0.1:9000/default?async_insert=true&foo=bar&secure=true",
want: "clickhouse://127.0.0.1:9000/default?async_insert=true&compress=lz4&foo=bar&secure=true",
},
{
name: "support replace database in DSN with config to override database",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000/otel",
Database: "override",
},
want: "tcp://127.0.0.1:9000/override?async_insert=true",
want: "tcp://127.0.0.1:9000/override?async_insert=true&compress=lz4",
},
{
name: "when config option is missing, preserve async_insert false in DSN",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000?async_insert=false",
},
want: "tcp://127.0.0.1:9000/default?async_insert=false",
want: "tcp://127.0.0.1:9000/default?async_insert=false&compress=lz4",
},
{
name: "when config option is missing, preserve async_insert true in DSN",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000?async_insert=true",
},
want: "tcp://127.0.0.1:9000/default?async_insert=true",
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "ignore config option when async_insert is present in connection params as false",
Expand All @@ -275,7 +279,7 @@ func TestConfig_buildDSN(t *testing.T) {
AsyncInsert: &configTrue,
},

want: "tcp://127.0.0.1:9000/default?async_insert=false",
want: "tcp://127.0.0.1:9000/default?async_insert=false&compress=lz4",
},
{
name: "ignore config option when async_insert is present in connection params as true",
Expand All @@ -285,7 +289,7 @@ func TestConfig_buildDSN(t *testing.T) {
AsyncInsert: &configFalse,
},

want: "tcp://127.0.0.1:9000/default?async_insert=true",
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "ignore config option when async_insert is present in DSN as false",
Expand All @@ -294,7 +298,7 @@ func TestConfig_buildDSN(t *testing.T) {
AsyncInsert: &configTrue,
},

want: "tcp://127.0.0.1:9000/default?async_insert=false",
want: "tcp://127.0.0.1:9000/default?async_insert=false&compress=lz4",
},
{
name: "use async_insert true config option when it is not present in DSN",
Expand All @@ -303,7 +307,7 @@ func TestConfig_buildDSN(t *testing.T) {
AsyncInsert: &configTrue,
},

want: "tcp://127.0.0.1:9000/default?async_insert=true",
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "use async_insert false config option when it is not present in DSN",
Expand All @@ -312,15 +316,15 @@ func TestConfig_buildDSN(t *testing.T) {
AsyncInsert: &configFalse,
},

want: "tcp://127.0.0.1:9000/default?async_insert=false",
want: "tcp://127.0.0.1:9000/default?async_insert=false&compress=lz4",
},
{
name: "set async_insert to true when not present in config or DSN",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000",
},

want: "tcp://127.0.0.1:9000/default?async_insert=true",
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "connection_params takes priority over endpoint and async_insert option.",
Expand All @@ -330,7 +334,33 @@ func TestConfig_buildDSN(t *testing.T) {
AsyncInsert: &configFalse,
},

want: "tcp://127.0.0.1:9000/default?async_insert=true",
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "use compress br config option when it is not present in DSN",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000",
Compress: "br",
},

want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=br",
},
{
name: "set compress to lz4 when not present in config or DSN",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000",
},

want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=lz4",
},
{
name: "connection_params takes priority over endpoint and compress option.",
fields: fields{
Endpoint: "tcp://127.0.0.1:9000?compress=none",
ConnectionParams: map[string]string{"compress": "br"},
Compress: "lz4",
},
want: "tcp://127.0.0.1:9000/default?async_insert=true&compress=br",
},
}
for _, tt := range tests {
Expand Down

0 comments on commit 71a6797

Please sign in to comment.