Skip to content

Commit

Permalink
(otelarrowreceiver): add admission control to otlp path (#35021)
Browse files Browse the repository at this point in the history
  • Loading branch information
moh-osman3 authored Sep 26, 2024
1 parent a4cb201 commit c5aa92a
Show file tree
Hide file tree
Showing 16 changed files with 490 additions and 49 deletions.
31 changes: 31 additions & 0 deletions .chloggen/admission-control-for-otlp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: otelarrowreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: |
Add admission control in the otelarrow receiver's standard otlp data path.
Also moves admission control config options to a separate block.
arrow.admission_limit_mib -> admission.request_limit_mib
arrow.waiter_limit -> admission.waiter_limit
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [35021]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
7 changes: 7 additions & 0 deletions internal/otelarrow/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,19 @@ require (
github.com/fxamacker/cbor/v2 v2.4.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-viper/mapstructure/v2 v2.1.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/snappy v0.0.5-0.20220116011046-fa5810519dcb // indirect
github.com/google/flatbuffers v24.3.25+incompatible // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/cpuid/v2 v2.2.8 // indirect
github.com/knadh/koanf/maps v0.1.1 // indirect
github.com/knadh/koanf/providers/confmap v0.1.0 // indirect
github.com/knadh/koanf/v2 v2.1.1 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/mitchellh/copystructure v1.2.0 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/mostynb/go-grpc-compression v1.2.3 // indirect
Expand All @@ -69,6 +75,7 @@ require (
go.opentelemetry.io/collector/config/configretry v1.16.0 // indirect
go.opentelemetry.io/collector/config/configtls v1.16.0 // indirect
go.opentelemetry.io/collector/config/internal v0.110.0 // indirect
go.opentelemetry.io/collector/confmap v1.16.0 // indirect
go.opentelemetry.io/collector/consumer/consumerprofiles v0.110.0 // indirect
go.opentelemetry.io/collector/extension v0.110.0 // indirect
go.opentelemetry.io/collector/extension/auth v0.110.0 // indirect
Expand Down
17 changes: 10 additions & 7 deletions receiver/otelarrowreceiver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,16 @@ Several common configuration structures provide additional capabilities automati
- [gRPC settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configgrpc/README.md)
- [TLS and mTLS settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configtls/README.md)

### Admission Control Configuration

In the `admission` configuration block the following settings are available:

- `request_limit_mib` (default: 128): limits the number of requests that are received by the stream in terms of *uncompressed request size*. This should not be confused with `arrow.memory_limit_mib` which limits allocations made by the consumer when translating arrow records into pdata objects. i.e. request size is used to control how much traffic we admit, but does not control how much memory is used during request processing.

- `waiter_limit` (default: 1000): limits the number of requests waiting on admission once `admission_limit_mib` is reached. This is another dimension of memory limiting that ensures waiters are not holding onto a significant amount of memory while waiting to be processed.

`request_limit_mib` and `waiter_limit` are arguments supplied to [admission.BoundedQueue](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/internal/otelarrow/admission). This custom semaphore is meant to be used within receivers to help limit memory within the collector pipeline.

### Arrow-specific Configuration

In the `arrow` configuration block, the following settings are available:
Expand All @@ -87,13 +97,6 @@ When the limit is reached, the receiver will return RESOURCE_EXHAUSTED
error codes to the receiver, which are [conditionally retryable, see
exporter retry configuration](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

- `admission_limit_mib` (default: 64): limits the number of requests that are received by the stream based on request size information available. This should not be confused with `memory_limit_mib` which limits allocations made by the consumer when translating arrow records into pdata objects. i.e. request size is used to control how much traffic we admit, but does not control how much memory is used during request processing.

- `waiter_limit` (default: 1000): limits the number of requests waiting on admission once `admission_limit_mib` is reached. This is another dimension of memory limiting that ensures waiters are not holding onto a significant amount of memory while waiting to be processed.

`admission_limit_mib` and `waiter_limit` are arguments supplied to [admission.BoundedQueue](https://github.com/open-telemetry/otel-arrow/tree/main/collector/admission). This custom semaphore is meant to be used within receivers to help limit memory within the collector pipeline.


### Compression Configuration

In the `arrow` configuration block, `zstd` sub-section applies to all
Expand Down
51 changes: 43 additions & 8 deletions receiver/otelarrowreceiver/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/configgrpc"
"go.opentelemetry.io/collector/confmap"

"github.com/open-telemetry/opentelemetry-collector-contrib/internal/otelarrow/compression/zstd"
)
Expand All @@ -18,22 +19,30 @@ type Protocols struct {
Arrow ArrowConfig `mapstructure:"arrow"`
}

type AdmissionConfig struct {
// RequestLimitMiB limits the number of requests that are received by the stream based on
// uncompressed request size. Request size is used to control how much traffic we admit
// for processing.
RequestLimitMiB uint64 `mapstructure:"request_limit_mib"`

// WaiterLimit is the limit on the number of waiters waiting to be processed and consumed.
// This is a dimension of memory limiting to ensure waiters are not consuming an
// unexpectedly large amount of memory in the arrow receiver.
WaiterLimit int64 `mapstructure:"waiter_limit"`
}

// ArrowConfig support configuring the Arrow receiver.
type ArrowConfig struct {
// MemoryLimitMiB is the size of a shared memory region used
// by all Arrow streams, in MiB. When too much load is
// passing through, they will see ResourceExhausted errors.
MemoryLimitMiB uint64 `mapstructure:"memory_limit_mib"`

// AdmissionLimitMiB limits the number of requests that are received by the stream based on
// request size information available. Request size is used to control how much traffic we admit
// for processing, but does not control how much memory is used during request processing.
AdmissionLimitMiB uint64 `mapstructure:"admission_limit_mib"`
// Deprecated: This field is no longer supported, use cfg.Admission.RequestLimitMiB instead.
DeprecatedAdmissionLimitMiB uint64 `mapstructure:"admission_limit_mib"`

// WaiterLimit is the limit on the number of waiters waiting to be processed and consumed.
// This is a dimension of memory limiting to ensure waiters are not consuming an
// unexpectedly large amount of memory in the arrow receiver.
WaiterLimit int64 `mapstructure:"waiter_limit"`
// Deprecated: This field is no longer supported, use cfg.Admission.WaiterLimit instead.
DeprecatedWaiterLimit int64 `mapstructure:"waiter_limit"`

// Zstd settings apply to OTel-Arrow use of gRPC specifically.
Zstd zstd.DecoderConfig `mapstructure:"zstd"`
Expand All @@ -43,6 +52,8 @@ type ArrowConfig struct {
type Config struct {
// Protocols is the configuration for gRPC and Arrow.
Protocols `mapstructure:"protocols"`
// Admission is the configuration for controlling amount of request memory entering the receiver.
Admission AdmissionConfig `mapstructure:"admission"`
}

var _ component.Config = (*Config)(nil)
Expand All @@ -54,3 +65,27 @@ func (cfg *ArrowConfig) Validate() error {
}
return nil
}

func (cfg *Config) Validate() error {
if err := cfg.GRPC.Validate(); err != nil {
return err
}
if err := cfg.Arrow.Validate(); err != nil {
return err
}
return nil
}

// Unmarshal will apply deprecated field values to assist the user with migration.
func (cfg *Config) Unmarshal(conf *confmap.Conf) error {
if err := conf.Unmarshal(cfg); err != nil {
return err
}
if cfg.Admission.RequestLimitMiB == 0 && cfg.Arrow.DeprecatedAdmissionLimitMiB != 0 {
cfg.Admission.RequestLimitMiB = cfg.Arrow.DeprecatedAdmissionLimitMiB
}
if cfg.Admission.WaiterLimit == 0 && cfg.Arrow.DeprecatedWaiterLimit != 0 {
cfg.Admission.WaiterLimit = cfg.Arrow.DeprecatedWaiterLimit
}
return nil
}
40 changes: 34 additions & 6 deletions receiver/otelarrowreceiver/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,41 @@ func TestUnmarshalConfig(t *testing.T) {
},
},
Arrow: ArrowConfig{
MemoryLimitMiB: 123,
AdmissionLimitMiB: 80,
WaiterLimit: 100,
MemoryLimitMiB: 123,
},
},
Admission: AdmissionConfig{
RequestLimitMiB: 80,
WaiterLimit: 100,
},
}, cfg)

}

// Tests that a deprecated config validation sets RequestLimitMiB and WaiterLimit in the correct config block.
func TestValidateDeprecatedConfig(t *testing.T) {
cm, err := confmaptest.LoadConf(filepath.Join("testdata", "deprecated.yaml"))
require.NoError(t, err)
cfg := &Config{}
assert.NoError(t, cm.Unmarshal(cfg))
assert.NoError(t, cfg.Validate())
assert.Equal(t,
&Config{
Protocols: Protocols{
Arrow: ArrowConfig{
MemoryLimitMiB: 123,
DeprecatedAdmissionLimitMiB: 80,
DeprecatedWaiterLimit: 100,
},
},
Admission: AdmissionConfig{
// cfg.Validate should now set these fields.
RequestLimitMiB: 80,
WaiterLimit: 100,
},
}, cfg)
}

func TestUnmarshalConfigUnix(t *testing.T) {
cm, err := confmaptest.LoadConf(filepath.Join("testdata", "uds.yaml"))
require.NoError(t, err)
Expand All @@ -103,11 +129,13 @@ func TestUnmarshalConfigUnix(t *testing.T) {
ReadBufferSize: 512 * 1024,
},
Arrow: ArrowConfig{
MemoryLimitMiB: defaultMemoryLimitMiB,
AdmissionLimitMiB: defaultAdmissionLimitMiB,
WaiterLimit: defaultWaiterLimit,
MemoryLimitMiB: defaultMemoryLimitMiB,
},
},
Admission: AdmissionConfig{
RequestLimitMiB: defaultRequestLimitMiB,
WaiterLimit: defaultWaiterLimit,
},
}, cfg)
}

Expand Down
14 changes: 8 additions & 6 deletions receiver/otelarrowreceiver/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ import (
const (
defaultGRPCEndpoint = "0.0.0.0:4317"

defaultMemoryLimitMiB = 128
defaultAdmissionLimitMiB = defaultMemoryLimitMiB / 2
defaultWaiterLimit = 1000
defaultMemoryLimitMiB = 128
defaultRequestLimitMiB = 128
defaultWaiterLimit = 1000
)

// NewFactory creates a new OTel-Arrow receiver factory.
Expand All @@ -47,11 +47,13 @@ func createDefaultConfig() component.Config {
ReadBufferSize: 512 * 1024,
},
Arrow: ArrowConfig{
MemoryLimitMiB: defaultMemoryLimitMiB,
AdmissionLimitMiB: defaultAdmissionLimitMiB,
WaiterLimit: defaultWaiterLimit,
MemoryLimitMiB: defaultMemoryLimitMiB,
},
},
Admission: AdmissionConfig{
RequestLimitMiB: defaultRequestLimitMiB,
WaiterLimit: defaultWaiterLimit,
},
}
}

Expand Down
26 changes: 24 additions & 2 deletions receiver/otelarrowreceiver/internal/logs/otlp.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,12 @@ import (
"context"

"go.opentelemetry.io/collector/consumer"
"go.opentelemetry.io/collector/pdata/plog"
"go.opentelemetry.io/collector/pdata/plog/plogotlp"
"go.opentelemetry.io/collector/receiver/receiverhelper"
"go.uber.org/zap"

"github.com/open-telemetry/opentelemetry-collector-contrib/internal/otelarrow/admission"
)

const dataFormatProtobuf = "protobuf"
Expand All @@ -18,13 +22,19 @@ type Receiver struct {
plogotlp.UnimplementedGRPCServer
nextConsumer consumer.Logs
obsrecv *receiverhelper.ObsReport
boundedQueue *admission.BoundedQueue
sizer *plog.ProtoMarshaler
logger *zap.Logger
}

// New creates a new Receiver reference.
func New(nextConsumer consumer.Logs, obsrecv *receiverhelper.ObsReport) *Receiver {
func New(logger *zap.Logger, nextConsumer consumer.Logs, obsrecv *receiverhelper.ObsReport, bq *admission.BoundedQueue) *Receiver {
return &Receiver{
nextConsumer: nextConsumer,
obsrecv: obsrecv,
boundedQueue: bq,
sizer: &plog.ProtoMarshaler{},
logger: logger,
}
}

Expand All @@ -37,7 +47,19 @@ func (r *Receiver) Export(ctx context.Context, req plogotlp.ExportRequest) (plog
}

ctx = r.obsrecv.StartLogsOp(ctx)
err := r.nextConsumer.ConsumeLogs(ctx, ld)

sizeBytes := int64(r.sizer.LogsSize(req.Logs()))
err := r.boundedQueue.Acquire(ctx, sizeBytes)
if err != nil {
return plogotlp.NewExportResponse(), err
}
defer func() {
if releaseErr := r.boundedQueue.Release(sizeBytes); releaseErr != nil {
r.logger.Error("Error releasing bytes from semaphore", zap.Error(releaseErr))
}
}()

err = r.nextConsumer.ConsumeLogs(ctx, ld)
r.obsrecv.EndLogsOp(ctx, dataFormatProtobuf, numSpans, err)

return plogotlp.NewExportResponse(), err
Expand Down
Loading

0 comments on commit c5aa92a

Please sign in to comment.