VMware Aria Operations for Applications (formerly known as Tanzu Observability by Wavefront) is a high-performance streaming analytics platform for monitoring and optimizing your environment and applications.
The Kubernetes Metrics Collector is an agent that runs as a DaemonSet on each node within a Kubernetes cluster. It collects metrics and events about the cluster and sends them to the Operations for Applications service.
- Collects real-time data from all layers of a Kubernetes environment
- Multiple sources of metrics providing comprehensive insight:
- Kubernetes (kubelet) source: For core kubernetes metrics
- Prometheus source: For scraping prometheus metric endpoints (API server, etcd, NGINX etc)
- Kubernetes state source: For resource state metrics
- Telegraf source: For host and application level metrics
- Systemd source: For host level systemd metrics
- Auto discovery of pods and services based on annotation and configuration
- Daemonset mode for high scalability with leader election for monitoring cluster level resources
- Rich filtering support
- Auto reload of configuration changes
- Internal metrics for tracking the collector health and configuration
Refer to the installation instructions.
The installation instructions use a default configuration suitable for most use cases. Refer to the documentation for details on all the configuration options.
Build using make
and the provided Makefile
.
Commonly used make
options include:
fmt
togo fmt
all your codetests
to run all the unit testsbuild
that creates a local executablecontainer
that uses a docker container to build for consistency and reproducability
Formerly, we would see the following error in the Wavefront proxy logs when a metric has too many tags: Too many point tags
.
However, logic has been added to the Collector to automatically drop tags in priority order
to ensure that metrics make it through to the proxy and no longer cause this error.
This is the order of the logic used to drop tags:
- Tags are empty or are interpreted to be empty (
"tag.key": ""
,"tag.key": "-"
, or"tag.key": "/"
). - Explicitly excluded tags (from
tagExclude
config). Refer here for an example scenario. - Tags are explicitly excluded
(
"namespace_id": "..."
,"host_id": "..."
,"pod_id": "..."
, or"hostname": "..."
). - Tag values are duplicated, and the shorter key is kept
(
"tag.key": "same value"
is kept instead of"tag.super.long.key": "same value"
). - Extra tags are removed:
- Tag key matches
alpha.*
orbeta.*
, after keys have been sorted (e.g."alpha.eksctl.io/nodegroup-name": "arm-group"
or"beta.kubernetes.io/arch": "amd64"
). - Tag key matches IaaS-specific tags, after keys have been sorted
(
"kubernetes.azure.com/agentpool": "agentpool"
,"topology.gke.io/zone": "us-central1-c"
,"eksctl.io/nodegroup-name": "arm-group"
, etc.). - Tag key matches
label.*
, after keys have been sorted.
- Tag key matches
Public contributions are always welcome. Please feel free to report issues or submit pull requests.