Skip to content

Commit

Permalink
Merge branch 'main' into hotfix/solve-grafana-hpa-selector
Browse files Browse the repository at this point in the history
  • Loading branch information
gritzkoo authored Apr 9, 2024
2 parents 577d5d7 + 1e803f8 commit 6d94096
Show file tree
Hide file tree
Showing 31 changed files with 751 additions and 11 deletions.
13 changes: 9 additions & 4 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,19 @@ jobs:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
CR_SKIP_EXISTING: "true"

- name: Login to GHCR
uses: docker/[email protected]
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Push charts to GHCR
run: |
shopt -s nullglob
for pkg in .cr-release-packages/*; do
for pkg in .cr-release-packages/*.tgz; do
if [ -z "${pkg:-}" ]; then
break
fi
if ! helm push "${pkg}" "oci://ghcr.io/${GITHUB_REPOSITORY_OWNER}/charts"; then
echo '::warning:: helm push failed!'
fi
helm push "${pkg}" "oci://ghcr.io/${GITHUB_REPOSITORY_OWNER}/helm-charts"
done
2 changes: 1 addition & 1 deletion charts/agent-operator/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: grafana-agent-operator
description: A Helm chart for Grafana Agent Operator
type: application
version: 0.3.19
version: 0.3.20
appVersion: "0.40.3"
home: https://grafana.com/docs/agent/v0.40/
icon: https://raw.githubusercontent.com/grafana/agent/v0.40.3/docs/sources/assets/logo_and_name.png
Expand Down
5 changes: 4 additions & 1 deletion charts/agent-operator/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# grafana-agent-operator

![Version: 0.3.19](https://img.shields.io/badge/Version-0.3.19-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.40.3](https://img.shields.io/badge/AppVersion-0.40.3-informational?style=flat-square)
![Version: 0.3.20](https://img.shields.io/badge/Version-0.3.20-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.40.3](https://img.shields.io/badge/AppVersion-0.40.3-informational?style=flat-square)

A Helm chart for Grafana Agent Operator

Expand Down Expand Up @@ -75,4 +75,7 @@ A major chart version change (like v1.2.3 -> v2.0.0) indicates that there is an
| resources | object | `{}` | Resource limits and requests config |
| serviceAccount.create | bool | `true` | Toggle to create ServiceAccount |
| serviceAccount.name | string | `nil` | Service account name |
| test.image.registry | string | `"docker.io"` | Test image registry |
| test.image.repository | string | `"library/busybox"` | Test image repo |
| test.image.tag | string | `"latest"` | Test image tag |
| tolerations | list | `[]` | Tolerations applied to Pods |
4 changes: 2 additions & 2 deletions charts/agent-operator/templates/tests/test-grafanaagent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -107,12 +107,12 @@ metadata:
spec:
containers:
- name: busybox
image: busybox
image: "{{ .Values.test.image.registry }}/{{ .Values.test.image.repository }}:{{ .Values.test.image.tag }}"
command: ['wget']
args: ['grafana-agent-test-operated:8080/-/healthy']
# Wait for GrafanaAgent CR
initContainers:
- name: sleep
image: busybox
image: "{{ .Values.test.image.registry }}/{{ .Values.test.image.repository }}:{{ .Values.test.image.tag }}"
command: ['sleep', '60']
restartPolicy: Never
9 changes: 9 additions & 0 deletions charts/agent-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,15 @@ image:
# -- Image pull secrets
pullSecrets: []

test:
image:
# -- Test image registry
registry: docker.io
# -- Test image repo
repository: library/busybox
# -- Test image tag
tag: latest

# -- hostAliases to add
hostAliases: []
# - ip: 1.2.3.4
Expand Down
23 changes: 23 additions & 0 deletions charts/grafana-sampling/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
9 changes: 9 additions & 0 deletions charts/grafana-sampling/Chart.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
dependencies:
- name: grafana-agent
repository: https://grafana.github.io/helm-charts
version: 0.36.0
- name: grafana-agent
repository: https://grafana.github.io/helm-charts
version: 0.36.0
digest: sha256:6d04a55dce2c09c4c250c6453e0d58f7280750bf04fce51027b4e235062413e5
generated: "2024-03-11T15:41:30.921516-07:00"
18 changes: 18 additions & 0 deletions charts/grafana-sampling/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: v2
name: grafana-sampling
description: A Helm chart for a layered OTLP tail sampling and metrics generation pipeline.
type: application
version: 0.1.0
appVersion: "v0.40.2"
sources:
- https://github.com/grafana/agent
- https://grafana.com/docs/grafana-cloud/monitor-applications/application-observability/setup/sampling/tail/
dependencies:
- name: grafana-agent
version: 0.36.0
repository: https://grafana.github.io/helm-charts
alias: grafana-agent-deployment
- name: grafana-agent
version: 0.36.0
repository: https://grafana.github.io/helm-charts
alias: grafana-agent-statefulset
124 changes: 124 additions & 0 deletions charts/grafana-sampling/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# grafana-sampling

![Version: 0.1.0](https://img.shields.io/badge/Version-0.1.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v0.40.2](https://img.shields.io/badge/AppVersion-v0.40.2-informational?style=flat-square)

A Helm chart for a layered OTLP tail sampling and metrics generation pipeline.

This chart deploys the following architecture to your environment:
![Photo of sampling architecture](./sampling-architecture.png)

Note: by default, only OTLP traces are accepted at the load balancing layer.

## Chart Repo

Add the following repo to use the chart:

```console
helm repo add grafana https://grafana.github.io/helm-charts
```
## Installing the Chart

Use the following command to install the chart with the release name `my-release`. Make sure to populate the required values.

```console
helm install my-release grafana/grafana-sampling --values - <<EOF | less
grafana-agent-statefulset:
agent:
extraEnv:
- name: GRAFANA_CLOUD_API_KEY
value: <REQUIRED>
- name: GRAFANA_CLOUD_PROMETHEUS_URL
value: <REQUIRED>
- name: GRAFANA_CLOUD_PROMETHEUS_USERNAME
value: <REQUIRED>
- name: GRAFANA_CLOUD_TEMPO_ENDPOINT
value: <REQUIRED>
- name: GRAFANA_CLOUD_TEMPO_USERNAME
value: <REQUIRED>
# This is required for adaptive metric deduplication in Grafana Cloud
- name: POD_UID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
EOF
```

## Uninstalling the Chart

To uninstall/delete the my-release deployment:

```console
helm delete my-release
```

The command removes all the Kubernetes components associated with the chart and deletes the release.

## Upgrading

A major chart version change indicates that there is an incompatible breaking change needing manual actions.

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| grafana-agent-deployment.agent.configMap.create | bool | `false` | |
| grafana-agent-deployment.agent.extraPorts[0].name | string | `"otlp-grpc"` | |
| grafana-agent-deployment.agent.extraPorts[0].port | int | `4317` | |
| grafana-agent-deployment.agent.extraPorts[0].protocol | string | `"TCP"` | |
| grafana-agent-deployment.agent.extraPorts[0].targetPort | int | `4317` | |
| grafana-agent-deployment.agent.extraPorts[1].name | string | `"otlp-http"` | |
| grafana-agent-deployment.agent.extraPorts[1].port | int | `4318` | |
| grafana-agent-deployment.agent.extraPorts[1].protocol | string | `"TCP"` | |
| grafana-agent-deployment.agent.extraPorts[1].targetPort | int | `4318` | |
| grafana-agent-deployment.agent.resources.requests.cpu | string | `"1"` | |
| grafana-agent-deployment.agent.resources.requests.memory | string | `"2G"` | |
| grafana-agent-deployment.controller.autoscaling.enabled | bool | `false` | Creates a HorizontalPodAutoscaler for controller type deployment. |
| grafana-agent-deployment.controller.autoscaling.maxReplicas | int | `5` | The upper limit for the number of replicas to which the autoscaler can scale up. |
| grafana-agent-deployment.controller.autoscaling.minReplicas | int | `2` | The lower limit for the number of replicas to which the autoscaler can scale down. |
| grafana-agent-deployment.controller.autoscaling.targetCPUUtilizationPercentage | int | `0` | Average CPU utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetCPUUtilizationPercentage` to 0 will disable CPU scaling. |
| grafana-agent-deployment.controller.autoscaling.targetMemoryUtilizationPercentage | int | `80` | Average Memory utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetMemoryUtilizationPercentage` to 0 will disable Memory scaling. |
| grafana-agent-deployment.controller.replicas | int | `1` | |
| grafana-agent-deployment.controller.type | string | `"deployment"` | |
| grafana-agent-deployment.nameOverride | string | `"deployment"` | Do not change this. |
| grafana-agent-statefulset.agent.configMap.create | bool | `false` | |
| grafana-agent-statefulset.agent.extraEnv[0].name | string | `"GRAFANA_CLOUD_API_KEY"` | |
| grafana-agent-statefulset.agent.extraEnv[0].value | string | `"<REQUIRED>"` | |
| grafana-agent-statefulset.agent.extraEnv[1].name | string | `"GRAFANA_CLOUD_PROMETHEUS_URL"` | |
| grafana-agent-statefulset.agent.extraEnv[1].value | string | `"<REQUIRED>"` | |
| grafana-agent-statefulset.agent.extraEnv[2].name | string | `"GRAFANA_CLOUD_PROMETHEUS_USERNAME"` | |
| grafana-agent-statefulset.agent.extraEnv[2].value | string | `"<REQUIRED>"` | |
| grafana-agent-statefulset.agent.extraEnv[3].name | string | `"GRAFANA_CLOUD_TEMPO_ENDPOINT"` | |
| grafana-agent-statefulset.agent.extraEnv[3].value | string | `"<REQUIRED>"` | |
| grafana-agent-statefulset.agent.extraEnv[4].name | string | `"GRAFANA_CLOUD_TEMPO_USERNAME"` | |
| grafana-agent-statefulset.agent.extraEnv[4].value | string | `"<REQUIRED>"` | |
| grafana-agent-statefulset.agent.extraEnv[5].name | string | `"POD_UID"` | |
| grafana-agent-statefulset.agent.extraEnv[5].valueFrom.fieldRef.apiVersion | string | `"v1"` | |
| grafana-agent-statefulset.agent.extraEnv[5].valueFrom.fieldRef.fieldPath | string | `"metadata.uid"` | |
| grafana-agent-statefulset.agent.extraPorts[0].name | string | `"otlp-grpc"` | |
| grafana-agent-statefulset.agent.extraPorts[0].port | int | `4317` | |
| grafana-agent-statefulset.agent.extraPorts[0].protocol | string | `"TCP"` | |
| grafana-agent-statefulset.agent.extraPorts[0].targetPort | int | `4317` | |
| grafana-agent-statefulset.agent.resources.requests.cpu | string | `"1"` | |
| grafana-agent-statefulset.agent.resources.requests.memory | string | `"2G"` | |
| grafana-agent-statefulset.controller.autoscaling.enabled | bool | `false` | Creates a HorizontalPodAutoscaler for controller type deployment. |
| grafana-agent-statefulset.controller.autoscaling.maxReplicas | int | `5` | The upper limit for the number of replicas to which the autoscaler can scale up. |
| grafana-agent-statefulset.controller.autoscaling.minReplicas | int | `2` | The lower limit for the number of replicas to which the autoscaler can scale down. |
| grafana-agent-statefulset.controller.autoscaling.targetCPUUtilizationPercentage | int | `0` | Average CPU utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetCPUUtilizationPercentage` to 0 will disable CPU scaling. |
| grafana-agent-statefulset.controller.autoscaling.targetMemoryUtilizationPercentage | int | `80` | Average Memory utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetMemoryUtilizationPercentage` to 0 will disable Memory scaling. |
| grafana-agent-statefulset.controller.replicas | int | `1` | |
| grafana-agent-statefulset.controller.type | string | `"statefulset"` | |
| grafana-agent-statefulset.nameOverride | string | `"statefulset"` | Do not change this. |
| grafana-agent-statefulset.rbac.create | bool | `false` | |
| grafana-agent-statefulset.service.clusterIP | string | `"None"` | |
| grafana-agent-statefulset.serviceAccount.create | bool | `false` | |
| metricsGeneration.dimensions | list | `["service.namespace","service.version","deployment.environment","k8s.cluster.name"]` | Additional dimensions to add to generated metrics. |
| metricsGeneration.enabled | bool | `true` | Toggle generation of spanmetrics and servicegraph metrics. |
| sampling.decisionWait | string | `"15s"` | Wait time since the first span of a trace before making a sampling decision. |
| sampling.enabled | bool | `true` | Toggle tail sampling. |
| sampling.extraPolicies | string | A policy to sample long requests is added by default. | User-defined policies in river format. |
| sampling.failedRequests.percentage | int | `50` | Percentage of failed requests to sample. |
| sampling.failedRequests.sample | bool | `false` | Toggle sampling failed requests. |
| sampling.successfulRequests.percentage | int | `10` | Percentage of successful requests to sample. |
| sampling.successfulRequests.sample | bool | `true` | Toggle sampling successful requests. |

63 changes: 63 additions & 0 deletions charts/grafana-sampling/README.md.gotmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
{{ template "chart.header" . }}

{{ template "chart.versionBadge" . }}{{ template "chart.typeBadge" . }}{{ template "chart.appVersionBadge" . }}

{{ template "chart.description" . }}

This chart deploys the following architecture to your environment:
![Photo of sampling architecture](./sampling-architecture.png)

Note: by default, only OTLP traces are accepted at the load balancing layer.


## Chart Repo

Add the following repo to use the chart:

```console
helm repo add grafana https://grafana.github.io/helm-charts
```
## Installing the Chart

Use the following command to install the chart with the release name `my-release`. Make sure to populate the required values.

```console
helm install my-release grafana/grafana-sampling --values - <<EOF | less
grafana-agent-statefulset:
agent:
extraEnv:
- name: GRAFANA_CLOUD_API_KEY
value: <REQUIRED>
- name: GRAFANA_CLOUD_PROMETHEUS_URL
value: <REQUIRED>
- name: GRAFANA_CLOUD_PROMETHEUS_USERNAME
value: <REQUIRED>
- name: GRAFANA_CLOUD_TEMPO_ENDPOINT
value: <REQUIRED>
- name: GRAFANA_CLOUD_TEMPO_USERNAME
value: <REQUIRED>
# This is required for adaptive metric deduplication in Grafana Cloud
- name: POD_UID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
EOF
```

## Uninstalling the Chart

To uninstall/delete the my-release deployment:

```console
helm delete my-release
```

The command removes all the Kubernetes components associated with the chart and deletes the release.

## Upgrading

A major chart version change indicates that there is an incompatible breaking change needing manual actions.

{{ template "chart.valuesSection" . }}

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{{- define "agent.config.deployment" -}}
{{- include "deployment.receiver.otlp" . }}
{{- include "deployment.processor.batch" . }}
{{- include "deployment.exporter.loadbalancing" . }}
{{- end -}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{{- define "agent.config.statefulset" -}}
{{- include "statefulset.receiver.otlp" . }}
{{- if .Values.metricsGeneration.enabled -}}
{{- include "statefulset.connector.spanmetrics" . }}
{{- include "statefulset.processor.transform.drop_unneeded_resource_attributes" . }}
{{- include "statefulset.processor.transform.use_grafana_metric_names" . }}
{{- include "statefulset.processor.filter" . }}
{{- include "statefulset.connector.servicegraph" . }}
{{- include "statefulset.exporter.prometheus" . }}
{{- include "statefulset.prometheus.remote_write" . }}
{{- end -}}
{{- if .Values.sampling.enabled -}}
{{- include "statefulset.processor.tail_sampling" . }}
{{- end -}}
{{- include "statefulset.processor.batch" . }}
{{- include "exporter.otlp" . }}
{{- include "auth.basic" . }}
{{- end -}}
9 changes: 9 additions & 0 deletions charts/grafana-sampling/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{{/* use the release name as the serviceAccount name for deployment and statefulset agents */}}
{{- define "grafana-agent.serviceAccountName" -}}
{{- default .Release.Name }}
{{- end }}

{{/* Calculate name of image ID to use for "grafana-agent". */}}
{{- define "grafana-agent.imageId" -}}
{{- printf ":%s" .Chart.AppVersion }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{{- define "auth.basic" -}}
otelcol.auth.basic "grafana_cloud_tempo" {
// https://grafana.com/docs/agent/latest/flow/reference/components/otelcol.auth.basic/
username = env("GRAFANA_CLOUD_TEMPO_USERNAME")
password = env("GRAFANA_CLOUD_API_KEY")
}

{{ end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{{- define "statefulset.connector.servicegraph" -}}
otelcol.connector.servicegraph "default" {
// https://grafana.com/docs/agent/latest/flow/reference/components/otelcol.connector.servicegraph/
dimensions = [
{{- range $.Values.metricsGeneration.dimensions }}
{{ . | quote }},
{{- end }}
]
latency_histogram_buckets = ["0s", "0.005s", "0.01s", "0.025s", "0.05s", "0.075s", "0.1s", "0.25s", "0.5s", "0.75s", "1s", "2.5s", "5s", "7.5s", "10s"]

store {
ttl = "2s"
}

output {
metrics = [otelcol.processor.batch.default.input]
}
}

{{ end }}
Loading

0 comments on commit 6d94096

Please sign in to comment.