Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workload tracing #645

Merged
merged 10 commits into from
Nov 13, 2024
Merged

Workload tracing #645

merged 10 commits into from
Nov 13, 2024

Conversation

mmkay
Copy link
Contributor

@mmkay mmkay commented Nov 8, 2024

Issue

Add workload tracing for prometheus workload.

Solution

Use built-in tracing capabilities in Prometheus.

Context

We're also following a decision to split charm-tracing and workload-tracing into separate endpoints. If you have Tempo deployed and related with Prometheus in your production environment, please see Upgrade Notes below.

Tracing configuration Prometheus reference: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tracing_config

Testing Instructions

Deploy the following bundle:

bundle: kubernetes
applications:
  alertmanager:
    charm: alertmanager-k8s
    channel: latest/edge
    revision: 138
    base: [email protected]/stable
    resources:
      alertmanager-image: 99
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,1024M
    trust: true
  catalogue:
    charm: catalogue-k8s
    channel: latest/edge
    revision: 69
    base: [email protected]/stable
    resources:
      catalogue-image: 34
    scale: 1
    options:
      description: "Canonical Observability Stack Lite, or COS Lite, is a light-weight,
        highly-integrated, \nJuju-based observability suite running on Kubernetes.\n"
      tagline: Model-driven Observability Stack deployed with a single command.
      title: Canonical Observability Stack
    constraints: arch=amd64
    trust: true
  grafana:
    charm: grafana-k8s
    channel: latest/edge
    revision: 121
    base: [email protected]/stable
    resources:
      grafana-image: 70
      litestream-image: 45
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,1024M
    trust: true
  loki:
    charm: loki-k8s
    channel: latest/edge
    revision: 174
    base: [email protected]/stable
    resources:
      loki-image: 100
      node-exporter-image: 3
    scale: 1
    constraints: arch=amd64
    storage:
      active-index-directory: kubernetes,1,1024M
      loki-chunks: kubernetes,1,1024M
    trust: true
  minio:
    charm: minio
    channel: ckf-1.9/edge
    revision: 380
    base: [email protected]/stable
    resources:
      oci-image: 545
    scale: 1
    options:
      access-key: accesskey
      secret-key: secretkey
    constraints: arch=amd64
    storage:
      minio-data: kubernetes,1,10240M
    trust: true
  prometheus-k8s:
    charm: local:prometheus-k8s-0
    base: [email protected]/stable
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,1024M
    trust: true
  s3:
    charm: s3-integrator
    channel: latest/edge
    revision: 75
    scale: 1
    options:
      bucket: tempo
      endpoint: minio-0.minio-endpoints.test-self-tracing.svc.cluster.local:9000
    constraints: arch=amd64
    trust: true
  self-signed-certificates:
    charm: self-signed-certificates
    channel: latest/edge
    revision: 202
    scale: 1
    constraints: arch=amd64
    trust: true
  tempo-coordinator-k8s:
    charm: tempo-coordinator-k8s
    channel: latest/edge
    revision: 30
    resources:
      nginx-image: 5
      nginx-prometheus-exporter-image: 3
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,1024M
    trust: true
  tempo-worker-k8s:
    charm: tempo-worker-k8s
    channel: latest/edge
    revision: 37
    resources:
      tempo-image: 4
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,1024M
    trust: true
  traefik:
    charm: traefik-k8s
    channel: latest/edge
    revision: 214
    base: [email protected]/stable
    resources:
      traefik-image: 161
    scale: 1
    constraints: arch=amd64
    storage:
      configurations: kubernetes,1,1024M
    trust: true
relations:
- - traefik:ingress-per-unit
  - loki:ingress
- - traefik:traefik-route
  - grafana:ingress
- - traefik:ingress
  - alertmanager:ingress
- - grafana:grafana-source
  - loki:grafana-source
- - grafana:grafana-source
  - alertmanager:grafana-source
- - loki:alertmanager
  - alertmanager:alerting
- - grafana:grafana-dashboard
  - loki:grafana-dashboard
- - grafana:grafana-dashboard
  - alertmanager:grafana-dashboard
- - catalogue:ingress
  - traefik:ingress
- - catalogue:catalogue
  - grafana:catalogue
- - catalogue:catalogue
  - alertmanager:catalogue
- - alertmanager:alerting
  - prometheus-k8s:alertmanager
- - alertmanager:self-metrics-endpoint
  - prometheus-k8s:metrics-endpoint
- - catalogue:catalogue
  - loki:catalogue
- - catalogue:catalogue
  - prometheus-k8s:catalogue
- - grafana:metrics-endpoint
  - prometheus-k8s:metrics-endpoint
- - loki:metrics-endpoint
  - prometheus-k8s:metrics-endpoint
- - loki:logging
  - tempo-coordinator-k8s:logging
- - loki:logging
  - traefik:logging
- - minio:grafana-dashboard
  - grafana:grafana-dashboard
- - minio:metrics-endpoint
  - prometheus-k8s:metrics-endpoint
- - prometheus-k8s:grafana-dashboard
  - grafana:grafana-dashboard
- - prometheus-k8s:grafana-source
  - grafana:grafana-source
- - prometheus-k8s:receive-remote-write
  - tempo-coordinator-k8s:send-remote-write
- - s3:s3-credentials
  - tempo-coordinator-k8s:s3
- - tempo-coordinator-k8s:tracing
  - alertmanager:tracing
- - tempo-coordinator-k8s:tracing
  - catalogue:tracing
- - tempo-coordinator-k8s:grafana-dashboard
  - grafana:grafana-dashboard
- - tempo-coordinator-k8s:grafana-source
  - grafana:grafana-source
- - tempo-coordinator-k8s:tracing
  - grafana:tracing
- - tempo-coordinator-k8s:tracing
  - loki:tracing
- - tempo-coordinator-k8s:metrics-endpoint
  - prometheus-k8s:metrics-endpoint
- - tempo-coordinator-k8s:tracing
  - prometheus-k8s:charm-tracing
- - tempo-coordinator-k8s:tracing
  - prometheus-k8s:workload-tracing
- - tempo-coordinator-k8s:tempo-cluster
  - tempo-worker-k8s:tempo-cluster
- - tempo-coordinator-k8s:tracing
  - traefik:tracing
- - traefik:grafana-dashboard
  - grafana:grafana-dashboard
- - traefik:ingress-per-unit
  - prometheus-k8s:ingress
- - traefik:metrics-endpoint
  - prometheus-k8s:metrics-endpoint
- - traefik:traefik-route
  - tempo-coordinator-k8s:ingress
- - self-signed-certificates:certificates
  - alertmanager:certificates
- - self-signed-certificates:certificates
  - catalogue:certificates
- - self-signed-certificates:certificates
  - grafana:certificates
- - self-signed-certificates:send-ca-cert
  - grafana:receive-ca-cert
- - self-signed-certificates:certificates
  - loki:certificates
- - self-signed-certificates:certificates
  - prometheus-k8s:certificates
- - self-signed-certificates:certificates
  - tempo-coordinator-k8s:certificates
- - self-signed-certificates:certificates
  - traefik:certificates
- - self-signed-certificates:send-ca-cert
  - traefik:receive-ca-cert
- - tempo-coordinator-k8s:tracing
  - self-signed-certificates:tracing

Upgrade Notes

Please note that the breaking change only affects you if you have Prometheus integrated with Tempo on production environments. If you haven't deployed Tempo, you're not affected.

We are rolling out the split within the COS charms. If you have COS components deployed together with Tempo and related using a tracing integration in your Juju model, please remove the tracing relation before the upgrade and re-relate the applications to tempo after the upgrade.

@mmkay mmkay marked this pull request as ready for review November 8, 2024 11:44
@mmkay mmkay requested a review from a team as a code owner November 8, 2024 11:44
src/charm.py Outdated Show resolved Hide resolved
src/charm.py Outdated Show resolved Hide resolved
@mmkay mmkay merged commit 3d0ac1a into main Nov 13, 2024
13 checks passed
@mmkay mmkay deleted the workload-tracing branch November 13, 2024 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants