sentry-worker livenessProbe not working #1521

marcin-je · 2024-10-08T11:17:03Z

Issue submitter TODO list

I've searched for an already existing issues here

Describe the bug (actual behavior)

livenessProbe errors

NameError: name 'celery' is not defined

Full error:

st recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined
Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/.venv/bin/sentry", line 4, in <module>
    raise SystemExit(main())
                     ^^^^^^
  File "/usr/src/sentry/src/sentry/runner/main.py", line 149, in main
    func(**kwargs)
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/sentry/src/sentry/runner/commands/exec.py", line 118, in exec_
    exec(compile(script, file, "exec"), g, g)
  File "<string>", line 11, in <module>
ScriptError: Failed to execute script '<string>'
Sentry is attempting to send 2 pending events
Waiting up to 2 seconds
Press Ctrl-C to quit

Expected behavior

livenessProbe not to error or return a meaningful message

values.yaml


  values:
    clickhouse:
      clickhouse:
        imageVersion: 21.8.14.5
        replicas: "1"
      serviceAccount:
        enabled: true
        name: sentry-clickhouse
    externalPostgresql:
      database: sentry
      existingSecret: postgres-credentials
      existingSecretKeys:
        password: user-password
      host: postgres-postgresql.sentry-new.svc.cluster.local
      port: 5432
      username: sentry
    externalRedis:
      host: redis-master.sentry-new.svc.cluster.local
      port: 6379
    filestore:
      backend: s3
      s3:
        bucketName: redacted
        default_acl: private
        encryption: "true"
        region_name: eu-west-2
    ingress:
      REDACTED
    kafka:
      resources:
        limits:
          cpu: 1
          memory: 2Gi
        requests:
          cpu: 250m
          memory: 1Gi
      zookeeper:
        resources:
          limits:
            cpu: 1
            memory: 1Gi
          requests:
            cpu: 250m
            memory: 400Mi
        serviceAccount:
          create: true
          name: sentry-zookeeper
    mail:
      backend: smtp
      existingSecret: sentry-mail-password
      from: [email protected]
      host: smtp.mandrillapp.com
      port: 587
      useTls: true
      username: redacted
    nginx:
      enabled: false
    postgresql:
      enabled: false
    rabbitmq:
      enabled: false
    redis:
      enabled: false
    relay:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - sentry
                  - key: role
                    operator: In
                    values:
                      - relay
              topologyKey: topology.kubernetes.io/zone
      init:
        resources:
          limits:
            cpu: 500m
            memory: 450Mi
          requests:
            cpu: 200m
            memory: 150Mi
      resources:
        limits:
          cpu: 1
          memory: 1Gi
        requests:
          cpu: 100m
          memory: 100Mi
    sentry:
      billingMetricsConsumer:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 150Mi
      cleanup:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 80Mi
        serviceAccount:
          name: sentry-cleanup
      cron:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - cron
                topologyKey: topology.kubernetes.io/zone
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 50m
            memory: 200Mi
      ingestConsumer:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 170Mi
      ingestMetricsConsumerPerf:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 310Mi
      ingestMetricsConsumerRh:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 300Mi
      ingestReplayRecordings:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 150Mi
      postProcessForwardErrors:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      postProcessForwardTransactions:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      subscriptionConsumerEvents:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 10m
            memory: 80Mi
      subscriptionConsumerSessions:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 80Mi
      subscriptionConsumerTransactions:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 80Mi
      web:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - web
                topologyKey: topology.kubernetes.io/zone
        resources:
          limits:
            cpu: 1
            memory: 1Gi
          requests:
            cpu: 200m
            memory: 200Mi
      worker:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - worker
                topologyKey: kubernetes.io/hostname
        resources:
          limits:
            cpu: 700m
            memory: 2Gi
          requests:
            cpu: 30m
            memory: 100Mi
    serviceAccount:
      annotations:
        eks.amazonaws.com/role-arn: redacted
      enabled: true
    snuba:
      api:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - snuba-api
                topologyKey: topology.kubernetes.io/zone
        resources:
          limits:
            cpu: 500m
            memory: 2Gi
          requests:
            cpu: 20m
            memory: 200Mi
      consumer:
        resources:
          limits:
            cpu: 500m
            memory: 1Gi
          requests:
            cpu: 100m
            memory: 100Mi
      metricsConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      outcomesBillingConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      outcomesConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      replacer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      replaysConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      sessionsConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      subscriptionConsumerEvents:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      subscriptionConsumerSessions:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      subscriptionConsumerTransactions:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      transactionsConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
    user:
      create: true
      email: [email protected]
      existingSecret: sentry-admin-password
      existingSecretKey: admin-password
    zookeeper:
      persistence:
        size: 32Gi
      resources:
        limits:
          cpu: 1
          memory: 1Gi
        requests:
          cpu: 250m
          memory: 450Mi
      serviceAccount:
        create: true
        name: sentry-zookeeper-clickhouse
      sidecars:
        - args:
            - -c
            - while true; do sleep 86400 && find /bitnami/zookeeper/data -type f -mtime
              +14 -name 'log.*' -print0 | xargs -r0 rm --; done;
          command:
            - /bin/sh
          image: alpine:3.18.2
          name: cleanup-logs
          resources:
            limits:
              cpu: 1
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 128Mi
          volumeMounts:
            - mountPath: /bitnami/zookeeper
              name: data

Helm chart version

25.9.0

Steps to reproduce

 kubectl exec -it deployments/sentry-worker -c sentry-worker -- sentry exec -c  "from sentry.celery import app; import os; dest="celery@{}".format(os.environ["HOSTNAME"]); print(app.control.ping(destination=[dest], timeout=5)[0][dest]["ok"])"

Screenshots

No response

Logs

Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined
Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/.venv/bin/sentry", line 4, in <module>
    raise SystemExit(main())
                     ^^^^^^
  File "/usr/src/sentry/src/sentry/runner/main.py", line 149, in main
    func(**kwargs)
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/sentry/src/sentry/runner/commands/exec.py", line 118, in exec_
    exec(compile(script, file, "exec"), g, g)
  File "<string>", line 11, in <module>
ScriptError: Failed to execute script '<string>'
Sentry is attempting to send 2 pending events
Waiting up to 2 seconds
Press Ctrl-C to quit

Additional context

No response

The text was updated successfully, but these errors were encountered:

patsevanton · 2024-10-25T14:15:41Z

I do not approve. I just installed the latest version of helm chart sentry

root@sentry-worker-58788dd48-dc58x:/usr/src/sentry# sentry exec -c 'from sentry.celery import app; import os; dest="celery@{}".format(os.environ["HOSTNAME"]); print(app.control.ping(destination=[dest], timeout=5)[0][dest]["ok"])'
pong
root@sentry-worker-58788dd48-dc58x:/usr/src/sentry#

JaanJah · 2024-11-18T07:30:00Z

I had the same issue and figured out it was related to CPU throttling in Sentry Worker, so I increased CPU limit and now the workers are fine.

Scaling up worker replicas also might help.

Personally running such config at the moment:

    worker:
      replicas: 3
      resources:
        requests:
          cpu: 250m
          memory: 800Mi
        limits:
          cpu: 1300m
          memory: 1Gi

mway-niels · 2024-11-20T10:57:56Z

Not sure if this is related but our sentry-worker Pod keeps restarting due to:

Liveness probe failed: command timed out: "sentry exec -c from sentry.celery import app; import os; dest=\"celery@{}\".format(os.environ[\"HOSTNAME\"]); pri │
│ nt(app.control.ping(destination=[dest], timeout=5)[0][dest][\"ok\"])" timed out after 10s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sentry-worker livenessProbe not working #1521

sentry-worker livenessProbe not working #1521

marcin-je commented Oct 8, 2024

patsevanton commented Oct 25, 2024 •

edited

Loading

JaanJah commented Nov 18, 2024

mway-niels commented Nov 20, 2024

sentry-worker livenessProbe not working #1521

sentry-worker livenessProbe not working #1521

Comments

marcin-je commented Oct 8, 2024

Issue submitter TODO list

Describe the bug (actual behavior)

Expected behavior

values.yaml

Helm chart version

Steps to reproduce

Screenshots

Logs

Additional context

patsevanton commented Oct 25, 2024 • edited Loading

JaanJah commented Nov 18, 2024

mway-niels commented Nov 20, 2024

patsevanton commented Oct 25, 2024 •

edited

Loading