Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentry-worker livenessProbe not working #1521

Open
1 task done
marcin-je opened this issue Oct 8, 2024 · 3 comments
Open
1 task done

sentry-worker livenessProbe not working #1521

marcin-je opened this issue Oct 8, 2024 · 3 comments

Comments

@marcin-je
Copy link

Issue submitter TODO list

  • I've searched for an already existing issues here

Describe the bug (actual behavior)

livenessProbe errors

NameError: name 'celery' is not defined

Full error:

st recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined
Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/.venv/bin/sentry", line 4, in <module>
    raise SystemExit(main())
                     ^^^^^^
  File "/usr/src/sentry/src/sentry/runner/main.py", line 149, in main
    func(**kwargs)
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/sentry/src/sentry/runner/commands/exec.py", line 118, in exec_
    exec(compile(script, file, "exec"), g, g)
  File "<string>", line 11, in <module>
ScriptError: Failed to execute script '<string>'
Sentry is attempting to send 2 pending events
Waiting up to 2 seconds
Press Ctrl-C to quit

Expected behavior

livenessProbe not to error or return a meaningful message

values.yaml


  values:
    clickhouse:
      clickhouse:
        imageVersion: 21.8.14.5
        replicas: "1"
      serviceAccount:
        enabled: true
        name: sentry-clickhouse
    externalPostgresql:
      database: sentry
      existingSecret: postgres-credentials
      existingSecretKeys:
        password: user-password
      host: postgres-postgresql.sentry-new.svc.cluster.local
      port: 5432
      username: sentry
    externalRedis:
      host: redis-master.sentry-new.svc.cluster.local
      port: 6379
    filestore:
      backend: s3
      s3:
        bucketName: redacted
        default_acl: private
        encryption: "true"
        region_name: eu-west-2
    ingress:
      REDACTED
    kafka:
      resources:
        limits:
          cpu: 1
          memory: 2Gi
        requests:
          cpu: 250m
          memory: 1Gi
      zookeeper:
        resources:
          limits:
            cpu: 1
            memory: 1Gi
          requests:
            cpu: 250m
            memory: 400Mi
        serviceAccount:
          create: true
          name: sentry-zookeeper
    mail:
      backend: smtp
      existingSecret: sentry-mail-password
      from: [email protected]
      host: smtp.mandrillapp.com
      port: 587
      useTls: true
      username: redacted
    nginx:
      enabled: false
    postgresql:
      enabled: false
    rabbitmq:
      enabled: false
    redis:
      enabled: false
    relay:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - sentry
                  - key: role
                    operator: In
                    values:
                      - relay
              topologyKey: topology.kubernetes.io/zone
      init:
        resources:
          limits:
            cpu: 500m
            memory: 450Mi
          requests:
            cpu: 200m
            memory: 150Mi
      resources:
        limits:
          cpu: 1
          memory: 1Gi
        requests:
          cpu: 100m
          memory: 100Mi
    sentry:
      billingMetricsConsumer:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 150Mi
      cleanup:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 80Mi
        serviceAccount:
          name: sentry-cleanup
      cron:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - cron
                topologyKey: topology.kubernetes.io/zone
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 50m
            memory: 200Mi
      ingestConsumer:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 170Mi
      ingestMetricsConsumerPerf:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 310Mi
      ingestMetricsConsumerRh:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 300Mi
      ingestReplayRecordings:
        resources:
          limits:
            cpu: 500m
            memory: 700Mi
          requests:
            cpu: 10m
            memory: 150Mi
      postProcessForwardErrors:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      postProcessForwardTransactions:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      subscriptionConsumerEvents:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 10m
            memory: 80Mi
      subscriptionConsumerSessions:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 80Mi
      subscriptionConsumerTransactions:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 80Mi
      web:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - web
                topologyKey: topology.kubernetes.io/zone
        resources:
          limits:
            cpu: 1
            memory: 1Gi
          requests:
            cpu: 200m
            memory: 200Mi
      worker:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - worker
                topologyKey: kubernetes.io/hostname
        resources:
          limits:
            cpu: 700m
            memory: 2Gi
          requests:
            cpu: 30m
            memory: 100Mi
    serviceAccount:
      annotations:
        eks.amazonaws.com/role-arn: redacted
      enabled: true
    snuba:
      api:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - sentry
                    - key: role
                      operator: In
                      values:
                        - snuba-api
                topologyKey: topology.kubernetes.io/zone
        resources:
          limits:
            cpu: 500m
            memory: 2Gi
          requests:
            cpu: 20m
            memory: 200Mi
      consumer:
        resources:
          limits:
            cpu: 500m
            memory: 1Gi
          requests:
            cpu: 100m
            memory: 100Mi
      metricsConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      outcomesBillingConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      outcomesConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      replacer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      replaysConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      sessionsConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
      subscriptionConsumerEvents:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      subscriptionConsumerSessions:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      subscriptionConsumerTransactions:
        resources:
          limits:
            cpu: 200m
            memory: 700Mi
          requests:
            cpu: 20m
            memory: 120Mi
      transactionsConsumer:
        resources:
          limits:
            cpu: 200m
            memory: 500Mi
          requests:
            cpu: 20m
            memory: 100Mi
    user:
      create: true
      email: [email protected]
      existingSecret: sentry-admin-password
      existingSecretKey: admin-password
    zookeeper:
      persistence:
        size: 32Gi
      resources:
        limits:
          cpu: 1
          memory: 1Gi
        requests:
          cpu: 250m
          memory: 450Mi
      serviceAccount:
        create: true
        name: sentry-zookeeper-clickhouse
      sidecars:
        - args:
            - -c
            - while true; do sleep 86400 && find /bitnami/zookeeper/data -type f -mtime
              +14 -name 'log.*' -print0 | xargs -r0 rm --; done;
          command:
            - /bin/sh
          image: alpine:3.18.2
          name: cleanup-logs
          resources:
            limits:
              cpu: 1
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 128Mi
          volumeMounts:
            - mountPath: /bitnami/zookeeper
              name: data

Helm chart version

25.9.0

Steps to reproduce

 kubectl exec -it deployments/sentry-worker -c sentry-worker -- sentry exec -c  "from sentry.celery import app; import os; dest="celery@{}".format(os.environ["HOSTNAME"]); print(app.control.ping(destination=[dest], timeout=5)[0][dest]["ok"])"

Screenshots

No response

Logs

Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined
Traceback (most recent call last):
  File "<string>", line 7, in <module>
NameError: name 'celery' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/.venv/bin/sentry", line 4, in <module>
    raise SystemExit(main())
                     ^^^^^^
  File "/usr/src/sentry/src/sentry/runner/main.py", line 149, in main
    func(**kwargs)
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/sentry/src/sentry/runner/commands/exec.py", line 118, in exec_
    exec(compile(script, file, "exec"), g, g)
  File "<string>", line 11, in <module>
ScriptError: Failed to execute script '<string>'
Sentry is attempting to send 2 pending events
Waiting up to 2 seconds
Press Ctrl-C to quit

Additional context

No response

@patsevanton
Copy link
Contributor

patsevanton commented Oct 25, 2024

I do not approve. I just installed the latest version of helm chart sentry

root@sentry-worker-58788dd48-dc58x:/usr/src/sentry# sentry exec -c 'from sentry.celery import app; import os; dest="celery@{}".format(os.environ["HOSTNAME"]); print(app.control.ping(destination=[dest], timeout=5)[0][dest]["ok"])'
pong
root@sentry-worker-58788dd48-dc58x:/usr/src/sentry# 

@JaanJah
Copy link

JaanJah commented Nov 18, 2024

I had the same issue and figured out it was related to CPU throttling in Sentry Worker, so I increased CPU limit and now the workers are fine.

Scaling up worker replicas also might help.

Personally running such config at the moment:

    worker:
      replicas: 3
      resources:
        requests:
          cpu: 250m
          memory: 800Mi
        limits:
          cpu: 1300m
          memory: 1Gi

@mway-niels
Copy link

Not sure if this is related but our sentry-worker Pod keeps restarting due to:

Liveness probe failed: command timed out: "sentry exec -c from sentry.celery import app; import os; dest=\"celery@{}\".format(os.environ[\"HOSTNAME\"]); pri │
│ nt(app.control.ping(destination=[dest], timeout=5)[0][dest][\"ok\"])" timed out after 10s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants