Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add signal handling to agbot_start.sh and css_start.sh #4039

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lbergesio
Copy link
Contributor

@lbergesio lbergesio commented Apr 16, 2024

Description

When run in a container anax is run via a shell script that does not handle SIGTERM and hence when the container is stopped this signal does not arrive to the binary for graceful exit. A consequence of this is when run via docker compose up, docker compose down will have to wait until the timeout in order to kill the process.

The same happens for the ccs-api.

This change adds proper SIGTERM handling to agbot_start.sh and css_start.sh.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Running the container image with docker compose up and checking the output of docker compose down:
Running docker compose down -t 20s without this change :

✔ Container agbot                              Removed                       21.9s
✔ Container css-api                            Removed                       20.4s

With this change:

✔ Container agbot                              Removed                       3.4s
✔ Container css-api                            Removed                       4.2s

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings
  • I have tagged the reviewers in a comment below incase my pull request is ready for a review
  • I have signed the commit message to agree to Developer Certificate of Origin (DCO) (to certify that you wrote or otherwise have the right to submit your contribution to the project.) by adding "--signoff" to my git commit command.

@lbergesio lbergesio force-pushed the add_sighandler_agbot_start branch from efe8e4c to e828422 Compare April 16, 2024 10:06
@lbergesio lbergesio changed the title Add signal handling to agbot_start.sh Add signal handling to agbot_start.sh and css_start.sh Apr 16, 2024
@lbergesio
Copy link
Contributor Author

Not sure why the ci failed, any hint with that would be appreciated

When run in a container anax is run via a shell script that does not
handle SIGTERM and hence when the container is stopped this signal does
not arrive to the binary for graceful exit. A consequence of this is
when run via docker compose up, docker compose down will have to wait
until the timeout in order to kill the process.

The same happens for the ccs-api.

This change adds proper SIGTERM handling to agbot_start.sh and
css_start.sh.

Signed-off-by: Leonardo Bergesio <[email protected]>
@LiilyZhang LiilyZhang force-pushed the add_sighandler_agbot_start branch from e828422 to d851295 Compare April 17, 2024 13:48
@lbergesio
Copy link
Contributor Author

Hi @LiilyZhang is anything preventing this to be merge I should address?

@LiilyZhang
Copy link
Contributor

we are still investigating how to test this in openshift environment @lbergesio

@lbergesio
Copy link
Contributor Author

we are still investigating how to test this in openshift environment @lbergesio

I think kubelet works with the pod the same way than docker with the container: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
I would say just run the pod, get in, check the process is running. Destroy the pod, this will send sigterm to the pod process?

@lbergesio
Copy link
Contributor Author

lbergesio commented Jun 18, 2024

Hi! any update on this? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants