Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run CI jobs in container #1136

Closed
5 of 9 tasks
demisx opened this issue Feb 24, 2023 · 3 comments · Fixed by #1154
Closed
5 of 9 tasks

Run CI jobs in container #1136

demisx opened this issue Feb 24, 2023 · 3 comments · Fixed by #1154
Assignees
Labels
ci Continuous integration tech debt

Comments

@demisx
Copy link
Collaborator

demisx commented Feb 24, 2023

Currently, we execute all jobs directly on self-hosted runners. This leads to a side effect where each runner has state left over from the previously executed job. We try to clean it up in the common "Reset Runner" action right now, but it has its own drawbacks:

  1. It's a manual process, so it's easy to miss something
  2. Leftover state leads to intermittent errors which are hard to replicate and fix
  3. The reset runner cannot be used as the first step, because the repo that hosts this runner needs to be checked out beforehand. And if the error is in the checkout action itself we are looking at catch 22 (ex. Fix GitHub check out error #1133)

In order to make CI process more reliable, I recommend considering running each job in a container, so each runner starts off with a clean state every single time. There is also another benefit of running in a container - we can create and use our own images with all preinstalled tools and libraries. Makes CI even faster.

Tasks:

  • Conduct research on running CI jobs in a container, caveats associated with it
  • Move shared runner jobs to container
  • Move self-hosted runner jobs to container
  • Remove debug steps
  • Delete test rust file
  • Disable Reset Runner common action (globally)
  • Remove HOME: /root env var?
  • Evaluate the benefit of pre-building our own images with pre-installed tools and libs
  • Switch from root to runner user

References:

@demisx demisx added discussion Topic for Discussion at a Community Call tech debt ci Continuous integration labels Feb 24, 2023
@demisx
Copy link
Collaborator Author

demisx commented Feb 24, 2023

@wilwade @saraswatpuneet I think this is the next logical step in Frequency CI evolution. Let me know what your thoughts are on this.

@demisx demisx self-assigned this Feb 27, 2023
@demisx demisx removed the discussion Topic for Discussion at a Community Call label Feb 28, 2023
@demisx
Copy link
Collaborator Author

demisx commented Feb 28, 2023

Here is my plan of attack. So far we are running two types of executors:

  1. "Self-hosted runners" for compute intensive jobs (8 total)
  2. "GitHub shared runners" for everything else
  • I am going to attempt to run ALL jobs in a container, regardless of the executor. All jobs will run under non-privileged user runner now (I'm worried that self-hosted runners run them as root right now).
  • ubuntu-20.04 official docker image will be the base image.
  • Required tools and libs will be pre-installed into the image to, hopefully, reduce CI execution run times further.
  • Alpine image is a backup in case we see poor performance from the Ubuntu based image.
  • The jobs running on the shared runners are simpler and will be converted to use containers first, in its own PR.
  • A separate issue Harden GitHub Actions #1152 has been created to further improve security of GH actions, especially since this is a public repo and community is allowed to fork it.

@demisx demisx linked a pull request Feb 28, 2023 that will close this issue
@demisx
Copy link
Collaborator Author

demisx commented Mar 1, 2023

The calc-code-coverage job fails with an error when running in a container. Filed the following bug actions-rs/tarpaulin#24.

demisx added a commit that referenced this issue Mar 7, 2023
# Goal
The goal of this PR is to move jobs to docker executor in the following
workflows:

1. Verify PR Commit
2. Run Post PR Merge Actions

Part of #1136
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration tech debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant