Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PORT] Bring our workflows into (mostly) parity with tgstation's workflows #2348

Merged
merged 18 commits into from
Jun 26, 2024

Conversation

LikeLakers2
Copy link
Contributor

@LikeLakers2 LikeLakers2 commented Jun 21, 2024

About The Pull Request

Ports the following workflow-related PRs, taken from the list of commits at https://github.com/tgstation/tgstation/commits/master/.github/workflows:

Reverted PRs

The following PR(s) were originally included, but were reverted:

https://github.com/tgstation/tgstation/pull/80631
Title: Fix flaky wizard screenshot test failure, and try to make screenshot tests run more consistently
Reason: Appeared to cause a test failure in this PR. It's not a very important fix, so I've opted to revert it for now.

Unported PRs

The following PRs listed at https://github.com/tgstation/tgstation/commits/master/.github/workflows have NOT been ported:

https://github.com/tgstation/tgstation/pull/75570
Title: Quietens Spaceman when debugging UNIT_TESTS locally
Reason: Unsure if this PR is wanted/needed

https://github.com/tgstation/tgstation/pull/76663
Title: Bumps rust-g to 3.0.0
Reason: Unsure if this PR is wanted/needed

https://github.com/tgstation/tgstation/pull/77352
Title: Remove merge conflict labeling workflow
Reason: Unsure if this PR is wanted/needed

https://github.com/tgstation/tgstation/pull/79642
Title: Puts all traits in the globalvars file + CI Testing
Reason: Porting this is planned, but this will need more attention than I can give it in this PR

https://github.com/tgstation/tgstation/pull/79659
Title: Icon Autoslicing
Reason: Porting this is planned, but this will need more attention than I can give it in this PR

https://github.com/tgstation/tgstation/pull/80558
Title: Update TGS Tester script for TGS6
Reason: Cherry-picking resulted in empty commit

https://github.com/tgstation/tgstation/pull/83015
Title: Fixes CI not checking warnings.
Reason: Not sure how to deal with JS conflicts

Why It's Good For The Game

Keeping in parity with their workflows makes things much easier for us, especially when bringing content from tgstation.

Changelog

Nothing player-facing has been changed. However, here are some noteworthy changes relevant to contributors:

  • If you face a linter error, it should now be less of a slog searching through the logs.
  • Several tools used during workflows should now have their installs cached on Github, making for faster CI runs. If the caches are ever cleared for any reason, the next CI run will automatically rebuild it with a fresh version.
  • Several Github Actions have been updated to Node 20, which should reduce the number of useless "Node 16" warnings.
  • OpenDream lints are improved.
  • Existing CI runs are now cancelled when a new commit comes in.
  • Certain workflows will now timeout if they run for too long. This shouldn't ever come up in practice, but it will keep GitHub Actions from being bogged down.

Wayland-Smithy and others added 18 commits June 21, 2024 07:29
## About The Pull Request

Bumps the emoji remover action's commit hash +1 to [the current
one](Wayland-Smithy/emoji-stripper-action@de0c1d1)
which has passed my node16 testing.
Necessary for when github intends to disable node12 later this year.

## Why It's Good For The Repo

Paying off some tech debt so the action continues to help maintain
maintainer sanity into 2024 and byond,

## Changelog
Non player/admin facing change.
@LemonInTheDark wanted this to be able to hunt down the progress bar
hard dels.

Code is here:
https://gist.github.com/Mothblocks/06b415afd75672fb03637804435350d2

Intended to be reverted once we figure out the issue, or if we realize
it's not related. This library is not stable.
## About The Pull Request

This PR removes the "Annotate Lints" job step and merges it with the
"Run Linters" step above. To achieve this, I wrote a python script that
should be identical to the action. I even added the ability to read the
output from a file to the script if we ever needed that, but I decided
to have the job step pipe the output into the script instead.
## Why It's Good For The Game

It always bugged me a bit that we had to check the results for a
separate step if we wanted to see the linter results for dm code. I also
noticed a few people getting confused as to why their CI failed on
linters.

Turns out that the action is just a few lines that match the
dreamchecker output and reformat it to a format that GitHub can annotate
code with. It's so brain dead simple that it shouldn't need to be a
whole new step, and for the previous two reasons.
## Changelog
not playerfacing
## About The Pull Request

Splits the big "Run Linters" step into multiple steps. Also since all of
these steps are independent of eachother, I've made them all run
regardless of if the job is currently failing.

<details>
<summary>Proof of testing:</summary>

Fail in install tools, all linting steps are skipped:
https://github.com/distributivgesetz/tgstation/actions/runs/6151628214/job/16692089726
Fail in Run DreamChecker, other steps continue to run:
https://github.com/distributivgesetz/tgstation/actions/runs/6151664705/job/16692203569?pr=2
</details>

<details>
<summary>Pictured: me breaking CI for a day</summary>


https://github.com/tgstation/tgstation/assets/47710522/ea12ad30-2b69-4aa3-9642-7d0818eab2d1


</details>

## Why It's Good For The Game

Going through the Run Linters step has always been a slog. Finding an
error is like finding a needle in a haystack. Seeing what command
exactly went wrong is going to go a long way in helping people find out
which linters have failed.

## Changelog
nothing playerfacing
## About The Pull Request

few errors which were odd and annoying
## Why It's Good For The Game

stealing PRs from san7890, they wanted to do this
## Changelog
nothing playerfacing
> `bash`/`sh`:
By default, fail-fast behavior is enforced using `set -e` for both `sh`
and `bash`. When `shell: bash` is specified, `-o pipefail` is also
applied to enforce early exit from pipelines that generate a non-zero
exit status.

it helps when I actually read the docs
## About The Pull Request

I was planning on doing this a lot earlier but ran into some weird
roadblocks, but this time I finally did the research and it's done
properly.

We do a lot of useless work on every single CI run, and in the interest
of saving the world's energy (by a matter of at least 10 seconds per my
testing), lets stop doing so much extra work by caching both the work we
do on the python bootstrap installation (we literally call it `cache`
for a reason) and the Node installation by sharing it between github
actions runners.

Here's a random CI run on master where you can see that the `Install
Tools` portion takes about 19 seconds -
https://github.com/tgstation/tgstation/actions/runs/6167104927/job/16737570519

Here's the CI run I was testing with where you can see we slim it down
to about 6 seconds for the `Install Tools` step, but with a second-or-so
taken to restore the cache since the runner needs to download+unzip the
cache. it tends to be shorter or longer by a second at times but i'm
certain this is just noise -
https://github.com/san7890/bruhstation/actions/runs/6167245722/job/16737919874

On average, we save about **10 seconds** a run on Run Linters, which is
meant to be the fastest CI step. Future improvements would lie in making
either maplint or the tgui test suite faster, but that would be a much
more complicated and code-intensive task. cache go weeeee

## Changelog

Nothing that concerns players.
I thought `set +e` would still make the shell exit with an error if any
command failed, I didn't think that it would just use the exit code of
the last command. I am dumb, I'm an idiot and I want to cry.
Ripgrep got updated some time ago, and that means that our cache went
out of date as we had to keep doing more work to install the new
dependencies that it has. This means we were eating up another **50
seconds** every time we ran Run Linters. Unfortunately, github caches
will only hash to a file that exists on the repository, not anything
that might exist after a tool is installed.

So, in order to do this, we create a new bash file `install_ripgrep.sh`
and also pin the ripgrep version. If we need to reap some new benefits
or it's substantially faster (which might not be the case as ripgrep is
already blazing fast), we just need to update `ci_dependencies.sh` and
the cache should work for future versions without us needing to purge
rust caches manually through github.

You can view a successful run here
https://github.com/san7890/bruhstation/actions/runs/7123029395/job/19394997831

I also did something similar for SpacemanDMM while i noticed it.
…0287)

It wasn't exiting gracefully because of a typo? here
…tests run more consistently (#80631)

From tgstation/tgstation#80542

Not exactly sure why this only fails once. My hint is that it happens
when the alternate tests are skipped from this:


![image](https://github.com/tgstation/tgstation/assets/35135081/67c261c0-e771-4696-add2-f41bb800ad65)

...though this should still be *consistent* at least.
Node 16 actions are deprecated, we need to use Node 20 actions now (as
noted by this handy screenshot which spams it on every CI run)

![image](https://github.com/tgstation/tgstation/assets/34697715/24ea3013-c762-4027-951c-d598b1eec8a3)

You may see
https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/
for more information.
## About The Pull Request

This PR does a few things.

The main goal was to have it annotate the code like we do for `Run
Dreamchecker`, which it now does. The code for this was mostly
shamelessly ripped from the very same process I just mentioned, but
stripped down a lot more for the task. Since I had to write a lot more
code to handle the way we get warnings and the like from OpenDream, I
didn't fold both into one program.


![image](https://github.com/tgstation/tgstation/assets/34697715/f4a05b59-5407-412e-a73f-5c069fdb4b02)


![image](https://github.com/tgstation/tgstation/assets/34697715/0d3335aa-0da1-45ed-af19-1d17a7fc4174)

It works! Ta-da.

Anyways, in order to facilitate the code annotation, we need to use a
Python script. This Python script means that we need to use the
bootstrap, and the bootstrap is best used with the bootstrap cache. I
then realized that we should just really fold this special CI runner
into the Run Linters workflow anyways, instead of having it on its own
runner (since it'll cut into the number of runners we have available at
a time anyways).

I think it's a good way to save precious CPU cycles on our dying Earth,
and the issues are now visible enough to the average coder (in the
`Files Changed` tab) to where I feel pretty good about doing this.

There's a bit of weirdness and jank in the Python program, but it's all
done for a reason I assure you (it's always silly when stuff breaks
silently and no one knows to fix it until years later). You can look an
example of a failing run here:
https://github.com/san7890/bruhstation/actions/runs/8303759645/job/22728427623#step:17:14
## About The Pull Request

Fun fact, nothing ever hashbanged this, so we've been using the same
weeks-old cache which has been necessitating that we use the same old
versions of BYOND until someone cleaned the cache. Let's tie to
`dependencies.sh` like everything else (which we already did in #78307
(5e1c8bd)) so we don't have to keep
wasting time in CI compilations having to reinstall the needed BYOND
version from scratch (and have caches date and auto-clear themselves
when outdated)

the advantage is that we spend less time downloading/installing BYOND in
CI runs and can actually run CI
This sets concurrency at the workflow level, which should provide a more
concise way to eject if a new commit is entered

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

I've also added timeouts that are between 5-15 minutes per test. Default
is 360 minutes without this
…eenshot tests run more consistently (#80631)"

This reverts commit 93edfa1.
@LikeLakers2 LikeLakers2 changed the title [PORT] GitHub Workflow Parity [PORT] Bring our workflows into (mostly) parity with tgstation's workflows Jun 21, 2024
@LemonInTheDark
Copy link
Contributor

You should not port tgstation/tgstation#77773 it was removed after it served its use

also you should port autoslicing

@LikeLakers2
Copy link
Contributor Author

LikeLakers2 commented Jun 21, 2024

You should not port tgstation#77773 it was removed after it served its use

When I ported that PR, I also ported the PR that reverted it (tgstation/tgstation#78324).

To be clear: I did not look at the specific changes those two PRs made - I only noted that they seemed to touch different files. Thus, to avoid conflicts caused by not bringing those in, I opted to cherry-pick both.

That said, it seems I forgot to include tgstation/tgstation#78324 in the PR comment though - my bad. That's fixed now though.

also you should port autoslicing

See the comment below.

(Original response: I considered doing that, but alongside being unsure if we needed it, that port appears to need more attention than simply running git cherry-pick - in particular, I need to make sure they work properly with any modular icons we have too. Thus I left it out of this PR.)

@LikeLakers2
Copy link
Contributor Author

LikeLakers2 commented Jun 22, 2024

@LemonInTheDark It occurred to me that you may not be aware of monkestation's codebase structure, seeing as how you haven't contributed to this repo. If that's the case, then let me explain in more detail why I didn't include autoslicing:

Whenever we can, we try to put our own stuff in separate files from the tgstation codebase - for example, monkestation/code/ instead of code/. This way, we can have our own content and changes, without it intermingling or conflicting with tgstation stuff. This also makes it slightly easier to cherry-pick fixes from tgstation, because we don't have to figure out what code is ours and what code is tgstation's.

This also means we have our own icons - under monkestation/icons/ - which I'll need to ensure the autoslicer works upon. And so, because it requires that extra attention, I've opted to port it in its own separate PR, rather than delay the merging of these ports.

I hope that helps. :)

@LemonInTheDark
Copy link
Contributor

LemonInTheDark commented Jun 23, 2024

Heyo
I was mostly just ribbing you tbh. That seems reasonable. You'll have to add separate calls to the cutter in both the build.js and lint to ensure both the icons and monkeystation icons folders are covered (alongside modifying build.js to properly treat the monkeystation folder as a source for templates). (it would be easier if you had nested the monkeystations inside /icons but that bridge is far byond crossed by now)

@LikeLakers2
Copy link
Contributor Author

@LemonInTheDark Thanks for the info. :) If I need any help when I get to porting the icon-cutter, I'll be sure to let you know.

@dwasint dwasint merged commit 614dc06 into Monkestation:master Jun 26, 2024
28 checks passed
@LikeLakers2 LikeLakers2 deleted the ports/github-workflow-parity branch June 29, 2024 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants