Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically update our dependencies using dependabot #1012

Closed
2 of 3 tasks
sgibson91 opened this issue Feb 21, 2022 · 31 comments
Closed
2 of 3 tasks

Automatically update our dependencies using dependabot #1012

sgibson91 opened this issue Feb 21, 2022 · 31 comments
Assignees

Comments

@sgibson91
Copy link
Member

sgibson91 commented Feb 21, 2022

Description of problem and opportunity to address it

Context to understand the problem

We maintain a lot of tools in this repository that have dependencies. Remembering to update them for security or bugfix reasons is often not at the top of our to-do lists and only gets there as a result of a wider problem.

Proposed solution

Dependabot is a tool (supported by GitHub) that can watch select package ecosystems for upgrades to listed dependencies and generate PRs to bump them over a given frequency. The generated PRs come with changelogs and can be managed using a range of @dependabot commands (that are also explained in the PR) and we can choose to ignore a given dependency bump if needed.

What's the value and who would benefit

We would no longer need to "remember" to make PRs like this #1005 while still retaining flexibility on when we upgrade.

Dependabot also tracks more than just Python dependencies! It can also bump GitHub Action versions, terraform modules, and Docker base images https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/configuration-options-for-dependency-updates#package-ecosystem

Implementation guide and constraints

Folk are concerned that dependabot is noisy and so we should phase the rollout of dependabot managing our dependencies so we don't get an influx of of Pull Requests.

  • @sgibson91 will create a PR that will enable dependabot update only for the pip dependencies in our docs folder (since those are regularly built/tested by RTD)
    • Ensure frequency is set to weekly
    • Ignore patch updates to further reduce noise
  • We monitor the PRs and see how noisy we feel dependabot is being
  • Decide whether to apply this to Terraform as well

Updates and ongoing work

No response

@choldgraf
Copy link
Member

I'm +1 on this in theory, but have one concern to figure out: my experience with dependabot in other repositories is that it tends to generate a decent amount of noise. Maybe this is less of a problem for us because we don't have a ton of dependencies, but I've seen a lot of repositories with large numbers of un-merged dependabot PRs for patch versions and such. To that extent, what if we also adopt a practice like:

Any dependabot PR that bumps a minor or patch version should be merged quickly by any team member, so long as the tests are passing. For major version bumps, we should notify the @2i2c-org/tech-team and give a day or two for others to respond before merging.

This will require to believe in our testing infrastructure, but I think that this is a target we should be shooting for anyway).

This also reminds me I need to figure out that "PR is open, somebody merge it" alert bot again :-)

@sgibson91
Copy link
Member Author

You can set the frequency for when it checks. In my projects I set it to check weekly to reduce noise. But you could set it to run on a specific day of the week if you wanted. https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/configuration-options-for-dependency-updates#scheduleinterval

There is also an ignore.update-types key that can be used to ignore patch or minor bumps if that's what we would prefer. https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/configuration-options-for-dependency-updates#specifying-dependencies-and-versions-to-ignore

@choldgraf
Copy link
Member

I am +1 on trying this out and seeing how it goes. If it feels noisy we can revisit it later on.

@sgibson91
Copy link
Member Author

Another thing we can do is phase in dependabot for different ecosystems, since it will always be noisy when you first add it. So an example strategy could look like:

  • Add dependabot config that just looks at Python deps
  • Wait for Python deps to settle
  • Add dependabot config that just looks at GitHub Actions deps
  • Wait for GHA deps to settle
  • etc...

@damianavila
Copy link
Contributor

Let's experiment in the iterative fashion you proposed above...
I have also experienced really noisy behavior in the past and I am usually reluctant to use it since then 😛

@choldgraf
Copy link
Member

For what it's worth, the places where I've seen dependabot be the most annoying are in JavaScript worlds where there are seemingly 10,000 dependencies for every project, and they all make patch releases every 15 minutes. 😅 hopefully we wouldn't have as much of an issue with that here

@sgibson91
Copy link
Member Author

Ok, I will make a PR to enable dependabot just for the Python dependencies in our docs folder later today. Why the docs folder? Our docs get tested/built regularly by readthedocs so we should quickly know if we run into issues.

@sgibson91
Copy link
Member Author

Sorry, I ended up not having time to make the PR yesterday, but it's here now! #1022

After we merge it, I recommend we monitor for 4 weeks (so 4 runs of the bot) and then reassess.

@sgibson91
Copy link
Member Author

Adding a note here to remind myself that this should be documented here too https://team-compass.2i2c.org/en/latest/practices/styleguide.html (And should maybe be renamed when that change happens)

@sgibson91
Copy link
Member Author

Just a note that I don't think we got any PRs today because we don't actually pin anything in docs/requirements.txt. pip fetches the latest version by default, so to dependabot there's nothing to update.

@choldgraf
Copy link
Member

yeah I realized that in another project as well - so we should probably use a different requirements file to test this out with, I don't think dependabot will ever open a PR against a requirements file with no pins right?

@sgibson91
Copy link
Member Author

sgibson91 commented Feb 28, 2022

I think that is true, yeah. Or instead of using a requirements file, we could enable updates for GitHub Actions instead since those are pinned.

@choldgraf
Copy link
Member

either works for me - thus far, for Python packages I haven't found dependabot to be annoying, especially if you don't ask it to make PRs for patch versions.

@sgibson91
Copy link
Member Author

I opened a new PR to swap to GitHub Actions #1037

@damianavila
Copy link
Contributor

Just a note that I don't think we got any PRs today because we don't actually pin anything in docs/requirements.txt

Wondering if we should pin those dependencies 🤔.

@sgibson91
Copy link
Member Author

Wondering if we should pin those dependencies 🤔.

I'd be plus one on that. Better to know when your deps are changing than letting them define themselves. Even if it's "only" docs.

@sgibson91
Copy link
Member Author

I realised I had let this go a little bit stale, so here's another PR to start tracking our pip dependencies #1332

@choldgraf
Copy link
Member

Update

We discussed this in a meeting today:

  • This is generally implemented, but isn't that active because we don't pin many dependencies
  • We need to decide whether we want to also apply this to terraform, but there was concern about this because we don't auto-run terraform as part of CI/CD so it might be hard to debug

updated top comment to reflect this

@sgibson91
Copy link
Member Author

Topics to discuss to follow-up this issue:

  • Dependabot for Python is quiet because we don't pin a lot - should we?
  • We can also bump terraform modules, but that often comes with issues around deprecated attributes, etc. We could do one of the following:
    • Implement terraform module bumps with dependabot and define a process where an engineer is assigned to review the PR making sure nothing like deprecated attributes would break it on merge. Either open an issue or implement a fix if it would.
    • Do nothing.

@yuvipanda
Copy link
Member

With terraform, if we bump we'd also have to terraform apply across all our workspaces. So my inclination is to not.

@sgibson91
Copy link
Member Author

Agreed, but we can also run into unforeseen toil if we have to do this the next time we touch the terraform config. So one thing we could do is, from a project management POV, always allocate extra time to a task that touches terraform in case such an issue crops up. (People are always mad you're overdue but happy that you're early!)

@damianavila
Copy link
Contributor

So one thing we could do is, from a project management POV, always allocate extra time to a task that touches terraform in case such an issue crops up.

Is there any "silly" task we might run repeatedly or are you thinking about proactively selecting terraform-related issues from the backlog?

@sgibson91
Copy link
Member Author

sgibson91 commented Jun 24, 2022

@damianavila Here's a scenario I'm imagining:

  • There's a task on our backlog that involves adding some "feature" to our terraform config
  • In the process of adding said feature, some terraform module needs bumping and that causes issues, like attributes being deprecated/changed
  • This results in a new PR that is not addressing the original task, but instead fixing what the module bump broke, and also has to be applied across all clusters
  • This is extra toil that we cannot reliably forecast when we scoped/assigned the original task - you don't know it's coming until you bump the module, it could not happen
  • So when assigning a terraform-related task to someone we give them, say, one week extra than we scoped the task for just in case the above scenario arises

So not proactively selecting terraform tasks from the backlog per se and it's not even that this task would crop up all the time, but when we do select terraform tasks, give the engineer a little more time than we expect them to need

@damianavila
Copy link
Contributor

but when we do select terraform tasks, give the engineer a little more time than we expect them to need

Makes sense to me!

@sgibson91
Copy link
Member Author

Dependabot for Python is quiet because we don't pin a lot - should we?

I think we can close this issue once we reach consensus on the above.

@damianavila
Copy link
Contributor

I think we can close this issue once we reach consensus on the above.

I am usually in favor/biased towards pinning requirements.

@consideRatio
Copy link
Contributor

I've not read through this issue, but here are three strategies for three situations that I've come to appreciate:

  1. dependabot for bumping referenced github actions' versions in a repo's .github/workflows
  2. pre-commit.ci for bumping .pre-commit-config.yaml referenced pre-commit hooks
  3. using a github workflow/job to re-freezing pip installable dependencies for docker images in a requirements.txt file, based on a requirements.in file, using the tool pip-compile

requirements.txt pinning?

Based on a comment in chat I think it could be relevant to also to provide input if we should pin requirements.txt in this repo - it is used as a declaration of the python dependencies needed for the deployer script.

I think this is reasonable at least if we have experienced or expect to see the CI/CD pipeline crash due to bugs in the dependencies over time. If so, then I think a good approach is to use requirements.in + requirements.txt and refreeze requirements.txt in a way that z2jh's does: using pip-compile run from within a docker container. See this logic and the README.md in this folder.

@sgibson91
Copy link
Member Author

We already have dependabot setup to watch requirements.txt files, it just wasn't generating any PRs since none of the packages listed in our requirements files are pinned:

# Maintain dependencies for pip
- package-ecosystem: "pip"
directory: "/"
schedule:
# Check for updates once a week. By default, this check happens on a Monday.
interval: "weekly"
ignore:
# Ignore patch upgrades to further reduce noise
- update-types: ["version-update:semver-patch"]
dependency-name: "*" # Match all packages
reviewers:
- "2i2c-org/tech-team"

Given #1559, I just went ahead and pinned stuff: #1561

@GeorgianaElena
Copy link
Member

Given @damianavila's comment in #1583 (comment), and the fact that there are other pkgs in the hub image apart from oauthenticator for which we install an unreleased version, I was wondering if we could setup depandabot for that Dockerfile too?

@sgibson91, I believe you mentioned some time ago automating these upgrades, but I don't remember the context exactly and I'm trying to figure out if I should open a new issue to track this?

@sgibson91
Copy link
Member Author

I don't think dependabot will be able to bump OAuthenticator in the Dockerfile. dependabot only really bumps the base image, i.e., FROM image:tag-to-bump. I also don't think my actions can bump things in Dockerfiles either, only YAML configs, so I was probably referring to bumping the image referenced here in basehub's values:

image:
name: quay.io/2i2c/pilot-hub
tag: "0.0.1-n3784.hc740a271"

@sgibson91
Copy link
Member Author

@GeorgianaElena I'm going to recommend a new issue for further discussion around your topic and close this one given that #1561 was merged. We can iterate in a new issue on dependabot/pinning generally if folk think it is needed.

Repository owner moved this from In progress to Complete in DEPRECATED Engineering and Product Backlog Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

6 participants