-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a developer, I want dependency updates #368
Comments
Feel free to edit the OP to replace |
I'm going to start on this today, but given this is a short day for me, I don't anticipate finishing it. In fact, if we start bumping to major versions, I may not even come close to finishing it today. To begin, my plan is to do a minor version bump. I'll then take a look at what major versions there are to consider and decide which if any to tackle with this update. Hank |
I've pulled the latest and am running unit tests on Hank |
I looked at the past dependency update to 6.27, #340, and searching a few libraries, it appears that, if a major version update is indicated by The results of Hank ==========
|
Yeah, that is unfortunately a weakness of our dependency checker, it doesn't provide both the minor and major version options, so you do need to check for minor updates that are hidden by major ones, unfortunately. |
Or just do the major update, but I would take major updates carefully, one at a time. Our automated tests are probably good enough to catch most issues, once it is compiling (and major upgrades mean it may not compile, of course), but there is an extra risk, especially when working in the cowres components, tasker/worker/broker, as these are generally not very well covered by unit/integration tests. |
Adding #100 to the notable tickets in the description. Hank |
Thanks, James. I'll look into minor updates that are between the reported current and major. Hank |
I completed the (what I hope are) low hanging fruit minors upgrades. I'm unit testing now. If that completes successfully, I'll attempt to deploy to the -dev COWRES. I just need to see what the current process is deploy from the development area. Hank |
I've tackled all of the low-hanging fruit, minor version upgrades. That is, those identified by the dependency check. All pass unit tests except for this one:
I took the above to mean that I need to update the line,
to reference 0.9.4. I made that change. When I did so, the build failed:
When I backed that one up to 0.9.1, I was able to build the code and the unit tests passed. So, referring to the dependency check shared in my previous comment, I've handled all of the minor updates directly reported by dependency check, except for Hank P.S. Note to self so I don't lose it by the time I return to this ticket next week... When deploying the locally build .zips, the following command should work: ./scripts/dockerize.sh 20241126-b735d54-dev 20241113-6f1b17d-dev 20241113-6f1b17d-dev 20241126-b735d54-dev 20241126-b735d54-dev 20241126-b735d54-dev 20241126-2da40a3-dev 20241126-2da40a3-dev |
See #103. I had thought Evan addressed this recently, but I guess that was the protobuf dependencies themselves, not the gradle plugin. |
Thanks, James. I'll take a look. I was able to deploy what I have to -dev COWRES and get a smoke test to pass. Step 1 of many steps. :) Hank |
I read through #103. I also note #70. They appear to be related. I think what I'm seeing is that to update @com.google.protobuf@ to 0.9.4, we need the general protobuf dependencies to go to 4.26.1 or later. Is that right? If so, I can try that as my first major update when I get started on them. Thanks, Hank |
With the I do this so rarely that every time I update dependencies, I spend a lot of time reminding myself of the process. Hank |
Probably, but we'll see. According to the ticket by the people that maintain the gradle dep, it is fixed, so I would browse that ticket w/r to versions. I do recall that they hadn't backported the fix. |
Final update for today... I've updated the Dockerfiles; the Redhat UBI version changed to Once the minor changes are made, the next step will be to step through the major version updates one at a time and identify the ones that will cause problems for us. We'll then determine whether to work on each change soon/now, or push them off until later and find an interim minor version update (that is hidden by the major one pointed to by the dependency check). After a discussion with Josh Walston during "office hours", we decided that we should deploy next week. He's waiting on a couple of features that we will be deploying in 6.28. We'll have to decide if any of these dependency updates should go out in that deployment. If so, then I'll have to push what I can on Tuesday when I return to work. My day is done. Happy Thanksgiving! Hank |
I've been on sick leave a couple of days, so I wasn't able to look at the CVEs. My hope is to work on that tomorrow. Today is likely going to be a partial days and I won't have the time available to look at this. Hank |
Running I need to take my kid to school. Back in a bit, Hank |
I'm having NVD request failures so I bailed on the run:
I have the dependencyCheck {
nvd {
apiKey='omitted'
}
} Hank |
Seems like the issue is with NIST service, not the tool - has been a consistent issue. Will probably 503 eventually: |
I got a fresh NVD key just in case mine expired or something. I'm running it again. Without that service, I can only use the dependabot, right? Hank |
I'm not currently aware of another good tool. There may be one. But, ultimately, it is sounds like it isn't the tool, it's the underlying service (cannot say for sure in this instance, but that has been the case, and the above ticket points to it too). |
Thanks. Well, I scanned the dependabot stuff, at least, through the "high" CVEs: https://github.com/NOAA-OWP/wres/security/dependabot I already noted the Redisson critical CVE is waiting on #100 with a comment in that ticket. However, the rest are either false alarms it appears (we are already using a new enough version that the CVE should not be an issue) or do not apply (they are libraries we are not referencing in build.gradle). Anyway, I'm not sure there is much more I can do for the CVEs unless someone has a better idea. Hank |
Be careful, because transitive deps are still deps. That is why we override a bunch to remove CVEs in the |
I would also keep trying |
Updated the checkboxes. Beyond the
I also updated
Oh, and, look at that, the Hank |
No, I've made no changes to the Hank |
I need to get Arvin some feedback on his GUI changes, so I'm taking a short break from this ticket. Should be back in a bit, Hank |
I'm back. No closer to a solution. I double checked that nothing else was changed in the broker. I see no reason for an image built today to fail when brought up in a container when an image built two months ago comes up just fine. Hank |
I pushed my changes to branch 368_dependency_updates. I've asked Evan to pull down that branch and see if he encounters the same problem I'm encountering with building and deploying the broker image, in particular. I can build and deploy the other images just fine, and can confirm they work successfully when I use the 6.27 broker image, but, as soon I use a broker image built today (with no changes in Thanks, Hank (NOTE to self: The link to create a pull request later is https://github.com/NOAA-OWP/wres/pull/new/368_dependency_updates. So I don't have to fumble around for it later.) |
Evan recommended trying to build the broker image on the -dev03 machine. I did that, and the problem persisted. He also recommended rebuilding a clean version of the 6.27 images to see what happens. I might try that tomorrow. Hank |
ok, so looking through this and some stuff I have noticed. First off, I have no clue why using an older version of just the broker seems to fix things other than maybe the fact to do with docker's cache Looking at the containers, it seems like the broker actually starts successfully
It seems like this is an issue with the worker and tasker talking to the broker and not an isolated issue with the broker starting up or something like that. I think that means that changes made to the tasker/worker should also be scrutinized as well. Once again, unsure why using an older broker version only would change this |
Okay, I was able to reproduce what hank was able to. Deploying the old version of the broker has everything working fine and using the new version leads to above posted error. Something that is a potential thread to pull on, but if this is the case then this is really bad practice, but a thought. With the above being true than that means that SOMETHING must have changed in the broker image from last deploy to current deploy. Out of interest, I decided to do some digging and I saw that the image we use was "Last pushed" 3 days ago: Out of interest, I decided to look into why this may have been and it looks like there are some automated version bumping that are happening on official docker images: What was bumped was something related to ERLANG in rabbitmq Which is where the error message for SSL we are getting originates from. https://www.erlang.org/doc/apps/ssl/ssl.html Now this is my theory. This automated dependency check saw that is was a patch (I think, never seen a 4 decimal version haha) change and therefore just raw pushed the changes and same with whoever updated the docker registry |
If I am correct, then it seems like there was a small change to the underlying erlang logic of the rabbitmq version we use. The bump got pushed all the way up to the official docker registry without triggering any flags for a version change when in reality that contains changes that interact poorly with how we have things set up. That at least is my best (and really only) thought on what is causing things. If this is true, then re-creating the broker image with the old code we have will be breaking. |
Sounds right, this is exactly what I speculated above: If you read the message, it just sounds like an additional check related to the key usage extensions, which suggests our cert is missing something related to this. That said, this sounds like very dodgy practice indeed, updating existing images in place, if I understood correctly. Anyway, we'll need to decipher the message in full and then update our cert. |
The OTP version went from 26.2.5.5 to 26.2.5.6. Information on that patch is here, including release notes which can be downloaded: https://www.erlang.org/patches/otp-26.2.5.6 I'm scanning the release notes now, Hank |
Probably one of these two, likely the latter:
Hank |
What has probably happened here is that you have created a server cert without any constraints on extended key usage. As far as I recall, the instructions were to constrain these uses. For example, for the eventsbroker, I use this:
Note the |
I posted this comment earlier to the wrong ticket, https://github.com/NOAA-OWP/wres-gui/issues/25. I'm going to repost it here... Oh, and I was able to deploy a version of the broker with SSL logging jacked up, but it didn't help a whole lot with understanding why the certificate is unsupported. Restating the complete message here:
Now for the new content... These are the key usages for the tasker client cert:
So that "TLS Web Server Authentication" key usage would appear to be invalid. What makes it invalid? Looking, Hank |
Here are the key usages that are supposed to be used when the cert is created:
Where is "TLS Web Server Authentication" coming from based on that list? Note that this comes from the Updating Certificates wiki in VLab. Is this certificate consistent with how other certificates have been defined? Let me check that real quick, Hank |
When the CA signs a cert they can change anything they want about it, so this may be an IT issue an not us |
Yes, that is consistent with how the certs are defined in staging. However, its not consistent with how we used to define the certs many moons ago. I found some certs in -dev from a long time ago (2021?). For the tasker client cert, I see this:
The extended key usage is client authentication. So the question is, where is the server authentication extended key usage coming from given we ask for client authentication in the CSR? Hank |
Server authentication doesn't sound like a valid use for a client cert. What happened here is that either the CA overrode, incorrectly, or the wrong source file was used for extended key usage, which should be client auth for a client cert. |
Oh. When we sign the cert using the web interface, we select "Web Server". What are the other options? Hank |
Or that. Client and server certs are obvs not the same thing, and I think you are saying that the web interface for generating the CSR only supports server certs... |
Yes, they specifically told me to select web server. They also added web client and server for us for our 2 way auth certs |
Sounds like we may need to ask for a client, only, option. Regardless, 6.28 cannot be deployed until we have a fix in place for this, because our certs are invalid (if that's the right word) and RabbitMQ is now calling us out on it. I'll create a Jira ticket, Hank |
ITSG-4052 has been reported. Evan: Please feel free to add or correct any of the details I provided in the ticket. Also, thank you for spotting the update to the image, which was done in place and which added the checks that exposed our invalid certificates. All: My minor dependency updates are still sitting in the branch. I don't think I can merge the changes, because I cannot fully check that they work in -dev. They appear to work when using the 6.27 broker image, but nothing is running when I use a freshly built broker image. I'm hoping one of you tell me to not worry about it and just merge, because I hate having that branch sitting off to the side for potentially multiple days (I'm out tomorrow because of window replacements). Hank |
We don't normally deploy to check dependency updates, FWIW. I mean, any problem is ultimately exposed during an official deployment cycle, and that has always been the case. There is no special requirement to deploy in order to verify a software change, although it can certainly make sense sometimes. This is definitely a case of personal prerogative. Feel free to merge, if you want. For major updates to cluster dependencies, it probably does make sense to deploy to -dev before merging in, preempting the deployment to -ti for an official deploy UAT. |
Sounds good to me. :) Before I merge the branch, I need to back out changes to I need to step away for a few minutes, but will handle that when I return. Thanks, Hank |
I backed out changes to Pull request has been made and I'm waiting for the build checks to complete before merging, Hank |
Pull request handled and the changes were merged: Hank |
Next up is deploying 6.28... Which unfortunately needs to wait until we can get the certs fixed. Hank |
An update... ITSG pointed to the web client and server option as one to use. I used it, and, yes, I was able to deploy to -dev and post a smoke test. However, I'm not comfortable with our client authentication only certs having an extended key usage indicating that they can be used for server authentication, as well. So I asked that they still provide a client authentication only option. We are going to have to redo the tasker and worker client certs in all environments. I'd rather only do that once. Hence, I'm going to wait to see if they will provide the client-only option before deploying 6.28, which now looks like it will happen Thursday or Friday. Hank |
Known Issues:
#68
#100
CVEs flagged by the github bot, which we can use alongside the gradle task,
dependencyCheckAnalyze
https://github.com/NOAA-OWP/wres/security/dependabot
The text was updated successfully, but these errors were encountered: