Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature crypto backend2 #202

Closed
wants to merge 10 commits into from

Conversation

jvanasco
Copy link
Contributor

PR #196 got too out of date from the main branch, and there were too many messy merge-commits for tests to run; so this is just a manual diff to continue that branch/discussion and get tests running.

@jvanasco
Copy link
Contributor Author

This is an idea for conditional usage with the new/legacy concept as discussed in #196, while still making pyOpenSSL a conditional import. I think this should not break compatibility, but I need more time to think about it.

In this version:

  • .wrapped_new: Cryptography
  • .wrapped_legacy: PyOpenSSL or None
  • .wrapped: PyOpenSSL or Cryptography

All internal package functions reference .wrapped_new

On instantiation of ComparableX509 objects:

  • PyOpenSSL and Cryptography both accepted
  • PyOpenSSL always transcoded to Cryptography
  • Cryptography transcoded to PyOpenSSL if support is detected

What I don't like about this and misc concerns:

  1. Not sure what to do about def __getattr__, which I generally dislike, and seems prone to causing issues in any possible strategy.
  2. Although this drops pyOpenSSL to a legacy support mode, it likely needs to list it as a normal dependency to protect against the version conflict with Cryptography.

There are a few other things I was concerned with, but I have too much brain frog now. Maybe one of the Cryptography maintainers can chime in with comments/concerns, as I know they're focused on getting Cerbot off PyOpenSSL as well.

CHANGELOG.rst Outdated
migrate code to the Cryptography objects. This is offered as a minimally
breaking change to aid in migration to Cryptography. Affected projects should
either pin to `1.14.0` or utilize the new attribute in a "hotfix" release.
Please note, due to the removal of `X509_V_FLAG_NOTIFY_POLICY` in pyOpenSSL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit more? Where does X509_V_FLAG_NOTIFY_POLICY come into play? Not seeing it on a grep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pretty smart developer on this project, @ohemorange, figured this out. Check out their comment in the other PR ;) : #196 (comment)

In terms of this causing issues...

This breaks the linting routine (the Python Tests for Josepy / external (pull_request) test), if it picks up a system pyOpenSSL that is incompatible. I got around that by setting a compatible pyopenssl as a dev/testing requirement - however (going back to my concern 2 above), even though the code does not require PyOpenSSL to operate, it would probably be wise to list the minimum compatible version as a regular dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, but you see that was last-year @ohemorange, a person that this-year @ohemorange retains no memory of.

I still don't see why there would be a version conflict though, what's causing the conflict? Is this just saying that there will be a version conflict only if they don't update to the new required version (or remove pyopenssl entirely)? Or is there a conflicting pyopenssl requirement somewhere else, that is has to be under some version? We usually just write required version updates as "josepy now requires project version x.x.x," see https://github.com/certbot/certbot/pull/10130/files for an ongoing example.

And as I mentioned in my other comment, it's probably easiest to remove the pyopenssl requirement after this PR in one fell swoop rather than making it conditional, especially if it's both a conditional requirement and a dev/testing requirement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see why there would be a version conflict though, what's causing the conflict?

The only time I've been able to trigger this is that test suite, and bunch of debugging/analysis suggests this is what happens:

  • PyOpenSSL is not required by josepy or any dependencies, so is not installed into the virtualenv
  • The linter sees the PyOpenSSL import, and loads the system installation of PyOpenSSL or something from the last cached download - which sometimes (but not always) is an earlier version.
  • The linter picks up the incompatibility,

I don't think this would be triggered in production code - I think it's just a happenstance of the linter.

And as I mentioned in my other comment, it's probably easiest to remove the pyopenssl requirement after this PR in one fell swoop rather than making it conditional, especially if it's both a conditional requirement and a dev/testing requirement.

Agreed. I'm going to migrate PyOpenSSL into a requirement, but not use it. Then it can be dropped on the next release. If someone is still using PyOpenSSL, forcing them to use a more modern version makes sense.

@ohemorange
Copy link
Contributor

Some thoughts:

Dropping pyopenssl is of course nice, but conditional dependencies are a pain. Probably easier to just keep it around for now and drop it on next release than do the legacy thing. I don't see the benefit of asserting the wrapped cert type based on what's installed.

Didn't realize that .wrapped was public, that's annoying. What's the benefit of having a public wrapped_legacy access? Won't those users just be able to call .wrapped? And why do we need to create a legacy version from a Cryptography.x509 input? It looks like it's really just needed for tests; I don't think anyone is going to start referencing that when they're about to move off of it.

Once pyopenssl is totally gone, we don't technically need util.ComparableX509 at all, though we could just keep it as a convenience wrapper for x509.Certificate and x509.CertificateSigningRequest. If we do plan to get rid of it entirely, then this update should change methods that take util.ComparableX509 in the signature to also take Cryptography.x509 types, so that users who want to start a conversion can do so in the PR. Are you explicitly proposing we should continue to keep util.ComparableX509 around going forward by not doing so?

If we do want to get rid of ComparableX509, looks like x509.Certificate and x509.CertificateSigningRequest don't have any common class they inherit from, so we'd end up defining the type in the signature as a union of those two anyway. In which case, we can either create a custom type that's a union of those two, or just stick a union in the signature. Either way, we could do that now and also include util.ComparableX509 in that union, and check the types within methods that take all three to see if we should be using it directly or accessing .wrapped.

@jvanasco
Copy link
Contributor Author

These are good ideas, and I agree. I should have time to implement them tomorrow.

In terms of the design motivations: Sometimes you spend so much time in the weeds, you forget to look at the sky.

And why do we need to create a legacy version from a Cryptography.x509 input?

I think this is what might be needed:

  • Drop .wrapped_legacy; .wrapped is the legacy and will be deprecated.
  • All internal functions use .wrapped_new; also accept a Cryptography.x509 object as well.
  • Input of Cryptography.x509 is only turned into a legacy object if .wrapped is accessed; this is the compatibility layer for any 3rd party libraries

@ohemorange
Copy link
Contributor

Thanks for that summary, that sounds mostly right to me.

Thinking about your third bullet raises an interesting question -- should self.wrapped be "always pyopenssl object no matter what was used to initialize the wrapper" or "always return whatever you put in there in the first place"? I was initially thinking the latter, but on further consideration I don't think that either direction is clear from the existing usages. I definitely know that I don't like the answer "it depends on whether or not you have pyopenssl installed."

@ohemorange
Copy link
Contributor

I feel like for migration purposes, the latter would be more useful, but might also be more in violation of backwards compatibility. Which may indicate a fundamental problem with this approach?

@ohemorange
Copy link
Contributor

Maybe we're overthinking this whole thing. Going back to basics, what we want here is for users to be able to use either pyopenssl or cryptography certs in their functions. If they're using cryptography, they don't actually need ComparableX509 at all, so maybe it doesn't even have to ever accept cryptography objects, and instead what we need to do is mark it as deprecated, and everywhere it's used elsewhere in the code, also accept cryptography objects directly.

@jvanasco
Copy link
Contributor Author

I think .wrapped should offer PyOpenSSL no matter what the object is constructed with - that preserves backwards compatibility. It can be lazily generated off of .wrapped_new on-demand. All internal functions will accept raw Cryptography.X509 objects OR ComparableX509 objects, and utilize the ComparableX509.wrapped_new if that's what they get.

The ComparableX509 object, .wrapped, and PyOpenSSL become officially deprecated and removed on the next release.

This would preserve the interface for existing usage, migrate all josepy/acme/cert operations off of OpenSSL, and give the upgrade path (raw Cryptography objects). It looks like the relevant Cryptography objects support eq through the rust bindings

@ohemorange
Copy link
Contributor

If ComparableX509 is removed on the next release, then the proper migration step for people to take would be to stop using it in this release (but also allow it to still be used in some places if needed). I don't see a benefit to switching your code to stick cryptography objects inside the deprecated ComparableX509 this release, then having to switch your code again to use cryptography objects directly next release. I really only see a reason to have ComparableX509 accept both objects if we are not in fact going to deprecate ComparableX509 this release and remove it next release. If you see otherwise, what would that migration process look like?

@jvanasco
Copy link
Contributor Author

I had a thoughtful and insightful reply, and lost it due to iOS switching tabs.

I hope to convince you with the next PR.

@jvanasco
Copy link
Contributor Author

Ok, here is the general idea:

ComparableX509 is deprecated, but still exists for migration. It will accept Cryptography and PyOpenSSL objects. If anything (third party) touches .wrapped or uses getattr it will use the PyOpenSSL object or transcode a new one from Cryptography.

All internal usage is 100% Cryptography

The only issues:

Breaking Change:

  • x5c headers decode to Cryptography; perhaps this should still be ComparableX509

Migration Concerns:

  • decode_cert and decode_csr still decode to legacy ComparableX509;
  • new functions decode_cert_cryptography and decode_csr_cryptography are added to return Cryptography objects
  • perhaps the original functions take a flag to return a Cryptography object, which is defaulted to False now but flips to True on the next release.

Aside from the headers, this is as minimally a breaking change I can think of that gets 100% of library operations off of PyOpenSSL, and will not break any integrations that just pass Cryptography objects around.

@ohemorange
Copy link
Contributor

ohemorange commented Jan 15, 2025

I really appreciate your efforts here, but I still think we can do be doing things way more minimally that allow for migration and are less prone to breakage. The key principles are

  1. switch to using cryptography methods inside of functions where possible now,
  2. if pyopenssl objects are returned, deprecate that function and create a parallel one, and call the new function internally, and
  3. if objects to be deprecated are in the inputs, including pyopenssl objects directly and ComparableX509, accept either the deprecated or cryptography objects. This last one I'm open to alternatives such as creating parallel functions as in 2.

This is how we've been doing it in the certbot code, and it allows for a smooth transition period where new functionality in added in a update, and removed in a major update, and also allows for small modular changes, which are both easier to review and (therefore) statistically less likely to lead to bugs.

By this logic, the updates inside of ComparableX509 are not necessary, and additionally this process makes it more clear what type of object should be returned or passed to any given function. I still don't see any need to modify ComparableX509 to accept other types, given that we're removing it in the next major release. People should only need to update their code a total of once to make it through this transition, and wrapping cryptography objects in ComparableX509 would be updating calling code twice.

Edit: I do realize that this plan is slightly less than ideal when it comes to the future names of methods; decode_cert is certainly a nicer name than decode_cert_cryptography, and the pattern in jws of having short header labels as function names is certainly nicer than adding an x5c_cryptography class variable, but imo it's a tradeoff between that, or just doing what we can internally now and not changing any headers then doing what #182 does and just dropping pyopenssl entirely from the API in one fell swoop in a follow-up PR.

Edit 2: I forgot about josepy fields and how things are done automagically here, I don't think we can in fact do something like x5c_cryptography, at least not without adding even more nonsense magic. Yet we still can't have x5c.decode return a different type of object right now without making a breaking change, that's backwards incompatible. And even if we stuck cryptography objects in ComparableX509, we'd still have to provide the existing .wrapped until making a breaking change. So again working from the principle of "do the thing such that people only need to make changes to any given code once," we really will just have to switch the return type in a future version. Unless, of course, we're going back to "keep ComparableX509 around forever," in which case adjust that statement to say "we'll have to change what wrapped returns in a future version." And I really do think it's simpler and cleaner to just have it switch to cryptography objects in a future release.

@ohemorange
Copy link
Contributor

Took a stab at the sort of smaller, more minimal change I'm talking about in #204, which is largely based on this PR, just pulling things out and changing them around to match the vision I described.

@ohemorange
Copy link
Contributor

After writing that PR, it occurs to me that maybe the reason you want to do the ComparableX509 change is to remove as much pyopenssl as possible immediately? For example, if encode_cert received a ComparableX509, it could access a ._wrapped_cryptography and operate on that for encoding. But in that case, I still don't think it makes sense for ComparableX509 to take cryptography objects as inputs, it could just convert internally.

@jvanasco
Copy link
Contributor Author

After writing that PR, it occurs to me that maybe the reason you want to do the ComparableX509 change is to remove as much pyopenssl as possible immediately?

That is a major factor. It's a mix of a few things; my two main rationales are this:

1- Drop PyOpenSSL immediately. PyOpenSSL is only used for transcoding to/from Cryptography, if needed.
All library operations are now done on:

  • raw Cryptography objects
  • the Cryptography object wrapped in a ComparableX509

2- There are two distinct populations who will be affected by this change - Package Maintainers that utilize josepy directly (e.g. certbot) and Tertiary Projects (for example: packages like acme, certbot plugins, or custom tools that are leveraging certbot or acme). Using the Certbot/Josepy/Acme integration an example - but let's pretend that it's not the same maintainers:

  • There is no need for Certbot to use PyOpenSSL internally, and it could immediately switch to using Cryptography instead of ComparableX509, except...
  • The Acme library utilizes a ComparableX509 extensively for inputs and outputs. The Acme library doesn't do much with this object, but it does create this API problem.

Given situations like this - where multiple packages are relying on this object and passing it between one another - some populations will have to handle this migration across multiple releases. Other populations will be blocked. Many people trying to migrate will likely get blocked due to things breaking against a random project, and need to wait until an Issue+PR is resolved.

In my example (again, pretend it's different teams), Certbot would not be able to migrate to Cryptography and would need to keep using PyOpenSSL simply to keep the ComparableX509 object around until the Acme library changes. If the ComparableX509 object accepts Cryptography objects though, Certbot can immediately drop PyOpenSSL and do everything in Cryptography without breaking any of the integrated packages. Because the .wrapped attribute is lazily transcoded, PyOpenSSL might never be invoked.

So my goal is to let whomever invokes Josepy directly the ability to drop PyOpenSSL immediately from their code, but not break any contracts with integrated packages that depend on it.

As someone with private code that integrates against joespy/certbot/acme, the only reason why I have PyOpenSSL in my stack right now is to maintain the ComparableX509 interface. IIRC, the functions I leverage from ACME don't touch the .wrapped pyopenssl attribute - but the API expects ComparableX509 and not x509.Certificate – so this becomes a blocker on my development roadmap. If ComparableX509 accepts a Cryptography object, then I can drop PyOpenSSL internally today – and not worry about the other packages down the line. I'll figure out what third party projects break, create or subscribe to a github issue for those projects to drop PyOpenSSL/ComparableX509, and in a few weeks or months I'll get a notification email about a merge that lets me finally drop ComparableX509.

IMHO, I think the best two options for Josepy are something like I propose in this PR to support the different types of users through a migration, or just go straight to an abrupt and completely breaking change with a 2.0 release that removes ComparableX509 and replaces every function (keeping the name) to use raw Cryptography objects for input and output. (I was low-key suggesting that should be the next release a PR like this; this PR would accept Comparable+Crypto, the next PR would drop the Comparable object; the _crypto decoders would become aliased to the existing versions as well).

People should only need to update their code a total of once to make it through this transition, and wrapping cryptography objects in ComparableX509 would be updating calling code twice.

For the reasons mentioned above, I don't think this is universally viable. I think some developers will be able to immediately migrate to raw Cryptography and should - but I think a lot of developers will be stuck with situations where some random dependency in their stack is requiring the ComparableX509 object... and that dependency is keeping them from migrating to Cryptography.

I guess in simpler terms, my vision is this:

v 1.15 - ComparableX509 accepts Cryptography and PyOpenSSL, all functions use Cryptography; PyOpenSSL only used for transcoding.
v 2.0 - Fully drop PyOpenSSL and ComparableX509

They could even be released on the same day. v1.15 is just the migration path to 2.0, so projects can immediately migrate to Cryptography without being blocked by third party dependencies.

Does that explanation make more sense?

@jvanasco
Copy link
Contributor Author

Also, this would be the migration path for Certbot - and what (IMHO) most developers would be experiencing due to third-party dependencies:

  • v1.15 released
  • Certbot migrates from CompatibleX509(PyOpenSSL) to CompatibleX509(Cryptography)
  • new Acme drops CompatibleX509 for Cryptography
  • Certbot migrates from CompatibleX509(Cryptography) to Cryptography

While it is a change over 2 cycles, the second change is minimal and would not require a coordinated timed release of Certbot+Josepy+Acme or create any blockages.

@ohemorange ohemorange changed the base branch from master to main January 16, 2025 19:15
@ohemorange
Copy link
Contributor

I really appreciate all the work you've put in here, but after considering the various options, a slower migration adds a lot of complexity and we don't expect there to be that much benefit so we're just going to go forward with more direct changes. To help people avoid breakage, I've opened #206 to issue deprecation warnings in affected places.

@ohemorange ohemorange closed this Jan 16, 2025
@jvanasco
Copy link
Contributor Author

No worries. As I’ve said before, I think the only two options are to either do a massive breaking change or offer a soft migration path before a massive breaking change.

While I see benefit to the latter for most users, my only concern is being able to drop pyopenssl asap.

@jvanasco jvanasco mentioned this pull request Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants