Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyatls: use urllib3 v2 #12

Merged
merged 11 commits into from
Sep 1, 2023
Merged

pyatls: use urllib3 v2 #12

merged 11 commits into from
Sep 1, 2023

Conversation

HernanGatta
Copy link
Contributor

@HernanGatta HernanGatta commented Aug 31, 2023

Overview

There are multiple mutually-incompatible HTTP libraries for Python: the built-in http library, urllib3, and requests.

These are layered in that requests uses urllib3 which in turn uses http. These libraries provide similarly or identically named classes to those in the lower layers but have different interfaces and exhibit different behavior, depending on usage.

As a result, care must be taken to not mix them. Additionally, the general consensus in the Python world is to use urllib3 because of its superior feature set, such as built-in support for connection pooling.

Note: The individual commits are meant to be reviewed individually.

Changes

Previously, the aTLS library used the http library directly, provided a shim to plug it into urllib3, and provided an aTLS adapter for the requests library directly atop the http-based implementation.

Additionally, the library used urllib3 v1.x, which is being deprecated in favor of v2.x, which is not API-compatible with v1.x. It follows that currently the aTLS library only works when v1.x is in the environment, though in a clean environment pip would install the v2.x series. Pinning urllib3 to < 2.0.0 would be less than ideal as its users move on.

Lastly, the current implementation of the aTLS library is not compatible with Python 3.8 due to its usage of 3.10 syntax.

This PR fixes all of these problems.

Tested

Against an Opaque Prompts instance in confidential ACI:

[X] Clean Python 3.8 environment
[X] Idem, Python 3.10
[X] Idem, Python 3.11

@HernanGatta HernanGatta added the fix Bug fix for the user, not a fix to a build script label Aug 31, 2023
@HernanGatta HernanGatta self-assigned this Aug 31, 2023
@github-actions
Copy link

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ ACTION actionlint 1 0 0.01s
✅ PYTHON black 18 0 0.8s
✅ PYTHON flake8 18 0 0.44s
✅ PYTHON isort 18 0 0.24s
✅ YAML yamllint 5 0 0.25s

See detailed report in MegaLinter reports

You could have the same capabilities but better runtime performances if you use a MegaLinter flavor:

MegaLinter is graciously provided by OX Security

@HernanGatta HernanGatta changed the title pyatls: use urllib3 pyatls: use urllib3 v2 Aug 31, 2023
Copy link

@zizhong zizhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG!

@@ -1,4 +1,5 @@
cryptography==41.0.3
PyJWT==2.8.0
pyOpenSSL==23.2.0
urllib3==1.26.16
requests==2.31.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requests library is not marked as a dependency of this package

do we need it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it depends on what we use this file for. If it's for development dependencies, then yes; if it's for general dependencies, then I'd say no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should rename it to requirements-dev.txt.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to have it here.

Copy link
Collaborator

@ggroode ggroode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change, code itself looks good just have a couple comments on conventions and some values we are importing.

@@ -1,4 +1,5 @@
cryptography==41.0.3
PyJWT==2.8.0
pyOpenSSL==23.2.0
urllib3==1.26.16
requests==2.31.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to have it here.

Comment on lines +5 to +6
from urllib3.util.connection import _TYPE_SOCKET_OPTIONS
from urllib3.util.timeout import _DEFAULT_TIMEOUT, _TYPE_TIMEOUT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge deal, but importing variables starting with _ feels a bit weird since they are the python equivalent of private variables and methods (and therefore I am not certain if they are protected from changing / breaking between minor version bumps unlike public methods).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't like it either. The alternative though is duplicating them, which felt nastier when I was considering what to do.

Thinking about it more now, maybe I should duplicate them since these end up being part of the aTLS package's public API.

from urllib3.util.timeout import _DEFAULT_TIMEOUT, _TYPE_TIMEOUT


class _HTTPAConnectionShim(HTTPAConnection):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason this file and class have an _ in front, because my understanding is that means you don't intend to import it anywhere else, but you do end up importing it in other files (I also have never seen a python file that starts with an _, but that might be a convention I am not aware of).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the file name, I saw it in the requests library:
https://github.com/psf/requests/blob/main/src/requests/_internal_utils.py

As for the class name, the intention is to mark it as internal to the library, but still available to the library itself. Some tools (like VS Code) automatically hide modules (.py files) and types that start with an underscore.

I suppose that if the file starts with an underscore, then there is no point to also the class starting with one.

I can leave the underscore in the file and remove it from the class. Thoughts?

@HernanGatta HernanGatta merged commit 7063347 into dev Sep 1, 2023
1 check passed
@HernanGatta HernanGatta deleted the hegatta/urllib3v2 branch September 1, 2023 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix for the user, not a fix to a build script
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants