Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: module 'urllib.request' has no attribute 'HTTPSHandler' when using astropy #33

Closed
ManonMarchand opened this issue May 12, 2023 · 17 comments

Comments

@ManonMarchand
Copy link

Hello and thanks for this library!

I was unsure about where to post this issue but I'm wondering about why pyodide-http does not work with astropy.

Here is a minimal non-working example :

# do pyodide http magics like in the readme here
from astropy.coordinates import SkyCoord
SkyCoord.from_name("Crab Nebula")

In jupyterlite, the output is like this :

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], line 1
----> 1 SkyCoord.from_name("M1")

File /lib/python3.10/site-packages/astropy/coordinates/sky_coordinate.py:2218, in SkyCoord.from_name(cls, name, frame, parse, cache)
   2183 """
   2184 Given a name, query the CDS name resolver to attempt to retrieve
   2185 coordinate information for that object. The search database, sesame
   (...)
   2213     Instance of the SkyCoord class.
   2214 """
   2216 from .name_resolve import get_icrs_coordinates
-> 2218 icrs_coord = get_icrs_coordinates(name, parse, cache=cache)
   2219 icrs_sky_coord = cls(icrs_coord)
   2220 if frame in ("icrs", icrs_coord.__class__):

File /lib/python3.10/site-packages/astropy/coordinates/name_resolve.py:170, in get_icrs_coordinates(name, parse, cache)
    167 for url in urls:
    168     try:
    169         resp_data = get_file_contents(
--> 170             download_file(url, cache=cache, show_progress=False)
    171         )
    172         break
    173     except urllib.error.URLError as e:

File /lib/python3.10/site-packages/astropy/utils/data.py:1509, in download_file(remote_url, cache, show_progress, timeout, sources, pkgname, http_headers, ssl_context, allow_insecure)
   1507 for source_url in sources:
   1508     try:
-> 1509         f_name = _download_file_from_source(
   1510             source_url,
   1511             timeout=timeout,
   1512             show_progress=show_progress,
   1513             cache=cache,
   1514             remote_url=remote_url,
   1515             pkgname=pkgname,
   1516             http_headers=http_headers,
   1517             ssl_context=ssl_context,
   1518             allow_insecure=allow_insecure,
   1519         )
   1520         # Success!
   1521         break

File /lib/python3.10/site-packages/astropy/utils/data.py:1293, in _download_file_from_source(source_url, show_progress, timeout, remote_url, cache, pkgname, http_headers, ftp_tls, ssl_context, allow_insecure)
   1290         else:
   1291             raise
-> 1293 with _try_url_open(
   1294     source_url,
   1295     timeout=timeout,
   1296     http_headers=http_headers,
   1297     ftp_tls=ftp_tls,
   1298     ssl_context=ssl_context,
   1299     allow_insecure=allow_insecure,
   1300 ) as remote:
   1301     info = remote.info()
   1302     try:

File /lib/python3.10/site-packages/astropy/utils/data.py:1205, in _try_url_open(source_url, timeout, http_headers, ftp_tls, ssl_context, allow_insecure)
   1201 # Always try first with a secure connection
   1202 # _build_urlopener uses lru_cache, so the ssl_context argument must be
   1203 # converted to a hashshable type (a set of 2-tuples)
   1204 ssl_context = frozenset(ssl_context.items() if ssl_context else [])
-> 1205 urlopener = _build_urlopener(
   1206     ftp_tls=ftp_tls, ssl_context=ssl_context, allow_insecure=False
   1207 )
   1208 req = urllib.request.Request(source_url, headers=http_headers)
   1210 try:

File /lib/python3.10/site-packages/astropy/utils/data.py:1179, in _build_urlopener(ftp_tls, ssl_context, allow_insecure)
   1176 if cert_chain:
   1177     ssl_context.load_cert_chain(**cert_chain)
-> 1179 https_handler = urllib.request.HTTPSHandler(context=ssl_context)
   1181 if ftp_tls:
   1182     urlopener = urllib.request.build_opener(_FTPTLSHandler(), https_handler)

AttributeError: module 'urllib.request' has no attribute 'HTTPSHandler'

You can have a look at it there in the notebook 04-sesame.ipynb :

https://cds-astro.github.io/jupyterlite/lab/index.html

From there, what I understand is that maybe urllib needs more patching in order to work with astropy? Or is it more an issue that I should post on their side of the story?

Thanks again!

(PS: the example uses a really cool function that outputs the coordinates of any objects for any of their registered names or designations :) )

@koenvo
Copy link
Owner

koenvo commented May 12, 2023

Thanks for the kind words!

Your example makes it easier to find the problem. pyodide-http does not patch urllib.request.HTTPSHandler at the moment.

Let me figure out of we can patch it and ignore the context argument ( https://github.com/astropy/astropy/blob/cc73b24619ce37f2af26a0140bbdda8015ac8265/astropy/utils/data.py#LL1170C49-L1170C56 )

@ManonMarchand
Copy link
Author

ManonMarchand commented May 12, 2023

That would be super cool. Do you need help?

Also, the example https://github.com/koenvo/pyodide-http/blob/main/examples/pyvo.html returns the same error because pyvo is using astropy.

@koenvo
Copy link
Owner

koenvo commented May 12, 2023

Still trying to reproduce the issue. The example pyvo.html needs a little change as it needs pyodide-http>=0.2.1 to make it work in FireFox/Safari but furthermore it works fine here.

Could it be a different pyodide version? When loading it shows version pyodide-0.22.1.

When I try this code it also works fine in Chrome/FireFox and Safari:

<html>
    <head>
        <link rel="stylesheet" href="https://pyscript.net/latest/pyscript.css" />
        <script defer src="https://pyscript.net/latest/pyscript.js"></script>
    </head>
    <body>
    <py-config>
        packages = ["ssl", "pyodide-http>=0.2.1", "astropy"]
    </py-config>

    <py-script>
        import pyodide_http
        pyodide_http.patch_all()

        from astropy.coordinates import SkyCoord
        res = SkyCoord.from_name("Crab Nebula")

        print(res)
    </py-script>
    </body>
</html>

output

<SkyCoord (ICRS): (ra, dec) in deg
    (83.6287, 22.0147)>

@ManonMarchand
Copy link
Author

ManonMarchand commented May 12, 2023

The example works for me too now with pyscript ✨ 🦀 ⭐

Then the issue might be more on jupyterlite side?
@jtpio sorry to tag you but do you know what's happening?

On the question of the pyodide version, in the cds-astro/jupyterlite there is the jupyterlite-pyodide-kernel v0.0.8. It looks like they are using pyodide 0.23.2 ? :
https://github.com/jupyterlite/pyodide-kernel

@jtpio
Copy link

jtpio commented May 12, 2023

Thanks both for looking into this!

On the question of the pyodide version, in the cds-astro/jupyterlite there is the jupyterlite-pyodide-kernel v0.0.8. It looks like they are using pyodide 0.23.2 ? :

Right, jupyterlite-pyodide-kernel uses the latest stable release of Pyodide which is currently 0.23.2.

So it could indeed be related to the Pyodide version.

@jtpio
Copy link

jtpio commented May 12, 2023

Just checked with the Pyodide console directly and it is giving the same error. Although an extra micropip.install("ssl") seems to be required in the console: https://pyodide.org/en/stable/console.html

image

The console also runs 0.23.2:

image

@rth
Copy link
Contributor

rth commented May 13, 2023

I think requests 2.30.0 that was released on May 3 broke pyodide-http monkeypatching. Using a previous version works.

>>> import micropip
>>> await micropip.install(["requests==2.29.0", "ssl", "pyodide-http>=0.2.1", "astropy"])
>>> import pyodide_http; pyodide_http.patch_all()
>>> from astropy.coordinates import SkyCoord
>>> res = SkyCoord.from_name("Crab Nebula")
>>> res
<SkyCoord (ICRS): (ra, dec) in deg
    (83.6287, 22.0147)>

So there should probably be a range of compatible requests versions specified with a given pyodide-http version?

@koenvo
Copy link
Owner

koenvo commented May 13, 2023

Hmm I tried to use requests 2.30.0 from the examples/pyvo.html and that works.

The pyodide 0.22.1 version includes python 3.10 and Pyodide 0.23.2 included python 3.11. It also seems thatastropy doesn't use the requests library, right?

I suspect it to be something with a different python version instead of a different requests version.

[edit]
The AttributeError: module 'urllib.request' has no attribute 'HTTPSHandler' exception seems to be related to the ssl package and not to pyodide_http.

urllib.request.HTTPSHandler is defined when http.client.HTTPSConnection exists: https://github.com/python/cpython/blob/3.11/Lib/urllib/request.py#L1381

http.client.HTTPSConnection is defined when ssl can be imported:
https://github.com/python/cpython/blob/3.11/Lib/http/client.py#L1402

@rth
Copy link
Contributor

rth commented May 14, 2023

It also seems that astropy doesn't use the requests library, right?

Yeah, you are right it can't be that. Went to the conclusion too quickly )

http.client.HTTPSConnection is defined when ssl can be imported:

So indeed in the notebook 04-sesame.ipynb :

cds-astro.github.io/jupyterlite/lab/index.html

where this is reproducible

image

while in the Pyodide REPL which has exactly same pyodide version would have http.client.HTTPSConnection defined.

A plausible scenario is that http.client is imported somewhere in Jupyterlite directly or indirectly before the ssl module is loaded, leading to this. Maybe we should do some post-initialization step after SSL module is loaded to reload stdlib modules that depend on it (pyodide/pyodide#3856). Or maybe you could do that in pyodide-http.

If I add,

from importlib import reload

import http.client
import urllib.request

reload(http.client)
reload(urllib.request)

to the above notebook, I would now get,

BadStatusLine: HTTP/1.1 0
---------------------------------------------------------------------------
BadStatusLine                             Traceback (most recent call last)
Cell In[5], line 1
----> 1 SkyCoord.from_name("M1")

File /lib/python3.11/site-packages/astropy/coordinates/sky_coordinate.py:2218, in SkyCoord.from_name(cls, name, frame, parse, cache)
   2183 """
   2184 Given a name, query the CDS name resolver to attempt to retrieve
   2185 coordinate information for that object. The search database, sesame
   (...)
   2213     Instance of the SkyCoord class.
   2214 """
   2216 from .name_resolve import get_icrs_coordinates
-> 2218 icrs_coord = get_icrs_coordinates(name, parse, cache=cache)
   2219 icrs_sky_coord = cls(icrs_coord)
   2220 if frame in ("icrs", icrs_coord.__class__):

File /lib/python3.11/site-packages/astropy/coordinates/name_resolve.py:170, in get_icrs_coordinates(name, parse, cache)
    167 for url in urls:
    168     try:
    169         resp_data = get_file_contents(
--> 170             download_file(url, cache=cache, show_progress=False)
    171         )
    172         break
    173     except urllib.error.URLError as e:

File /lib/python3.11/site-packages/astropy/utils/data.py:1509, in download_file(remote_url, cache, show_progress, timeout, sources, pkgname, http_headers, ssl_context, allow_insecure)
   1507 for source_url in sources:
   1508     try:
-> 1509         f_name = _download_file_from_source(
   1510             source_url,
   1511             timeout=timeout,
   1512             show_progress=show_progress,
   1513             cache=cache,
   1514             remote_url=remote_url,
   1515             pkgname=pkgname,
   1516             http_headers=http_headers,
   1517             ssl_context=ssl_context,
   1518             allow_insecure=allow_insecure,
   1519         )
   1520         # Success!
   1521         break

File /lib/python3.11/site-packages/astropy/utils/data.py:1293, in _download_file_from_source(source_url, show_progress, timeout, remote_url, cache, pkgname, http_headers, ftp_tls, ssl_context, allow_insecure)
   1290         else:
   1291             raise
-> 1293 with _try_url_open(
   1294     source_url,
   1295     timeout=timeout,
   1296     http_headers=http_headers,
   1297     ftp_tls=ftp_tls,
   1298     ssl_context=ssl_context,
   1299     allow_insecure=allow_insecure,
   1300 ) as remote:
   1301     info = remote.info()
   1302     try:

File /lib/python3.11/site-packages/astropy/utils/data.py:1211, in _try_url_open(source_url, timeout, http_headers, ftp_tls, ssl_context, allow_insecure)
   1208 req = urllib.request.Request(source_url, headers=http_headers)
   1210 try:
-> 1211     return urlopener.open(req, timeout=timeout)
   1212 except urllib.error.URLError as exc:
   1213     reason = exc.reason

File /lib/python3.11/site-packages/pyodide_http/_urllib.py:58, in urlopen_self_removed(self, url, *args, **kwargs)
     57 def urlopen_self_removed(self, url, *args, **kwargs):
---> 58     return urlopen(url, *args, **kwargs)

File /lib/python3.11/site-packages/pyodide_http/_urllib.py:53, in urlopen(url, *args, **kwargs)
     41 response_data = (
     42     b"HTTP/1.1 "
     43     + str(resp.status_code).encode("ascii")
   (...)
     49     + resp.body
     50 )
     52 response = HTTPResponse(FakeSock(response_data))
---> 53 response.begin()
     54 return response

File /lib/python311.zip/http/client.py:318, in HTTPResponse.begin(self)
    316 # read until we get a non-100 response
    317 while True:
--> 318     version, status, reason = self._read_status()
    319     if status != CONTINUE:
    320         break

File /lib/python311.zip/http/client.py:306, in HTTPResponse._read_status(self)
    304     status = int(status)
    305     if status < 100 or status > 999:
--> 306         raise BadStatusLine(line)
    307 except ValueError:
    308     raise BadStatusLine(line)

BadStatusLine: HTTP/1.1 0

which is still unclear but at least different from the original error. No idea how the status return code can be 0.

@jobovy
Copy link

jobovy commented May 15, 2023

Because galpy has a similar from_name function as astropy, I'm quite interested in this issue. I actually tried the fix @rth suggested and when I run

import micropip
await micropip.install(["ssl", "pyodide-http>=0.2.1", "astropy"])
import pyodide_http; pyodide_http.patch_all()
from astropy.coordinates import SkyCoord
res = SkyCoord.from_name("Crab Nebula")

in the stable pyodide REPL, I get

pyodide.ffi.JsException: NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load '[http://cdsweb.u-strasbg.fr/cgi-bin/](http://cdsweb.u-strasbg.fr/cgi-bin/nph-sesame/A?Crab%20Nebula)
[nph-sesame/A?Crab%20Nebula](http://cdsweb.u-strasbg.fr/cgi-bin/nph-sesame/A?Crab%20Nebula)'.

with the full error message:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.11/site-packages/astropy/coordinates/sky_coordinate.py", line 2218, in from_name
    icrs_coord = get_icrs_coordinates(name, parse, cache=cache)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/astropy/coordinates/name_resolve.py", line 170, in get_icrs_coordinates
    download_file(url, cache=cache, show_progress=False)
  File "/lib/python3.11/site-packages/astropy/utils/data.py", line 1509, in download_file
    f_name = _download_file_from_source(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/astropy/utils/data.py", line 1293, in _download_file_from_source
    with _try_url_open(
         ^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/astropy/utils/data.py", line 1211, in _try_url_open
    return urlopener.open(req, timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/pyodide_http/_urllib.py", line 58, in urlopen_self_removed
    return urlopen(url, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/pyodide_http/_urllib.py", line 31, in urlopen
    resp = send(request)
           ^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/pyodide_http/_core.py", line 121, in send
    xhr.send(to_js(request.body))
pyodide.ffi.JsException: NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load '[http://cdsweb.u-strasbg.fr/cgi-bin/](http://cdsweb.u-strasbg.fr/cgi-bin/nph-sesame/A?Crab%20Nebula)
[nph-sesame/A?Crab%20Nebula](http://cdsweb.u-strasbg.fr/cgi-bin/nph-sesame/A?Crab%20Nebula)'.

The console actually reveals that this is a an error from mixing HTTP content on an HTTPS site

pyodide.asm.js:9 Mixed Content: The page at 'https://pyodide.org/en/latest/console.html' was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint 'http://cdsweb.u-strasbg.fr/cgi-bin/nph-sesame/A?Crab%20Nebula'. This request has been blocked; the content must be served over HTTPS.

In Chrome, one can allow insecure content and doing that, the code runs fine. However, in jupyterlite, even with the from importlib import reload... fix, this still doesn't work, perhaps because jupyterlite runs in a webworker? The same mixed-content error keeps appearing even when allowing insecure content.

So I think we should then just upstream fix this by making an HTTPS request here in astropy:
https://github.com/astropy/astropy/blob/cc73b24619ce37f2af26a0140bbdda8015ac8265/astropy/coordinates/name_resolve.py#L30-39
because I believe those URLs work as https:// ones.

@ManonMarchand
Copy link
Author

Thanks for looking! This PR open in astropy is exactly to change the link to sesame to its https version (and other CDS things too) 🙂

astropy/astropy#14681

But what about the data that don't have a https address? Like for example some old nasa mission? We will never be able to query them through jupyterlite?

@koenvo
Copy link
Owner

koenvo commented May 15, 2023

I tried to summarise the issue to better understand what's going on, and came to this summary:

  1. In some cases the ssl module isn't loaded in time which causes urllib.request.HTTPSHandler to be unvailable
  2. By default pyodide_http version 0.2.0 is used which passed the User-Agent header to XMLHTTPRequest. This causes the browser to reject the request as the User-Agent header is not allowed (in some browsers)
  3. The source data for astropy is requested over http and while the page is hosted over https. This causes a Mixed Content exception and will result in a failed request in python

Curious if you come to the same summary.

From these issues there are some possible solutions/fixes. When I look at the possibilities at the pyodide_http side, I see the following options:

  1. Inform the user when they are mixing content - request over http while page is served over https. Optionally try to request the content over httpsinstead ofhttp`. This is related to Show a meaningful error on CORS error #26
  2. reload the http.client and subsequential modules module when http.client.HTTPSConnection isn't available

To answer @ManonMarchand question about old data: the source needs to support both https and CORS headers to make it work in JupyterLite. Would it be possible to host the data somewhere else? Other option is to proxy the data but that can become quite costly.

@jobovy
Copy link

jobovy commented Dec 15, 2023

Hi all,

It seems like the fix that I proposed earlier in this thread stopped working, because the CORS error that I got here (that we resolved through changing the URL to https://) reared its head again as

pyodide.asm.js:9 Mixed Content: The page at 'https://jupyterlite.github.io/demo/extensions/@jupyterlite/pyodide-kernel-extension/static/568.621d55d3f28fca39d88b.js?v=621d55d3f28fca39d88b' 
was loaded over HTTPS, but attempted to connect to the insecure WebSocket endpoint 'ws://cds.unistra.fr:443/'. 
This request has been blocked; this endpoint must be available over WSS.

So it seems like with newer versions of jupyterlite/pyodide, a websocket is used instead of a XMLHttpRequest and the websocket is insecure even if the starting URL was secure. I can't figure out where this change happened. I wonder whether it has to do with jupyterlite switching to running in a Service Worker.

@rth
Copy link
Contributor

rth commented Dec 16, 2023

I'm pretty sure it didn't happen in Pyodide, so you probably should report this to JupyterLite.

@jobovy
Copy link

jobovy commented Dec 18, 2023

I'm pretty sure it didn't happen in Pyodide, so you probably should report this to JupyterLite.

Actually, this happens in the pyodide REPL as well (both stable and latest), so I think it must be in pyodide? EDIT: Nevermind, it does work, see below.

Screenshot 2023-12-18 at 11 57 34 AM

@ManonMarchand
Copy link
Author

It works in pyodide repl with :

>>> import micropip
>>> await micropip.install(["ssl", "pyodide-http>=0.2.1", "astropy"])
>>> from astropy.coordinates import SkyCoord
>>> import pyodide_http; pyodide_http.patch_all()
>>> SkyCoord.from_name('NGC3256')
<SkyCoord (ICRS): (ra, dec) in deg
    (156.9636833, -43.9037639)>

But I cannot find any combination of load/reload/patch that makes it work in jupyterlite. So I guess we should open an issue there?

@jobovy
Copy link

jobovy commented Jan 17, 2024

I found that the issue in jupyterlite is that ssl wasn't installed when http.client and urllib.request are first imported (these are likely imported in their IPython setup) and for some reason reloading them doesn't fix the issue in jupyterlite as it does in pure pyodide. So I fixed the issue by just installing ssl before anything else is installed in the initialization of jupyterlite's pyodide-kernel, which is enough to fix this issue here: jupyterlite/pyodide-kernel#79.

A new version of pyodide-kernel was released that includes this fix, which you can check here: https://jupyterlite-pyodide-kernel.readthedocs.io/en/latest/_static/ (of course, you still need the pyodide-http patching, so try this example:

>>> import micropip
>>> await micropip.install(["ssl", "pyodide-http>=0.2.1", "astropy"])
>>> from astropy.coordinates import SkyCoord
>>> import pyodide_http; pyodide_http.patch_all()
>>> SkyCoord.from_name('NGC3256')

). I think this issue here can therefore be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants