-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSLCertVerificationError while running the simplest of PubChemPy examples #89
Comments
I am having the same issue, did you find a solution? |
No. And authors don't bother to answer, either. |
It might not applicable but have you tried with older Python such as 3.10 or below? |
@aromring Aiming to replicate your findings (Python 3.12.4 in Windows 10, pubchempy installed in a virtual environment), I started with the example from the project's landing page. Initially, I mistyped the cid in question -- this compound was resolved; when correcting the typo, I got pretty much the same error you report. However, the story doesn't end here. Your entry 5090 was resolved well. And going to cid1423 (the one which failed during the first attempt), this now equally was resolved. See this log:
For curiosity, I tested the approach again in Linux Debian 13/trixie; record 5090 was resolved successfully right the first time:
Conceptually, maybe it is worth to let pubchempy i) attempt once to connect with the servers of NIH. In case this fails, ii) to repeat this attempt a couple of times (similar to (Aspects of parallelization on the local, and a potential throttle to prevent DDoS like scenarios on the remote/server site are not considered here.) |
wow, that is a very interesting find, @nbehrnd !! Thank you so much for your work! |
|
Hi @nbehrnd, Thank you for spending time on this problem! Yours is an interesting find, indeed. I tried your slightly different way of getting SMILES, directly in Python, and the results are enclosed below. I am getting different error message (CERTIFICATE_VERIFY_FAILED) from yours (getaddrinfo). Unfortunately, it's permanent and it does not matter how many times I repeat the same command. :( `Python 3.12.5 (tags/v3.12.5:ff3bc82, Aug 6 2024, 20:45:27) [MSC v.1940 64 bit (AMD64)] on win32
During handling of the above exception, another exception occurred: Traceback (most recent call last):
|
@aromring I'm not aware if running pubchempy in a virtual environment which equally has You mention access of the pubchem data with python/pubchempy fails (and for one case, same for me) while access of the data with a web browser works. To narrow down if the hump in the road is in pubchempy, or elsewhere, can you test the two python scripts attached below? Download them, remove the .txt file extension (this only was added to easily attach them here), and launch them from the command line.
I agree, they may appear "toy like"; in performance a setback to programmatic queries of multiple cid at once. It is not a fix in/around pubchempy, only a (hopefully temporary) bypass. But they are intentionally simple (e.g. no pubchem session key for a performance better than 5 requests/s max, the 30 s timeout; only functions provided by Python's standard library, etc). |
@khoivan88 pubchempy as designed by @mcs07 offers to check its functions with pytest. I just run it (see log attached below) on the project's state as left in April 2017. There are a few which appear to affect the interaction with NIH. For instance line 212 reading
Because you (@khoivan88) worked on/around openenventory, may you have a look on the log? Do these errors relate to / might they cause the problems reported by @aromring ? |
Hi @nbehrnd, Thank you again for looking into it. I've run both scripts and got the same result in each case: SSL: CERTIFICATE_VERIFY_FAILED. I do have pip-system-certs installed, so this is not the issue. `Enter a chemical name: acetone Select the value below to retrieve
MOLECULAR FORMULA[2] Enter a number choice? 4 During handling of the above exception, another exception occurred: Traceback (most recent call last): |
@aromring Do you equally use Python 3.12.4? Your log suggests Next attempt: start a virtual environment of Python, amend it with urllib3 via pip install urllib3 After a successful installation of this one (i.e. in addition to Python's standard library), launch this "script" import urllib3
resp = urllib3.request("GET", "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2010/property/MolecularFormula/TXT")
print(str(resp.data))
print(str(resp.data)[2:-3]) # after trimming On my side, this yields
-- first line is the raw result, the second after trimming the string a little.
import urllib.request
from time import sleep # allows to later limit the rate of requests
list_of_cid = [1020, 1234, 5678]
for entry in list_of_cid:
string1 = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/"
string3 = "/property/MolecularFormula/TXT"
query = "".join([string1, str(entry), string3])
try:
reply_by_nih = urllib.request.urlopen(query).read()
formula = reply_by_nih.decode("UTF-8").strip()
print(f"{entry}\t{formula}")
except Exception as e:
print(e)
sleep(0.4) # i.e. a delay of 0.4 s between each request to NIH into a trinket, press the play button to obtain a table with one column about the cid sent, and the molecular formula received: Only the what follows Edit: in case approach with $ python multiple02.py 123 456
123 C5H12N4O
456 C3H4N2O4 |
Hi @nbehrnd, I owe you a beer. :) I use Python 3.12.5 downloaded from Python.org. There are no hurdles for my directly connected computer in terms of accessing the Internet. WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get issuer certificate (_ssl.c:1000)'))': /rest/pug/compound/cid/2010/property/MolecularFormula/TXT
|
Finally! I've found solution thanks to https://stackoverflow.com/questions/30405867/how-to-get-python-requests-to-trust-a-self-signed-ssl-certificate import os The question remains: why PubChemPy just can't get the certificate?? |
@aromring Nice to read that a contact to NIH was established. My points now are:
MacOS and Linux Debian, though both Unix-like, still differ enough that I don't find in * And what about the instance on PyPi which -- like the root repository of the project here -- fully belongs to @mcs07 to accept PR and publish new/updated versions ... (</comment on>: possibly a cid based request to NIH now can report additional types data than (implemented) in 2017. </comment off>.) |
The following example from PCP documentation:
import pubchempy as pcp c = pcp.Compound.from_cid(5090) print(c.molecular_formula)
results in
`SSLCertVerificationError Traceback (most recent call last)
File c:\Python312\Lib\urllib\request.py:1344, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1343 try:
-> 1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error
File c:\Python312\Lib\http\client.py:1336, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1335 """Send a complete request to the server."""
-> 1336 self._send_request(method, url, body, headers, encode_chunked)
File c:\Python312\Lib\http\client.py:1382, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1381 body = _encode(body, 'body')
-> 1382 self.endheaders(body, encode_chunked=encode_chunked)
File c:\Python312\Lib\http\client.py:1331, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1330 raise CannotSendHeader()
-> 1331 self._send_output(message_body, encode_chunked=encode_chunked)
File c:\Python312\Lib\http\client.py:1091, in HTTPConnection._send_output(self, message_body, encode_chunked)
1090 del self._buffer[:]
-> 1091 self.send(msg)
1093 if message_body is not None:
1094
...
-> 1347 raise URLError(err)
1348 r = h.getresponse()
1349 except:
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get issuer certificate (_ssl.c:1000)>`
I can browse to https://pubchem.ncbi.nlm.nih.gov/ without a problem and search for CID=5090 delivers rofecoxib.
How to fix it?
BTW, I am on Windows 10, not MAC
The text was updated successfully, but these errors were encountered: