Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI is failing, probably due to resource location change #126

Closed
fabianegli opened this issue Jun 7, 2022 · 3 comments
Closed

CI is failing, probably due to resource location change #126

fabianegli opened this issue Jun 7, 2022 · 3 comments

Comments

@fabianegli
Copy link
Collaborator

The CI currently fails because some test tries to retrieve

http://purl.obolibrary.org/obo/NCBITaxon_9606&size=100

which fails.

When this URL is entered in a browser the browser is redirected to http://ontologies.berkeleybop.org/

@fabianegli
Copy link
Collaborator Author

Dumping this here:

https://www.ebi.ac.uk/spot/zooma/docs/api

This is the code in the test that runs throuhgh:

In [1]: from sdrf_pipelines.zooma.zooma import Zooma, SlimOlsClient

In [2]: keyword = 'human'

In [3]: client = Zooma()

In [4]: results = client.recommender(keyword, filters="ontologies:[nbcitaxon]")

In [5]: results
Out[5]: 
[{'uri': None,
  'annotatedProperty': {'uri': 'http://rdf.ebi.ac.uk/resource/zooma/69A7F59A19C8525E4C7B04C798596A69',
   'propertyType': 'factor',
   'propertyValue': 'Human'},
  '_links': {'olslinks': [{'href': 'https://ves-oy-be:8080/ols/api/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FNCBITaxon_9606',
     'semanticTag': 'http://purl.obolibrary.org/obo/NCBITaxon_9606'}]},
  'semanticTags': ['http://purl.obolibrary.org/obo/NCBITaxon_9606'],
  'replacedBy': [],
  'replaces': [],
  'derivedFrom': {'uri': 'http://rdf.ebi.ac.uk/resource/zooma/metabolights/8F94CFC1124A59BCEB894A9917CA77C2',
   'annotatedProperty': {'uri': 'http://rdf.ebi.ac.uk/resource/zooma/69A7F59A19C8525E4C7B04C798596A69',
    'propertyType': 'factor',
    'propertyValue': 'Human'},
   '_links': {'olslinks': [{'href': 'http://purl.obolibrary.org/obo/NCBITaxon_9606',
      'semanticTag': 'http://purl.obolibrary.org/obo/NCBITaxon_9606'}]},
   'semanticTags': ['http://purl.obolibrary.org/obo/NCBITaxon_9606'],
   'replacedBy': [],
   'replaces': [],
   'provenance': {'source': {'type': 'DATABASE',
     'name': 'metabolights',
     'uri': 'https://www.ebi.ac.uk/metabolights'},
    'evidence': 'MANUAL_CURATED',
    'accuracy': 'NOT_SPECIFIED',
    'generator': 'https://www.ebi.ac.uk/metabolights',
    'generatedDate': 1649104721000,
    'annotator': 'Zoe May Pendlington',
    'annotationDate': 1538852400000},
   'annotatedBiologicalEntities': [{'uri': 'http://rdf.ebi.ac.uk/resource/zooma/metabolights/1846EDBCECE5A0D6E04AC107EF21F950',
     'name': 'metabo_219',
     'types': ['http://www.w3.org/2002/07/owl#NamedIndividual',
      'http://rdf.ebi.ac.uk/terms/zooma/Target'],
     'studies': [{'uri': 'http://rdf.ebi.ac.uk/resource/zooma/metabolights/A6AA403F9EF597D6BCFAAAFB79571724',
       'accession': 'MTBLS176',
       'types': ['http://www.w3.org/2002/07/owl#NamedIndividual',
        'http://rdf.ebi.ac.uk/terms/zooma/DatabaseEntrySource']}]}]},
  'confidence': 'HIGH',
  'provenance': {'source': {'type': 'DATABASE',
    'name': 'zooma',
    'uri': 'www.ebi.ac.uk/spot/zooma'},
   'evidence': 'ZOOMA_INFERRED_FROM_CURATED',
   'accuracy': None,
   'generator': 'ZOOMA',
   'generatedDate': 1654622176809,
   'annotator': 'ZOOMA',
   'annotationDate': 1654622176809},
  'annotatedBiologicalEntities': []}]

In [6]: ols_terms = client.process_zooma_results(results)

In [7]: ols_terms
Out[7]: 
[{'queryValue': 'Human',
  'confidence': 'HIGH',
  'ols_url': 'https://ves-oy-be:8080/ols/api/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FNCBITaxon_9606'}]

In [8]: ols_client = SlimOlsClient()

this then fails (simplified for interative debugging)

In [9]: ols_client.get_term_from_url(ols_terms[0]['ols_url'], ontology="ncbitaxon")
---------------------------------------------------------------------------
gaierror                                  Traceback (most recent call last)
File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connection.py:174, in HTTPConnection._new_conn(self)
    173 try:
--> 174     conn = connection.create_connection(
    175         (self._dns_host, self.port), self.timeout, **extra_kw
    176     )
    178 except SocketTimeout:

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/util/connection.py:72, in create_connection(address, timeout, source_address, socket_options)
     68     return six.raise_from(
     69         LocationParseError(u"'%s', label empty or too long" % host), None
     70     )
---> 72 for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
     73     af, socktype, proto, canonname, sa = res

File /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/socket.py:918, in getaddrinfo(host, port, family, type, proto, flags)
    917 addrlist = []
--> 918 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    919     af, socktype, proto, canonname, sa = res

gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connectionpool.py:703, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    702 # Make the request on the httplib connection object.
--> 703 httplib_response = self._make_request(
    704     conn,
    705     method,
    706     url,
    707     timeout=timeout_obj,
    708     body=body,
    709     headers=headers,
    710     chunked=chunked,
    711 )
    713 # If we're going to release the connection in ``finally:``, then
    714 # the response doesn't need to know about the connection. Otherwise
    715 # it will also try to release it and we'll have a double-release
    716 # mess.

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connectionpool.py:386, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    385 try:
--> 386     self._validate_conn(conn)
    387 except (SocketTimeout, BaseSSLError) as e:
    388     # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connectionpool.py:1040, in HTTPSConnectionPool._validate_conn(self, conn)
   1039 if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
-> 1040     conn.connect()
   1042 if not conn.is_verified:

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connection.py:358, in HTTPSConnection.connect(self)
    356 def connect(self):
    357     # Add certificate verification
--> 358     self.sock = conn = self._new_conn()
    359     hostname = self.host

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connection.py:186, in HTTPConnection._new_conn(self)
    185 except SocketError as e:
--> 186     raise NewConnectionError(
    187         self, "Failed to establish a new connection: %s" % e
    188     )
    190 return conn

NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x10c74d6a0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/requests/adapters.py:440, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    439 if not chunked:
--> 440     resp = conn.urlopen(
    441         method=request.method,
    442         url=url,
    443         body=request.body,
    444         headers=request.headers,
    445         redirect=False,
    446         assert_same_host=False,
    447         preload_content=False,
    448         decode_content=False,
    449         retries=self.max_retries,
    450         timeout=timeout
    451     )
    453 # Send the request.
    454 else:

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/connectionpool.py:785, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    783     e = ProtocolError("Connection aborted.", e)
--> 785 retries = retries.increment(
    786     method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    787 )
    788 retries.sleep()

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/urllib3/util/retry.py:592, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    591 if new_retry.is_exhausted():
--> 592     raise MaxRetryError(_pool, url, error or ResponseError(cause))
    594 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPSConnectionPool(host='ves-oy-be', port=8080): Max retries exceeded with url: /ols/api/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FNCBITaxon_9606&size=100 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x10c74d6a0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
Input In [9], in <cell line: 1>()
----> 1 ols_client.get_term_from_url(ols_terms[0]['ols_url'], ontology="ncbitaxon")

File ~/github/fabianegli/sdrf-pipelines/sdrf_pipelines/zooma/zooma.py:33, in SlimOlsClient.get_term_from_url(url, page_size, ontology)
     25 """
     26 Return a list of terms by ontology
     27 :param url:
   (...)
     30 :return:
     31 """
     32 url += "&" + "size=" + str(page_size)
---> 33 r = requests.get(url)
     34 if r.status_code == 414:
     35     raise HTTPError('URL do not exist in OLS')

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/requests/api.py:75, in get(url, params, **kwargs)
     64 def get(url, params=None, **kwargs):
     65     r"""Sends a GET request.
     66 
     67     :param url: URL for the new :class:`Request` object.
   (...)
     72     :rtype: requests.Response
     73     """
---> 75     return request('get', url, params=params, **kwargs)

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/requests/api.py:61, in request(method, url, **kwargs)
     57 # By using the 'with' statement we are sure the session is closed, thus we
     58 # avoid leaving sockets open which can trigger a ResourceWarning in some
     59 # cases, and look like a memory leak in others.
     60 with sessions.Session() as session:
---> 61     return session.request(method=method, url=url, **kwargs)

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/requests/sessions.py:529, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    524 send_kwargs = {
    525     'timeout': timeout,
    526     'allow_redirects': allow_redirects,
    527 }
    528 send_kwargs.update(settings)
--> 529 resp = self.send(prep, **send_kwargs)
    531 return resp

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/requests/sessions.py:645, in Session.send(self, request, **kwargs)
    642 start = preferred_clock()
    644 # Send the request
--> 645 r = adapter.send(request, **kwargs)
    647 # Total elapsed time of the request (approximately)
    648 elapsed = preferred_clock() - start

File ~/github/fabianegli/sdrf-pipelines/venv/lib/python3.8/site-packages/requests/adapters.py:519, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    515     if isinstance(e.reason, _SSLError):
    516         # This branch is for urllib3 v1.22 and later.
    517         raise SSLError(e, request=request)
--> 519     raise ConnectionError(e, request=request)
    521 except ClosedPoolError as e:
    522     raise ConnectionError(e, request=request)

ConnectionError: HTTPSConnectionPool(host='ves-oy-be', port=8080): Max retries exceeded with url: /ols/api/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FNCBITaxon_9606&size=100 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x10c74d6a0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

from zooma.py

def process_zumma_results(results):

ontology_terms = []
for result in results:
ols_term = {'queryValue': result['annotatedProperty']['propertyValue'], 'confidence': result['confidence'],
'ols_url': result['_links']['olslinks'][0]['href']}
ontology_terms.append(ols_term)
return ontology_terms

class SlimOlsClient(:

class SlimOlsClient(object):
def __init__(self) -> None:
super().__init__()
self._ols_client = OlsClient()
@staticmethod
def get_term_from_url(url, page_size: int = 100, ontology: str = None):
"""
Return a list of terms by ontology
:param url:
:param page_size:
:param ontology:
:return:
"""
url += "&" + "size=" + str(page_size)
r = requests.get(url)
if r.status_code == 414:
raise HTTPError('URL do not exist in OLS')
json_response = r.json()
old_terms = json_response['_embedded']['terms']
old_terms = list(filter(lambda k: ontology in k['ontology_name'], old_terms))
return [OlsTerm(x['iri'], x['label'], x['ontology_name']) for x in old_terms]

@fabianegli
Copy link
Collaborator Author

I reported the issue to the Zooma project: EBISPOT/zooma#99

@fabianegli
Copy link
Collaborator Author

The issue was solved upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant