-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💄 Update ruff #3707
💄 Update ruff #3707
Conversation
Quick links (staging server):
Login: chart-diff: ✅No charts for review.Edited: 2024-12-09 08:20:09 UTC |
bd5ae3e
to
9942bd3
Compare
6adab7b
to
9a3cdc7
Compare
I would say that it works, but I get some errors. I think some of these errors are just classic errors that I see always when running Heads up that you can now schedule PRs using the command make test
==> Installing packages
Resolved 393 packages in 2ms
Built etl @ file:///home/lucas/repos/etl
Built owid-catalog @ file:///home/lucas/repos/etl/lib/catalog
Built owid-datautils @ file:///home/lucas/repos/etl/lib/datautils
Built owid-repack @ file:///home/lucas/repos/etl/lib/repack
Built walden @ file:///home/lucas/repos/etl/lib/walden
Prepared 6 packages in 1.05s
Uninstalled 7 packages in 12ms
Installed 6 packages in 2ms
~ etl==0.1.0 (from file:///home/lucas/repos/etl)
~ owid-catalog==0.3.11 (from file:///home/lucas/repos/etl/lib/catalog)
~ owid-datautils==0.5.3 (from file:///home/lucas/repos/etl/lib/datautils)
~ owid-repack==0.1.4 (from file:///home/lucas/repos/etl/lib/repack)
- pyreadr==0.5.2
- ruff==0.1.6
+ ruff==0.8.2
~ walden==0.1.1 (from file:///home/lucas/repos/etl/lib/walden)
==> Checking formatting
3816 files already formatted
==> Checking linting
All checks passed!
==> Checking types
. .venv/bin/activate && .venv/bin/pyright etl snapshots apps api tests docs
WARNING: there is a new pyright version available (v1.1.373 -> v1.1.390).
Please install the new version or set PYRIGHT_PYTHON_FORCE_VERSION to `latest`
0 errors, 0 warnings, 0 informations
==> Running unit tests
.venv/bin/pytest -m "not integration" tests
================================================ test session starts =================================================
platform linux -- Python 3.11.10, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/lucas/repos/etl
configfile: pyproject.toml
plugins: typeguard-4.3.0, anyio-4.4.0, hydra-core-1.3.2, Faker-28.4.1
collected 170 items / 3 deselected / 167 selected
tests/apps/wizard/pages/expert/test_prompts.py . [ 0%]
tests/apps/wizard/utils/test_utils.py . [ 1%]
tests/backport/datasync/test_data_metadata.py ...... [ 4%]
tests/data_helpers/test_geo.py ......................................................... [ 38%]
tests/data_helpers/test_misc.py .................. [ 49%]
tests/test_command.py ... [ 51%]
tests/test_config.py .... [ 53%]
tests/test_converters.py . [ 54%]
tests/test_datadiff.py FF [ 55%]
tests/test_etl.py ..... [ 58%]
tests/test_etl_step_code.py . [ 59%]
tests/test_files.py ...... [ 62%]
tests/test_grapher_helpers.py .......... [ 68%]
tests/test_grapher_import.py .. [ 70%]
tests/test_grapher_model.py .. [ 71%]
tests/test_helpers.py .................... [ 83%]
tests/test_metadata_schemas.py .. [ 84%]
tests/test_prune.py s [ 85%]
tests/test_snapshot.py .. [ 86%]
tests/test_steps.py .......F. [ 91%]
tests/test_tempcompare.py .... [ 94%]
tests/test_version_tracker.py .......... [100%]
====================================================== FAILURES ======================================================
______________________________________________ test_DatasetDiff_summary ______________________________________________
tmp_path = PosixPath('/tmp/pytest-of-lucas/pytest-6/test_DatasetDiff_summary0')
def test_DatasetDiff_summary(tmp_path):
ds_a, ds_b = _create_datasets(tmp_path)
tab_a = Table(pd.DataFrame({"a": [1, 2]}), short_name="tab")
tab_a.metadata.description = "tab"
tab_b = Table(pd.DataFrame({"a": [1, 3], "b": ["a", "b"]}), short_name="tab")
tab_b["a"].metadata.description = "col a"
> ds_a.add(tab_a)
tests/test_datadiff.py:31:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Dataset(path='/tmp/pytest-of-lucas/pytest-6/test_DatasetDiff_summary0/catalog_a/ds', metadata=DatasetMeta(channel='gar...blic=True, additional_info=None, version='v', update_period_days=None, non_redistributable=False, source_checksum='1'))
table = a
0 1
1 2, formats = ['feather'], repack = True
def add(
self,
table: tables.Table,
formats: List[FileFormat] = DEFAULT_FORMATS,
repack: bool = True,
) -> None:
"""
Add this table to the dataset by saving it in the dataset's folder. By default we
save in multiple formats, but if you need a specific one (e.g. CSV for explorers)
you can specify it.
:param repack: if True, try to cast column types to the smallest possible type (e.g. float64 -> float32)
to reduce binary file size. Consider using False when your dataframe is large and the repack is failing.
"""
utils.validate_underscore(table.metadata.short_name, "Table's short_name")
for col in list(table.columns) + list(table.index.names):
utils.validate_underscore(col, "Variable's name")
if not table.primary_key:
if "OWID_STRICT" in environ:
> raise PrimaryKeyMissing(
f"Table `{table.metadata.short_name}` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving"
)
E owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving
lib/catalog/owid/catalog/datasets.py:123: PrimaryKeyMissing
___________________________________________________ test_new_data ____________________________________________________
tmp_path = PosixPath('/tmp/pytest-of-lucas/pytest-6/test_new_data0')
def test_new_data(tmp_path):
ds_a, ds_b = _create_datasets(tmp_path)
tab_a = Table({"country": ["UK", "US"], "a": [1, 3]}, short_name="tab")
tab_b = Table({"country": ["UK", "US", "FR"], "a": [1, 2, 3]}, short_name="tab")
> ds_a.add(tab_a)
tests/test_datadiff.py:52:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Dataset(path='/tmp/pytest-of-lucas/pytest-6/test_new_data0/catalog_a/ds', metadata=DatasetMeta(channel='garden', names...blic=True, additional_info=None, version='v', update_period_days=None, non_redistributable=False, source_checksum='1'))
table = country a
0 UK 1
1 US 3, formats = ['feather'], repack = True
def add(
self,
table: tables.Table,
formats: List[FileFormat] = DEFAULT_FORMATS,
repack: bool = True,
) -> None:
"""
Add this table to the dataset by saving it in the dataset's folder. By default we
save in multiple formats, but if you need a specific one (e.g. CSV for explorers)
you can specify it.
:param repack: if True, try to cast column types to the smallest possible type (e.g. float64 -> float32)
to reduce binary file size. Consider using False when your dataframe is large and the repack is failing.
"""
utils.validate_underscore(table.metadata.short_name, "Table's short_name")
for col in list(table.columns) + list(table.index.names):
utils.validate_underscore(col, "Variable's name")
if not table.primary_key:
if "OWID_STRICT" in environ:
> raise PrimaryKeyMissing(
f"Table `{table.metadata.short_name}` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving"
)
E owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving
lib/catalog/owid/catalog/datasets.py:123: PrimaryKeyMissing
___________________________________________________ test_get_etag ____________________________________________________
self = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7fa980407890>, method = 'HEAD'
url = '/owid/owid-grapher/master/README.md', body = None
headers = {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, br', 'Accept': '*/*', 'Connection': 'keep-alive'}
retries = Retry(total=0, connect=None, read=False, redirect=None, status=None), redirect = False
assert_same_host = False, timeout = Timeout(connect=None, read=None, total=None), pool_timeout = None
release_conn = False, chunked = False, body_pos = None
response_kw = {'decode_content': False, 'preload_content': False}
parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/owid/owid-grapher/master/README.md', query=None, fragment=None)
destination_scheme = None, conn = None, release_this_conn = True, http_tunnel_required = False, err = None
clean_exit = False
def urlopen(
self,
method,
url,
body=None,
headers=None,
retries=None,
redirect=True,
assert_same_host=True,
timeout=_Default,
pool_timeout=None,
release_conn=None,
chunked=False,
body_pos=None,
**response_kw
):
"""
Get a connection from the pool and perform an HTTP request. This is the
lowest level call for making a request, so you'll need to specify all
the raw details.
.. note::
More commonly, it's appropriate to use a convenience method provided
by :class:`.RequestMethods`, such as :meth:`request`.
.. note::
`release_conn` will only behave as expected if
`preload_content=False` because we want to make
`preload_content=False` the default behaviour someday soon without
breaking backwards compatibility.
:param method:
HTTP request method (such as GET, POST, PUT, etc.)
:param url:
The URL to perform the request on.
:param body:
Data to send in the request body, either :class:`str`, :class:`bytes`,
an iterable of :class:`str`/:class:`bytes`, or a file-like object.
:param headers:
Dictionary of custom headers to send, such as User-Agent,
If-None-Match, etc. If None, pool headers are used. If provided,
these headers completely replace any pool-specific headers.
:param retries:
Configure the number of retries to allow before raising a
:class:`~urllib3.exceptions.MaxRetryError` exception.
Pass ``None`` to retry until you receive a response. Pass a
:class:`~urllib3.util.retry.Retry` object for fine-grained control
over different types of retries.
Pass an integer number to retry connection errors that many times,
but no other types of errors. Pass zero to never retry.
If ``False``, then retries are disabled and any exception is raised
immediately. Also, instead of raising a MaxRetryError on redirects,
the redirect response will be returned.
:type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.
:param redirect:
If True, automatically handle redirects (status codes 301, 302,
303, 307, 308). Each redirect counts as a retry. Disabling retries
will disable redirect, too.
:param assert_same_host:
If ``True``, will make sure that the host of the pool requests is
consistent else will raise HostChangedError. When ``False``, you can
use the pool on an HTTP proxy and request foreign hosts.
:param timeout:
If specified, overrides the default timeout for this one
request. It may be a float (in seconds) or an instance of
:class:`urllib3.util.Timeout`.
:param pool_timeout:
If set and the pool is set to block=True, then this method will
block for ``pool_timeout`` seconds and raise EmptyPoolError if no
connection is available within the time period.
:param release_conn:
If False, then the urlopen call will not release the connection
back into the pool once a response is received (but will release if
you read the entire contents of the response such as when
`preload_content=True`). This is useful if you're not preloading
the response's content immediately. You will need to call
``r.release_conn()`` on the response ``r`` to return the connection
back into the pool. If None, it takes the value of
``response_kw.get('preload_content', True)``.
:param chunked:
If True, urllib3 will send the body using chunked transfer
encoding. Otherwise, urllib3 will send the body using the standard
content-length form. Defaults to False.
:param int body_pos:
Position to seek to in file-like body in the event of a retry or
redirect. Typically this won't need to be set because urllib3 will
auto-populate the value when needed.
:param \\**response_kw:
Additional parameters are passed to
:meth:`urllib3.response.HTTPResponse.from_httplib`
"""
parsed_url = parse_url(url)
destination_scheme = parsed_url.scheme
if headers is None:
headers = self.headers
if not isinstance(retries, Retry):
retries = Retry.from_int(retries, redirect=redirect, default=self.retries)
if release_conn is None:
release_conn = response_kw.get("preload_content", True)
# Check host
if assert_same_host and not self.is_same_host(url):
raise HostChangedError(self, url, retries)
# Ensure that the URL we're connecting to is properly encoded
if url.startswith("/"):
url = six.ensure_str(_encode_target(url))
else:
url = six.ensure_str(parsed_url.url)
conn = None
# Track whether `conn` needs to be released before
# returning/raising/recursing. Update this variable if necessary, and
# leave `release_conn` constant throughout the function. That way, if
# the function recurses, the original value of `release_conn` will be
# passed down into the recursive call, and its value will be respected.
#
# See issue #651 [1] for details.
#
# [1] <https://github.com/urllib3/urllib3/issues/651>
release_this_conn = release_conn
http_tunnel_required = connection_requires_http_tunnel(
self.proxy, self.proxy_config, destination_scheme
)
# Merge the proxy headers. Only done when not using HTTP CONNECT. We
# have to copy the headers dict so we can safely change it without those
# changes being reflected in anyone else's copy.
if not http_tunnel_required:
headers = headers.copy()
headers.update(self.proxy_headers)
# Must keep the exception bound to a separate variable or else Python 3
# complains about UnboundLocalError.
err = None
# Keep track of whether we cleanly exited the except block. This
# ensures we do proper cleanup in finally.
clean_exit = False
# Rewind body position, if needed. Record current position
# for future rewinds in the event of a redirect/retry.
body_pos = set_file_position(body, body_pos)
try:
# Request a connection from the queue.
timeout_obj = self._get_timeout(timeout)
conn = self._get_conn(timeout=pool_timeout)
conn.timeout = timeout_obj.connect_timeout
is_new_proxy_conn = self.proxy is not None and not getattr(
conn, "sock", None
)
if is_new_proxy_conn and http_tunnel_required:
self._prepare_proxy(conn)
# Make the request on the httplib connection object.
> httplib_response = self._make_request(
conn,
method,
url,
timeout=timeout_obj,
body=body,
headers=headers,
chunked=chunked,
)
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:716:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:404: in _make_request
self._validate_conn(conn)
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:1061: in _validate_conn
conn.connect()
.venv/lib64/python3.11/site-packages/urllib3/connection.py:419: in connect
self.sock = ssl_wrap_socket(
.venv/lib64/python3.11/site-packages/urllib3/util/ssl_.py:458: in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
.venv/lib64/python3.11/site-packages/urllib3/util/ssl_.py:502: in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
/usr/lib64/python3.11/ssl.py:517: in wrap_socket
return self.sslsocket_class._create(
/usr/lib64/python3.11/ssl.py:1104: in _create
self.do_handshake()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <ssl.SSLSocket [closed] fd=-1, family=2, type=1, proto=6>, block = False
@_sslcopydoc
def do_handshake(self, block=False):
self._check_connected()
timeout = self.gettimeout()
try:
if timeout == 0.0 and block:
self.settimeout(None)
> self._sslobj.do_handshake()
E ssl.SSLEOFError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)
/usr/lib64/python3.11/ssl.py:1382: SSLEOFError
During handling of the above exception, another exception occurred:
self = <requests.adapters.HTTPAdapter object at 0x7fa98051fb90>, request = <PreparedRequest [HEAD]>, stream = False
timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None, proxies = OrderedDict()
def send(
self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
):
"""Sends PreparedRequest object. Returns Response object.
:param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
:param stream: (optional) Whether to stream the request content.
:param timeout: (optional) How long to wait for the server to send
data before giving up, as a float, or a :ref:`(connect timeout,
read timeout) <timeouts>` tuple.
:type timeout: float or tuple or urllib3 Timeout object
:param verify: (optional) Either a boolean, in which case it controls whether
we verify the server's TLS certificate, or a string, in which case it
must be a path to a CA bundle to use
:param cert: (optional) Any user-provided SSL certificate to be trusted.
:param proxies: (optional) The proxies dictionary to apply to the request.
:rtype: requests.Response
"""
try:
conn = self.get_connection_with_tls_context(
request, verify, proxies=proxies, cert=cert
)
except LocationValueError as e:
raise InvalidURL(e, request=request)
self.cert_verify(conn, request.url, verify, cert)
url = self.request_url(request, proxies)
self.add_headers(
request,
stream=stream,
timeout=timeout,
verify=verify,
cert=cert,
proxies=proxies,
)
chunked = not (request.body is None or "Content-Length" in request.headers)
if isinstance(timeout, tuple):
try:
connect, read = timeout
timeout = TimeoutSauce(connect=connect, read=read)
except ValueError:
raise ValueError(
f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
f"or a single float to set both timeouts to the same value."
)
elif isinstance(timeout, TimeoutSauce):
pass
else:
timeout = TimeoutSauce(connect=timeout, read=timeout)
try:
> resp = conn.urlopen(
method=request.method,
url=url,
body=request.body,
headers=request.headers,
redirect=False,
assert_same_host=False,
preload_content=False,
decode_content=False,
retries=self.max_retries,
timeout=timeout,
chunked=chunked,
)
.venv/lib64/python3.11/site-packages/requests/adapters.py:667:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:802: in urlopen
retries = retries.increment(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Retry(total=0, connect=None, read=False, redirect=None, status=None), method = 'HEAD'
url = '/owid/owid-grapher/master/README.md', response = None
error = SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)'))
_pool = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7fa980407890>
_stacktrace = <traceback object at 0x7fa980404340>
def increment(
self,
method=None,
url=None,
response=None,
error=None,
_pool=None,
_stacktrace=None,
):
"""Return a new Retry object with incremented retry counters.
:param response: A response object, or None, if the server did not
return a response.
:type response: :class:`~urllib3.response.HTTPResponse`
:param Exception error: An error encountered during the request, or
None if the response was received successfully.
:return: A new ``Retry`` object.
"""
if self.total is False and error:
# Disabled, indicate to re-raise the error.
raise six.reraise(type(error), error, _stacktrace)
total = self.total
if total is not None:
total -= 1
connect = self.connect
read = self.read
redirect = self.redirect
status_count = self.status
other = self.other
cause = "unknown"
status = None
redirect_location = None
if error and self._is_connection_error(error):
# Connect retry?
if connect is False:
raise six.reraise(type(error), error, _stacktrace)
elif connect is not None:
connect -= 1
elif error and self._is_read_error(error):
# Read retry?
if read is False or not self._is_method_retryable(method):
raise six.reraise(type(error), error, _stacktrace)
elif read is not None:
read -= 1
elif error:
# Other retry?
if other is not None:
other -= 1
elif response and response.get_redirect_location():
# Redirect retry?
if redirect is not None:
redirect -= 1
cause = "too many redirects"
redirect_location = response.get_redirect_location()
status = response.status
else:
# Incrementing because of a server error like a 500 in
# status_forcelist and the given method is in the allowed_methods
cause = ResponseError.GENERIC_ERROR
if response and response.status:
if status_count is not None:
status_count -= 1
cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
status = response.status
history = self.history + (
RequestHistory(method, url, error, status, redirect_location),
)
new_retry = self.new(
total=total,
connect=connect,
read=read,
redirect=redirect,
status=status_count,
other=other,
history=history,
)
if new_retry.is_exhausted():
> raise MaxRetryError(_pool, url, error or ResponseError(cause))
E urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /owid/owid-grapher/master/README.md (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))
.venv/lib64/python3.11/site-packages/urllib3/util/retry.py:594: MaxRetryError
During handling of the above exception, another exception occurred:
def test_get_etag():
> etag = get_etag("https://raw.githubusercontent.com/owid/owid-grapher/master/README.md")
tests/test_steps.py:165:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
etl/steps/__init__.py:1085: in get_etag
resp = requests.head(url, verify=TLS_VERIFY)
.venv/lib64/python3.11/site-packages/requests/api.py:100: in head
return request("head", url, **kwargs)
.venv/lib64/python3.11/site-packages/requests/api.py:59: in request
return session.request(method=method, url=url, **kwargs)
.venv/lib64/python3.11/site-packages/requests/sessions.py:589: in request
resp = self.send(prep, **send_kwargs)
.venv/lib64/python3.11/site-packages/requests/sessions.py:703: in send
r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <requests.adapters.HTTPAdapter object at 0x7fa98051fb90>, request = <PreparedRequest [HEAD]>, stream = False
timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None, proxies = OrderedDict()
def send(
self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
):
"""Sends PreparedRequest object. Returns Response object.
:param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
:param stream: (optional) Whether to stream the request content.
:param timeout: (optional) How long to wait for the server to send
data before giving up, as a float, or a :ref:`(connect timeout,
read timeout) <timeouts>` tuple.
:type timeout: float or tuple or urllib3 Timeout object
:param verify: (optional) Either a boolean, in which case it controls whether
we verify the server's TLS certificate, or a string, in which case it
must be a path to a CA bundle to use
:param cert: (optional) Any user-provided SSL certificate to be trusted.
:param proxies: (optional) The proxies dictionary to apply to the request.
:rtype: requests.Response
"""
try:
conn = self.get_connection_with_tls_context(
request, verify, proxies=proxies, cert=cert
)
except LocationValueError as e:
raise InvalidURL(e, request=request)
self.cert_verify(conn, request.url, verify, cert)
url = self.request_url(request, proxies)
self.add_headers(
request,
stream=stream,
timeout=timeout,
verify=verify,
cert=cert,
proxies=proxies,
)
chunked = not (request.body is None or "Content-Length" in request.headers)
if isinstance(timeout, tuple):
try:
connect, read = timeout
timeout = TimeoutSauce(connect=connect, read=read)
except ValueError:
raise ValueError(
f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
f"or a single float to set both timeouts to the same value."
)
elif isinstance(timeout, TimeoutSauce):
pass
else:
timeout = TimeoutSauce(connect=timeout, read=timeout)
try:
resp = conn.urlopen(
method=request.method,
url=url,
body=request.body,
headers=request.headers,
redirect=False,
assert_same_host=False,
preload_content=False,
decode_content=False,
retries=self.max_retries,
timeout=timeout,
chunked=chunked,
)
except (ProtocolError, OSError) as err:
raise ConnectionError(err, request=request)
except MaxRetryError as e:
if isinstance(e.reason, ConnectTimeoutError):
# TODO: Remove this in 3.0.0: see #2811
if not isinstance(e.reason, NewConnectionError):
raise ConnectTimeout(e, request=request)
if isinstance(e.reason, ResponseError):
raise RetryError(e, request=request)
if isinstance(e.reason, _ProxyError):
raise ProxyError(e, request=request)
if isinstance(e.reason, _SSLError):
# This branch is for urllib3 v1.22 and later.
> raise SSLError(e, request=request)
E requests.exceptions.SSLError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /owid/owid-grapher/master/README.md (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))
.venv/lib64/python3.11/site-packages/requests/adapters.py:698: SSLError
================================================== warnings summary ==================================================
.venv/lib64/python3.11/site-packages/rdata/parser/_parser.py:11
/home/lucas/repos/etl/.venv/lib64/python3.11/site-packages/rdata/parser/_parser.py:11: DeprecationWarning: 'xdrlib' is deprecated and slated for removal in Python 3.13
import xdrlib
tests/data_helpers/test_geo.py::TestHarmonizeCountries::test_one_country_harmonized_and_one_excluded
tests/data_helpers/test_geo.py::TestHarmonizeCountries::test_one_country_left_equal_one_harmonized_and_one_excluded
/home/lucas/repos/etl/lib/datautils/owid/datautils/common.py:41: UserWarning: Unknown country names in excluded countries file:
warnings.warn(warning_message)
tests/data_helpers/test_misc.py::test_expand_time_column_full_range_dimension
/home/lucas/repos/etl/etl/data_helpers/misc.py:223: DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
df = df.groupby(dimension_col).apply(_reindex_dates).reset_index(drop=True).set_index(index) # type: ignore
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_zero
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_interpolate_and_zero
/home/lucas/repos/etl/etl/data_helpers/misc.py:317: FutureWarning: DataFrameGroupBy.fillna is deprecated and will be removed in a future version. Use obj.ffill() or obj.bfill() for forward or backward filling instead. If you want to fill with a single value, use DataFrame.fillna instead
df[values_column] = df.groupby(dimension_col)[values_column].fillna(0)
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
/home/lucas/repos/etl/tests/data_helpers/test_misc.py:308: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_tag.loc[:, "expand"] = False
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_zero
/home/lucas/repos/etl/tests/data_helpers/test_misc.py:332: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_tag.loc[:, "expand"] = False
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_ffill
/home/lucas/repos/etl/tests/data_helpers/test_misc.py:367: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_tag.loc[:, "expand"] = False
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_bfill
/home/lucas/repos/etl/tests/data_helpers/test_misc.py:446: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_tag.loc[:, "expand"] = False
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_interpolate
/home/lucas/repos/etl/tests/data_helpers/test_misc.py:525: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_tag.loc[:, "expand"] = False
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_interpolate_and_zero
/home/lucas/repos/etl/tests/data_helpers/test_misc.py:604: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df_tag.loc[:, "expand"] = False
tests/data_helpers/test_misc.py::test_expand_time_column_and_extra_years
tests/data_helpers/test_misc.py::test_expand_time_complex
/home/lucas/repos/etl/lib/catalog/owid/catalog/tables.py:403: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
super().__setitem__(key, value)
tests/data_helpers/test_misc.py::test_expand_time_complex
/home/lucas/repos/etl/lib/catalog/owid/catalog/tables.py:1069: FutureWarning: DataFrameGroupBy.fillna is deprecated and will be removed in a future version. Use obj.ffill() or obj.bfill() for forward or backward filling instead. If you want to fill with a single value, use Table.fillna instead
df = getattr(self.groupby, name)(*args, **kwargs)
tests/test_steps.py::test_data_step
tests/test_steps.py::test_data_step_becomes_dirty_when_pandas_version_changes
tests/test_steps.py::test_data_step_private
/home/lucas/repos/etl/lib/catalog/owid/catalog/datasets.py:191: UserWarning: Dataset test is missing namespace
warnings.warn(f"Dataset {self.metadata.short_name} is missing namespace")
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================== short test summary info ===============================================
FAILED tests/test_datadiff.py::test_DatasetDiff_summary - owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...
FAILED tests/test_datadiff.py::test_new_data - owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...
FAILED tests/test_steps.py::test_get_etag - requests.exceptions.SSLError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceed...
=================== 3 failed, 163 passed, 1 skipped, 3 deselected, 22 warnings in 80.94s (0:01:20) ===================
make: *** [Makefile:65: unittest] Error 1 |
b0ecfe7
to
583e9bd
Compare
583e9bd
to
38d6493
Compare
⌛ Merge Schedule |
/schedule 2024-12-12
Update ruff to the latest version in etl and all libraries. We're currently on version 0.1.6 which is getting outdated. The new version has more linting rules, but most changes are just added / removed newlines at the top of modules.
How to review
Just run
make test
locally to confirm that it works on your machine.TODO before merging
.git-blame-ignore-revs