💄 Update ruff #3707

Marigold · 2024-12-09T08:15:42Z

/schedule 2024-12-12

Update ruff to the latest version in etl and all libraries. We're currently on version 0.1.6 which is getting outdated. The new version has more linting rules, but most changes are just added / removed newlines at the top of modules.

How to review

Just run make test locally to confirm that it works on your machine.

TODO before merging

Add this commit to .git-blame-ignore-revs

owidbot · 2024-12-09T08:20:09Z

Quick links (staging server):

Site Dev	Site Preview	Admin	Wizard	Docs

Login: ssh owid@staging-site-update-ruff

chart-diff: ✅

No charts for review.

Edited: 2024-12-09 08:20:09 UTC
Execution time: 3.95 seconds

lucasrodes · 2024-12-11T14:17:46Z

I would say that it works, but I get some errors. I think some of these errors are just classic errors that I see always when running make test, so not sure they are relevant here. Leaving the complete traceback in case it is helpful.

Heads up that you can now schedule PRs using the command /schedule in your PR description (see guide here). Fiona and I have been using and it works good.

 make test
==> Installing packages
Resolved 393 packages in 2ms
   Built etl @ file:///home/lucas/repos/etl
   Built owid-catalog @ file:///home/lucas/repos/etl/lib/catalog
   Built owid-datautils @ file:///home/lucas/repos/etl/lib/datautils
   Built owid-repack @ file:///home/lucas/repos/etl/lib/repack
   Built walden @ file:///home/lucas/repos/etl/lib/walden
Prepared 6 packages in 1.05s
Uninstalled 7 packages in 12ms
Installed 6 packages in 2ms
 ~ etl==0.1.0 (from file:///home/lucas/repos/etl)
 ~ owid-catalog==0.3.11 (from file:///home/lucas/repos/etl/lib/catalog)
 ~ owid-datautils==0.5.3 (from file:///home/lucas/repos/etl/lib/datautils)
 ~ owid-repack==0.1.4 (from file:///home/lucas/repos/etl/lib/repack)
 - pyreadr==0.5.2
 - ruff==0.1.6
 + ruff==0.8.2
 ~ walden==0.1.1 (from file:///home/lucas/repos/etl/lib/walden)
==> Checking formatting
3816 files already formatted
==> Checking linting
All checks passed!
==> Checking types
. .venv/bin/activate && .venv/bin/pyright etl snapshots apps api tests docs
WARNING: there is a new pyright version available (v1.1.373 -> v1.1.390).
Please install the new version or set PYRIGHT_PYTHON_FORCE_VERSION to `latest`

0 errors, 0 warnings, 0 informations 
==> Running unit tests
.venv/bin/pytest -m "not integration" tests
================================================ test session starts =================================================
platform linux -- Python 3.11.10, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/lucas/repos/etl
configfile: pyproject.toml
plugins: typeguard-4.3.0, anyio-4.4.0, hydra-core-1.3.2, Faker-28.4.1
collected 170 items / 3 deselected / 167 selected                                                                    

tests/apps/wizard/pages/expert/test_prompts.py .                                                               [  0%]
tests/apps/wizard/utils/test_utils.py .                                                                        [  1%]
tests/backport/datasync/test_data_metadata.py ......                                                           [  4%]
tests/data_helpers/test_geo.py .........................................................                       [ 38%]
tests/data_helpers/test_misc.py ..................                                                             [ 49%]
tests/test_command.py ...                                                                                      [ 51%]
tests/test_config.py ....                                                                                      [ 53%]
tests/test_converters.py .                                                                                     [ 54%]
tests/test_datadiff.py FF                                                                                      [ 55%]
tests/test_etl.py .....                                                                                        [ 58%]
tests/test_etl_step_code.py .                                                                                  [ 59%]
tests/test_files.py ......                                                                                     [ 62%]
tests/test_grapher_helpers.py ..........                                                                       [ 68%]
tests/test_grapher_import.py ..                                                                                [ 70%]
tests/test_grapher_model.py ..                                                                                 [ 71%]
tests/test_helpers.py ....................                                                                     [ 83%]
tests/test_metadata_schemas.py ..                                                                              [ 84%]
tests/test_prune.py s                                                                                          [ 85%]
tests/test_snapshot.py ..                                                                                      [ 86%]
tests/test_steps.py .......F.                                                                                  [ 91%]
tests/test_tempcompare.py ....                                                                                 [ 94%]
tests/test_version_tracker.py ..........                                                                       [100%]

====================================================== FAILURES ======================================================
______________________________________________ test_DatasetDiff_summary ______________________________________________

tmp_path = PosixPath('/tmp/pytest-of-lucas/pytest-6/test_DatasetDiff_summary0')

    def test_DatasetDiff_summary(tmp_path):
        ds_a, ds_b = _create_datasets(tmp_path)
    
        tab_a = Table(pd.DataFrame({"a": [1, 2]}), short_name="tab")
        tab_a.metadata.description = "tab"
    
        tab_b = Table(pd.DataFrame({"a": [1, 3], "b": ["a", "b"]}), short_name="tab")
        tab_b["a"].metadata.description = "col a"
    
>       ds_a.add(tab_a)

tests/test_datadiff.py:31: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Dataset(path='/tmp/pytest-of-lucas/pytest-6/test_DatasetDiff_summary0/catalog_a/ds', metadata=DatasetMeta(channel='gar...blic=True, additional_info=None, version='v', update_period_days=None, non_redistributable=False, source_checksum='1'))
table =    a
0  1
1  2, formats = ['feather'], repack = True

    def add(
        self,
        table: tables.Table,
        formats: List[FileFormat] = DEFAULT_FORMATS,
        repack: bool = True,
    ) -> None:
        """
        Add this table to the dataset by saving it in the dataset's folder. By default we
        save in multiple formats, but if you need a specific one (e.g. CSV for explorers)
        you can specify it.
    
        :param repack: if True, try to cast column types to the smallest possible type (e.g. float64 -> float32)
            to reduce binary file size. Consider using False when your dataframe is large and the repack is failing.
        """
    
        utils.validate_underscore(table.metadata.short_name, "Table's short_name")
        for col in list(table.columns) + list(table.index.names):
            utils.validate_underscore(col, "Variable's name")
    
        if not table.primary_key:
            if "OWID_STRICT" in environ:
>               raise PrimaryKeyMissing(
                    f"Table `{table.metadata.short_name}` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving"
                )
E               owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving

lib/catalog/owid/catalog/datasets.py:123: PrimaryKeyMissing
___________________________________________________ test_new_data ____________________________________________________

tmp_path = PosixPath('/tmp/pytest-of-lucas/pytest-6/test_new_data0')

    def test_new_data(tmp_path):
        ds_a, ds_b = _create_datasets(tmp_path)
    
        tab_a = Table({"country": ["UK", "US"], "a": [1, 3]}, short_name="tab")
        tab_b = Table({"country": ["UK", "US", "FR"], "a": [1, 2, 3]}, short_name="tab")
    
>       ds_a.add(tab_a)

tests/test_datadiff.py:52: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Dataset(path='/tmp/pytest-of-lucas/pytest-6/test_new_data0/catalog_a/ds', metadata=DatasetMeta(channel='garden', names...blic=True, additional_info=None, version='v', update_period_days=None, non_redistributable=False, source_checksum='1'))
table =   country  a
0      UK  1
1      US  3, formats = ['feather'], repack = True

    def add(
        self,
        table: tables.Table,
        formats: List[FileFormat] = DEFAULT_FORMATS,
        repack: bool = True,
    ) -> None:
        """
        Add this table to the dataset by saving it in the dataset's folder. By default we
        save in multiple formats, but if you need a specific one (e.g. CSV for explorers)
        you can specify it.
    
        :param repack: if True, try to cast column types to the smallest possible type (e.g. float64 -> float32)
            to reduce binary file size. Consider using False when your dataframe is large and the repack is failing.
        """
    
        utils.validate_underscore(table.metadata.short_name, "Table's short_name")
        for col in list(table.columns) + list(table.index.names):
            utils.validate_underscore(col, "Variable's name")
    
        if not table.primary_key:
            if "OWID_STRICT" in environ:
>               raise PrimaryKeyMissing(
                    f"Table `{table.metadata.short_name}` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving"
                )
E               owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...], verify_integrity=True) to indicate dimensions before saving

lib/catalog/owid/catalog/datasets.py:123: PrimaryKeyMissing
___________________________________________________ test_get_etag ____________________________________________________

self = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7fa980407890>, method = 'HEAD'
url = '/owid/owid-grapher/master/README.md', body = None
headers = {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, br', 'Accept': '*/*', 'Connection': 'keep-alive'}
retries = Retry(total=0, connect=None, read=False, redirect=None, status=None), redirect = False
assert_same_host = False, timeout = Timeout(connect=None, read=None, total=None), pool_timeout = None
release_conn = False, chunked = False, body_pos = None
response_kw = {'decode_content': False, 'preload_content': False}
parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/owid/owid-grapher/master/README.md', query=None, fragment=None)
destination_scheme = None, conn = None, release_this_conn = True, http_tunnel_required = False, err = None
clean_exit = False

    def urlopen(
        self,
        method,
        url,
        body=None,
        headers=None,
        retries=None,
        redirect=True,
        assert_same_host=True,
        timeout=_Default,
        pool_timeout=None,
        release_conn=None,
        chunked=False,
        body_pos=None,
        **response_kw
    ):
        """
        Get a connection from the pool and perform an HTTP request. This is the
        lowest level call for making a request, so you'll need to specify all
        the raw details.
    
        .. note::
    
           More commonly, it's appropriate to use a convenience method provided
           by :class:`.RequestMethods`, such as :meth:`request`.
    
        .. note::
    
           `release_conn` will only behave as expected if
           `preload_content=False` because we want to make
           `preload_content=False` the default behaviour someday soon without
           breaking backwards compatibility.
    
        :param method:
            HTTP request method (such as GET, POST, PUT, etc.)
    
        :param url:
            The URL to perform the request on.
    
        :param body:
            Data to send in the request body, either :class:`str`, :class:`bytes`,
            an iterable of :class:`str`/:class:`bytes`, or a file-like object.
    
        :param headers:
            Dictionary of custom headers to send, such as User-Agent,
            If-None-Match, etc. If None, pool headers are used. If provided,
            these headers completely replace any pool-specific headers.
    
        :param retries:
            Configure the number of retries to allow before raising a
            :class:`~urllib3.exceptions.MaxRetryError` exception.
    
            Pass ``None`` to retry until you receive a response. Pass a
            :class:`~urllib3.util.retry.Retry` object for fine-grained control
            over different types of retries.
            Pass an integer number to retry connection errors that many times,
            but no other types of errors. Pass zero to never retry.
    
            If ``False``, then retries are disabled and any exception is raised
            immediately. Also, instead of raising a MaxRetryError on redirects,
            the redirect response will be returned.
    
        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.
    
        :param redirect:
            If True, automatically handle redirects (status codes 301, 302,
            303, 307, 308). Each redirect counts as a retry. Disabling retries
            will disable redirect, too.
    
        :param assert_same_host:
            If ``True``, will make sure that the host of the pool requests is
            consistent else will raise HostChangedError. When ``False``, you can
            use the pool on an HTTP proxy and request foreign hosts.
    
        :param timeout:
            If specified, overrides the default timeout for this one
            request. It may be a float (in seconds) or an instance of
            :class:`urllib3.util.Timeout`.
    
        :param pool_timeout:
            If set and the pool is set to block=True, then this method will
            block for ``pool_timeout`` seconds and raise EmptyPoolError if no
            connection is available within the time period.
    
        :param release_conn:
            If False, then the urlopen call will not release the connection
            back into the pool once a response is received (but will release if
            you read the entire contents of the response such as when
            `preload_content=True`). This is useful if you're not preloading
            the response's content immediately. You will need to call
            ``r.release_conn()`` on the response ``r`` to return the connection
            back into the pool. If None, it takes the value of
            ``response_kw.get('preload_content', True)``.
    
        :param chunked:
            If True, urllib3 will send the body using chunked transfer
            encoding. Otherwise, urllib3 will send the body using the standard
            content-length form. Defaults to False.
    
        :param int body_pos:
            Position to seek to in file-like body in the event of a retry or
            redirect. Typically this won't need to be set because urllib3 will
            auto-populate the value when needed.
    
        :param \\**response_kw:
            Additional parameters are passed to
            :meth:`urllib3.response.HTTPResponse.from_httplib`
        """
    
        parsed_url = parse_url(url)
        destination_scheme = parsed_url.scheme
    
        if headers is None:
            headers = self.headers
    
        if not isinstance(retries, Retry):
            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)
    
        if release_conn is None:
            release_conn = response_kw.get("preload_content", True)
    
        # Check host
        if assert_same_host and not self.is_same_host(url):
            raise HostChangedError(self, url, retries)
    
        # Ensure that the URL we're connecting to is properly encoded
        if url.startswith("/"):
            url = six.ensure_str(_encode_target(url))
        else:
            url = six.ensure_str(parsed_url.url)
    
        conn = None
    
        # Track whether `conn` needs to be released before
        # returning/raising/recursing. Update this variable if necessary, and
        # leave `release_conn` constant throughout the function. That way, if
        # the function recurses, the original value of `release_conn` will be
        # passed down into the recursive call, and its value will be respected.
        #
        # See issue #651 [1] for details.
        #
        # [1] <https://github.com/urllib3/urllib3/issues/651>
        release_this_conn = release_conn
    
        http_tunnel_required = connection_requires_http_tunnel(
            self.proxy, self.proxy_config, destination_scheme
        )
    
        # Merge the proxy headers. Only done when not using HTTP CONNECT. We
        # have to copy the headers dict so we can safely change it without those
        # changes being reflected in anyone else's copy.
        if not http_tunnel_required:
            headers = headers.copy()
            headers.update(self.proxy_headers)
    
        # Must keep the exception bound to a separate variable or else Python 3
        # complains about UnboundLocalError.
        err = None
    
        # Keep track of whether we cleanly exited the except block. This
        # ensures we do proper cleanup in finally.
        clean_exit = False
    
        # Rewind body position, if needed. Record current position
        # for future rewinds in the event of a redirect/retry.
        body_pos = set_file_position(body, body_pos)
    
        try:
            # Request a connection from the queue.
            timeout_obj = self._get_timeout(timeout)
            conn = self._get_conn(timeout=pool_timeout)
    
            conn.timeout = timeout_obj.connect_timeout
    
            is_new_proxy_conn = self.proxy is not None and not getattr(
                conn, "sock", None
            )
            if is_new_proxy_conn and http_tunnel_required:
                self._prepare_proxy(conn)
    
            # Make the request on the httplib connection object.
>           httplib_response = self._make_request(
                conn,
                method,
                url,
                timeout=timeout_obj,
                body=body,
                headers=headers,
                chunked=chunked,
            )

.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:716: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:404: in _make_request
    self._validate_conn(conn)
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:1061: in _validate_conn
    conn.connect()
.venv/lib64/python3.11/site-packages/urllib3/connection.py:419: in connect
    self.sock = ssl_wrap_socket(
.venv/lib64/python3.11/site-packages/urllib3/util/ssl_.py:458: in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
.venv/lib64/python3.11/site-packages/urllib3/util/ssl_.py:502: in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
/usr/lib64/python3.11/ssl.py:517: in wrap_socket
    return self.sslsocket_class._create(
/usr/lib64/python3.11/ssl.py:1104: in _create
    self.do_handshake()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ssl.SSLSocket [closed] fd=-1, family=2, type=1, proto=6>, block = False

    @_sslcopydoc
    def do_handshake(self, block=False):
        self._check_connected()
        timeout = self.gettimeout()
        try:
            if timeout == 0.0 and block:
                self.settimeout(None)
>           self._sslobj.do_handshake()
E           ssl.SSLEOFError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)

/usr/lib64/python3.11/ssl.py:1382: SSLEOFError

During handling of the above exception, another exception occurred:

self = <requests.adapters.HTTPAdapter object at 0x7fa98051fb90>, request = <PreparedRequest [HEAD]>, stream = False
timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection_with_tls_context(
                request, verify, proxies=proxies, cert=cert
            )
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
>           resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )

.venv/lib64/python3.11/site-packages/requests/adapters.py:667: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib64/python3.11/site-packages/urllib3/connectionpool.py:802: in urlopen
    retries = retries.increment(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Retry(total=0, connect=None, read=False, redirect=None, status=None), method = 'HEAD'
url = '/owid/owid-grapher/master/README.md', response = None
error = SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)'))
_pool = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7fa980407890>
_stacktrace = <traceback object at 0x7fa980404340>

    def increment(
        self,
        method=None,
        url=None,
        response=None,
        error=None,
        _pool=None,
        _stacktrace=None,
    ):
        """Return a new Retry object with incremented retry counters.
    
        :param response: A response object, or None, if the server did not
            return a response.
        :type response: :class:`~urllib3.response.HTTPResponse`
        :param Exception error: An error encountered during the request, or
            None if the response was received successfully.
    
        :return: A new ``Retry`` object.
        """
        if self.total is False and error:
            # Disabled, indicate to re-raise the error.
            raise six.reraise(type(error), error, _stacktrace)
    
        total = self.total
        if total is not None:
            total -= 1
    
        connect = self.connect
        read = self.read
        redirect = self.redirect
        status_count = self.status
        other = self.other
        cause = "unknown"
        status = None
        redirect_location = None
    
        if error and self._is_connection_error(error):
            # Connect retry?
            if connect is False:
                raise six.reraise(type(error), error, _stacktrace)
            elif connect is not None:
                connect -= 1
    
        elif error and self._is_read_error(error):
            # Read retry?
            if read is False or not self._is_method_retryable(method):
                raise six.reraise(type(error), error, _stacktrace)
            elif read is not None:
                read -= 1
    
        elif error:
            # Other retry?
            if other is not None:
                other -= 1
    
        elif response and response.get_redirect_location():
            # Redirect retry?
            if redirect is not None:
                redirect -= 1
            cause = "too many redirects"
            redirect_location = response.get_redirect_location()
            status = response.status
    
        else:
            # Incrementing because of a server error like a 500 in
            # status_forcelist and the given method is in the allowed_methods
            cause = ResponseError.GENERIC_ERROR
            if response and response.status:
                if status_count is not None:
                    status_count -= 1
                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
                status = response.status
    
        history = self.history + (
            RequestHistory(method, url, error, status, redirect_location),
        )
    
        new_retry = self.new(
            total=total,
            connect=connect,
            read=read,
            redirect=redirect,
            status=status_count,
            other=other,
            history=history,
        )
    
        if new_retry.is_exhausted():
>           raise MaxRetryError(_pool, url, error or ResponseError(cause))
E           urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /owid/owid-grapher/master/README.md (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))

.venv/lib64/python3.11/site-packages/urllib3/util/retry.py:594: MaxRetryError

During handling of the above exception, another exception occurred:

    def test_get_etag():
>       etag = get_etag("https://raw.githubusercontent.com/owid/owid-grapher/master/README.md")

tests/test_steps.py:165: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
etl/steps/__init__.py:1085: in get_etag
    resp = requests.head(url, verify=TLS_VERIFY)
.venv/lib64/python3.11/site-packages/requests/api.py:100: in head
    return request("head", url, **kwargs)
.venv/lib64/python3.11/site-packages/requests/api.py:59: in request
    return session.request(method=method, url=url, **kwargs)
.venv/lib64/python3.11/site-packages/requests/sessions.py:589: in request
    resp = self.send(prep, **send_kwargs)
.venv/lib64/python3.11/site-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <requests.adapters.HTTPAdapter object at 0x7fa98051fb90>, request = <PreparedRequest [HEAD]>, stream = False
timeout = Timeout(connect=None, read=None, total=None), verify = True, cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection_with_tls_context(
                request, verify, proxies=proxies, cert=cert
            )
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )
    
        except (ProtocolError, OSError) as err:
            raise ConnectionError(err, request=request)
    
        except MaxRetryError as e:
            if isinstance(e.reason, ConnectTimeoutError):
                # TODO: Remove this in 3.0.0: see #2811
                if not isinstance(e.reason, NewConnectionError):
                    raise ConnectTimeout(e, request=request)
    
            if isinstance(e.reason, ResponseError):
                raise RetryError(e, request=request)
    
            if isinstance(e.reason, _ProxyError):
                raise ProxyError(e, request=request)
    
            if isinstance(e.reason, _SSLError):
                # This branch is for urllib3 v1.22 and later.
>               raise SSLError(e, request=request)
E               requests.exceptions.SSLError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /owid/owid-grapher/master/README.md (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))

.venv/lib64/python3.11/site-packages/requests/adapters.py:698: SSLError
================================================== warnings summary ==================================================
.venv/lib64/python3.11/site-packages/rdata/parser/_parser.py:11
  /home/lucas/repos/etl/.venv/lib64/python3.11/site-packages/rdata/parser/_parser.py:11: DeprecationWarning: 'xdrlib' is deprecated and slated for removal in Python 3.13
    import xdrlib

tests/data_helpers/test_geo.py::TestHarmonizeCountries::test_one_country_harmonized_and_one_excluded
tests/data_helpers/test_geo.py::TestHarmonizeCountries::test_one_country_left_equal_one_harmonized_and_one_excluded
  /home/lucas/repos/etl/lib/datautils/owid/datautils/common.py:41: UserWarning: Unknown country names in excluded countries file:
    warnings.warn(warning_message)

tests/data_helpers/test_misc.py::test_expand_time_column_full_range_dimension
  /home/lucas/repos/etl/etl/data_helpers/misc.py:223: DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
    df = df.groupby(dimension_col).apply(_reindex_dates).reset_index(drop=True).set_index(index)  # type: ignore

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_zero
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_interpolate_and_zero
  /home/lucas/repos/etl/etl/data_helpers/misc.py:317: FutureWarning: DataFrameGroupBy.fillna is deprecated and will be removed in a future version. Use obj.ffill() or obj.bfill() for forward or backward filling instead. If you want to fill with a single value, use DataFrame.fillna instead
    df[values_column] = df.groupby(dimension_col)[values_column].fillna(0)

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
tests/data_helpers/test_misc.py::test_expand_time_column_fillna_basic
  /home/lucas/repos/etl/tests/data_helpers/test_misc.py:308: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_tag.loc[:, "expand"] = False

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_zero
  /home/lucas/repos/etl/tests/data_helpers/test_misc.py:332: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_tag.loc[:, "expand"] = False

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_ffill
  /home/lucas/repos/etl/tests/data_helpers/test_misc.py:367: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_tag.loc[:, "expand"] = False

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_bfill
  /home/lucas/repos/etl/tests/data_helpers/test_misc.py:446: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_tag.loc[:, "expand"] = False

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_interpolate
  /home/lucas/repos/etl/tests/data_helpers/test_misc.py:525: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_tag.loc[:, "expand"] = False

tests/data_helpers/test_misc.py::test_expand_time_column_fillna_interpolate_and_zero
  /home/lucas/repos/etl/tests/data_helpers/test_misc.py:604: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    df_tag.loc[:, "expand"] = False

tests/data_helpers/test_misc.py::test_expand_time_column_and_extra_years
tests/data_helpers/test_misc.py::test_expand_time_complex
  /home/lucas/repos/etl/lib/catalog/owid/catalog/tables.py:403: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    super().__setitem__(key, value)

tests/data_helpers/test_misc.py::test_expand_time_complex
  /home/lucas/repos/etl/lib/catalog/owid/catalog/tables.py:1069: FutureWarning: DataFrameGroupBy.fillna is deprecated and will be removed in a future version. Use obj.ffill() or obj.bfill() for forward or backward filling instead. If you want to fill with a single value, use Table.fillna instead
    df = getattr(self.groupby, name)(*args, **kwargs)

tests/test_steps.py::test_data_step
tests/test_steps.py::test_data_step_becomes_dirty_when_pandas_version_changes
tests/test_steps.py::test_data_step_private
  /home/lucas/repos/etl/lib/catalog/owid/catalog/datasets.py:191: UserWarning: Dataset test is missing namespace
    warnings.warn(f"Dataset {self.metadata.short_name} is missing namespace")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================== short test summary info ===============================================
FAILED tests/test_datadiff.py::test_DatasetDiff_summary - owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...
FAILED tests/test_datadiff.py::test_new_data - owid.catalog.datasets.PrimaryKeyMissing: Table `tab` does not have a primary_key -- please use t.set_index([col, ...
FAILED tests/test_steps.py::test_get_etag - requests.exceptions.SSLError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceed...
=================== 3 failed, 163 passed, 1 skipped, 3 deselected, 22 warnings in 80.94s (0:01:20) ===================
make: *** [Makefile:65: unittest] Error 1

github-actions · 2024-12-11T14:33:37Z

⌛ Merge Schedule
Scheduled to be merged on 2024-12-12 00:00:00 (UTC)

github-actions bot assigned Marigold Dec 9, 2024

Marigold force-pushed the update-ruff branch from c97a83b to 5a35fd2 Compare December 9, 2024 08:19

Marigold force-pushed the update-ruff branch from 5a35fd2 to 3b09e0e Compare December 9, 2024 08:24

Marigold marked this pull request as ready for review December 9, 2024 08:24

Marigold force-pushed the update-ruff branch 3 times, most recently from bd5ae3e to 9942bd3 Compare December 9, 2024 11:27

Marigold added the staging-bake Full grapher bake on the staging server label Dec 9, 2024

Marigold force-pushed the update-ruff branch 2 times, most recently from 6adab7b to 9a3cdc7 Compare December 10, 2024 08:33

Marigold requested a review from lucasrodes December 10, 2024 08:34

Marigold force-pushed the update-ruff branch from b0ecfe7 to 583e9bd Compare December 11, 2024 14:31

💄 Update ruff

38d6493

Marigold force-pushed the update-ruff branch from 583e9bd to 38d6493 Compare December 11, 2024 14:32

💄 Update .git-blame-ignore-revs to include recent ruff changes

fd62def

Marigold merged commit 4acec4a into master Dec 12, 2024
9 of 10 checks passed

Marigold deleted the update-ruff branch December 12, 2024 07:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

💄 Update ruff #3707

💄 Update ruff #3707

Marigold commented Dec 9, 2024 •

edited

Loading

owidbot commented Dec 9, 2024

lucasrodes commented Dec 11, 2024

github-actions bot commented Dec 11, 2024 •

edited

Loading

💄 Update ruff #3707

💄 Update ruff #3707

Conversation

Marigold commented Dec 9, 2024 • edited Loading

How to review

TODO before merging

owidbot commented Dec 9, 2024

lucasrodes commented Dec 11, 2024

github-actions bot commented Dec 11, 2024 • edited Loading

Marigold commented Dec 9, 2024 •

edited

Loading

github-actions bot commented Dec 11, 2024 •

edited

Loading