Skip to content

Commit

Permalink
Merge pull request #217 from khaeru/enh/testing
Browse files Browse the repository at this point in the history
Improve test suite, utilities
  • Loading branch information
khaeru authored Jan 7, 2025
2 parents 9698161 + 3fea832 commit fc68ccf
Show file tree
Hide file tree
Showing 20 changed files with 876 additions and 549 deletions.
27 changes: 9 additions & 18 deletions .github/workflows/pytest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,35 +38,26 @@ jobs:
steps:
- uses: actions/checkout@v4

- name: Checkout test data
uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v5
with:
repository: khaeru/sdmx-test-data
path: sdmx-test-data

- uses: astral-sh/setup-uv@v4
with:
enable-cache: true
cache-dependency-glob: "**/pyproject.toml"
python-version: ${{ matrix.python-version }}

- name: Install Python, the package, and dependencies
run: |
uv venv --python=${{ matrix.python-version }}
uv pip install .[tests]
run: uv pip install .[tests]

- name: Run pytest
env:
SDMX_TEST_DATA: ./sdmx-test-data/
run: |
uv run --no-sync \
pytest \
-ra --color=yes --verbose \
--sdmx-fetch-data \
--cov-report=xml \
--numprocesses auto
shell: bash

- name: Upload test coverage to Codecov.io
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v5
with: { token: "${{ secrets.CODECOV_TOKEN }}" }

pre-commit:
Expand All @@ -76,16 +67,16 @@ jobs:

steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v4
- uses: astral-sh/setup-uv@v5
with:
enable-cache: true
cache-dependency-glob: "**/pyproject.toml"
# TEMPORARY Use Python 3.12 to avoid https://github.com/python/mypy/issues/18216
python-version: "3.12"
- uses: actions/cache@v4
with:
path: ~/.cache/pre-commit
key: ${{ github.job }}|${{ hashFiles('.pre-commit-config.yaml') }}
lookup-only: ${{ github.event_name == 'schedule' }}
# lookup-only: true
- name: Run pre-commit
# TEMPORARY Use Python 3.12 to avoid https://github.com/python/mypy/issues/18216
run: uvx --python=3.12 pre-commit run --all-files --show-diff-on-failure --color=always
run: uvx pre-commit run --all-files --show-diff-on-failure --color=always
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@ repos:
hooks:
- id: mypy
additional_dependencies:
- GitPython
- lxml-stubs
- pandas-stubs
- pytest
- requests-cache
- requests-mock
- responses
- types-Jinja2
- types-python-dateutil
- types-PyYAML
Expand Down
5 changes: 5 additions & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,16 @@

# -- Options for sphinx.ext.linkcode ---------------------------------------------------

LINKCODE_ALIAS = {
"sdmx/testing": "sdmx/testing/__init__",
}


def linkcode_resolve(domain, info):
if domain != "py" or not info["module"]:
return None
filename = info["module"].replace(".", "/")
filename = LINKCODE_ALIAS.get(filename, filename)
return f"https://github.com/khaeru/sdmx/tree/main/{filename}.py"


Expand Down
105 changes: 85 additions & 20 deletions doc/dev.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,28 +38,69 @@ Code style

.. _testing:

Test specimens
==============
Testing
=======

Specimens and data
------------------

.. versionadded:: 2.0

A variety of *specimens*—example files from real web services, or published with the standards—are used to test that :mod:`sdmx` correctly reads and writes the different SDMX message formats.
Since v2.0, specimens are stored in the separate `sdmx-test-data <https://github.com/khaeru/sdmx-test-data>`_ repository.

Specimens are stored in the separate `sdmx-test-data <https://github.com/khaeru/sdmx-test-data>`_ repository.

Running the test suite requires these files.
To retrieve them, use one of the following methods:
The simplest way to do this is to give the :program:`--sdmx-fetch-data` option when invoking :program:`pytest`::

$ pytest --sdmx-fetch-data

This invokes :meth:`SpecimenCollection.fetch`, which uses :program:`git` (via `GitPython <https://gitpython.readthedocs.io>`_) to retrieve and unpack the files to a directory like :file:`$HOME/.cache/sdmx/test-data/`.
:ref:`See below <sdmx-test-data>` for more advanced options.

Contents and layout
~~~~~~~~~~~~~~~~~~~

**Specimen files** are:

- Arranged in directories with names matching particular sources in :file:`sources.json`.
- Named with:

- Certain keywords:

- ``-structure``: a structure message, often associated with a file with a similar name containing a data message.
- ``ts``: time-series data, i.e. with a TimeDimensions at the level of individual Observations.
- ``xs``: cross-sectional data arranged in other ways.
- ``flat``: flat DataSets with all Dimensions at the Observation level.
- ``ss``: structure-specific data messages.

- In some cases, the query string or data flow/structure ID as the file name.
- Hyphens ``-`` instead of underscores ``_``.

.. _recorded-responses:

The :file:`recorded/` directory contains **recorded HTTP responses** from certain SDMX-REST web services.
These files are stored using the :mod:`requests_cache` :doc:`file system backend <requests-cache:user_guide/backends/filesystem>`; see those docs for the name and format of the files.

.. _sdmx-test-data:

Custom test data directory
~~~~~~~~~~~~~~~~~~~~~~~~~~

It is also possible to place the test data in a specific directory; for instance, in order to commit new files to the specimen collection.
Use one of the following methods:

1. Obtain the files by one of two methods:

a. Clone ``khaeru/sdmx-test-data``::
a. Clone ``sdmx-test-data``::

$ git clone [email protected]:khaeru/sdmx-test-data.git

b. Download https://github.com/khaeru/sdmx-test-data/archive/main.zip

2. Indicate where pytest can find the files, by one of two methods:
2. Indicate where :program:`pytest` can find the files, by one of two methods:

a. Set the `SDMX_TEST_DATA` environment variable::
a. Set the ``SDMX_TEST_DATA`` environment variable::

# Set the variable only for one command
$ SDMX_TEST_DATA=/path/to/files pytest
Expand All @@ -68,26 +109,35 @@ To retrieve them, use one of the following methods:
$ export SDMX_TEST_DATA
$ pytest

b. Give the option ``--sdmx-test-data=<PATH>`` when invoking pytest::
b. Give the option ``--sdmx-test-data=<PATH>`` when invoking :program:`pytest`::

$ pytest --sdmx-test-data=/path/to/files

The files are:
.. _test-network:

- Arranged in directories with names matching particular sources in :file:`sources.json`.
- Named with:
Network vs. offline tests
-------------------------

- Certain keywords:
Tests related to particular SDMX-REST web services can be categorized as:

- ``-structure``: a structure message, often associated with a file with a similar name containing a data message.
- ``ts``: time-series data, i.e. with a TimeDimensions at the level of individual Observations.
- ``xs``: cross-sectional data arranged in other ways.
- ``flat``: flat DataSets with all Dimensions at the Observation level.
- ``ss``: structure-specific data messages.
- Ensuring :mod:`sdmx` can interact with the service *as-is*.

- In some cases, the query string or data flow/structure ID as the file name.
- Hyphens ``-`` instead of underscores ``_``.
These include the :ref:`full matrix of source-endpoint tests <source-policy>`, which run on a nightly schedule because they are slow.
They also include other tests (for instance, of code snippets appearing in this documentation) marked with the custom pytest mark :py:`@pytest.mark.network` that make actual network requests.
These tests may appear ‘flaky’: they are vulnerable to network interruptions, or temporary downtime/incapacity of the targeted service(s).

- Ensuring :mod:`sdmx` can handle certain SDMX messages or HTTP responses returned by services.
This should remain true *whether or not* those services actually return the same content as they did at the moment the tests were written.

These are handled using :ref:`recorded responses <recorded-responses>`, as described above.
This makes the test outcomes deterministic, even if the services are periodically unavailable.

These tests use :func:`.session_with_stored_responses`, which is an in-memory :class:`~requests_cache.CachedSession` prepared using:

- The recorded/stored responses from ``sdmx-test-data``.
- Other responses generated by :func:`.add_responses` / :func:`.save_response`.
- :func:`.offline` / :class:`.OfflineAdapter`.
This ensures that *only* the cached URLs/requests can be queried; all other queries raise :class:`.RuntimeError`.

Releasing
=========
Expand Down Expand Up @@ -154,11 +204,26 @@ Internal code reference
:undoc-members:
:show-inheritance:

.. automodule:: sdmx.testing.data
:members:
:undoc-members:
:show-inheritance:

.. automodule:: sdmx.testing.report
:members:
:undoc-members:
:show-inheritance:

``util``: Utilities
-------------------

.. automodule:: sdmx.util
:noindex:
:members:
:undoc-members:
:show-inheritance:

.. automodule:: sdmx.util.requests
:members:
:undoc-members:
:show-inheritance:

Expand Down
2 changes: 1 addition & 1 deletion doc/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Optional dependencies for extra features

- for ``cache``, allowing the caching of SDMX messages in memory, MongoDB, Redis, and more: `requests-cache <https://requests-cache.readthedocs.io>`_.
- for ``docs``, to build the documentation: `sphinx <https://sphinx-doc.org>`_ and `IPython <https://ipython.org>`_.
- for ``tests``, to run the test suite: `pytest <https://pytest.org>`_, and `requests-mock <https://requests-mock.readthedocs.io>`_.
- for ``tests``, to run the test suite: `pytest <https://pytest.org>`_ and others.

Instructions
============
Expand Down
4 changes: 4 additions & 0 deletions doc/whatsnew.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ What's new?
Next release
============

- Simplify :class:`.Session` via direct inheritance from :class:`.requests_cache.session.CacheMixin`, where installed (:pull:`217`).
- Add an optional :py:`session=...` keyword argument to :class:`.Client` (:pull:`217`).
- Improve :ref:`network and offline tests <test-network>` via new and improved test utilities (:pull:`217`).
New test fixtures :func:`.session_with_pytest_cache` and :func:`.session_with_stored_responses`.
- Bug fix for reading :xml:`<str:Categorisation>` from SDMX-ML 2.1: the :attr:`.Categorisation.category` attribute was read as an instance of Categorisation, rather than Category (:pull:`215`).

.. _2.20.0:
Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,12 @@ dependencies = [
cache = ["requests-cache"]
docs = ["furo", "IPython", "sphinx >= 8"]
tests = [
"GitPython",
"Jinja2",
"pytest >= 5",
"pytest-cov",
"pytest-xdist",
"requests-mock >= 1.4",
"responses",
"sdmx1[cache]",
]

Expand Down Expand Up @@ -75,6 +76,7 @@ exclude = ["^build/"]
# Packages/modules for which no type hints are available.
module = [
"lxml.builder", # Not covered by types-lxml
"xdist",
]
ignore_missing_imports = true

Expand Down
29 changes: 22 additions & 7 deletions sdmx/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,15 +45,18 @@ class Client:
source : str or source.Source
Identifier of a data source. If a string, must be one of the known sources in
:meth:`list_sources`.
session :
:class:`.requests.Session` instance. If not supplied, an instance of
:class:`.Session` is created.
log_level : int
Override the package-wide logger with one of the
:ref:`standard logging levels <py:levels>`.
.. deprecated:: 2.0
Will be removed in :mod:`sdmx` version 3.0.
**session_opts
Additional keyword arguments are passed to :class:`.Session`.
Additional keyword arguments are passed to :class:`.Session` and thus to
:class:`.requests_cache.CachedSession` and its backend classes (if installed).
"""

cache: dict[str, "sdmx.message.Message"] = {}
Expand All @@ -67,16 +70,28 @@ class Client:
# Stored keyword arguments "allow_redirects" and "timeout" for pre-requests.
_send_kwargs: dict[str, Any] = {}

def __init__(self, source=None, log_level=None, **session_opts):
def __init__(
self,
source=None,
*,
session: Optional["requests.Session"] = None,
log_level=None,
**session_opts,
):
try:
self.source = sources[source.upper()] if source else NoSource
except KeyError:
raise ValueError(
f"source must be None or one of: {' '.join(list_sources())}"
)

# Create an HTTP Session object to reuse a connection for multiple requests
self.session = Session(**session_opts)
if session:
if session_opts:
raise ValueError("Client(…, session=…) with additional keyword args")
self.session = session
else:
# Create an HTTP Session object to reuse a connection for multiple requests
self.session = Session(**session_opts)

if log_level:
message = "Client(…, log_level=…) parameter"
Expand Down Expand Up @@ -192,8 +207,8 @@ def _request_from_args(self, kwargs):
resource_id=resource_id,
params=kwargs.pop("params", {}),
)
if provider := kwargs.pop("provider", None):
warn("provider= keyword argument; use agency_id", DeprecationWarning, 2)
if provider := kwargs.pop("provider", None): # pragma: no cover
warn("provider= keyword argument; use agency_id", DeprecationWarning, 3)
kw.update(agency_id=provider)
if version := kwargs.pop("version", None):
kw.update(version=version)
Expand Down
Loading

0 comments on commit fc68ccf

Please sign in to comment.