Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove VCF support #1264

Merged
merged 9 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 0 additions & 15 deletions .github/scripts/test_sgkit_vcf.py

This file was deleted.

4 changes: 2 additions & 2 deletions .github/workflows/build-numpy-2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-dev.txt
pip install -U 'numpy<2.1'
# update bio2zarr for NumPy 2, see https://github.com/sgkit-dev/bio2zarr/issues/256
pip install -U 'numpy<2.1' -U git+https://github.com/sgkit-dev/bio2zarr.git
- name: Run pre-commit
uses: pre-commit/[email protected]
- name: Test with pytest (numba jit disabled)
Expand All @@ -32,7 +33,6 @@ jobs:
run: |
# avoid guvectorized functions #1194
pytest -v sgkit/tests/test_pedigree.py
pytest -v sgkit/tests/io/vcf/test_vcf_writer_utils.py
- name: Test with pytest and coverage
run: |
pytest -v --cov=sgkit --cov-report=term-missing
1 change: 0 additions & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ jobs:
run: |
# avoid guvectorized functions #1194
pytest -v sgkit/tests/test_pedigree.py
pytest -v sgkit/tests/io/vcf/test_vcf_writer_utils.py
- name: Test with pytest and coverage
run: |
pytest -v --cov=sgkit --cov-report=term-missing
Expand Down
38 changes: 4 additions & 34 deletions .github/workflows/wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,14 @@ jobs:
with:
path: dist

unix-test:
test:
# This workflow only runs on the origin org
if: github.repository_owner == 'sgkit-dev'
needs: ['build']
strategy:
matrix:
# don't use macos-latest as it uses M1 which doesn't work
os: [ubuntu-latest, macos-12]
os: [ubuntu-latest, macos-12, windows-latest]
python-version: ["3.9", "3.10", "3.11"]
runs-on: ${{ matrix.os }}
steps:
Expand All @@ -64,46 +64,16 @@ jobs:
python -VV
# Install the local wheel
wheel=$(ls artifact/sgkit-*.whl)
pip install ${wheel} ${wheel}[bgen] ${wheel}[plink] ${wheel}[vcf]
pip install ${wheel} ${wheel}[bgen] ${wheel}[plink]
python sgkit-copy/.github/scripts/test_sgkit.py
python sgkit-copy/.github/scripts/test_sgkit_bgen.py
python sgkit-copy/.github/scripts/test_sgkit_plink.py
python sgkit-copy/.github/scripts/test_sgkit_vcf.py

# Windows doesn't support vcf
windows-test:
# This workflow only runs on the origin org
if: github.repository_owner == 'sgkit-dev'
runs-on: windows-latest
needs: ['build']
strategy:
matrix:
python-version: ["3.9"]
steps:
# checkout repo to subdirectory to get access to scripts
- uses: actions/checkout@v2
with:
path: sgkit-copy
- name: Download artifacts
uses: actions/[email protected]
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install wheel and test
run: |
python -VV
# Install the local wheel
$env:wheel = $(ls artifact/sgkit-*.whl)
pip install $env:wheel "$env:wheel[bgen]" "$env:wheel[plink]"
python sgkit-copy/.github/scripts/test_sgkit.py
python sgkit-copy/.github/scripts/test_sgkit_bgen.py
python sgkit-copy/.github/scripts/test_sgkit_plink.py

pypi-upload:
if: github.repository_owner == 'sgkit-dev'
runs-on: ubuntu-latest
needs: ['unix-test', 'windows-test']
needs: ['test']
steps:
- name: Download all
uses: actions/[email protected]
Expand Down
142 changes: 0 additions & 142 deletions benchmarks/benchmarks_vcf.py

This file was deleted.

3 changes: 1 addition & 2 deletions conftest.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# Ignore VCF files during pytest collection, so it doesn't fail if cyvcf2 isn't installed.
collect_ignore_glob = ["benchmarks/**", "sgkit/io/vcf/*.py", ".github/scripts/*.py"]
collect_ignore_glob = ["benchmarks/**", ".github/scripts/*.py"]


def pytest_addoption(parser):
Expand Down
42 changes: 3 additions & 39 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,47 +32,11 @@ PLINK
write_plink
zarr_to_plink

VCF (reading)
VCF
-------------

.. deprecated:: 0.9.0
Functions for reading VCF are deprecated, please use the `bio2zarr <https://github.com/sgkit-dev/bio2zarr>`_ package.

.. currentmodule:: sgkit.io.vcf
.. autosummary::
:toctree: generated/

read_vcf
vcf_to_zarr

For more low-level control:

.. currentmodule:: sgkit.io.vcf
.. autosummary::
:toctree: generated/

partition_into_regions
vcf_to_zarrs
concat_zarrs
zarr_array_sizes

For converting from `scikit-allel's VCF Zarr representation <https://scikit-allel.readthedocs.io/en/stable/io.html#allel.vcf_to_zarr>`_ to sgkit's Zarr representation:

.. currentmodule:: sgkit
.. autosummary::
:toctree: generated/

read_scikit_allel_vcfzarr

VCF (writing)
-------------

.. currentmodule:: sgkit.io.vcf
.. autosummary::
:toctree: generated/

write_vcf
zarr_to_vcf
Functions for reading and writing VCF were removed from sgkit, please use the `bio2zarr <https://github.com/sgkit-dev/bio2zarr>`_
and `vcztools <https://github.com/sgkit-dev/vcztools>`_ packages.

Dataset
-------
Expand Down
20 changes: 12 additions & 8 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,12 @@ New Features
- Add 'matching' method to :func:`identity_by_state` function.
(:user:`timothymillar`, :pr:`1229`, :issue:`1227`)

.. Breaking changes
.. ~~~~~~~~~~~~~~~~
Breaking changes
~~~~~~~~~~~~~~~~

- Functions for reading and writing VCF were removed from sgkit, please use the `bio2zarr <https://github.com/sgkit-dev/bio2zarr>`_
and `vcztools <https://github.com/sgkit-dev/vcztools>`_ packages instead.
(:user:`tomwhite`, :pr:`1264`)

.. Deprecations
.. ~~~~~~~~~~~~
Expand Down Expand Up @@ -147,22 +151,22 @@ New Features
- Add :func:`sgkit.convert_call_to_index` method.
(:user:`timothymillar`, :pr:`1050`, :issue:`1048`)

- Add ``read_chunk_length`` option to :func:`sgkit.io.vcf.vcf_to_zarr` and
:func:`sgkit.io.vcf.vcf_to_zarrs` functions. These are useful to reduce memory usage
- Add ``read_chunk_length`` option to ``sgkit.io.vcf.vcf_to_zarr`` and
``sgkit.io.vcf.vcf_to_zarrs`` functions. These are useful to reduce memory usage
with large sample counts or a large ``chunk_length``.
(:user:`benjeffery`, :pr:`1044`, :issue:`1042`)

- Add ``retain_temp_files`` to :func:`sgkit.io.vcf.vcf_to_zarr` function.
- Add ``retain_temp_files`` to ``sgkit.io.vcf.vcf_to_zarr`` function.
(:user:`benjeffery`, :pr:`1046`, :issue:`1036`)

- Add :func:`sgkit.io.vcf.read_vcf` convenience function.
- Add ``sgkit.io.vcf.read_vcf`` convenience function.
(:user:`tomwhite`, :pr:`1052`, :issue:`1004`)

- Add :func:`sgkit.hybrid_relationship`, :func:`sgkit.hybrid_inverse_relationship`
and :func:`invert_relationship_matrix` methods.
(:user:`timothymillar`, :pr:`1053`, :issue:`993`)

- Add :func:`sgkit.io.vcf.zarr_array_sizes` for determining array sizes for storage in Zarr.
- Add ``sgkit.io.vcf.zarr_array_sizes`` for determining array sizes for storage in Zarr.
(:user:`tomwhite`, :pr:`1073`, :issue:`734`)

- Add ``skipna`` option to :func:`genomic_relationship` function.
Expand All @@ -174,7 +178,7 @@ New Features
Breaking changes
~~~~~~~~~~~~~~~~

- Generate VCF header by default when writing VCF using :func:`sgkit.io.vcf.write_vcf` or :func:`sgkit.io.vcf.zarr_to_vcf`.
- Generate VCF header by default when writing VCF using ``sgkit.io.vcf.write_vcf`` or ``sgkit.io.vcf.zarr_to_vcf``.
Previously, the dataset had to contain a ``vcf_header`` attribute.
(:user:`tomwhite`, :pr:`1021`, :issue:`1020`)

Expand Down
1 change: 0 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ both popular Python genetics toolkits with a respective focus on population and

getting_started
user_guide
vcf
examples/index
api
how_do_i
Expand Down
Loading
Loading