Skip to content

Commit

Permalink
Remove VCF docs and refer to bio2zarr and vcztools
Browse files Browse the repository at this point in the history
  • Loading branch information
tomwhite committed Sep 24, 2024
1 parent db67ea5 commit eeef90b
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 427 deletions.
42 changes: 3 additions & 39 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,47 +32,11 @@ PLINK
write_plink
zarr_to_plink

VCF (reading)
VCF
-------------

.. deprecated:: 0.9.0
Functions for reading VCF are deprecated, please use the `bio2zarr <https://github.com/sgkit-dev/bio2zarr>`_ package.

.. currentmodule:: sgkit.io.vcf
.. autosummary::
:toctree: generated/

read_vcf
vcf_to_zarr

For more low-level control:

.. currentmodule:: sgkit.io.vcf
.. autosummary::
:toctree: generated/

partition_into_regions
vcf_to_zarrs
concat_zarrs
zarr_array_sizes

For converting from `scikit-allel's VCF Zarr representation <https://scikit-allel.readthedocs.io/en/stable/io.html#allel.vcf_to_zarr>`_ to sgkit's Zarr representation:

.. currentmodule:: sgkit
.. autosummary::
:toctree: generated/

read_scikit_allel_vcfzarr

VCF (writing)
-------------

.. currentmodule:: sgkit.io.vcf
.. autosummary::
:toctree: generated/

write_vcf
zarr_to_vcf
Functions for reading and writing VCF were removed from sgkit, please use the `bio2zarr <https://github.com/sgkit-dev/bio2zarr>`_
and `vcztools <https://github.com/sgkit-dev/vcztools>`_ packages.

Dataset
-------
Expand Down
1 change: 0 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ both popular Python genetics toolkits with a respective focus on population and

getting_started
user_guide
vcf
examples/index
api
how_do_i
Expand Down
28 changes: 4 additions & 24 deletions docs/user_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,10 @@ Reading and writing genetic data
Installation
------------

Sgkit can read standard genetic file formats, including VCF, PLINK, and BGEN. It can also export
to VCF.
Sgkit can read standard genetic file formats, including PLINK and BGEN. For reading VCF,
please use the `bio2zarr <https://github.com/sgkit-dev/bio2zarr>`_ package.

If sgkit has been installed using conda, support for reading BGEN and PLINK is included, but
VCF is not because there is no Windows support for cyvcf2, the library we use for reading VCF data.
If you are using Linux or a Mac, please install cyvcf2 using the following to enable VCF support::

$ conda install -c bioconda cyvcf2
If sgkit has been installed using conda, support for reading BGEN and PLINK is included.

If sgkit has been installed using pip, then support for reading these formats is
not included, and requires additional dependencies, which can be installed
Expand All @@ -39,10 +35,6 @@ To install sgkit with PLINK support::

$ pip install 'sgkit[plink]'

To install sgkit with VCF support::

$ pip install 'sgkit[vcf]'

Converting genetic data to Zarr
-------------------------------

Expand Down Expand Up @@ -88,22 +80,10 @@ arrays within an :class:`xarray.Dataset` from ``bed``, ``bim``, and ``fam`` file
The :func:`sgkit.io.plink.write_plink` and :func:`sgkit.io.plink.zarr_to_plink`
functions convert sgkit's Xarray data representation to PLINK.

VCF
---

The :func:`sgkit.io.vcf.vcf_to_zarr` function converts one or more VCF files to
Zarr files stored in sgkit's Xarray data representation, which can then be opened
as a :class:`xarray.Dataset`.

The :func:`sgkit.io.vcf.write_vcf` and :func:`sgkit.io.vcf.zarr_to_vcf` functions
convert sgkit's Xarray data representation to VCF.

See :ref:`vcf` for installation instructions, and details on using VCF in sgkit.

Working with cloud-native data
------------------------------

TODO: Show how to read/write Zarr (and VCF?) data in cloud storage
TODO: Show how to read/write Zarr data in cloud storage


Datasets
Expand Down
Loading

0 comments on commit eeef90b

Please sign in to comment.