Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add README + release workflow + fix __repr__ bug and update tutorials #7

Merged
merged 5 commits into from
Aug 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Publish Python Package to PyPI

on:
release:
types: [published]
workflow_dispatch:

jobs:
Publish:
# prevents this action from running on forks
if: github.repository == 'open2c/assemblyinfo'

runs-on: ubuntu-latest
permissions:
id-token: write

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.x"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build

- name: Build
run: python -m build

- name: Publish distribution 📦 to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
114 changes: 113 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,113 @@
# assemblyinfo
# Assemblyinfo: Interact with assembly metadata in Python

![CI](https://github.com/open2c/assemblyinfo/actions/workflows/ci.yml/badge.svg)
[![Docs status](https://readthedocs.org/projects/genomeinfo/badge/)](https://genomeinfo.readthedocs.io/en/latest/)
[![Slack](https://img.shields.io/badge/chat-slack-%233F0F3F?logo=slack)](https://bit.ly/open2c-slack)

Assemblyinfo simplifies the management and analysis of genome assembly metadata in Python.

This package provides:

* Efficient tools for querying and manipulating assembly information datasets.
* Streamlined methods for importing, exporting, and converting between common chromosome formats.
* Utilities for retrieving assembly statistics across different versions or species.

Read the [documentation](https://genomeinfo.readthedocs.io/en/latest/) for more information.


## Installation

Bioframe is available on [PyPI](https://pypi.org/project/bioframe/):

```sh
pip install assemblyinfo
```

## Basic operations on chromosome data

Assemblyinfo offers a flexible and straigthforward interface to interact and perform basic queries.

```python
import assemblyinfo

db = assemblyinfo.connect()
hg38 = db.assembly_info("hg38", roles=["assembled"])
```

Easily allows getting chromosome sizes:

```text
hg38.chromsizes

> name
> chr1 248956422
> chr2 242193529
> ...
```

chromosome equivalences:

```text
hg38.chromeq

> ncbi genbank refseq
> chr1 1 CM000663.2 NC_000001.11
> chr2 2 CM000664.2 NC_000002.12
> chr3 3 CM000665.2 NC_000003.12
> ...
```

or assembly metadata:

```text
hg38.metadata

> {'assembly_level': 'Chromosome',
'assembly_type': 'haploid-with-alt-loci',
'bioproject': 'PRJNA168',
'submitter': 'Genome Reference Consortium',
'synonyms': ['GRCh38', 'hg38'],
'taxid': '9606',
'species': 'homo_sapiens',
'common_name': 'human',
... }
```

and more!

# Request an assembly

Feel free to open an issue and request a non-reference assembly! Current supported species are:

```plaintext
['caenorhabditis_elegans',
'homo_sapiens',
'mus_musculus',
'drosophila_melanogaster',
'danio_rerio',
'bos_taurus',
'gallus_gallus',
'canis_lupus_familiaris']
```

You also can easily see which specific assemblies are supported by:

```python
db = assemblyinfo.connect()
db.available_assemblies()
```

## Citing

If you use ***assemblyinfo*** in your work, please refer to:

```bibtex
@software{assemblyinfo_2024,
author = {Open2C},
title = {assemblyinfo},
year = {2024},
publisher = {Github},
version = {v0.0.1},
url = {https://github.com/open2c/assemblyinfo}
}
```
4 changes: 2 additions & 2 deletions assemblyinfo/core/assembly.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@ def chromeq(self) -> Dict[str, Dict[str, str]]:
return pd.DataFrame(self.aliases).T

def __repr__(self):
return (f"Assembly(assembly={self.assembly}",
f"species={self.species}",
return (f"Assembly(assembly={self.assembly}, "
f"species={self.species}, "
f"common_name={self.common_name})")


Expand Down
139 changes: 35 additions & 104 deletions docs/tutorials/get_quick_stats.ipynb

Large diffs are not rendered by default.

Loading