Skip to content

Commit

Permalink
Merge devel into master (#88)
Browse files Browse the repository at this point in the history
* add 10x feature barcode map

* count kite

* add 10x feature barcode map

* count kite

* update 10x feature barcode

* update kite workflow

* update macos and linux bustools binaries for dev

* add 10x feature barcode map

* count kite

* update 10x feature barcode

* update kite workflow

* update macos and linux bustools binaries for dev

* fix kmer length in kite index

* kallisto dev builds

* try different linux binary

* allow custom technologies

* allow different barcode lengths

* allow index without hamming distance 1 variants (hidden --no-mismatches)

* allow overriding kmer length

* fix tests and code quality

* detect when feature barcode mapping has wrong columns

* forgot to remove print

* add missing check for None

* allow multiple fastas, gtfs

* update binary

* allow multiple indices + bus file merging

* fix kite ref

* HTML report

* update requirements

* velocity reports

* --report option

* anndata name column

* t2g includes transcript name (if available)

* more information in t2g

* fix adata import

* update h5py requirement

* write kb_info.json

* update binaries

* comment logger disable

* Revert "update binaries"

This reverts commit 9c2602a.

* cellranger matrix is transposed

* update kallisto to 0.46.2

* disable logging from anndata

* update how anndatas are overlayed & summed

* allow whitelist override with "none"

* remove kernelspec from report

* hidden option to disable inspect

* freeze dev requirements and fix code quality

* fix split indices

* fix dry run

* fix dry run with split indices

* improve test coverage

* add codecov

* ignore main.py for coverage

* Fix temp directory (#72)

* add workflow for stale issues (#54)

* add workflow for stale issues

* update workflow file

* Update README.md

* Fix temp_dir

Co-authored-by: Joseph Min <[email protected]>

* fix tests

* add runtimes to kb_info

* hidden option to turn off validation

* split with --workflow lamanno only splits intron fasta

* pass in memory to filter_with_bustools

* hidden option to set flank

* Iupac extension (#83) (#84)

* add workflow for stale issues (#54)

* add workflow for stale issues

* update workflow file

* Update README.md

* iupac extension

* smarter upper

Co-authored-by: Joseph Min <[email protected]>

Co-authored-by: Maarten-vd-Sande <[email protected]>

* update setup.py so that it doesn't install tests

* fix tests and force LF line endings

* split index fix

* update setup.py and makefile

* add python 3.8 to CI

* Smartseq support (#87)

* count smartseq

* fix code quality

Co-authored-by: Yueh-Hua Tu <[email protected]>
Co-authored-by: Maarten-vd-Sande <[email protected]>
  • Loading branch information
3 people authored Nov 19, 2020
1 parent 3ce4d8e commit f92e159
Show file tree
Hide file tree
Showing 88 changed files with 3,193,111 additions and 897 deletions.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* text=auto eol=lf
8 changes: 6 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
runs-on: ubuntu-18.04
strategy:
matrix:
python: [ '3.6', '3.7' ]
python: [ '3.6', '3.7', '3.8' ]
os: ['ubuntu-18.04']
name: Test on Python ${{ matrix.python }}
steps:
Expand All @@ -36,4 +36,8 @@ jobs:
- name: Install dependencies
run: pip install -r requirements.txt && pip install -r dev-requirements.txt
- name: Run tests
run: nosetests --verbose --with-coverage --cover-package kb_python
run: make test
- name: Upload coverage
run: bash <(curl -s https://codecov.io/bash)
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
include kb_python/info.txt
include kb_python/whitelists/*
include kb_python/maps/*
recursive-include kb_python/report *
recursive-include kb_python/bins *
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
.PHONY : install test check build docs clean push_release

test:
nosetests --verbose --with-coverage --cover-package kb_python
rm -f .coverage
nosetests --verbose --with-coverage --cover-package kb_python tests/* tests/dry/*

check:
flake8 kb_python && echo OK
Expand All @@ -19,6 +20,7 @@ clean:
rm -rf kb_python.egg-info
rm -rf docs/_build
rm -rf docs/api
rm -rf .coverage

bump_patch:
bumpversion patch
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
[![pypi version](https://img.shields.io/pypi/v/kb-python)](https://pypi.org/project/kb-python/0.24.4/)
![python versions](https://img.shields.io/pypi/pyversions/kb_python)
![status](https://github.com/pachterlab/kb_python/workflows/CI/badge.svg)
[![codecov](https://codecov.io/gh/pachterlab/kb_python/branch/master/graph/badge.svg)](https://codecov.io/gh/pachterlab/kb_python)
[![pypi downloads](https://img.shields.io/pypi/dm/kb-python)](https://pypi.org/project/kb-python/)
[![docs](https://readthedocs.org/projects/kb-python/badge/?version=latest)](https://kb-python.readthedocs.io/en/latest/?badge=latest)
[![license](https://img.shields.io/pypi/l/kb-python)](LICENSE)
Expand Down
2 changes: 2 additions & 0 deletions codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
ignore:
- "kb_python/main.py"
13 changes: 7 additions & 6 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
bumpversion>=0.5.3
coverage>=4.5.4
flake8>=3.7.8
nose>=1.3.7
bumpversion==0.6.0
coverage==5.1
flake8==3.8.2
nose==1.3.7
pre-commit==2.4.0
twine>=2.0.0
wheel>=0.33.6
yapf>=0.29.0
wheel==0.34.2
yapf==0.30.0
Binary file modified kb_python/bins/darwin/bustools/bustools
Binary file not shown.
Binary file modified kb_python/bins/darwin/kallisto/kallisto
Binary file not shown.
Binary file modified kb_python/bins/linux/bustools/bustools
Binary file not shown.
Binary file modified kb_python/bins/linux/kallisto/kallisto
Binary file not shown.
Binary file modified kb_python/bins/windows/bustools/bustools.exe
Binary file not shown.
Binary file modified kb_python/bins/windows/kallisto/kallisto.exe
Binary file not shown.
Empty file modified kb_python/bins/windows/kallisto/license.txt
100755 → 100644
Empty file.
107 changes: 91 additions & 16 deletions kb_python/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,39 +8,79 @@

TEMP_DIR = 'tmp'
DRY = False
VALIDATE = True
CHUNK_SIZE = 1024 * 1024 * 4 # Download files in chunks of 4 Mb

# Technology to file position mapping
Technology = namedtuple(
'Technology', [
'name', 'description', 'nfiles', 'reads_file', 'umi_positions',
'barcode_positions', 'whitelist_archive'
'barcode_positions', 'whitelist_archive', 'map_archive'
]
)
WHITELIST_DIR = 'whitelists'
MAP_DIR = 'maps'
TECHNOLOGIES = [
Technology(
'10XV1',
'10x version 1',
3,
0,
[(1, 0, 0)],
[(2, 0, 0)],
2,
[(1, 0, 10)],
[(0, 0, 14)],
'10xv1_whitelist.txt.gz',
None,
),
Technology(
'10XV2',
'10x version 2',
2,
1,
[(0, 16, 26)],
[(0, 0, 16)],
'10xv2_whitelist.txt.gz',
None,
),
Technology(
'10XV3',
'10x version 3',
2,
1,
[(0, 16, 28)],
[(0, 0, 16)],
'10xv3_whitelist.txt.gz',
'10xv3_feature_barcode_map.txt.gz',
),
Technology(
'10XV2', '10x version 2', 2, 1, [(0, 16, 26)], [(0, 0, 16)],
'10xv2_whitelist.txt.gz'
'CELSEQ',
'CEL-Seq',
2,
1,
[(0, 8, 12)],
[(0, 0, 8)],
None,
None,
),
Technology(
'10XV3', '10x version 3', 2, 1, [(0, 16, 28)], [(0, 0, 16)],
'10xv3_whitelist.txt.gz'
'CELSEQ2',
'CEL-SEQ version 2',
2,
1,
[(0, 0, 6)],
[(0, 6, 12)],
None,
None,
),
Technology('CELSEQ', 'CEL-Seq', 2, 1, [(0, 8, 12)], [(0, 0, 8)], None),
Technology(
'CELSEQ2', 'CEL-SEQ version 2', 2, 1, [(0, 0, 6)], [(0, 6, 12)], None
'DROPSEQ',
'DropSeq',
2,
1,
[(0, 12, 20)],
[(0, 0, 12)],
None,
None,
),
Technology('DROPSEQ', 'DropSeq', 2, 1, [(0, 12, 20)], [(0, 0, 12)], None),
Technology(
'INDROPSV1',
'inDrops version 1',
Expand All @@ -49,6 +89,7 @@
[(0, 42, 48)],
[(0, 0, 11), (0, 30, 38)],
None,
None,
),
Technology(
'INDROPSV2',
Expand All @@ -58,6 +99,7 @@
[(1, 42, 48)],
[(1, 0, 11), (1, 30, 38)],
None,
None,
),
Technology(
'INDROPSV3',
Expand All @@ -67,14 +109,31 @@
[(1, 8, 14)],
[(0, 0, 8), (1, 0, 8)],
'inDropsv3_whitelist.txt.gz',
None,
),
Technology('SCRUBSEQ', 'SCRB-Seq', 2, 1, [(0, 6, 16)], [(0, 0, 6)], None),
Technology(
'SURECELL', 'SureCell for ddSEQ', 2, 1, [(0, 51, 59)], [(0, 0, 6),
(0, 21, 27),
(0, 42, 48)],
None
'SCRUBSEQ',
'SCRB-Seq',
2,
1,
[(0, 6, 16)],
[(0, 0, 6)],
None,
None,
),
Technology(
'SURECELL',
'SureCell for ddSEQ',
2,
1,
[(0, 51, 59)],
[(0, 0, 6), (0, 21, 27), (0, 42, 48)],
None,
None,
),
Technology(
'SMARTSEQ', 'Smart-seq2', 2, '0, 1 (paired)', [], [], None, None
)
]
TECHNOLOGIES_MAPPING = {t.name: t for t in TECHNOLOGIES}

Expand Down Expand Up @@ -165,3 +224,19 @@ def is_dry():
:rtype: bool
"""
return DRY


def no_validate():
"""Turn off validation.
"""
global VALIDATE
VALIDATE = False


def is_validate():
"""Return whether validation is turned on.
:return: whether validation is on
:rtype: bool
"""
return VALIDATE
30 changes: 30 additions & 0 deletions kb_python/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,46 @@
BUS_SC_FILENAME = 'output.s.c.bus'
BUS_UNFILTERED_FILENAME = 'output.unfiltered.bus'
BUS_FILTERED_FILENAME = 'output.filtered.bus'
BUS_MASHED_FILENAME = 'mashed.bus'
BUS_MERGED_FILENAME = 'merged.bus'
ECMAP_MERGED_FILENAME = 'merged.ec'
BUS_CDNA_PREFIX = 'spliced'
BUS_INTRON_PREFIX = 'unspliced'
ECMAP_FILENAME = 'matrix.ec'
TXNAMES_FILENAME = 'transcripts.txt'
KB_INFO_FILENAME = 'kb_info.json'
KALLISTO_INFO_FILENAME = 'run_info.json'
REPORT_NOTEBOOK_FILENAME = 'report.ipynb'
REPORT_HTML_FILENAME = 'report.html'
COUNTS_PREFIX = 'cells_x_genes'
TCC_PREFIX = 'cells_x_tcc'
FEATURE_PREFIX = 'cells_x_features'
ADATA_PREFIX = 'adata'
GENE_NAME = 'gene'
FEATURE_NAME = 'feature'
TRANSCRIPT_NAME = 'transcript'

UNFILTERED_COUNTS_DIR = 'counts_unfiltered'
FILTERED_COUNTS_DIR = 'counts_filtered'
CELLRANGER_DIR = 'cellranger'
CELLRANGER_MATRIX = 'matrix.mtx'
CELLRANGER_BARCODES = 'barcodes.tsv'
CELLRANGER_GENES = 'genes.tsv'

BUS_UNFILTERED_SUFFIX = '.unfiltered.bus'
BUS_FILTERED_SUFFIX = '.filtered.bus'

# Smartseq file names
BATCH_FILENAME = 'batch.txt'
ABUNDANCE_FILENAME = 'matrix.abundance.mtx'
CELLS_FILENAME = 'matrix.cells'
GENE_DIR = 'counts_gene'

# File codes.
# These are appended to the filename whenever it undergoes some kind of
# transformation.
SORT_CODE = 's'
CORRECT_CODE = 'c'
FILTERED_CODE = 'filtered'
UNFILTERED_CODE = 'unfiltered'
PROJECT_CODE = 'p'
Loading

0 comments on commit f92e159

Please sign in to comment.