Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydamage version 0.80 #27

Merged
merged 38 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
f10da15
add GtoA damage
maxibor Jul 11, 2024
e40005d
feat: add GtoA damage
maxibor Jul 11, 2024
d3cd9cf
feat: add damage rescaling
maxibor Jul 12, 2024
956a9b7
feat: add g2a substitutions
maxibor Jul 16, 2024
d5ff542
feat: add rescaling feature
maxibor Jul 16, 2024
7b1dafe
fix: damage test
maxibor Jul 16, 2024
2223195
feat: add numba to rescale
maxibor Jul 16, 2024
b0cb6df
pydamage version update
maxibor Jul 16, 2024
8670abc
dep: update conda env
maxibor Jul 16, 2024
3344de0
ci: update python version
maxibor Jul 16, 2024
ed5cdd0
dep: remove version pinning
maxibor Jul 16, 2024
3d4e30d
dep: remove python
maxibor Jul 16, 2024
cef0003
ci: update miniconda action
maxibor Jul 16, 2024
f3e6895
ci: update rtd config
maxibor Jul 16, 2024
d311267
ci: update rtd python version
maxibor Jul 16, 2024
3ea58d3
ci: update rtd dependancies
maxibor Jul 16, 2024
c8f0b35
doc: add benchmark notebook
maxibor Jul 17, 2024
fd989b4
fix: grouped mode rescaling
maxibor Jul 17, 2024
ec32bcb
doc: add rescaling notebook to rtd
maxibor Jul 17, 2024
a709509
feat: add more parallelization
maxibor Jul 18, 2024
3d5ff0d
feat: jit cache
maxibor Jul 18, 2024
d22777d
doc: update output plot
maxibor Jul 22, 2024
8c8d312
feat: add subsample
maxibor Jul 22, 2024
0b1b40b
feat: add pydamage CLI to bam rescaled bam output
maxibor Jul 22, 2024
7a6deb8
feat: reduce window to 13
maxibor Jul 23, 2024
6b41c33
fix: remove hardcoded variable
maxibor Jul 23, 2024
e19f8b1
feat: numpyfy code where possible
maxibor Jul 23, 2024
278efbf
Merge branch 'dev' of github.com:maxibor/pydamage into dev
maxibor Jul 23, 2024
ce1a560
cleanup: remove old code
maxibor Jul 23, 2024
77b0f88
cleanup: remove comment
maxibor Jul 23, 2024
fd54950
test: add subsample kwarg
maxibor Jul 23, 2024
97ba7a4
doc: (re)add repr
maxibor Jul 23, 2024
f165c2e
log: add progress tracking
maxibor Jul 23, 2024
77f0d7f
feat: move contig stats to dedicated modules
maxibor Jul 24, 2024
89c55b0
feat: switch print to logging
maxibor Jul 24, 2024
80ba101
log: add warning for no g2a used together with rescale
maxibor Jul 24, 2024
90146a3
feat: limit rescaling to window size
maxibor Jul 25, 2024
04ec8ab
doc: pydamage mapdamage benchmark
maxibor Sep 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 2 additions & 11 deletions .github/workflows/pydamage_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,23 +10,14 @@ jobs:
if: "!contains(github.event.head_commit.message, '[skip_ci]')"
steps:
- uses: actions/checkout@v2
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.7
auto-update-conda: true
mamba-version: "*"
channels: conda-forge,bioconda,defaults
channel-priority: true
environment-file: environment.yml
activate-environment: pydamage
- uses: actions/setup-java@v2
with:
distribution: 'adopt'
java-version: '11'
- name: Lint with flake8
shell: bash -l {0}
run: |
pip install flake8
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
- name: Test with pytest
shell: bash -l {0}
run: |
Expand Down
5 changes: 5 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
# Required
version: 2

build:
os: ubuntu-22.04
tools:
python: "mambaforge-latest"

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/source/conf.py
Expand Down
1 change: 0 additions & 1 deletion conda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ requirements:
- tqdm>=4.45.0
- biopython>=1.78
- kneed>=0.7.0
- pypmml>=0.9.7

test:
commands:
Expand Down
Binary file added docs/img/reference.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 3 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
# -- Project information -----------------------------------------------------

project = "pydamage"
copyright = "2020, Maxime Borry"
copyright = "2024, Maxime Borry"
author = "Maxime Borry"

# The full version, including alpha/beta/rc tags
Expand All @@ -42,6 +42,7 @@
"sphinx.ext.mathjax",
"sphinx_click.ext",
"recommonmark",
"nbsphinx",
]


Expand Down Expand Up @@ -76,7 +77,7 @@
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = [".rst", ".md", ".ipynb"]
source_suffix = [".rst", ".md"]

# The master toctree document.
master_doc = "index"
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ __ homepage_
CLI
output
troubleshooting
rescaling



Expand Down
11 changes: 10 additions & 1 deletion docs/source/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,15 @@ The tabular outputs are comma-separated file (`.csv`) with the following column

Same file as above, but with contigs filtered with `qvalue <= 0.05` and `predicted_accuracy >= threshold` with a user defined filtering threshold (default = 0.5), or determined with the [kneedle](https://ieeexplore.ieee.org/document/5961514) method.


### `pydamage_rescaled.bam`

The input alignment file with rescaled base quality scores when running `pydamage analyze` with the `-r` or `--rescale` flag.

The rescaled base calling scores are computed for each read containing ancient DNA damage according to the following formula, with `i` the position in the read, `p_err` the original base calling error probability,`p_pydam` the pydamage computed ancient damage probability, and `p_new` the updated base calling error probability.

`p_new(i) = 1 - (1 - p_err(i)) (1 - p_pydam(i))`

### Plots

The visual output are PNG files, one per reference contig. They show the frequency of observed C to T, and G to A transition at the 5' end of the sequencing data and overlay it with the fitted models for both the null and the damage model, including 95% confidence intervals. Furthermore, it provides a "residuals versus fitted" plot to help evaluate the fit of the pydamage damage model. Finally, the plot contains informtion on the average coverage along the reference and the p-value calculated from the likelihood-ratio test-statistic using a chi-squared distribution.
Expand All @@ -43,4 +52,4 @@ The visual output are PNG files, one per reference contig. They show the frequen

* **Visual output**

![pydamage_plot](../img/NZ_JHCB02000011.1.png)
![pydamage_plot](../img/reference.png)
Loading
Loading