Skip to content

Commit

Permalink
feat: starting the book
Browse files Browse the repository at this point in the history
  • Loading branch information
mrvollger committed Jun 11, 2024
1 parent 46b418a commit cf03b1d
Show file tree
Hide file tree
Showing 4 changed files with 102 additions and 60 deletions.
4 changes: 2 additions & 2 deletions docs/book.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ renderer = ["html"]
max-level = 4


[preprocessor.auto-links]
command = "python auto-links.py"
#[preprocessor.auto-links]
#command = "python auto-links.py"

1 change: 1 addition & 0 deletions docs/make_help_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ export DYLD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH

out="src/help.md"
echo "# Help pages for fibertools subcommands" >$out
echo "<!-- toc -->" >>$out
echo "" >>$out

for subcommand in "" "predict-m6a" "fire" "extract" "center" "add-nucleosomes" "footprint" "clear-kinetics" "strip-basemods" "track-decorators" "pileup"; do
Expand Down
9 changes: 8 additions & 1 deletion docs/src/creating/fire.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,11 @@ This command identifies **<ins>F</ins>iber-seq <ins>I</ins>nferred <ins>R</ins>e
This command can be run in isolation; however, it is usually preferable to run the [FIRE pipeline](https://github.com/fiberseq/FIRE), which runs `ft fire` and performs many additional analyses and visualizations.


[**The help page**](../help.md#ft-fire).
[**The help page**](../help.md#ft-fire).

## Extracting from a FIRE BAM
`ft fire` can also be used as an extraction tool to extract Fiber-seq data from an already processed FIRE BAM file.
```bash
ft fire --extract fire.bam > all.bed
```
This produces a file in BED format that contains all the MSPs, FIREs, and nucleosomes in the FIRE BAM file. This command produces output analogous to the [now removed](https://github.com/fiberseq/FIRE/issues/24) `model.results.bed.gz` result from older versions of FIRE pipeline.
148 changes: 91 additions & 57 deletions docs/src/help.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Help pages for fibertools subcommands
<!-- toc -->

## `ft `
```console
Expand All @@ -7,13 +8,16 @@ Fiber-seq toolkit in rust
Usage: ft [OPTIONS] <COMMAND>

Commands:
predict-m6a Predict m6A positions using HiFi kinetics data and encode the results in the MM and ML bam tags. Also adds
nucleosome (nl, ns) and MTase sensitive patches (al, as) [aliases: m6A, m6a]
predict-m6a Predict m6A positions using HiFi kinetics data and encode the results in the
MM and ML bam tags. Also adds nucleosome (nl, ns) and MTase sensitive
patches (al, as) [aliases: m6A, m6a]
add-nucleosomes Add nucleosomes to a bam file with m6a predictions
fire Add FIREs (Fiber-seq Inferred Regulatory Elements) to a bam file with m6a predictions
fire Add FIREs (Fiber-seq Inferred Regulatory Elements) to a bam file with m6a
predictions
extract Extract fiberseq data into plain text files [aliases: ex, e]
center This command centers fiberseq data around given reference positions. This is useful for making aggregate m6A
and CpG observations, as well as visualization of SVs [aliases: c, ct]
center This command centers fiberseq data around given reference positions. This is
useful for making aggregate m6A and CpG observations, as well as
visualization of SVs [aliases: c, ct]
footprint Infer footprints from fiberseq data
track-decorators Make decorated bed files for fiberseq data
pileup Make a pileup track of Fiber-seq features from a FIRE bam
Expand All @@ -35,14 +39,15 @@ Debug-Options:

## `ft predict-m6a`
```console
Predict m6A positions using HiFi kinetics data and encode the results in the MM and ML bam tags. Also adds nucleosome (nl, ns) and
MTase sensitive patches (al, as)
Predict m6A positions using HiFi kinetics data and encode the results in the MM and ML bam tags.
Also adds nucleosome (nl, ns) and MTase sensitive patches (al, as)

Usage: ft predict-m6a [OPTIONS] [BAM] [OUT]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]
[OUT] Output bam file with m6A calls in new/extended MM and ML bam tags [default: -]

Options:
Expand All @@ -62,7 +67,8 @@ Options:
Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -73,9 +79,12 @@ Debug-Options:
--quiet Turn off all logging

Developer-Options:
--force-min-ml-score <FORCE_MIN_ML_SCORE> Force a different minimum ML score
--all-calls Keep all m6A calls regardless of how low the ML value is
-b, --batch-size <BATCH_SIZE> Number of reads to include in batch prediction [default: 1]
--force-min-ml-score <FORCE_MIN_ML_SCORE>
Force a different minimum ML score
--all-calls
Keep all m6A calls regardless of how low the ML value is
-b, --batch-size <BATCH_SIZE>
Number of reads to include in batch prediction [default: 1]
```

## `ft fire`
Expand All @@ -85,10 +94,11 @@ Add FIREs (Fiber-seq Inferred Regulatory Elements) to a bam file with m6a predic
Usage: ft fire [OPTIONS] [BAM] [OUT]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[OUT] Output file (BAM by default, table of MSP features if `--feats-to-text` is used, and bed9 + if `--extract`` is used)
[default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]
[OUT] Output file (BAM by default, table of MSP features if `--feats-to-text` is used, and bed9 +
if `--extract`` is used) [default: -]

Options:
-e, --extract
Expand All @@ -102,18 +112,21 @@ Options:
--min-msp <MIN_MSP>
Skip reads without at least `N` MSP calls [env: MIN_MSP=] [default: 0]
--min-ave-msp-size <MIN_AVE_MSP_SIZE>
Skip reads without an average MSP size greater than `N` [env: MIN_AVE_MSP_SIZE=] [default: 0]
Skip reads without an average MSP size greater than `N` [env: MIN_AVE_MSP_SIZE=] [default:
0]
-w, --width-bin <WIDTH_BIN>
Width of bin for feature collection [env: WIDTH_BIN=] [default: 40]
-b, --bin-num <BIN_NUM>
Number of bins to collect [env: BIN_NUM=] [default: 9]
--best-window-size <BEST_WINDOW_SIZE>
Calculate stats for the highest X bp window within each MSP Should be a fair amount higher than the expected linker length
[env: BEST_WINDOW_SIZE=] [default: 100]
Calculate stats for the highest X bp window within each MSP Should be a fair amount higher
than the expected linker length [env: BEST_WINDOW_SIZE=] [default: 100]
--min-msp-length-for-positive-fire-call <MIN_MSP_LENGTH_FOR_POSITIVE_FIRE_CALL>
Minium length of msp to call a FIRE [env: MIN_MSP_LENGTH_FOR_POSITIVE_FIRE_CALL=] [default: 85]
Minium length of msp to call a FIRE [env: MIN_MSP_LENGTH_FOR_POSITIVE_FIRE_CALL=]
[default: 85]
--model <MODEL>
Optional path to a model json file. If not provided ft will use the default model (recommended) [env: FIRE_MODEL=]
Optional path to a model json file. If not provided ft will use the default model
(recommended) [env: FIRE_MODEL=]
--fdr-table <FDR_TABLE>
Optional path to a FDR table [env: FDR_TABLE=]
-h, --help
Expand All @@ -122,7 +135,8 @@ Options:
Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -140,8 +154,9 @@ Extract fiberseq data into plain text files
Usage: ft extract [OPTIONS] [BAM]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]

Options:
-r, --reference Report in reference sequence coordinates
Expand All @@ -155,7 +170,8 @@ Options:
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -172,21 +188,25 @@ All-Format-Options:

## `ft center`
```console
This command centers fiberseq data around given reference positions. This is useful for making aggregate m6A and CpG observations, as
well as visualization of SVs
This command centers fiberseq data around given reference positions. This is useful for making
aggregate m6A and CpG observations, as well as visualization of SVs

Usage: ft center [OPTIONS] --bed <BED> [BAM]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]

Options:
-b, --bed <BED> Bed file on which to center fiberseq reads. Data is adjusted to the start position of the bed file and corrected
for strand if the strand is indicated in the 6th column of the bed file. The 4th column will also be checked for
the strand but only after the 6th is. If you include strand information in the 4th (or 6th) column it will
orient data accordingly and use the end position of bed record instead of the start if on the minus strand. This
means that profiles of motifs in both the forward and minus orientation will align to the same central position
-b, --bed <BED> Bed file on which to center fiberseq reads. Data is adjusted to the start
position of the bed file and corrected for strand if the strand is indicated in
the 6th column of the bed file. The 4th column will also be checked for the
strand but only after the 6th is. If you include strand information in the 4th
(or 6th) column it will orient data accordingly and use the end position of bed
record instead of the start if on the minus strand. This means that profiles of
motifs in both the forward and minus orientation will align to the same central
position
-d, --dist <DIST> Set a maximum distance from the start of the motif to keep a feature
-w, --wide Provide data in wide format, one row per read
-r, --reference Return relative reference position instead of relative molecular position
Expand All @@ -195,7 +215,8 @@ Options:
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -213,8 +234,9 @@ Add nucleosomes to a bam file with m6a predictions
Usage: ft add-nucleosomes [OPTIONS] [BAM] [OUT]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]
[OUT] Output bam file with nucleosome calls [default: -]

Options:
Expand All @@ -232,7 +254,8 @@ Options:
Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -250,19 +273,21 @@ Infer footprints from fiberseq data
Usage: ft footprint [OPTIONS] --bed <BED> --yaml <YAML> [BAM]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]

Options:
-b, --bed <BED> BED file with the regions to footprint. Should all contain the same motif with proper strand information, and
ideally be ChIP-seq peaks
-b, --bed <BED> BED file with the regions to footprint. Should all contain the same motif with
proper strand information, and ideally be ChIP-seq peaks
-y, --yaml <YAML> yaml describing the modules of the footprint
-o, --out <OUT> Output bam [default: -]
-h, --help Print help
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -280,16 +305,18 @@ Remove HiFi kinetics tags from the input bam file
Usage: ft clear-kinetics [OPTIONS] [BAM] [OUT]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]
[OUT] Output bam file without hifi kinetics [default: -]

Options:
-h, --help Print help
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -307,17 +334,20 @@ Strip out select base modifications
Usage: ft strip-basemods [OPTIONS] [BAM] [OUT]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]
[OUT] Output bam file [default: -]

Options:
-b, --basemod <BASEMOD> base modification to strip out of the bam file [default: m6A] [possible values: m6A, 6mA, 5mC, CpG]
-b, --basemod <BASEMOD> base modification to strip out of the bam file [default: m6A] [possible
values: m6A, 6mA, 5mC, CpG]
-h, --help Print help
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -335,8 +365,9 @@ Make decorated bed files for fiberseq data
Usage: ft track-decorators [OPTIONS] --bed12 <BED12> [BAM]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]

Options:
-b, --bed12 <BED12> Output path for bed12 file to be decorated
Expand All @@ -345,7 +376,8 @@ Options:
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand All @@ -363,8 +395,9 @@ Make a pileup track of Fiber-seq features from a FIRE bam
Usage: ft pileup [OPTIONS] [BAM] [RGN]

Arguments:
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a HiFi bam file with kinetics data.
For other commands, this should be a bam file with m6A calls [default: -]
[BAM] Input BAM file. If no path is provided stdin is used. For m6A prediction, this should be a
HiFi bam file with kinetics data. For other commands, this should be a bam file with m6A
calls [default: -]
[RGN] Region string to make a pileup of. If not provided will make a pileup of the whole genome

Options:
Expand All @@ -378,7 +411,8 @@ Options:
-V, --version Print version

BAM-Options:
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default: 0]
-F, --filter <BIT_FLAG> BAM bit flags to filter on, equivalent to `-F` in samtools view [default:
0]
--ml <MIN_ML_SCORE> Minium score in the ML tag to use or include in the output [default: 125]

Global-Options:
Expand Down

0 comments on commit cf03b1d

Please sign in to comment.