Skip to content

Commit

Permalink
fix bug add todo
Browse files Browse the repository at this point in the history
  • Loading branch information
berntpopp committed Oct 19, 2023
1 parent 19b4782 commit 4830a43
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 5 deletions.
14 changes: 13 additions & 1 deletion analyses/calling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,16 @@ The pipeline produces the following outputs in the output_folder specified in th
- `variant_calls`: A folder containing the VCF files with the variant calls.
- `logs`: A folder containing the log files for the MuTect2 runs.

Each VCF file is named with the format `<individual1>_<analysis>_<chromosome>.vcf.gz`.
Each VCF file is named with the format `<individual1>_<analysis>_<chromosome>.vcf.gz`.

# TODO
- [ ] script to merge VCF files, f1r2 files, and stats files
- use GatherVcfs (Picard) to merge VCF files
- use gatk MergeMutectStats to merge stats files
- the f1r2 files are not merged but are all used as input for LearnReadOrientationModel
- [ ] script for LearnReadOrientationModel
- this should be part of the merge script
- [ ] script for FilterMutectCalls
- [ ] script for CalculateContamination (plus GetPileupSummaries)

--> see: https://gatk.broadinstitute.org/hc/en-us/articles/360035531132--How-to-Call-somatic-mutations-using-GATK4-Mutect2
2 changes: 1 addition & 1 deletion analyses/calling/bcftools_concat.smk
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ SCATTER_NAME_PREFIX = config.get("scatter_name_prefix", "")
# Define the scattering interval names based on the ranges and prefixes specified in the
# configuration file. This allows for a flexible definition of intervals, accommodating
# various naming conventions and range specifications.
SCATTERING_INTERVAL_RANGES = config.get("scattering_interval_ranges", [str(i) for i in range(1, 22)] + ['X', 'Y'])
SCATTERING_INTERVAL_RANGES = config.get("scattering_interval_ranges", [str(i) for i in range(1, 23)] + ['X', 'Y'])
SCATTERING_INTERVAL_NAMES = expand_interval_names(SCATTERING_INTERVAL_RANGES, SCATTER_NAME_PREFIX)

# Get the delimiter for scatter names; default is "."
Expand Down
2 changes: 1 addition & 1 deletion analyses/calling/merge_mutect2_calls.smk
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ MERGED_DIR = prefix_results('variant_merge')
LOG_DIR = prefix_results('logs')

# List of chromosomes for processing
chromosomes = [f"chr{i}" for i in range(1, 22)] + ["chrX", "chrY"]
chromosomes = [f"chr{i}" for i in range(1, 23)] + ["chrX", "chrY"]
# ----------------------------------------------------------------------------------- #

# ----------------------------------------------------------------------------------- #
Expand Down
4 changes: 2 additions & 2 deletions analyses/calling/mutect2_calling.smk
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ VARIANT_DIR = prefix_results('variant_calls')
LOG_DIR = prefix_results('logs')

# List of chromosomes to loop through
chromosomes = [f"chr{i}" for i in range(1, 22)] + ["chrX", "chrY"]
chromosomes = [f"chr{i}" for i in range(1, 23)] + ["chrX", "chrY"]
# ----------------------------------------------------------------------------------- #

# ----------------------------------------------------------------------------------- #
Expand Down Expand Up @@ -98,7 +98,7 @@ rule call_variants:
$normal_sample_option \
--germline-resource {params.af_only_gnomad} \
--panel-of-normals {params.panel_of_normals} \
--f1r2-tar-gz {OUTPUT_DIR}/{params.individual}_{params.analysis}_{wildcards.chromosome}.f1r2.tar.gz \
--f1r2-tar-gz {VARIANT_DIR}/{params.individual}_{params.analysis}_{wildcards.chromosome}.f1r2.tar.gz \
$scatter_option \
-O {output.variant_file} 2> {log.mutect2}
"""
Expand Down

0 comments on commit 4830a43

Please sign in to comment.