Unable to generate count table with reference peaks #219

gbloeb · 2022-03-25T04:48:12Z

Ran a project with reference peaks, but I am unable to generate a count table for the reference peaks:
When the project is run:

peaks are still called for each sample individually
Each sample/peak_calling_mm10 directory contains both:
GLIS3_Ctrl_1_S2_ref_peaks_coverage.bed corresponding to the reference peaks and
GLIS3_Ctrl_1_S2_peaks_coverage.bed.gz corresponding to the called peaks

When I run the project processing pipeline, consensus peaks are still generated and the count table is generated with the consensus peaks with the warning:
Warning message:
In PEPATACr::peakCounts(sample_table, summary_dir, argv$results, :
Peak coverage files are not derived from a singular reference peak set.

My config:

# This project config file describes your project. See looper docs for details.
name: GLIS3_ATAC_nolambda_qe-7_sh-30_peaks # The name that summary files will be prefaced with

pep_version: 2.0.0
sample_table: annotation_onlyGLIS3.csv  # sheet listing all samples in the project

looper:  # relative paths are relative to this config file
  output_dir: ~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results
  pipeline_interfaces: ~/pepatac/project_pipeline_interface.yaml  # PATH to the directory where looper will find the pipeline repository.

sample_modifiers:
  append:
    pipeline_interfaces: ~/pepatac/sample_pipeline_interface.yaml
  derive:
    attributes: [read1, read2]
    sources:
      R1: "~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/fastq/{sample_name}_R1_001.fastq.gz"
      R2: "~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/fastq/{sample_name}_R2_001.fastq.gz"
  imply:
    - if:
	organism: ["mouse"]
      then:
	genome: mm10
        prealignment_names: ["mouse_chrM2x"]
        genome_size: "2.3e9"
        frip_ref_peaks: ~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/comb_peak_call/GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.narrowPeak

a sample log file:
PEPATAC_log.md

project log file:
`### Pipeline run code and environment:

         Command:  `/wynton/protected/home/reiter/gloeb/pepatac/pipelines/pepatac_collator.py --config /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml -O /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results -P 1 -M 16G -n GLIS3_ATAC_nolambda_qe-7_sh-30_peaks -r /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline`

    Compute host:  plog1.wynton.ucsf.edu

     Working dir:  /wynton/group/reiter/gabe/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results

       Outfolder:  /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/

Pipeline started at: (03-24 21:30:49) elapsed: 0.0 TIME

Version log:

Python version: 3.9.7

     Pypiper dir:  `/wynton/protected/home/reiter/gloeb/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper`

```
 Pypiper version:  0.12.3
```

    Pipeline dir:  `/wynton/protected/home/reiter/gloeb/pepatac/pipelines`

```
Pipeline version:  0.0.4
```

Arguments passed to pipeline:

   `config_file`:  `/wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml`

```
         `cores`:  `1`
```
```
        `cutoff`:  `2`
```
```
         `dirty`:  `False`
```
force_follow: False
```
        `logdev`:  `False`
```
```
           `mem`:  `16G`
```
```
      `min_olap`:  `1`
```
```
     `min_score`:  `5`
```

          `name`:  `GLIS3_ATAC_nolambda_qe-7_sh-30_peaks`

```
     `new_start`:  `False`
```
```
    `normalized`:  `False`
```

 `output_parent`:  `/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results`

```
      `poverlap`:  `False`
```
```
       `recover`:  `False`
```

       `results`:  `/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline`

```
        `silent`:  `False`
```
```
`skip_consensus`:  `False`
```
```
    `skip_table`:  `False`
```
```
      `testmode`:  `False`
```
```
     `verbosity`:  `None`
```

Target to produce: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_libComplexity.pdf,/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_*_consensusPeaks.narrowPeak,/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_peaks_coverage.tsv

Rscript /wynton/protected/home/reiter/gloeb/pepatac/tools/PEPATAC_summarizer.R /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline 2 5 1 (9429)

Loading config file: /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml
Creating stats summary...
Summary (n=4): /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_stats_summary.tsv
Creating assets summary...
Summary (n=4): /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_assets_summary.tsv
Creating summary plots...
4 of 4 library complexity files available.
INFO: Found real counts for GLIS3_Ctrl_1_S2 - Total (M): 126.745044 Unique (M): 108.93747
INFO: Found real counts for GLIS3_Ctrl_2_S7 - Total (M): 111.712142 Unique (M): 96.955292
INFO: Found real counts for GLIS3_Dox_1_S3 - Total (M): 221.935342 Unique (M): 183.353884
INFO: Found real counts for GLIS3_Dox_2_S8 - Total (M): 116.520356 Unique (M): 105.904826

WARNING: y-max value changed from default 139.24586665 to the max real data 201.6892724
Successfully produced project summary plots.

Calculating mm10 consensus peak set from 4 samples...
Consensus peak set: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_consensusPeaks.narrowPeak

Calculating mm10 peak counts for 4 samples...
Counts table: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_peaks_coverage.tsv

Counts table: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_peaks_coverage.tsv

Warning message:
In PEPATACr::peakCounts(sample_table, summary_dir, argv$results,  :
  Peak coverage files are not derived from a singular reference peak set.

Command completed. Elapsed time: 0:00:57. Running peak memory: 0.91GB.
PID: 9429; Command: Rscript; Return code: 0; Memory used: 0.91GB

Pipeline completed. Epilogue

```
   Elapsed time (this run):  0:00:57
```
Total elapsed time (all runs): 0:00:57
```
    Peak memory (this run):  0.9104 GB
```

   Pipeline completed time: 2022-03-24 21:31:46

`

The text was updated successfully, but these errors were encountered:

Kange2014 · 2022-11-07T09:30:34Z

does anyone has an update on this issue? encounter the same problem. Thanks.

ljmills · 2023-01-06T15:51:16Z

I am also having this issue

zhongzheng1999 · 2024-02-20T06:58:28Z

The issue seems to persist. I am also having this issue. @donaldcampbelljr Could you do me a favor to solve the issue？Thanks!

zhongzheng1999 · 2024-02-20T07:00:18Z

@ljmills Did you ever find a solution? Thanks!

zhongzheng1999 mentioned this issue Feb 20, 2024

The peakCounts function in PEPATACr.R should probably need fixing #273

Closed

nsheff assigned donaldcampbelljr and jpsmith5 May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to generate count table with reference peaks #219

Unable to generate count table with reference peaks #219

gbloeb commented Mar 25, 2022

Kange2014 commented Nov 7, 2022

ljmills commented Jan 6, 2023

zhongzheng1999 commented Feb 20, 2024

zhongzheng1999 commented Feb 20, 2024

Unable to generate count table with reference peaks #219

Unable to generate count table with reference peaks #219

Comments

gbloeb commented Mar 25, 2022

Version log:

Arguments passed to pipeline:

Pipeline completed. Epilogue

Kange2014 commented Nov 7, 2022

ljmills commented Jan 6, 2023

zhongzheng1999 commented Feb 20, 2024

zhongzheng1999 commented Feb 20, 2024