Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to generate count table with reference peaks #219

Open
gbloeb opened this issue Mar 25, 2022 · 4 comments
Open

Unable to generate count table with reference peaks #219

gbloeb opened this issue Mar 25, 2022 · 4 comments
Assignees

Comments

@gbloeb
Copy link

gbloeb commented Mar 25, 2022

Ran a project with reference peaks, but I am unable to generate a count table for the reference peaks:
When the project is run:

  1. peaks are still called for each sample individually
  2. Each sample/peak_calling_mm10 directory contains both:
    GLIS3_Ctrl_1_S2_ref_peaks_coverage.bed corresponding to the reference peaks and
    GLIS3_Ctrl_1_S2_peaks_coverage.bed.gz corresponding to the called peaks

When I run the project processing pipeline, consensus peaks are still generated and the count table is generated with the consensus peaks with the warning:
Warning message:
In PEPATACr::peakCounts(sample_table, summary_dir, argv$results, :
Peak coverage files are not derived from a singular reference peak set.

My config:

# This project config file describes your project. See looper docs for details.
name: GLIS3_ATAC_nolambda_qe-7_sh-30_peaks # The name that summary files will be prefaced with

pep_version: 2.0.0
sample_table: annotation_onlyGLIS3.csv  # sheet listing all samples in the project

looper:  # relative paths are relative to this config file
  output_dir: ~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results
  pipeline_interfaces: ~/pepatac/project_pipeline_interface.yaml  # PATH to the directory where looper will find the pipeline repository.

sample_modifiers:
  append:
    pipeline_interfaces: ~/pepatac/sample_pipeline_interface.yaml
  derive:
    attributes: [read1, read2]
    sources:
      R1: "~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/fastq/{sample_name}_R1_001.fastq.gz"
      R2: "~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/fastq/{sample_name}_R2_001.fastq.gz"
  imply:
    - if:
	organism: ["mouse"]
      then:
	genome: mm10
        prealignment_names: ["mouse_chrM2x"]
        genome_size: "2.3e9"
        frip_ref_peaks: ~/group/bulk_atac/220126_GLIS_ATAC_IMCD3/comb_peak_call/GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.narrowPeak

a sample log file:
PEPATAC_log.md

project log file:
`### Pipeline run code and environment:

  •          Command:  `/wynton/protected/home/reiter/gloeb/pepatac/pipelines/pepatac_collator.py --config /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml -O /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results -P 1 -M 16G -n GLIS3_ATAC_nolambda_qe-7_sh-30_peaks -r /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline`
    
  •     Compute host:  plog1.wynton.ucsf.edu
    
  •      Working dir:  /wynton/group/reiter/gabe/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results
    
  •        Outfolder:  /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/
    
  • Pipeline started at: (03-24 21:30:49) elapsed: 0.0 TIME

Version log:

  • Python version: 3.9.7
  •      Pypiper dir:  `/wynton/protected/home/reiter/gloeb/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper`
    
  •  Pypiper version:  0.12.3
    
  •     Pipeline dir:  `/wynton/protected/home/reiter/gloeb/pepatac/pipelines`
    
  • Pipeline version:  0.0.4
    

Arguments passed to pipeline:

  •    `config_file`:  `/wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml`
    
  •          `cores`:  `1`
    
  •         `cutoff`:  `2`
    
  •          `dirty`:  `False`
    
  • force_follow: False
  •         `logdev`:  `False`
    
  •            `mem`:  `16G`
    
  •       `min_olap`:  `1`
    
  •      `min_score`:  `5`
    
  •           `name`:  `GLIS3_ATAC_nolambda_qe-7_sh-30_peaks`
    
  •      `new_start`:  `False`
    
  •     `normalized`:  `False`
    
  •  `output_parent`:  `/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results`
    
  •       `poverlap`:  `False`
    
  •        `recover`:  `False`
    
  •        `results`:  `/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline`
    
  •         `silent`:  `False`
    
  • `skip_consensus`:  `False`
    
  •     `skip_table`:  `False`
    
  •       `testmode`:  `False`
    
  •      `verbosity`:  `None`
    

Target to produce: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_libComplexity.pdf,/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_*_consensusPeaks.narrowPeak,/wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_peaks_coverage.tsv

Rscript /wynton/protected/home/reiter/gloeb/pepatac/tools/PEPATAC_summarizer.R /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/results_pipeline 2 5 1 (9429)

Loading config file: /wynton/protected/home/reiter/gloeb/pepatac/220126_GLIS_ATAC_IMCD3/config_GLIS_doxpctrl_shift.bed_nolambda_q0.0000001_sh-30_peaks.yaml
Creating stats summary...
Summary (n=4): /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_stats_summary.tsv
Creating assets summary...
Summary (n=4): /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_assets_summary.tsv
Creating summary plots...
4 of 4 library complexity files available.
INFO: Found real counts for GLIS3_Ctrl_1_S2 - Total (M): 126.745044 Unique (M): 108.93747
INFO: Found real counts for GLIS3_Ctrl_2_S7 - Total (M): 111.712142 Unique (M): 96.955292
INFO: Found real counts for GLIS3_Dox_1_S3 - Total (M): 221.935342 Unique (M): 183.353884
INFO: Found real counts for GLIS3_Dox_2_S8 - Total (M): 116.520356 Unique (M): 105.904826

WARNING: y-max value changed from default 139.24586665 to the max real data 201.6892724
Successfully produced project summary plots.

Calculating mm10 consensus peak set from 4 samples...
Consensus peak set: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_consensusPeaks.narrowPeak

Calculating mm10 peak counts for 4 samples...
Counts table: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_peaks_coverage.tsv

Counts table: /wynton/protected/home/reiter/gloeb/group/bulk_atac/220126_GLIS_ATAC_IMCD3/GLIS3_nolambda_qe-7_sh-30_peaks/pepatac_results/summary/GLIS3_ATAC_nolambda_qe-7_sh-30_peaks_mm10_peaks_coverage.tsv

Warning message:
In PEPATACr::peakCounts(sample_table, summary_dir, argv$results,  :
  Peak coverage files are not derived from a singular reference peak set.

Command completed. Elapsed time: 0:00:57. Running peak memory: 0.91GB.
PID: 9429; Command: Rscript; Return code: 0; Memory used: 0.91GB

Pipeline completed. Epilogue

  •    Elapsed time (this run):  0:00:57
    
  • Total elapsed time (all runs): 0:00:57
  •     Peak memory (this run):  0.9104 GB
    
  •    Pipeline completed time: 2022-03-24 21:31:46
    

`

@Kange2014
Copy link

does anyone has an update on this issue? encounter the same problem. Thanks.

@ljmills
Copy link

ljmills commented Jan 6, 2023

I am also having this issue

@zhongzheng1999
Copy link

The issue seems to persist. I am also having this issue. @donaldcampbelljr Could you do me a favor to solve the issue?Thanks!

@zhongzheng1999
Copy link

@ljmills Did you ever find a solution? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants