NextFlow #112

MadeleineOman · 2025-01-23T16:56:06Z

I'm trying to run the Nextflow pipeline and it seems that the pipeline creates temporary folders during the analysis under the "work" folder, however during execution it for some reason cannot find these folders (which I verify do exist once the pipeline has errored out). I've included a screenshot of the error, as well as the whole .log file. I'm sure this is just some small issue (ie. previous step not functioning exactly as expected, ergo next step errors completely), but because the error itself isn't that informative I don't know where within the pipeline to look for troubleshooting.

Any help is appreciated!

2025_01_23_nextflow.log

fa8sanger · 2025-01-24T09:49:53Z

The log says "bsub" command wasn't found. This is the program to handle the queue system LSF. Which queue system are you using?

MadeleineOman · 2025-01-27T16:11:44Z

Actually we are using a local server, not a cluster. Can we still run the pipeline using Nextflow?

In case not, I've taken a look at the manual implementation but also see "bsub" commands there. Would the fix be simply to run the code directly? Ie instead of
bsub -q basement -G team78-grp -o out -e log -M20000 -R"span[hosts=1] select[mem>20000] rusage[mem=20000]" "/software/CGP/external-apps/bwa-0.7.5a/bwa mem -C /lustre/scratch117/casm/team78/ro4/hs37d5/hs37d5.fa 70#1R1.fastq.gz 70#1R2.fastq > 70#1.sam"
run
bwa mem -C /lustre/scratch117/casm/team78/ro4/hs37d5/hs37d5.fa 70#1R1.fastq.gz 70#1R2.fastq > 70#1.sam
we are using a smaller genome and so I do not think the same levels of computation/memory optimization will be required.

fa8sanger · 2025-01-28T04:55:53Z

It should be possible to run locally but it will take very long. I am not good at nextflow, but this is how google says:

To run Nextflow without LSF, simply set the "executor" option in your Nextflow configuration file to "local"; this will instruct Nextflow to run all pipeline tasks on the machine where you launched the command, effectively bypassing the need for a cluster job scheduler like LSF.
Key points:
Configuration file: Modify your nextflow.config file to include the following line:
Code

process.executor = 'local'

When to use "local": This is suitable for testing your pipeline on a single machine, running small workflows, or when you don't require large-scale cluster computing.
Alternative executors for cluster environments:
Slurm: If your cluster uses Slurm, set process.executor = 'slurm'.
SGE: For Sun Grid Engine, use process.executor = 'sge'.
PBS: For Portable Batch System, set process.executor = 'pbs'.

MadeleineOman · 2025-01-30T23:07:53Z

Ah I understand. I've changed the config file to process.executor = 'local' and also changed the profile flag in the command to run nextflow from -profile lsf_singularity to -profile standard`

I'm getting a new error now, that

full output log: 2025_01_30_nextflow.log

I'm not sure where this bwa_mem.pl command is coming from, since I have bwa-mem2 installed in the conda env I'm using to run nextflow. Using grep I see bwa_mem.pl is referenced in the NanoSeq/Nextflow/modules/bwa.nf file, but im still not sure how to solve this problem.
Once again, thanks for any help!

fa8sanger · 2025-01-31T09:01:04Z

Oh, that's used for remapping bams, which is something you usually don't do. Indeed I think that's only an internal option for the Sanger. From your log it seems you invoked nextflow with:
nextflow run NanoSeq_main.nf -qs 300 -profile standard --ref /research/projects/PBCV1/preliminary/raw_data/sequencing/reference/test/genome.fa --sample_sheet /research/projects/PBCV1/preliminary/data/samplesheet/test_samplesheet.csv

So you were not requesting remapping. Can you share your test_samplesheet.csv file? I'll then ask the person who wrote the Nextflow part

MadeleineOman · 2025-01-31T22:16:15Z

Here is the test_samplesheet.csv: test_samplesheet.csv. I wanted to test the pipeline on the test files provided by you to just try and get the pipeline up and running first. I assumed the /test/duplex.bam was an example bam for input, the /test/normal.bam was the corresponding matched normal, and that /test/hs37d5.fa.gz was the reference (which i had to prep and index myself, using this script: prepGenome.txt.txt

Just incase I made some false assumptions, I retried running the pipeline with my data, and did get a different error. Not sure if this helps, but here is the log and the associated samplesheet:
2025_01_31_nextflow_mypipeline.log
samplesheet.csv

fa8sanger · 2025-02-02T10:57:41Z

Thanks. I see in your log that it says "remap:true". You don't want that. Could you try specifying in the call to nextflow "remap false"? I'd also recommend using the noise_bed and snp_bed masks.
If that doesn't work either I'll ask the person who wrote the nextflow.

fa8sanger · 2025-02-02T15:13:42Z

Also, the log says: Jan.-31 11:03:01.961 [main] INFO nextflow.script.BaseScript - running with fastqs as input that will be trimmed,tagged and mapped
but you are not running it with fastqs but with bam files.... Not sure what's wrong. Hopefully Raúl would b e able to give you a hand (just emailed him)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NextFlow #112

NextFlow #112

MadeleineOman commented Jan 23, 2025

fa8sanger commented Jan 24, 2025

MadeleineOman commented Jan 27, 2025

fa8sanger commented Jan 28, 2025

MadeleineOman commented Jan 30, 2025

fa8sanger commented Jan 31, 2025

MadeleineOman commented Jan 31, 2025

fa8sanger commented Feb 2, 2025

fa8sanger commented Feb 2, 2025

NextFlow #112

NextFlow #112

Comments

MadeleineOman commented Jan 23, 2025

fa8sanger commented Jan 24, 2025

MadeleineOman commented Jan 27, 2025

fa8sanger commented Jan 28, 2025

MadeleineOman commented Jan 30, 2025

fa8sanger commented Jan 31, 2025

MadeleineOman commented Jan 31, 2025

fa8sanger commented Feb 2, 2025

fa8sanger commented Feb 2, 2025