Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when some assembly metadata tsv files are empty and don't contain accessions to download #94

Open
masudermann opened this issue Sep 18, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@masudermann
Copy link
Contributor

masudermann commented Sep 18, 2024

Description of the bug

I am doing some testing of complex_dataset_minimal, and everything starts well, but I'm encountering an error where the pipeline fails if some the tsv files containing the accessions to download for a given family (in my case) are empty. These files are linked from path_surveil_data/assembly_metadata.

Is there a way to just proceed and ignore these empty files?

Command used and terminal output

(nf-core) marthasudermann@pop-os:~/pathogensurveillance$ nextflow main.nf -profile complex_dataset_minimal,docker
N E X T F L O W  ~  version 23.10.1
Launching `main.nf` [peaceful_carson] DSL2 - revision: cc83aa0c27


------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/plantpathsurveil v1.0dev
------------------------------------------------------
Core Nextflow options
  runName                   : peaceful_carson
  containerEngine           : docker
  launchDir                 : /home/marthasudermann/pathogensurveillance
  workDir                   : /home/marthasudermann/pathogensurveillance/work
  projectDir                : /home/marthasudermann/pathogensurveillance
  userName                  : marthasudermann
  profile                   : complex_dataset_minimal,docker
  configFiles               : /home/marthasudermann/pathogensurveillance/nextflow.config

Input/output options
  sample_data               : test/data/metadata/complex_dataset_minimal.csv
  out_dir                   : test/output/complex_dataset_minimal
  download_bakta_db         : true
  cache_type                : lenient

Institutional config options
  config_profile_name       : Test dataset for Minimal complex dataset
  config_profile_description: Test dataset for Minimal complex dataset

Generic options
  trace_dir                 : null/pipeline_info

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/plantpathsurveil for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
#####
ERROR ~ Error executing process > 'PATHOGENSURVEILLANCE:PREPARE_INPUT:PICK_ASSEMBLIES (SRR12888960)'

Caused by:
  Process `PATHOGENSURVEILLANCE:PREPARE_INPUT:PICK_ASSEMBLIES (SRR12888960)` terminated with an error exit status (1)

Command executed:

  pick_assemblies.R SRR12888960_families.txt SRR12888960_genera.txt SRR12888960_species.txt 30 20 10 SRR12888960.tsv Aleyrodidae.tsv Amborellaceae.tsv Aphididae.tsv Castoridae.tsv Chrysomelidae.tsv Cordycipitaceae.tsv Cricetidae.tsv Cucurbitaceae.tsv Dasypodidae.tsv Dasyuridae.tsv Fabaceae.tsv Fagaceae.tsv Formicidae.tsv Halomonadaceae.tsv Lepisosteidae.tsv Liviidae.tsv Macroscelididae.tsv Malvaceae.tsv Micrococcaceae.tsv Moraceae.tsv Nectriaceae.tsv Nitidulidae.tsv Otariidae.tsv Penaeidae.tsv Pentatomidae.tsv Phocidae.tsv Theaceae.tsv Theridiidae.tsv Tupaiidae.tsv Xanthomonadaceae.tsv
  
  cat <<-END_VERSIONS > versions.yml
  "PATHOGENSURVEILLANCE:PREPARE_INPUT:PICK_ASSEMBLIES":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Error in `$<-.data.frame`(`*tmp*`, "family", value = "Dasypodidae") : 
    replacement has 1 row, data has 0
  Calls: lapply -> FUN -> $<- -> $<-.data.frame
  Execution halted

Work dir:
  /home/marthasudermann/pathogensurveillance/work/4c/daae218f1fb1874d272e6c13634773

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Desktop-System 76 Linux computer

@masudermann masudermann added the bug Something isn't working label Sep 18, 2024
@masudermann masudermann changed the title Error when some tsv files in find_assemblies directory are empty and not contain accessions to download Error when some assembly metadata tsv files are empty and don't contain accessions to download Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant