Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a safe guard to integron #5132

Merged
merged 2 commits into from
Feb 17, 2023
Merged

Conversation

bgruening
Copy link
Member

This is needed because of gem-pasteur/Integron_Finder#90 which results in a directory with multi-million of files.

@bgruening
Copy link
Member Author

@pimarin can you maybe have a look at this one?

@pimarin
Copy link
Contributor

pimarin commented Feb 15, 2023

I will check yes !

@pimarin
Copy link
Contributor

pimarin commented Feb 15, 2023

So you check that length of input won't be so big for integron due to number tmp file need ?

@bgruening
Copy link
Member Author

Yes, we only allow fasta files with less than 10.000 sequences. Which will already create 30-000 tp 40-000 files in one directory.

@pimarin
Copy link
Contributor

pimarin commented Feb 15, 2023

Oky it's okay to me, test pass so it's okay, maybe an issue to the integronfinder to change it coulbe be better for galaxy in the futur

@bgruening
Copy link
Member Author

Commented on an already existing issue: gem-pasteur/Integron_Finder#90

@pimarin
Copy link
Contributor

pimarin commented Feb 15, 2023

Oky it's good

@bgruening
Copy link
Member Author

@bernt-matthias can you please review and maybe merge? Thanks.

@@ -35,7 +35,9 @@
&& mv Results_Integron_Finder_* Results_Integron_Finder
]]></command>
<inputs>
<param type="data" name="sequence" format="fasta" label="Replicon file" help="Replicon can be entire chromosome, contif, PCR fragments..." />
<param type="data" name="sequence" format="fasta" label="Replicon file" help="Replicon can be entire chromosome, contif, PCR fragments...">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of problems does integron finder have with large files?

Do you have an upstream issue that we could link here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integron finder make 1 tmp directory per analyzed contigs from the fasta file, applied to metagenome or huge genome it make thousand of tmp directories

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a smallest genome test (e faecalis) i have 100 directories x 10 file into them, for a contamined genome I obtained 1708 directories

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if we really need the hard limit (bu I would be fine with it).

Maybe we can add doc (to the parameters help or the tool help) how to deal with larger data.

For your reference: https://integronfinder.readthedocs.io/en/latest/user_guide/tutorial.html#for-big-data-people

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an upstream issue that we could link here?

See the PR description ;)

Its the pure amount of small files that are created, we had 10.000.000 in one directory. We can discuss if we should limit to 10.000 or 100.000 but we need to limit it imho.

@bernt-matthias bernt-matthias merged commit 04aa903 into main Feb 17, 2023
@gxydevbot
Copy link
Collaborator

Attention: deployment skipped!

https://github.com/galaxyproject/tools-iuc/actions/runs/4201763505

@bernt-matthias
Copy link
Contributor

Should we have bumped the version? I stopped deployment...

@bernt-matthias bernt-matthias deleted the integron_finder_safe_guard branch February 17, 2023 07:56
@bgruening
Copy link
Member Author

We did with VERSION_SUFFIX correct?

@bernt-matthias
Copy link
Contributor

My bad. Will restart

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants