Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating WARNING: No newline at ending of file 'nonewline.fasta' during construction results in a broken/incorrect index #35

Open
tmaklin opened this issue Aug 21, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@tmaklin
Copy link
Contributor

tmaklin commented Aug 21, 2024

I ran fulgor on broken input containing files that generate WARNING: No newline at ending of file 'nonewline.fasta' errors from ggcat and noticed that the index fulgor builds will be wrong after this.

For example a large .fur index had size 206G on disk when generated from broken inputs but when the inputs were fixed the size grew to 281G which is closer to what I expected. Queries on the first index worked but produced results with no matches in the broken inputs.

The different index sizes also replicate on artificial data containing a file that generates the warning.

So, just as a heads-up, it might be better to abort if the inputs have this error. I've also reported this to ggcat and suggested that the warning should be an error.

@jermp jermp added the enhancement New feature or request label Aug 21, 2024
@jermp
Copy link
Owner

jermp commented Aug 21, 2024

Hi Tommi and thank for the suggestion. I agree: if the input is broken, GGCAT should abort the construction (and Fulgor too, in turn). Right now, I don't think there is a way to fix this in Fulgor as the warning is just a printed message. Right?

@tmaklin
Copy link
Contributor Author

tmaklin commented Aug 21, 2024

Yeah it's just text printed from rust using eprintln (https://github.com/algbio/ggcat/blob/a91ecc97f286b737b37195c0a86f0e11ad6bfc3b/crates/io/src/lines_reader.rs#L155) so detecting would require either capturing the text from rust and parsing it, or checking the input files somewhere within the fulgor code. I don't think this is a very common error to run into, though, so probably OK to wait and see if ggcat changes this.

@jermp
Copy link
Owner

jermp commented Aug 21, 2024

Ok, as I thought. I'll leave this issue open anyway as a reminder.

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants