Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No read me #3

Open
jianshu93 opened this issue Apr 1, 2021 · 4 comments
Open

No read me #3

jianshu93 opened this issue Apr 1, 2021 · 4 comments

Comments

@jianshu93
Copy link

Hello Lukasz,

I can compile it now but I have no clues how to run it. katome r1.fasta r2.fasta? what about the output? I also did not find how to run it in your original paper. Thank you very much for all those efforts. I was surprised that your software was not used since rust is so efficient in terms of compiling and memory safety.

Thanks,

@fuine
Copy link
Owner

fuine commented Apr 6, 2021

Sorry for the late response. To configure input/output files etc. you need to create a config/settings.toml file (for an example one take a look at the config/settings_example.toml file in the repo. Then to run it you simply need to issue cargo run --release.

@fuine
Copy link
Owner

fuine commented Apr 6, 2021

If it comes to more widespread usage then I think there are better assembly tools out there. katome was an experimental project to see if I could write an in-RAM assembler in Rust in a way that would be both efficient and also readable. If you want to know more/talk about the project you can contact me directly via email.

@jianshu93
Copy link
Author

Hello Lukasz,

Thank you very much for the information. Do you hav any interest to further develop it and make it more user friendly? I cannot find your email but I would be happy to talk more. And to use it not just for single cell but also mixed cells (aka Metagenomics assembly). I think the idea is similar, check idea_ud and megabit assembler for details. rust-mdbg for HIFI long reads is very popular and I am hoping that we could make it more user friendly for publication. Have you tried submit you manuscript to , for example, bmc bioinformatics? I am also new to rust but would be very interested to learn more. my emails is [email protected]

Thanks,

Jianshu

@jianshu93
Copy link
Author

When create the settings.toml file and run it I have the following error:

thread 'main' panicked at 'attempted to leave type linked_hash_map::Node<yaml::Yaml, yaml::Yaml> uninitialized, which is invalid', /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/mem/mod.rs:660:9
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

Any idea why?

I attached settings:

input files path, should contain at least one filename

input_files = ["/storage/home/hcoda1/4/jzhao399/p-ktk3-0/rich_project_bio-konstantinidis/apps/Competitive_mapping/T4AerOil_R1.fa", "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/rich_project_bio-konstantinidis/apps/Competitive_mapping/T4AerOil_R2.fa"]

output file path

output_file = "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/rich_project_bio-konstantinidis/apps/Competitive_mapping/assembly.katome.fa"

original genome length

original_genome_length = 100

minimal weight of the edge in De Bruijn Graph

minimal_weight_threshold = 0

input file type, currently can have one of threes values:

BFCounter, Fasta, Fastq

input_file_type = "Fasta"

size of the k-mer

k_mer_size = 131

Whether or not katome should create reverse complementary sequences to the

original reads. While this option noticeably slows down the process of

assembly it usually will create higher quality output. Note that it is highly

advisable to use that option when using BFCounter file input due to the fact

that it randomly chooses between complementary and 'normal' representation of

the edge and resulting graph without complementary sequences will contain a

lot of small weakly connected components, which results in poor assembly

quality

reverse_complement = true

Thanks,

Jianshu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants