Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training step problem #164

Open
heavennew opened this issue May 27, 2024 · 2 comments
Open

Training step problem #164

heavennew opened this issue May 27, 2024 · 2 comments

Comments

@heavennew
Copy link

Hello,

I had a problem when I did the multi-sample binning. The error information is below,

2024-05-28 00:44:40 localhost.localdomain SemiBin[102770] INFO Setting number of CPUs to 64
2024-05-28 00:44:40 localhost.localdomain SemiBin[102770] INFO Binning for short_read
2024-05-28 00:44:40 localhost.localdomain SemiBin[102770] INFO SemiBin will run in self supervised mode
2024-05-28 00:44:55 localhost.localdomain SemiBin[102770] INFO Did not detect GPU, using CPU.
2024-05-28 00:44:55 localhost.localdomain SemiBin[102770] INFO Performing multi-sample binning
2024-05-28 00:44:55 localhost.localdomain SemiBin[102770] INFO Generating training data...
2024-05-28 00:47:01 localhost.localdomain SemiBin[102770] INFO Calculating coverage for every sample.
2024-05-28 02:03:22 localhost.localdomain SemiBin[102770] INFO Processed: 1_sample.mapped.sorted.bam
2024-05-28 02:03:22 localhost.localdomain SemiBin[102770] INFO Processed: 2_sample.mapped.sorted.bam
2024-05-28 02:38:16 localhost.localdomain SemiBin[102770] INFO Training model and clustering for L1_contig_1000@L1.
2024-05-28 02:38:16 localhost.localdomain SemiBin[102770] INFO Start training from a single sample.
2024-05-28 02:38:31 localhost.localdomain SemiBin[102770] INFO Training model...
^M 0%| | 0/15 [00:00<?, ?it/s]^M 0%| | 0/15 [00:30<?, ?it/s]
Traceback (most recent call last):
File "/data/ang/.conda/envs/SemiBin2/bin/SemiBin2", line 10, in
sys.exit(main2())
^^^^^^^
File "/data/ang/.conda/envs/SemiBin2/lib/python3.12/site-packages/SemiBin/main.py", line 1610, in main2
multi_easy_binning(
File "/data/ang/.conda/envs/SemiBin2/lib/python3.12/site-packages/SemiBin/main.py", line 1349, in multi_easy_binning
training(logger, None, args.num_process,
File "/data/ang/.conda/envs/SemiBin2/lib/python3.12/site-packages/SemiBin/main.py", line 1126, in training
model = train_self(logger,
^^^^^^^^^^^^^^^^^^
File "/data/ang/.conda/envs/SemiBin2/lib/python3.12/site-packages/SemiBin/self_supervised_model.py", line 108, in train_self
dataset = feature_Dataset(train_input_1, train_input_2, train_labels)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/ang/.conda/envs/SemiBin2/lib/python3.12/site-packages/SemiBin/semi_supervised_model.py", line 120, in init
assert len(embedding1) == len(embedding2)
AssertionError

Do you know how to fix it?

@luispedro
Copy link
Member

Thanks for the report.

Can you tell us what was the command that you ran?

@heavennew
Copy link
Author

the command:
SemiBin multi_easy_bin -i concatenated.fa.gz -b *sample.mapped.sorted.bam -o easy_multi_sample_output --ml-threshold 30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants