-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training scooby #11
Comments
Hi Sergey, the weights and biases numbers are not super representative of model performance (PR coming soon), which is why we run the model evaluation notebook per epoch. We trained with batch size 8, and doing 4x gradient accumulation is not equivalent (as Borzoi has batch norms), so I am not sure if you maybe have to lower the learning rate or change other hparams. I am also unsure if you need to warm up the learning rate 4x longer then. We also observed the transform_borzoi_emb to make training less stable, so maybe try to set that to false - it does not drastically change results. Let me know if these changes help :) Regarding your second question, I guess as long as you lift train-val-test regions over to hg19 to maintain the same data split, it will still work, but we have not tried that. Cheers, |
Hi Johannes, Thanks a lot for the prompt reply! To evaluate the rna-only model using your evaluation notebook designed for the multiome model, I only change model_type to 'rna' when calling get_pseudobulk_count_pred and fix_rev_comp_multiome to fix_rev_comp_rna in the predict function. Are there other changes to make? Also, would you mind making the rna-only model available on the hub? Best, |
@sergeyvilov yes that seems all - have you had success with training scooby? I have just uploaded neurips-scooby-rna to huggingface, you should be able to get it with: from scooby.modeling import Scooby
scooby_neurips_rna = Scooby.from_pretrained(
'johahi/neurips-scooby-rna',
cell_emb_dim=14,
embedding_dim=1920,
n_tracks=2,
return_center_bins_only=True,
disable_cache=False,
use_transform_borzoi_emb=True,
) @Jaureguy760 are you retraining scooby or scooby-rna? Metrics seem fine, I am a bit surprised that your learning rate is not decaying as fast as mine, are you running on 8 GPUs, bs = 1? You can also try evaluating using the notebook, we will try to get the better eval soon :) |
Hi, Thanks for the response @johahi ! I was using the default lr of 1e-4 since that is used in the github repo but I just did a fresh run with 2e-4 and am seeing better results. I am using a different set up with 4 A40 gpus with the flashzoi model with bs=3 per gpu so 12 total global batch size. I was able to fit 3 per gpu with out doing gradient accumulation. I might test gradient accumulation to see if it helps converge better. Quick follow up question. Which metric on the wanb log did you base testing with the notebook? Which eval notebook would you recommend for quick evals to help with model training? Not sure. Thanks for the help! Great model and looking forward to the publication :) |
The across tracks metric is more important, but generally the single cell targets are pretty sparse so it's still a pretty shaky metric. We recommend this notebook for eval, it is pretty similar to the FastEvaluator-fixed.ipynb you found up there. |
Definitely will follow up and share. Im training a few models to evaluate them at the same time downstream. Thanks again for the great work! |
@johahi Thanks a lot for sharing the rna-only model! I finally managed to train mine. The crucial thing was to set transform_borzoi_emb=False. Thank you very much for the indication. |
Hi,
First, I'm trying to reproduce your results for the RNA-only model on NeurIPS (using hg38 as reference). Could you please tell me which metrics are to expect in lora? The validation metrics evolve as on the attached screenshot and I don't think that's the expected behaviour.
Second, do you think that it's ok to use lora to fine-tune to a different genome (e.g. hg19). Or would you expect that complete re-training would be necessary?
I'm training on 2 NVIDIA A100 with --gradient_accumulation_steps 4.
Thanks in advance for your responses
The text was updated successfully, but these errors were encountered: