You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I was wondering if you would release your pretraining code for DNABERT-2 and NT? The DNABERT-2 website does not release the actual code that they used to pre-train, just a suggestion of two similar models to use.
Hello, we are not publishing the training codes at this stage. You may nevertheless find the following info useful:
All training was done with PyTorch. We indeed trained the DNABERT-2 model using the architecture from https://huggingface.co/zhihan1996/DNABERT-2-117M/blob/main/bert_layers.py and the same learning rate scheduler as in the original DNABERT-2 paper. We have recently retrained all models on 10 GPUs (NVIDIA A100 80Gb) with the effective batch size of 4480 for DNABERT-2 and 480 for NTv2-250M-3UTR. As for DNABERT-2, the training time was 1.3 h (which is about 10x less compared to NT) when training on Zoonomia 3'UTR sequences for 2 epochs. This is equivalent to 13 hours when training on a single A100 GPU. DNABERT-2 is also faster than NT when compared on the same batch size.
Hello! I was wondering if you would release your pretraining code for DNABERT-2 and NT? The DNABERT-2 website does not release the actual code that they used to pre-train, just a suggestion of two similar models to use.
(from DNABERT-2 website)
We used and slightly modified the MosaicBERT implementation for DNABERT-2 https://github.com/mosaicml/examples/tree/main/examples/benchmarks/bert . You should be able to replicate the model training following the instructions.
Or you can use the run_mlm.py at https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling by importing the BertModelForMaskedLM from https://huggingface.co/zhihan1996/DNABERT-2-117M/blob/main/bert_layers.py. It should produce a very similar model.
I am interested in using your implementation of the pre-training DNABERT-2 because you were able to get it to train in such a short time.
Thank you for any help you can provide.
LeAnn
The text was updated successfully, but these errors were encountered: