-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting other data types (e.g. video) #53
Comments
Hi! Other input types are currently not implemented, although we've had this request more than once. We don't really have enough hands to look into it properly; hence up until now, we've focused on text-only applications. It should however be feasible with a reasonably small amount of changes to the codebase, depending what you're exactly looking for. If a hacky solution and vanilla transformer layers are good enough, then you can try the following to train a model:
If you'd like to have a look at doing that more properly, external contributions are very much welcome! |
Thank you for your detailed reply. I've been working on this for a month now and I'm able to train it properly. I haven't had a lot of experiments with that and I'm trying different hyperparameters to see if I can get to an acceptable performance. One question, isn't it possible to use loss as the criterion for early stopping? |
Yes, early stopping should be supported out of the box. Assuming you want to evaluate every 10k steps, and stop training if it no longer improves after 5 evaluation loops, then:
early_stopping: 5
early_stopping_criteria: ppl
valid_steps: 10000 The early stopper will evaluate perplexity on the validation dataset(s), which should be equivalent to the cross-entropy used to train the model. mammoth/mammoth/utils/earlystopping.py Line 11 in 1ba7af9
and then make sure to make it available as one of the default scorers and scorer builders: mammoth/mammoth/utils/earlystopping.py Lines 60 to 63 in 1ba7af9
|
Thank you. |
No worries! don't hesitate to share your code if you want us to include it in the library, we would welcome a pull request if you have something working. |
Hi,
Is it possible to use Mammoth for other seq2seq problems, such as multilingual video/image captioning? What I have in mind is to prepare video features in this format (batch, n_frames, emb_size) and read them using src_embeddings in the training script while keeping the rest unchanged in the config.yml (e.g., tgt_vocab).
The text was updated successfully, but these errors were encountered: