Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Build vocabulary script #230

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Build vocabulary script #230

wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Jan 3, 2018

Thanks for tensorflow nmt project that makes it easy for me to help my research. And I find it difficult to build vocabulary unless I download the data you provide. So I implement a very simple script to build vocabulary which is easy for newcomer to use this system.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot. The email used to register you as an authorized contributor must be the email used for the Git commit.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

@ghost
Copy link
Author

ghost commented Jan 3, 2018

I signed it !

@googlebot
Copy link

CLAs look good, thanks!

@Sabyasachi18
Copy link

Sabyasachi18 commented Jan 22, 2018

Hi @Goffic ...
I have a question..
I want to run incremental training on my trained German-English NMT Engine with subword BPE encoding .
Can I update my vocab file with new words from the incremental training data. If Yes, then kindly let me know the process.

Should I append the new words at the end of the existing vocabulary file while running incremental training?
Or should i do a sorting of the vocab file after appending the new words to it?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants