Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use go for multithreading<> increase the performance #16

Open
Hk669 opened this issue Jun 10, 2024 · 2 comments
Open

use go for multithreading<> increase the performance #16

Hk669 opened this issue Jun 10, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request roadmap

Comments

@Hk669
Copy link
Owner

Hk669 commented Jun 10, 2024

Is your feature request related to a problem? Please describe.

i tried using go, it seems 40% faster than the python. i think writing more efficient code on go with multithreading will optimize to fullness of the training of the tokenizer.

Describe the solution you'd like

replicate the bpetokenizer with go

@Hk669 Hk669 added enhancement New feature or request roadmap labels Jun 10, 2024
@Hk669 Hk669 self-assigned this Jun 10, 2024
@zacharias1219
Copy link

You started working on this??
Also there weren't any errors on my local machine while using bpetokenizer.

@Hk669
Copy link
Owner Author

Hk669 commented Jun 14, 2024

You started working on this??
Also there weren't any errors on my local machine while using bpetokenizer.

Yeah started already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request roadmap
Projects
None yet
Development

No branches or pull requests

2 participants