-
Prompsit SL
- Elche
-
18:59
(UTC +01:00) - http://motagirl.net
- @motagirl2
Stars
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
Targetted language identifier, based on FastText and Hunspell.
OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.
Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.
Repository for data models, dictionaries and more resources for Bicleaner
Utility that will help you to ROAM (Random Omit Anonymize and Mix) your parallel corpus.
Code for Neural Inverse Knitting: From Images to Manufacturing Instructions
A React site simulates knitting different stitch widths using a skein of variegated yarn.
mbanon / segment
Forked from loomchild/segmentProgram used to split text into segments
Tool to fix bitexts and tag near-duplicates for removal
Results of the human evaluation