subs2strudel

Creating semantic concept-feature norms using STRUDEL

Collecting the Data

Below is the process you need to run in order to contribute a language to the collection.

Open up process_strudels.Rmd in the root (~/) folder. It is the same folder that contains this README.md.
Select a language not currently completed. See ~/data/udpipe_languages.csv's Completed column.
Select a number of sub processes. Each sub process takes one core and ~2GB of RAM.
Run, NOT KNIT, everything. This will:
- Download a new language file into the ~/data folder.
- Download a new language udpipe control file into the ~/ folder.
- Splits the langage file into smaller files for parallel processing. This makes more files in the ~/data folder.
- Runs each smaller file in its own process. This generates files in the ~/concept-feature folder.
- Combines the files into a single file.
Upload the combined file to the releases in GitHub.
Update the releases and ~/data/udpipe_languages.csv noting the progress.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
concept-feature		concept-feature
data		data
example		example
manuscript		manuscript
presentations		presentations
strudel_sups		strudel_sups
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
process_strudels.Rmd		process_strudels.Rmd
subs2vec-strudel.Rproj		subs2vec-strudel.Rproj