Skip to content

hsci-r/flopo-data-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

flopo-data-pipeline

Scripts to convert the main article datasets used in the FLOPO project to CONLL-CSV and index them in the Octavo index service.

One of the four source datasets is not openly available, while the others have license restrictions prohibiting distributing them directly as part of this release.

The status per dataset is as follows:

For understanding the workflow, consult the Luigi workflow.py.