This repository contains the datasets used to evaluate and fine-tune state-of-the-art POS taggers when dealing with spoken Spanish, as described in the paper by Bonilla (submitted). The datasets are in CoNLL-U format and were initially generated using automatic methods. They were then manually corrected and double checked by experts for LEMMA, UPOS and FEATS tags.
To cite or check the results of using these datasets, please refer to the following paper:
Bonilla, J.E.(submitted). Spoken Spanish PoS Tagging: Gold Standard Dataset.