You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, first of all thank you for providing the datasets, they are very useful!
I am trying to link the labels of the dataset (particularly version for Germany) to concepts from ESCO taxonomy. However, for every annotation, the only identifiers available are the ones like [C000756_en_000, C000756_en_001, ... , C000756_en_022] that all correspond to different alternative labels of a single ESCO concept -- "aircraft engine specialist" with concept uri http://data.europa.eu/esco/occupation/0ac8fe65-32e6-4c25-8345-2b87bc7b2698.
Is there any mapping of the labels from annotations.tsv to conceptUri field from original ESCO taxonomy?
The text was updated successfully, but these errors were encountered:
Create a mapping file between the MELO corpus key and the ESCO concept URI for each element in the corpus, during the creation of a MELO dataset, as suggested by @kuba-bialczyk in #1 (comment)
Thank you for your message and for your interest in the MELO Benchmark! We're thrilled to hear that you find the datasets useful.
We agree that having a mapping between the "corpus element ID" in MELO and the concept URI in ESCO would be helpful. Based on your suggestion, we've just added this functionality to the repository. A new file, concept_mapping.tsv, is now included in the directory for each dataset and provides this mapping. For example, you can find the mapping for the German cross-lingual dataset here.
Please note that the German datasets use data from ESCO version 1.0.3 rather than 1.2, as mentioned in the issue title. (This is because of the ESCO version used in the original crosswalk created by the German Bundesagentur für Arbeit.) While there are some differences between these ESCO versions, most concepts should align well.
Thanks again for your suggestion! If there’s anything else we can help you with, or if you have further ideas to enhance the project, please don’t hesitate to reach out.
Best regards,
Federico Retyk
(Avature Machine Learning Team)
Hi, first of all thank you for providing the datasets, they are very useful!
I am trying to link the labels of the dataset (particularly version for Germany) to concepts from ESCO taxonomy. However, for every annotation, the only identifiers available are the ones like
[C000756_en_000, C000756_en_001, ... , C000756_en_022]
that all correspond to different alternative labels of a single ESCO concept --"aircraft engine specialist"
with concept urihttp://data.europa.eu/esco/occupation/0ac8fe65-32e6-4c25-8345-2b87bc7b2698
.Is there any mapping of the labels from annotations.tsv to conceptUri field from original ESCO taxonomy?
The text was updated successfully, but these errors were encountered: