Fixed loading error for customized pipelines and added a function for converting trankit outputs to CoNLL-U format
Latest- The issue #17 of loading customized pipelines has been fixed in this new release. Please check it out here.
- In this new release,
trankit
supports conversion of trankit outputs in json format to CoNLL-U format. The conversion is done via the new functiontrankit2conllu
, which can be used as belows:
from trankit import Pipeline, trankit2conllu
p = Pipeline('english')
# document level
json_doc = p('''Hello! This is Trankit.''')
conllu_doc = trankit2conllu(json_doc)
print(conllu_doc)
#1 Hello hello INTJ UH _ 0 root _ _
#2 ! ! PUNCT . _ 1 punct _ _
#
#1 This this PRON DT Number=Sing|PronType=Dem 3 nsubj _ _
#2 is be AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 3 cop _ _
#3 Trankit Trankit PROPN NNP Number=Sing 0 root _ _
#4 . . PUNCT . _ 3 punct _ _
# sentence level
json_sent = p('''This is Trankit.''', is_sent=True)
conllu_sent = trankit2conllu(json_sent)
print(conllu_sent)
#1 This this PRON DT Number=Sing|PronType=Dem 3 nsubj _ _
#2 is be AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 3 cop _ _
#3 Trankit Trankit PROPN NNP Number=Sing 0 root _ _
#4 . . PUNCT . _ 3 punct _ _