We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokens to align:
$ grep ps2013-001-01-002-002.u1.p5.s1.w18 /net/work/people/kopp/ParCzech/audio-alignment/Data/audio-corresp-tsv-in/www.psp.cz/eknih/2013ps/audio/2013/11/25/2013112513581412.tsv se ps2013-001-01-002-002.u1.p5.s1.w18 MiroslavaNemcova.1952
Alignment output (multiple alignments of one token):
$ grep ps2013-001-01-002-002.u1.p5.s1.w18 /net/work/people/kopp/ParCzech/audio-alignment/Data/audio-align-token/2013112513581412.tsv se - False ps2013-001-01-002-002.u1.p5.s1.w18 False - - - - - - se - False ps2013-001-01-002-002.u1.p5.s1.w18 False - - - - - - se - False ps2013-001-01-002-002.u1.p5.s1.w18 False - - - - - - se to False ps2013-001-01-002-002.u1.p5.s1.w18 True 2 1.000 612760.0 612910.0 150.0 75.000 se se False ps2013-001-01-002-002.u1.p5.s1.w18 True 0 0.000 743260.0 743420.0 160.0 80.000 se - False ps2013-001-01-002-002.u1.p5.s1.w18 False - - - - - -
The first match in audio-align-token/2013112513581412.tsv is aligned
audio-align-token/2013112513581412.tsv
<anchor synch="#ps2013-001-01-002-002.u1.p5.s1.w18.ab"/> <w xml:id="ps2013-001-01-002-002.u1.p5.s1.w18" lemma="se" pos="PRON" msd="UPosTag=PRON|Case=Acc|PronType=Prs|Reflex=Yes|Variant=Short" ana="pdt:P7-X4----------">se</w> <anchor synch="#ps2013-001-01-002-002.u1.p5.s1.w18.ae"/> <!-- .... --> <when xml:id="ps2013-001-01-002-002.u1.p5.s1.w18.ab" interval="612760.0" since="#ps2013-001-01-002-002.audio1.origin"/> <when xml:id="ps2013-001-01-002-002.u1.p5.s1.w18.ae" interval="612910.0" since="#ps2013-001-01-002-002.audio1.origin"/>
$ grep ps2013-001-01-002-002.u1.p3.s1.w26 audio-corresp-tsv-in/www.psp.cz/eknih/2013ps/audio/2013/11/25/2013112513581412.tsv ten ps2013-001-01-002-002.u1.p3.s1.w26 MiroslavaNemcova.1952 $ grep ps2013-001-01-002-002.u1.p3.s1.w26 /net/work/people/kopp/ParCzech/audio-alignment/Data/audio-align-token/2013112513581412.tsv ten ten False ps2013-001-01-002-002.u1.p3.s1.w26 True 0 0.000 516570.0 516820.0 250.0 83.333 ten - False ps2013-001-01-002-002.u1.p3.s1.w26 False - - - - - - ten - False ps2013-001-01-002-002.u1.p3.s1.w26 False - - - - - - ten - False ps2013-001-01-002-002.u1.p3.s1.w26 False - - - - - - ten ten False ps2013-001-01-002-002.u1.p3.s1.w26 True 0 0.000 693800.0 694080.0 280.0 93.333 ten - False ps2013-001-01-002-002.u1.p3.s1.w26 False - - - - - -
The text was updated successfully, but these errors were encountered:
ParCzech 4.0:
/net/work/people/kopp/ParCzech/audio-alignment/Data/audio-align-token$ ls | xargs grep '^[^\t]*\t[^-]*\t'|cut -f4|grep -v CONTEXT|sort | uniq -c|sort -n| grep -v '^ *1 ' > ~/double-aligned-tokens.log
Affected sentences:
$ cat ~/double-aligned-tokens.log | sed "s/.* //;s/.w.*//"|sort|uniq|wc -l 32464
Sorry, something went wrong.
matyaskopp
No branches or pull requests
Tokens to align:
Alignment output (multiple alignments of one token):
The first match in
audio-align-token/2013112513581412.tsv
is alignedThe text was updated successfully, but these errors were encountered: