We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Currently, the audio alignment follows this structure:
<anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w1.ab"/> <w xml:id="ps2013-001-01-000-999.u1.p1.s1.w1" lemma="vážený" pos="ADJ" msd="UPosTag=ADJ|Animacy=Anim|Case=Voc|Degree=Pos|Gender=Masc|Number=Sing|Polarity=Pos|VerbForm=Part|Voice=Pass" ana="pdt:AAFS5----1A----">Vážení</w> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w1.ae"/> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w2.ab"/> <w xml:id="ps2013-001-01-000-999.u1.p1.s1.w2" lemma="paní" pos="NOUN" msd="UPosTag=NOUN|Case=Voc|Gender=Fem|Number=Sing|Polarity=Pos" ana="pdt:NNFS5-----A----">paní</w> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w2.ae"/> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w3.ab"/> <w xml:id="ps2013-001-01-000-999.u1.p1.s1.w3" lemma="poslankyně" pos="NOUN" msd="UPosTag=NOUN|Case=Voc|Gender=Fem|Number=Sing|Polarity=Pos" ana="pdt:NNFS5-----A----" join="right">poslankyně</w> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w3.ae"/> <pc xml:id="ps2013-001-01-000-999.u1.p1.s1.w4" lemma="," pos="PUNCT" msd="UPosTag=PUNCT" ana="pdt:Z:-------------">,</pc> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w5.ab"/> <w xml:id="ps2013-001-01-000-999.u1.p1.s1.w5" lemma="vážený" pos="ADJ" msd="UPosTag=ADJ|Animacy=Anim|Case=Nom|Degree=Pos|Gender=Masc|Number=Plur|Polarity=Pos|VerbForm=Part|Voice=Pass" ana="pdt:AAMP5----1A----">vážení</w> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w5.ae"/> <!-- ... -->
Every aligned token is wrapped with two anchors :
w/preceding-sibling::anchor[1][ends-with(@synch,'b')]
w/following-sibling::anchor[1][ends-with(@synch,'e')]
This is not very good because it expects specific suffixes in @synch and also the adjected placement.
@synch
So, the proposal is to add a @corresp attribute to the anchor that would point to the corresponding token:
@corresp
<anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w1.ab" corresp="ps2013-001-01-000-999.u1.p1.s1.w1"/> <w xml:id="ps2013-001-01-000-999.u1.p1.s1.w1" lemma="vážený" pos="ADJ" msd="UPosTag=ADJ|Animacy=Anim|Case=Voc|Degree=Pos|Gender=Masc|Number=Sing|Polarity=Pos|VerbForm=Part|Voice=Pass" ana="pdt:AAFS5----1A----">Vážení</w> <anchor synch="#ps2013-001-01-000-999.u1.p1.s1.w1.ae" corresp="ps2013-001-01-000-999.u1.p1.s1.w1"/>
Notes:
The text was updated successfully, but these errors were encountered:
matyaskopp
No branches or pull requests
problem
Currently, the audio alignment follows this structure:
Every aligned token is wrapped with two anchors :
w/preceding-sibling::anchor[1][ends-with(@synch,'b')]
w/following-sibling::anchor[1][ends-with(@synch,'e')]
This is not very good because it expects specific suffixes in
@synch
and also the adjected placement.solution
So, the proposal is to add a
@corresp
attribute to the anchor that would point to the corresponding token:Notes:
The text was updated successfully, but these errors were encountered: