You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I'm working on the Italian_Old treebank, which consists (so far) of part of the Divine Comedy, an Old Italian poetry text.
During the process of annotation, I faced several problems with the annotation of ellipses.
As you already know, in UD there are two possibilities for annotating elliptical structures:
orphan deprel
promotion
However, UD annotation (excluding Enhanced Dependency, which, so far, are not as numerous as standard treebanks) makes it difficult to retrieve and analyze ellipses. On one hand, the orphan relation signals the presence of an ellipsis, but it obscures the dependency relations of the sentence (see example 1 below). On the other hand, promotion is used without explicitly signaling the ellipsis, resulting in a loss of information regarding the presence of this phenomenon (see example 2).
Example 1: Ed elli a me (Inferno, III v. 76)
Gloss = And he to me
Example 2: e la lingua (...) si fende, e la forcuta ne l'altro si richiude (Inferno, XXV, vv. 133-135)
Gloss = And the tongue (...) REFL cleave.3sing, and the forked.femsing in the other REFL close.3sing
I suggest the possibility of:
introducing dependency relation subtypes for ellipsis in promotion :ellipsis, to easily retrieve such cases;
modifying the dependency relation orphan, by keeping the original dependency relation of the node and also adopting the specific subtype :ellipsis for similar cases (X:ellipsis).
I will provide the same example given before with the suggested modification:
In the first example of ellipsis, I have also been suggested to select a me (to me) as the head, resulting in the following structure:
To deal with cases where we already have a subtype (e.g., nsubj:pass), we could adopt the @ symbol, as used in SUD, resulting in nsubj:pass@ellipsis.
The text was updated successfully, but these errors were encountered:
Thanks for bringing this up—I agree the current treatment of ellipsis is not fully satisfying!
Speaking just to what we do in English:
We are reluctant to introduce many new subtypes as we feel that ~50 deprels is what our annotators will be able to handle.
In EWT, I have started adding Promoted=Yes to MISC where I notice non-orphan cases of ellipsis. This will help us understand why an ADJ is attaching as nsubj, for example (and reassure us that it's not an error).
Regarding the orphan cases, in English we have enhanced graphs, so the underspecification of orphan is not an issue. If you wanted to hint at the inferred deprel without introducing an enhanced graph or adding a bunch of subtypes, you might experiment with a new MISC attribute for that, e.g. EllipsisDeprel=obl. This could be a stepping stone toward adding the enhanced graph in the future.
I would like to notice that subtypes are not new relations, though, especially when they are simple references to main types.
Here we are speaking more of the status of a relation as appearing in an elliptical construction or not. I am totally in favour of introducing "relation statuses" which remind me feature layers, and of which we might discuss the exact annotation (@, [], ... ). In fact, I am convinced this is strongly needed. It might be possible to consider regular "subtype extensions" for other underdefined relations, e.g. dislocated.
It is different than enhanced annotation, because it does not involve reconstructing the non-elliptical version (if ever possible!) of the construction, which might be beyond the goals of many treebanks. It just signals challenging cases, and this already is extremely beneficial to queries and data extraction. Also, not everybody is willing to query on enhanced graphs.
In EWT, I have started adding Promoted=Yes to MISC where I notice non-orphan cases of ellipsis. This will help us understand why an ADJ is attaching as nsubj, for example (and reassure us that it's not an error).
One could argue that, if this happens, it is always an ellipsis. This is one of the main points of the OP.
Ellipsis in UD:
Hi,
I'm working on the Italian_Old treebank, which consists (so far) of part of the Divine Comedy, an Old Italian poetry text.
During the process of annotation, I faced several problems with the annotation of ellipses.
As you already know, in UD there are two possibilities for annotating elliptical structures:
orphan
deprelHowever, UD annotation (excluding Enhanced Dependency, which, so far, are not as numerous as standard treebanks) makes it difficult to retrieve and analyze ellipses. On one hand, the
orphan
relation signals the presence of an ellipsis, but it obscures the dependency relations of the sentence (see example 1 below). On the other hand, promotion is used without explicitly signaling the ellipsis, resulting in a loss of information regarding the presence of this phenomenon (see example 2).I suggest the possibility of:
:ellipsis
, to easily retrieve such cases;orphan
, by keeping the original dependency relation of the node and also adopting the specific subtype:ellipsis
for similar cases (X:ellipsis
).I will provide the same example given before with the suggested modification:
In the first example of ellipsis, I have also been suggested to select a me (to me) as the head, resulting in the following structure:
To deal with cases where we already have a subtype (e.g.,
nsubj:pass
), we could adopt the @ symbol, as used in SUD, resulting innsubj:pass@ellipsis.
The text was updated successfully, but these errors were encountered: