-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex group matching with enhanced parentage constraint only works with corresponding basic parentage constraint #46
Comments
I think this is the only issue blocking nmod:desc implementation for honorific titles in EWT. There needs to be a way to reattach enhanced deps that are there due to relative clauses, coordination, and control involving the name. If the bracket group matching problem is thorny I would actually prefer a more direct syntax for this, as discussed in #40. |
I think the issue is that you're using edep for the capture replacement rather than edom. From the documentation:
The problem is that unlike other fields, DEPS contains multiple edge and label annotations, so it's hard to operate on it in a simple way. I haven't implemented the notation suggested in #40 yet, though it's still conceivable (but not likely I'll do this very soon, it seems non trivial and I have a bunch of other things in the queue... PRs welcome though!) |
Maybe I misunderstood the Note that
does work, but including the |
It might also help to give pseudocode in the docs for the rule matching algorithm. I am assuming something like:
I have wondered about the order and timing of bindings (are nodes matched left to right? computed once per sentence per rule, or with each rule application?). E.g. I was trying to get away with a rule that would apply to the leftmost Also I mentioned "node and edge bindings" contemplating the case where a pair of nodes has two enhanced dependencies between them, so a rule could apply twice to the same nodes. But maybe that is not supported. |
Can you post a minimal conllu input example and note what you expect to happen to the sentence which doesn't? Then I can take a look in the debugger
Yes, the algorithm you outlined above is right (for each sentence, each rule is applied, unless last or once is used as you wrote). Node and edge matches are computed only once, before the rule is applied, and then it is applied to all matches, so it's not important if node sets are ordered left to right or not - each set of matching nodes will be subjected to the 'action' column directives. There is no iterative changing and rechecking of constraints - they are only tested before a rule is applied. This also means there can never be an infinite loop à la Grew, because any changes triggered by a rule do not impact that rule's inputs anymore. |
Well the order will matter for Also, I assume the node bindings for any rule application must be distinct (
Sure, attaching the .conllu and the script (adding .txt extensions for GitHub reasons): sample.conllu.txt With the rule I want to write (R1), it gives an error: |
OK I finally figured out how to use the VSCode debugger and isolated the cause of the problem, which is that sometimes I don't see a direct way to specify a node-inequality constraint so I copied the rule and specified |
When setting the new edeprel I can't figure out how to prevent it from overriding a different one. This is happening at least for nodes that have two edeps with the same deprel (an nsubj which corresponds to the basic dep, and an nsubj into the relative clause). Maybe there's a way to do this by capturing the head ID and setting |
Debugging why it comes out as I would love to be able to substitute a node variable instead: |
I am trying to match and reattach all enhanced edges meeting certain criteria, copying the edeprel, but there is an error about a missing regex bracket group. See the commented-out line:
The text was updated successfully, but these errors were encountered: