You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This came out of a conversation with Zach Pennington about finding the lemmas corresponding to a translated word across a whole translation. The plan would be to provide some generic support for this kind of query.
I think this should happen at the sequence level. For many use cases this means the main sequence of the document, but it would also be possible to analyse, say, footnotes or headings.
The 3 useful permutations seem to be
scopes related to a token: sequence { tokenScopes { target context { payload nUses } } }
tokens related to a scope: sequence { scopeTokens { target context { payload nUses } } }
By default, these endpoints would produce an exhaustive mapping, eg all scopes applied to all tokens in a sequence in the case of tokenScopes. This could be useful for producing a lookup table or for populating a database. Two optional filters may be applied:
target - the terms to be searched for
context - the values to be collected with respect to the target
eg
tokenScopes(target: ["see", "hear", "touch"]) { target context { payload } } to get all scopes applied to each of three tokens
tokenScopes(context: ["blockTag/" "attribute/spanWithAtts/lemma/"]) { target context { payload } } to get the blockTag and lemma scopes applied to every token
scopeScopes(target:"attribute/spanWithAtts/lemma" context:"attribute/spanWithAtts/x-content") { target context { payload } } to get all the source words corresponding to a source lemma.
The optional caseInsensitive argument may be used to aggregate all tokens with the same lower-case value.
The text was updated successfully, but these errors were encountered:
This came out of a conversation with Zach Pennington about finding the lemmas corresponding to a translated word across a whole translation. The plan would be to provide some generic support for this kind of query.
I think this should happen at the sequence level. For many use cases this means the main sequence of the document, but it would also be possible to analyse, say, footnotes or headings.
The 3 useful permutations seem to be
sequence { tokenScopes { target context { payload nUses } } }
sequence { scopeTokens { target context { payload nUses } } }
sequence { scopeScopes { target context { payload nUses } } }
By default, these endpoints would produce an exhaustive mapping, eg all scopes applied to all tokens in a sequence in the case of
tokenScopes
. This could be useful for producing a lookup table or for populating a database. Two optional filters may be applied:target
- the terms to be searched forcontext
- the values to be collected with respect to the targeteg
tokenScopes(target: ["see", "hear", "touch"]) { target context { payload } }
to get all scopes applied to each of three tokenstokenScopes(context: ["blockTag/" "attribute/spanWithAtts/lemma/"]) { target context { payload } }
to get the blockTag and lemma scopes applied to every tokenscopeScopes(target:"attribute/spanWithAtts/lemma" context:"attribute/spanWithAtts/x-content") { target context { payload } }
to get all the source words corresponding to a source lemma.The optional
caseInsensitive
argument may be used to aggregate all tokens with the same lower-case value.The text was updated successfully, but these errors were encountered: