Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token and Scope Usage Stats #137

Open
mvahowe opened this issue Jul 20, 2021 · 0 comments
Open

Token and Scope Usage Stats #137

mvahowe opened this issue Jul 20, 2021 · 0 comments

Comments

@mvahowe
Copy link
Owner

mvahowe commented Jul 20, 2021

This came out of a conversation with Zach Pennington about finding the lemmas corresponding to a translated word across a whole translation. The plan would be to provide some generic support for this kind of query.

I think this should happen at the sequence level. For many use cases this means the main sequence of the document, but it would also be possible to analyse, say, footnotes or headings.

The 3 useful permutations seem to be

  • scopes related to a token: sequence { tokenScopes { target context { payload nUses } } }
  • tokens related to a scope: sequence { scopeTokens { target context { payload nUses } } }
  • scopes co-enclosing content: sequence { scopeScopes { target context { payload nUses } } }

By default, these endpoints would produce an exhaustive mapping, eg all scopes applied to all tokens in a sequence in the case of tokenScopes. This could be useful for producing a lookup table or for populating a database. Two optional filters may be applied:

  • target - the terms to be searched for
  • context - the values to be collected with respect to the target

eg

  • tokenScopes(target: ["see", "hear", "touch"]) { target context { payload } } to get all scopes applied to each of three tokens
  • tokenScopes(context: ["blockTag/" "attribute/spanWithAtts/lemma/"]) { target context { payload } } to get the blockTag and lemma scopes applied to every token
  • scopeScopes(target:"attribute/spanWithAtts/lemma" context:"attribute/spanWithAtts/x-content") { target context { payload } } to get all the source words corresponding to a source lemma.

The optional caseInsensitive argument may be used to aggregate all tokens with the same lower-case value.

@mvahowe mvahowe added this to the GraphQL milestone Jul 20, 2021
@mvahowe mvahowe modified the milestones: GraphQL, 0.8 - GraphQL Refinements Apr 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant