Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Own class: tokens_with_tokenvars #2

Open
3 of 4 tasks
chainsawriot opened this issue Nov 26, 2023 · 2 comments
Open
3 of 4 tasks

Own class: tokens_with_tokenvars #2

chainsawriot opened this issue Nov 26, 2023 · 2 comments

Comments

@chainsawriot
Copy link
Contributor

chainsawriot commented Nov 26, 2023

Given gesistsa/quanteda.proximity#35 and quanteda::tokens_*() will not respect tokenvars, it would be better to make this a new class for now.

  • Create a new class tokens_with_tokenvars
  • tokens_with_tokenvars.as.tokens()
  • docvars.tokens_with_tokenvars()
  • meta.tokens_with_tokenvars()
@chainsawriot
Copy link
Contributor Author

chainsawriot commented Nov 26, 2023

Although tokens_with_tokenvars_VERB() is annoying to type (if we are going to follow the style guide); but this is an experiment anyway.

@chainsawriot
Copy link
Contributor Author

chainsawriot commented Nov 26, 2023

  • print.tokens_with_tokenvars()
xtokenid <- c("t1", "t2")
xtoken <- c("spacy", "is")
xtokenvars <- data.frame(tag = c("NNP", "VBZ"), lemma = c("spaCy", "be"))

mockup <- function(xtokenid, xtoken, xtokenvars) {
    ugly <- vapply(seq_len(nrow(xtokenvars)), function (y) paste(as.character(xtokenvars[y,]), collapse = "/"), "a")
    cat("Tokens (with token variables) consisting 2 documents.\n")
    cat("Token variables: (", paste(names(xtokenvars), collapse = "/"), ").\n", sep = "")
    cat("d1:\n")
    for (i in seq_along(xtoken)) {
        cat("[", xtokenid[i], "]: ", xtoken[i], " (", ugly[i], ") ", sep = "")
    }
    cat("\n")
}

mockup(xtokenid, xtoken, xtokenvars)
#> Tokens (with token variables) consisting 2 documents.
#> Token variables: (tag/lemma).
#> d1:
#> [t1]: spacy (NNP/spaCy) [t2]: is (VBZ/be)

Created on 2023-11-26 with reprex v2.0.2

chainsawriot added a commit that referenced this issue Nov 26, 2023
chainsawriot added a commit that referenced this issue Nov 26, 2023
chainsawriot added a commit that referenced this issue Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant