Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Hub Connection Functionality #15

Merged
merged 23 commits into from
Jan 6, 2023
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
9204b94
Add jsonlite dependency
annakrystalli Oct 4, 2022
e8782c2
Add license
annakrystalli Oct 5, 2022
85c3167
Add read_hubmeta function. Resolves #9
annakrystalli Oct 5, 2022
091aa12
Buildignore attic folder
annakrystalli Oct 5, 2022
cadb440
Style code
annakrystalli Oct 5, 2022
5824273
Add R-CMD-CHECK & lifecycle badges to README
annakrystalli Oct 5, 2022
0866c7f
Merge branch 'hub-connect' of https://github.com/Infectious-Disease-M…
annakrystalli Oct 5, 2022
c7b0a66
add GITHUB_ACTOR env variable
annakrystalli Oct 5, 2022
e69d999
style pkg
annakrystalli Oct 5, 2022
73afb1b
Remove style GitHub Action for now
annakrystalli Oct 5, 2022
914b813
Add functions to create new hub_connection classes
annakrystalli Oct 6, 2022
0577ed7
add task_ids_by_round attribute
annakrystalli Oct 6, 2022
da1077c
add age group to scenario hub example
annakrystalli Oct 17, 2022
dae7498
Add get_* metadata function family + draft input validation functions
annakrystalli Oct 19, 2022
d8541ae
make hub connection object a single "hub_connection" class (remove "l…
annakrystalli Oct 19, 2022
c22c315
use rlang::arg_match instead of match.arg
annakrystalli Oct 19, 2022
e2817ec
Use %>% instead of |>. Resolves #23
annakrystalli Oct 19, 2022
44ac1ac
change assing_hc_attrs typo to assign_hc_attrs
annakrystalli Oct 24, 2022
1335e45
Merge branch 'main' of https://github.com/Infectious-Disease-Modeling…
annakrystalli Oct 24, 2022
4497211
Add hub_connection print method. Resolves #26
annakrystalli Oct 24, 2022
cec178e
rename print argument con to x for consistency with generic print method
annakrystalli Oct 24, 2022
ab81ae6
add ... argument for consistency with generic print method
annakrystalli Oct 25, 2022
7a1fc39
Merge branch 'hub-connect' of https://github.com/Infectious-Disease-M…
annakrystalli Dec 15, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@
^_pkgdown\.yml$
^docs$
^pkgdown$
^LICENSE\.md$
^attic$
68 changes: 0 additions & 68 deletions .github/workflows/style.yaml

This file was deleted.

17 changes: 14 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,15 +1,26 @@
Package: hubUtils
Title: Utility functions for Infectious Disease Modeling Hubs
Version: 0.0.0.9000
Version: 0.0.0.9001
Authors@R:
person("Anna", "Krystalli", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-2378-4915"))
Description: A set of utility functions for downloading, plotting, and scoring
forecast and truth data from Infectious Disease Modeling Hubs.
License: `use_mit_license()`, `use_gpl3_license()` or friends to pick a
license
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.1
URL: https://github.com/Infectious-Disease-Modeling-Hubs/hubUtils
BugReports: https://github.com/Infectious-Disease-Modeling-Hubs/hubUtils/issues
Suggests:
testthat (>= 3.0.0)
Config/testthat/edition: 3
Imports:
checkmate,
cli,
fs,
jsonlite,
magrittr,
purrr,
rlang,
yaml
2 changes: 2 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
YEAR: 2022
COPYRIGHT HOLDER: Consortium of Infectious Disease Modeling Hubs
21 changes: 21 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# MIT License

Copyright (c) 2022 Consortium of Infectious Disease Modeling Hubs

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
8 changes: 8 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,2 +1,10 @@
# Generated by roxygen2: do not edit by hand

export("%>%")
elray1 marked this conversation as resolved.
Show resolved Hide resolved
export(connect_hub)
export(get_round_ids)
export(get_task_id_vals)
export(get_task_ids)
export(read_hubmeta)
importFrom(magrittr,"%>%")
importFrom(rlang,`!!!`)
106 changes: 106 additions & 0 deletions R/get-metadata.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#' Get task id values for a given round
#'
#' @param con a `hub_connection` class object.
#' @param task_id Character string. A task id name.
#' @param round_id Character string. A round id name. If rounds vary by a variable in hub,
#' argument is ignored and can be left `NULL` (default).
#' @param flatten Logical. Whether to flatten required & optional task ids into a single
#' character vector.
#'
#' @return if `flatten = TRUE` (default), a character vector of task id values.
#' `flatten = FALSE`, a named list containing character vectors of
#' `required` and `optional` task id values.
#' @export
#' @family hub-metadata
#' @examples
#' con <- connect_hub(system.file("hub_1", package = "hubUtils"))
#' get_task_id_vals(con, task_id = "location")
#' con <- connect_hub(system.file("scnr_hub_1", package = "hubUtils"))
#' get_task_id_vals(con, round_id = "round-1", task_id = "location")
#' get_task_id_vals(con, round_id = "round-2", task_id = "age_group")
#' get_task_id_vals(con,
#' round_id = "round-1", task_id = "location",
#' flatten = FALSE
#' )
get_task_id_vals <- function(con,
task_id,
round_id = NULL,
flatten = TRUE) {
checkmate::assert_class(con, "hub_connection")
checkmate::assert_character(round_id, len = 1, null.ok = TRUE)
checkmate::assert_character(task_id, len = 1)


# validate inputs
round_id <- validate_round_ids(con, round_id)
# trigger validation error as a single task_id is being evaluated.
task_id <- validate_task_ids(con,
task_ids = task_id,
round_id = round_id, val_type = "error"
)


# extract task_id values from connection
values <- con[[round_id]]$model_tasks[[1]]$task_ids[[task_id]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we don't want to subset to model_tasks[[1]] here. We may want to pull the values from all model_tasks entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you mean by:

We may want to pull the values from all model_tasks entries.

In the currect hubmeta json files, the model_taskselement contains a list consisting of a single unnamed element which in turn contains two named list elements, task_ids & output_types (see below).

Hence I'm using [[1]] to access the underlying task_ids & output_types. Perhaps this superfluous unnamed list element shouldn't be there?

library(hubUtils)
con <- connect_hub(system.file("hub_1", package = "hubUtils")) 
round_id <- "round_id_from_variable"
con[[round_id]]$model_tasks[[1]]
#> $task_ids
#> $task_ids$origin_date
#> $task_ids$origin_date$required
#> NULL
#> 
#> $task_ids$origin_date$optional
#>  [1] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" "2022-02-05"
#>  [6] "2022-02-12" "2022-02-19" "2022-02-26" "2022-03-05" "2022-03-12"
#> [11] "2022-03-19" "2022-03-26" "2022-04-02" "2022-04-09" "2022-04-16"
#> [16] "2022-04-23" "2022-04-30" "2022-05-07" "2022-05-14" "2022-05-21"
#> [21] "2022-05-28" "2022-06-04" "2022-06-11" "2022-06-18"
#> 
#> 
#> $task_ids$location
#> $task_ids$location$required
#> NULL
#> 
#> $task_ids$location$optional
#>  [1] "01" "02" "04" "05" "06" "08" "09" "10" "11" "12" "13" "15" "16" "17" "18"
#> [16] "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
#> [31] "34" "35" "36" "37" "38" "39" "40" "41" "42" "44" "45" "46" "47" "48" "49"
#> [46] "50" "51" "53" "54" "55" "56" "72" "78" "US"
#> 
#> 
#> $task_ids$horizon
#> $task_ids$horizon$required
#> NULL
#> 
#> $task_ids$horizon$optional
#> [1] 1 2 3 4
#> 
#> 
#> 
#> $output_types
#> $output_types$mean
#> $output_types$mean$type_id
#> $output_types$mean$type_id$required
#> NULL
#> 
#> $output_types$mean$type_id$optional
#> [1] NA
#> 
#> 
#> $output_types$mean$value
#> $output_types$mean$value$type
#> [1] "integer"
#> 
#> $output_types$mean$value$minimum
#> [1] 0
#> 
#> 
#> 
#> $output_types$quantile
#> $output_types$quantile$type_id
#> $output_types$quantile$type_id$required
#> NULL
#> 
#> $output_types$quantile$type_id$optional
#>  [1] 0.010 0.025 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500
#> [13] 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990
#> 
#> 
#> $output_types$quantile$value
#> $output_types$quantile$value$type
#> [1] "integer"
#> 
#> $output_types$quantile$value$minimum
#> [1] 0

str(con[[round_id]]$model_tasks[[1]])
#> List of 2
#>  $ task_ids    :List of 3
#>   ..$ origin_date:List of 2
#>   .. ..$ required: NULL
#>   .. ..$ optional: chr [1:24] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" ...
#>   ..$ location   :List of 2
#>   .. ..$ required: NULL
#>   .. ..$ optional: chr [1:54] "01" "02" "04" "05" ...
#>   ..$ horizon    :List of 2
#>   .. ..$ required: NULL
#>   .. ..$ optional: int [1:4] 1 2 3 4
#>  $ output_types:List of 2
#>   ..$ mean    :List of 2
#>   .. ..$ type_id:List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: logi NA
#>   .. ..$ value  :List of 2
#>   .. .. ..$ type   : chr "integer"
#>   .. .. ..$ minimum: int 0
#>   ..$ quantile:List of 2
#>   .. ..$ type_id:List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: num [1:23] 0.01 0.025 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 ...
#>   .. ..$ value  :List of 2
#>   .. .. ..$ type   : chr "integer"
#>   .. .. ..$ minimum: int 0
str(con[[round_id]]$model_tasks)
#> List of 1
#>  $ :List of 2
#>   ..$ task_ids    :List of 3
#>   .. ..$ origin_date:List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: chr [1:24] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" ...
#>   .. ..$ location   :List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: chr [1:54] "01" "02" "04" "05" ...
#>   .. ..$ horizon    :List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: int [1:4] 1 2 3 4
#>   ..$ output_types:List of 2
#>   .. ..$ mean    :List of 2
#>   .. .. ..$ type_id:List of 2
#>   .. .. .. ..$ required: NULL
#>   .. .. .. ..$ optional: logi NA
#>   .. .. ..$ value  :List of 2
#>   .. .. .. ..$ type   : chr "integer"
#>   .. .. .. ..$ minimum: int 0
#>   .. ..$ quantile:List of 2
#>   .. .. ..$ type_id:List of 2
#>   .. .. .. ..$ required: NULL
#>   .. .. .. ..$ optional: num [1:23] 0.01 0.025 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 ...
#>   .. .. ..$ value  :List of 2
#>   .. .. .. ..$ type   : chr "integer"
#>   .. .. .. ..$ minimum: int 0

Created on 2022-10-24 with reprex v2.0.2

Copy link
Member Author

@annakrystalli annakrystalli Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually @elray1 ! I believe it is myself that has misunderstood here. When I was working on the JSON schema, I realised "round-1" in the complex example has two elements in model tasks! Hence your comment is absolutely valid.

This needs a little more thought for how to handle so will work on it first thing next week.


# specifically select `required` and `optional` fields to ensure additional
# metadata like format or units are excluded.
out <- values[c("required", "optional")]

# if distinction between required & optional not important,
# flatten output into single vector
if (flatten) {
out <- out %>%
unlist(use.names = FALSE)
}

# if present, add format and units as attributes
out <- structure(out,
units = values[["units"]],
format = values[["format"]]
)

return(out)
}

#' Get round ids
#'
#' @inheritParams get_task_id_vals
#' @return a character vector of round ids.
#' @export
#' @family hub-metadata
#' @examples
#' con <- connect_hub(system.file("hub_1", package = "hubUtils"))
#' get_round_ids(con)
#' con <- connect_hub(system.file("scnr_hub_1", package = "hubUtils"))
#' get_round_ids(con)
get_round_ids <- function(con) {
checkmate::assert_class(con, "hub_connection")
attr(con, "round_ids")
}

#' Get task ids for a given round_id
#'
#' @inheritParams get_task_id_vals
#'
#' @return A character vector of task ids.
#' @export
#' @family hub-metadata
#' @examples
#' con <- connect_hub(system.file("hub_1", package = "hubUtils"))
#' get_task_ids(con)
#' con <- connect_hub(system.file("scnr_hub_1", package = "hubUtils"))
#' get_task_ids(con, round_id = "round-1")
#' get_task_ids(con, round_id = "round-2")
get_task_ids <- function(con, round_id = NULL) {
checkmate::assert_class(con, "hub_connection")
checkmate::assert_character(round_id, len = 1, null.ok = TRUE)
validate_round_ids(con, round_id)

if (attr(con, "task_ids_by_round")) {
out <- attr(con, "task_id_names")[[round_id]]
} else {
out <- attr(con, "task_id_names")
}
out
}
Loading