Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Hub Connection Functionality #15

Merged
merged 23 commits into from
Jan 6, 2023
Merged

[WIP] Add Hub Connection Functionality #15

merged 23 commits into from
Jan 6, 2023

Conversation

annakrystalli
Copy link
Member

@annakrystalli annakrystalli commented Oct 5, 2022

NAMESPACE Show resolved Hide resolved
round_ids <- round_ids[!invalid_round_ids_lgl]
}
} else {
round_ids <- "round_id_from_variable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to issue #22 -- I think we probably do want to be validating against (and returning) valid round id values even if the round id is derived from a variable name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Will change as part of resolving #22.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that this currently works fine for non round varying hubs because round_id is ignored altogether and there is a single vector of task_ids that functions validate against. The idea of #22 is to get rid of this behaviour.



# extract task_id values from connection
values <- con[[round_id]]$model_tasks[[1]]$task_ids[[task_id]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we don't want to subset to model_tasks[[1]] here. We may want to pull the values from all model_tasks entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you mean by:

We may want to pull the values from all model_tasks entries.

In the currect hubmeta json files, the model_taskselement contains a list consisting of a single unnamed element which in turn contains two named list elements, task_ids & output_types (see below).

Hence I'm using [[1]] to access the underlying task_ids & output_types. Perhaps this superfluous unnamed list element shouldn't be there?

library(hubUtils)
con <- connect_hub(system.file("hub_1", package = "hubUtils")) 
round_id <- "round_id_from_variable"
con[[round_id]]$model_tasks[[1]]
#> $task_ids
#> $task_ids$origin_date
#> $task_ids$origin_date$required
#> NULL
#> 
#> $task_ids$origin_date$optional
#>  [1] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" "2022-02-05"
#>  [6] "2022-02-12" "2022-02-19" "2022-02-26" "2022-03-05" "2022-03-12"
#> [11] "2022-03-19" "2022-03-26" "2022-04-02" "2022-04-09" "2022-04-16"
#> [16] "2022-04-23" "2022-04-30" "2022-05-07" "2022-05-14" "2022-05-21"
#> [21] "2022-05-28" "2022-06-04" "2022-06-11" "2022-06-18"
#> 
#> 
#> $task_ids$location
#> $task_ids$location$required
#> NULL
#> 
#> $task_ids$location$optional
#>  [1] "01" "02" "04" "05" "06" "08" "09" "10" "11" "12" "13" "15" "16" "17" "18"
#> [16] "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
#> [31] "34" "35" "36" "37" "38" "39" "40" "41" "42" "44" "45" "46" "47" "48" "49"
#> [46] "50" "51" "53" "54" "55" "56" "72" "78" "US"
#> 
#> 
#> $task_ids$horizon
#> $task_ids$horizon$required
#> NULL
#> 
#> $task_ids$horizon$optional
#> [1] 1 2 3 4
#> 
#> 
#> 
#> $output_types
#> $output_types$mean
#> $output_types$mean$type_id
#> $output_types$mean$type_id$required
#> NULL
#> 
#> $output_types$mean$type_id$optional
#> [1] NA
#> 
#> 
#> $output_types$mean$value
#> $output_types$mean$value$type
#> [1] "integer"
#> 
#> $output_types$mean$value$minimum
#> [1] 0
#> 
#> 
#> 
#> $output_types$quantile
#> $output_types$quantile$type_id
#> $output_types$quantile$type_id$required
#> NULL
#> 
#> $output_types$quantile$type_id$optional
#>  [1] 0.010 0.025 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500
#> [13] 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990
#> 
#> 
#> $output_types$quantile$value
#> $output_types$quantile$value$type
#> [1] "integer"
#> 
#> $output_types$quantile$value$minimum
#> [1] 0

str(con[[round_id]]$model_tasks[[1]])
#> List of 2
#>  $ task_ids    :List of 3
#>   ..$ origin_date:List of 2
#>   .. ..$ required: NULL
#>   .. ..$ optional: chr [1:24] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" ...
#>   ..$ location   :List of 2
#>   .. ..$ required: NULL
#>   .. ..$ optional: chr [1:54] "01" "02" "04" "05" ...
#>   ..$ horizon    :List of 2
#>   .. ..$ required: NULL
#>   .. ..$ optional: int [1:4] 1 2 3 4
#>  $ output_types:List of 2
#>   ..$ mean    :List of 2
#>   .. ..$ type_id:List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: logi NA
#>   .. ..$ value  :List of 2
#>   .. .. ..$ type   : chr "integer"
#>   .. .. ..$ minimum: int 0
#>   ..$ quantile:List of 2
#>   .. ..$ type_id:List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: num [1:23] 0.01 0.025 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 ...
#>   .. ..$ value  :List of 2
#>   .. .. ..$ type   : chr "integer"
#>   .. .. ..$ minimum: int 0
str(con[[round_id]]$model_tasks)
#> List of 1
#>  $ :List of 2
#>   ..$ task_ids    :List of 3
#>   .. ..$ origin_date:List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: chr [1:24] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" ...
#>   .. ..$ location   :List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: chr [1:54] "01" "02" "04" "05" ...
#>   .. ..$ horizon    :List of 2
#>   .. .. ..$ required: NULL
#>   .. .. ..$ optional: int [1:4] 1 2 3 4
#>   ..$ output_types:List of 2
#>   .. ..$ mean    :List of 2
#>   .. .. ..$ type_id:List of 2
#>   .. .. .. ..$ required: NULL
#>   .. .. .. ..$ optional: logi NA
#>   .. .. ..$ value  :List of 2
#>   .. .. .. ..$ type   : chr "integer"
#>   .. .. .. ..$ minimum: int 0
#>   .. ..$ quantile:List of 2
#>   .. .. ..$ type_id:List of 2
#>   .. .. .. ..$ required: NULL
#>   .. .. .. ..$ optional: num [1:23] 0.01 0.025 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 ...
#>   .. .. ..$ value  :List of 2
#>   .. .. .. ..$ type   : chr "integer"
#>   .. .. .. ..$ minimum: int 0

Created on 2022-10-24 with reprex v2.0.2

Copy link
Member Author

@annakrystalli annakrystalli Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually @elray1 ! I believe it is myself that has misunderstood here. When I was working on the JSON schema, I realised "round-1" in the complex example has two elements in model tasks! Hence your comment is absolutely valid.

This needs a little more thought for how to handle so will work on it first thing next week.

R/hub-connection.R Outdated Show resolved Hide resolved
R/hub-connection.R Outdated Show resolved Hide resolved
x,
"round_id_from_variable",
"model_tasks",
1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as elsewhere that I think we don't want to pull only entries at index 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above

) %>%
names()

round_ids <- x[["round_id_from_variable"]][["round_id_variable"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing something about how this works. e.g., I was thinking that we'd need to do something like x[["round_id_from_variable"]]...[[x[["round_id_variable"]]]]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we're extracting the variable specified in round_id_variable and assigning to the round_ids attribute. All metadata in non round varying metadata are contained in top level round_id_from_variable and then we select the round_id_variable.

In all honesty, at the minute, this is largely ignored when validating non round varying hubs (as descussed above and in #22) but post #22 implementation, the idea it is it will play a role in validation.

library(hubUtils)
x <- read_hubmeta(system.file("hub_1/hubmeta.json",
                              package = "hubUtils"))
str(x)
#> List of 1
#>  $ round_id_from_variable:List of 3
#>   ..$ model_tasks      :List of 1
#>   .. ..$ :List of 2
#>   .. .. ..$ task_ids    :List of 3
#>   .. .. .. ..$ origin_date:List of 2
#>   .. .. .. .. ..$ required: NULL
#>   .. .. .. .. ..$ optional: chr [1:24] "2022-01-08" "2022-01-15" "2022-01-22" "2022-01-29" ...
#>   .. .. .. ..$ location   :List of 2
#>   .. .. .. .. ..$ required: NULL
#>   .. .. .. .. ..$ optional: chr [1:54] "01" "02" "04" "05" ...
#>   .. .. .. ..$ horizon    :List of 2
#>   .. .. .. .. ..$ required: NULL
#>   .. .. .. .. ..$ optional: int [1:4] 1 2 3 4
#>   .. .. ..$ output_types:List of 2
#>   .. .. .. ..$ mean    :List of 2
#>   .. .. .. .. ..$ type_id:List of 2
#>   .. .. .. .. .. ..$ required: NULL
#>   .. .. .. .. .. ..$ optional: logi NA
#>   .. .. .. .. ..$ value  :List of 2
#>   .. .. .. .. .. ..$ type   : chr "integer"
#>   .. .. .. .. .. ..$ minimum: int 0
#>   .. .. .. ..$ quantile:List of 2
#>   .. .. .. .. ..$ type_id:List of 2
#>   .. .. .. .. .. ..$ required: NULL
#>   .. .. .. .. .. ..$ optional: num [1:23] 0.01 0.025 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 ...
#>   .. .. .. .. ..$ value  :List of 2
#>   .. .. .. .. .. ..$ type   : chr "integer"
#>   .. .. .. .. .. ..$ minimum: int 0
#>   ..$ round_id_variable: chr "origin_date"
#>   ..$ submissions_due  :List of 3
#>   .. ..$ relative_to: chr "origin_date"
#>   .. ..$ start      : int -4
#>   .. ..$ end        : int 2
x[["round_id_from_variable"]][["round_id_variable"]]
#> [1] "origin_date"

Created on 2022-10-24 with reprex v2.0.2

Copy link
Contributor

@elray1 elray1 Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks. Noting that this may end up getting re-worked depending on where we head with the discussion here. Should we resolve this conversation though?

R/hub-connection.R Show resolved Hide resolved
@annakrystalli annakrystalli merged commit 7a1fc39 into main Jan 6, 2023
annakrystalli added a commit that referenced this pull request Jan 6, 2023
…idate

[WIP] Add json validation functionality. Integrate #15
@annakrystalli annakrystalli deleted the hub-connect branch June 7, 2023 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment