Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Br v4.0.0 #103

Merged
merged 23 commits into from
Nov 25, 2024
Merged

Br v4.0.0 #103

merged 23 commits into from
Nov 25, 2024

Conversation

annakrystalli
Copy link
Member

@annakrystalli annakrystalli commented Oct 9, 2024

This PR:

@LucieContamin LucieContamin self-requested a review October 9, 2024 12:30
zkamvar added a commit that referenced this pull request Oct 9, 2024
We were having trouble commenting the diff in #103 and have no clue why it's not working. Maybe this will fix it?
@annakrystalli
Copy link
Member Author

Note I did manage to re-indent with jq but it expanded all arrays to multiple rows too with no clear way how to avoid that. I've tried to manually focus the diff on what's important in v4.0.0 instead.

If someone has better ideas let me know!

@zkamvar
Copy link
Member

zkamvar commented Oct 9, 2024

/diff

Copy link

github-actions bot commented Oct 9, 2024

Here are your diffs for this pull request

admin-schema.json

--- v3.0.1/admin-schema.json	2024-11-19 19:40:45.814307408 +0000
+++ v4.0.0/admin-schema.json	2024-11-19 19:40:47.062298932 +0000
@@ -1,6 +1,6 @@
 {
     "$schema": "https://json-schema.org/draft/2020-12/schema",
-    "$id": "https://raw.githubusercontent.com/hubverse-org/schemas/main/v3.0.1/admin-schema.json",
+    "$id": "https://raw.githubusercontent.com/hubverse-org/schemas/main/v4.0.0/admin-schema.json",
     "title": "Schema for Modeling Hub administrative settings",
     "description": "This JSON file provides a schema for modeling hub administrative information.",
     "type": "object",
@@ -59,9 +59,11 @@
                     ]
                 },
                 "owner": {
+                    "description": "The hub repository owner (user or organisation).",
                     "type": "string"
                 },
                 "name": {
+                    "description": "The name of the hub repository.",
                     "type": "string"
                 }
             }

tasks-schema.json

--- v3.0.1/tasks-schema.json	2024-11-19 19:40:45.814307408 +0000
+++ v4.0.0/tasks-schema.json	2024-11-19 19:40:47.062298932 +0000
@@ -1,6 +1,6 @@
 {
     "$schema": "https://json-schema.org/draft/2020-12/schema",
-    "$id": "https://raw.githubusercontent.com/hubverse-org/schemas/main/v3.0.1/tasks-schema.json",
+    "$id": "https://raw.githubusercontent.com/hubverse-org/schemas/main/v4.0.0/tasks-schema.json",
     "title": "Schema for Modeling Hub model task definitions",
     "description": "This is the schema of the tasks.json configuration file that defines the tasks within a modeling hub.",
     "type": "object",
@@ -92,7 +92,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "forecast_date": {
                                             "description": "An object containing arrays of required and optional unique forecast dates. Forecast date usually defines the date that a model is run to produce a forecast.",
@@ -136,7 +137,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "scenario_id": {
                                             "description": "An object containing arrays of required and optional unique identifiers of each valid scenario.",
@@ -194,13 +196,16 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "location": {
                                             "description": "An object containing arrays of required and optional unique identifiers for each valid location, e.g. country codes, FIPS state or county level code etc.",
                                             "examples": [
                                                 {
-                                                    "required": "US",
+                                                    "required": [
+                                                        "US"
+                                                    ],
                                                     "optional": [
                                                         "01",
                                                         "02",
@@ -284,7 +289,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "target": {
                                             "description": "An object containing arrays of required and optional unique identifiers for each valid target. Usually represents a single task ID target key variable.",
@@ -332,7 +338,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "target_variable": {
                                             "description": "An object containing arrays of required and optional unique identifiers for each valid target variable. Usually forms part of a pair of task ID target key variables (along with target_outcome) which combine to define individual targets.",
@@ -382,7 +389,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "target_outcome": {
                                             "description": "An object containing arrays of required and optional unique identifiers for each valid target outcome. Usually forms part of a pair of task ID target key variables (along with target_variable) which combine to define individual targets.",
@@ -430,7 +438,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "target_date": {
                                             "description": "An object containing arrays of required and optional unique target dates. For short-term forecasts, the target_date specifies the date of occurrence of the outcome of interest. For instance, if models are requested to forecast the number of hospitalizations that will occur on 2022-07-15, the target_date is 2022-07-15",
@@ -474,7 +483,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "target_end_date": {
                                             "description": "An object containing arrays of required and optional unique target end dates. For short-term forecasts, the target_end_date specifies the date of occurrence of the outcome of interest. For instance, if models are requested to forecast the number of hospitalizations that will occur on 2022-07-15, the target_end_date is 2022-07-15",
@@ -518,7 +528,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "horizon": {
                                             "description": "An object containing arrays of required and optional unique horizons. Horizons define the difference between the target_date and the origin_date in time units specified by the hub (e.g., may be days, weeks, or months)",
@@ -567,7 +578,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "age_group": {
                                             "type": "object",
@@ -611,7 +623,8 @@
                                             "required": [
                                                 "required",
                                                 "optional"
-                                            ]
+                                            ],
+                                            "additionalProperties": false
                                         }
                                     },
                                     "additionalProperties": {
@@ -638,7 +651,8 @@
                                         "required": [
                                             "required",
                                             "optional"
-                                        ]
+                                        ],
+                                        "additionalProperties": false
                                     }
                                 },
                                 "output_type": {
@@ -650,60 +664,23 @@
                                             "description": "Object defining the mean of the predictive distribution output type.",
                                             "properties": {
                                                 "output_type_id": {
-                                                    "description": "output_type_id is not meaningful for a mean output_type. The property is primarily used to determine whether mean is a required or optional output type through properties required and optional. If mean is a required output type, the required property must be an array containing the single string element 'NA' and the optional property must be set to null. If mean is an optional output type, the optional property must be an array containing the single string element 'NA' and the required property must be set to null",
+                                                    "description": "output_type_id is not meaningful for a point estimate output_type. Must have a single property named 'required' with the value null.",
                                                     "examples": [
                                                         {
-                                                            "required": [
-                                                                "NA"
-                                                            ],
-                                                            "optional": null
-                                                        },
-                                                        {
-                                                            "required": null,
-                                                            "optional": [
-                                                                "NA"
-                                                            ]
+                                                            "required": null
                                                         }
                                                     ],
                                                     "type": "object",
-                                                    "oneOf": [
-                                                        {
-                                                            "properties": {
-                                                                "required": {
-                                                                    "description": "When mean is required, property set to single element 'NA' array",
-                                                                    "type": "array",
-                                                                    "items": {
-                                                                        "const": "NA",
-                                                                        "maxItems": 1
-                                                                    }
-                                                                },
-                                                                "optional": {
-                                                                    "description": "When mean is required, property set to null",
-                                                                    "type": "null"
-                                                                }
-                                                            }
-                                                        },
-                                                        {
-                                                            "properties": {
-                                                                "required": {
-                                                                    "description": "When mean is optional, property set to null",
-                                                                    "type": "null"
-                                                                },
-                                                                "optional": {
-                                                                    "description": "When mean is optional, property set to single element 'NA' array",
-                                                                    "type": "array",
-                                                                    "items": {
-                                                                        "const": "NA",
-                                                                        "maxItems": 1
-                                                                    }
-                                                                }
-                                                            }
+                                                    "properties": {
+                                                        "required": {
+                                                            "description": "Not relevant for point estimate output types. Must be null.",
+                                                            "type": "null"
                                                         }
-                                                    ],
+                                                    },
                                                     "required": [
-                                                        "required",
-                                                        "optional"
-                                                    ]
+                                                        "required"
+                                                    ],
+                                                    "additionalProperties": false
                                                 },
                                                 "value": {
                                                     "type": "object",
@@ -740,77 +717,55 @@
                                                     },
                                                     "required": [
                                                         "type"
-                                                    ]
+                                                    ],
+                                                    "additionalProperties": false
+                                                },
+                                                "is_required": {
+                                                    "description": "Is output type required? When required, property should be set to 'true'. If output type is optional, set to 'false'.",
+                                                    "examples": [
+                                                        {
+                                                            "is_required": true
+                                                        },
+                                                        {
+                                                            "is_required": false
+                                                        }
+                                                    ],
+                                                    "type": "boolean"
                                                 }
                                             },
                                             "required": [
                                                 "output_type_id",
-                                                "value"
-                                            ]
+                                                "value",
+                                                "is_required"
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "median": {
                                             "type": "object",
                                             "description": "Object defining the median of the predictive distribution output type",
                                             "properties": {
                                                 "output_type_id": {
-                                                    "description": "output_type_id is not meaningful for a median output_type. The property is primarily used to determine whether median is a required or optional output type through properties required and optional. If median is a required output type, the required property must be an array containing the single string element 'NA' and the optional property must be set to null. If median is an optional output type, the optional property must be an array containing the single string element 'NA' and the required property must be set to null",
+                                                    "description": "output_type_id is not meaningful for a point estimate output_type. Must have a single property named 'required' with the value null.",
                                                     "examples": [
                                                         {
-                                                            "required": [
-                                                                "NA"
-                                                            ],
-                                                            "optional": null
-                                                        },
-                                                        {
-                                                            "required": null,
-                                                            "optional": [
-                                                                "NA"
-                                                            ]
+                                                            "required": null
                                                         }
                                                     ],
                                                     "type": "object",
-                                                    "oneOf": [
-                                                        {
-                                                            "properties": {
-                                                                "required": {
-                                                                    "description": "When median is required, property set to single element 'NA' array",
-                                                                    "type": "array",
-                                                                    "items": {
-                                                                        "const": "NA",
-                                                                        "maxItems": 1
-                                                                    }
-                                                                },
-                                                                "optional": {
-                                                                    "description": "When median is required, property set to null",
-                                                                    "type": "null"
-                                                                }
-                                                            }
-                                                        },
-                                                        {
-                                                            "properties": {
-                                                                "required": {
-                                                                    "description": "When median is optional, property set to null",
-                                                                    "type": "null"
-                                                                },
-                                                                "optional": {
-                                                                    "description": "When median is optional, property set to single element 'NA' array",
-                                                                    "type": "array",
-                                                                    "items": {
-                                                                        "const": "NA",
-                                                                        "maxItems": 1
-                                                                    }
-                                                                }
-                                                            }
+                                                    "properties": {
+                                                        "required": {
+                                                            "description": "Not relevant for point estimate output types. Must be null.",
+                                                            "type": "null"
                                                         }
-                                                    ],
+                                                    },
                                                     "required": [
-                                                        "required",
-                                                        "optional"
-                                                    ]
+                                                        "required"
+                                                    ],
+                                                    "additionalProperties": false
                                                 },
                                                 "value": {
                                                     "type": "object",
-                                                    "description": "Object defining the characteristics of valid median values",
+                                                    "description": "Object defining the characteristics of valid median values.",
                                                     "examples": [
                                                         {
                                                             "type": "double",
@@ -819,7 +774,7 @@
                                                     ],
                                                     "properties": {
                                                         "type": {
-                                                            "description": "Data type of median values",
+                                                            "description": "Data type of median values.",
                                                             "type": "string",
                                                             "enum": [
                                                                 "double",
@@ -843,34 +798,47 @@
                                                     },
                                                     "required": [
                                                         "type"
-                                                    ]
+                                                    ],
+                                                    "additionalProperties": false
+                                                },
+                                                "is_required": {
+                                                    "description": "Is output type required? When required, property should be set to 'true'. If output type is optional, set to 'false'.",
+                                                    "examples": [
+                                                        {
+                                                            "is_required": true
+                                                        },
+                                                        {
+                                                            "is_required": false
+                                                        }
+                                                    ],
+                                                    "type": "boolean"
                                                 }
                                             },
                                             "required": [
                                                 "output_type_id",
-                                                "value"
-                                            ]
+                                                "value",
+                                                "is_required"
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "quantile": {
                                             "description": "Object defining the quantiles of the predictive distribution output type.",
                                             "type": "object",
                                             "properties": {
                                                 "output_type_id": {
-                                                    "description": "Object containing required and optional arrays defining the probability levels at which quantiles of the predictive distribution will be recorded.",
+                                                    "description": "Object containing arrays of required probability levels at which quantiles of the predictive distribution will be recorded.",
                                                     "examples": [
                                                         {
                                                             "required": [
-                                                                0.25,
-                                                                0.5,
-                                                                0.75
-                                                            ],
-                                                            "optional": [
                                                                 0.1,
                                                                 0.2,
+                                                                0.25,
                                                                 0.3,
                                                                 0.4,
+                                                                0.5,
                                                                 0.6,
                                                                 0.7,
+                                                                0.75,
                                                                 0.8,
                                                                 0.9
                                                             ]
@@ -879,24 +847,8 @@
                                                     "type": "object",
                                                     "properties": {
                                                         "required": {
-                                                            "description": "Array of unique probability levels between 0 and 1 that must be present for submission to be valid. Can be null if no probability levels are required and all valid probability levels are specified in the optional property.",
-                                                            "type": [
-                                                                "array",
-                                                                "null"
-                                                            ],
-                                                            "uniqueItems": true,
-                                                            "items": {
-                                                                "type": "number",
-                                                                "minimum": 0,
-                                                                "maximum": 1
-                                                            }
-                                                        },
-                                                        "optional": {
-                                                            "description": "Array of valid but not required unique probability levels. Can be null if all probability levels are required and are specified in the required property.",
-                                                            "type": [
-                                                                "array",
-                                                                "null"
-                                                            ],
+                                                            "description": "Array of unique probability levels between 0 and 1 inclusive that must be present for submission to be valid.",
+                                                            "type": "array",
                                                             "uniqueItems": true,
                                                             "items": {
                                                                 "type": "number",
@@ -906,9 +858,9 @@
                                                         }
                                                     },
                                                     "required": [
-                                                        "required",
-                                                        "optional"
-                                                    ]
+                                                        "required"
+                                                    ],
+                                                    "additionalProperties": false
                                                 },
                                                 "value": {
                                                     "type": "object",
@@ -945,27 +897,41 @@
                                                     },
                                                     "required": [
                                                         "type"
-                                                    ]
+                                                    ],
+                                                    "additionalProperties": false
+                                                },
+                                                "is_required": {
+                                                    "description": "Is output type required? When required, property should be set to 'true'. If output type is optional, set to 'false'.",
+                                                    "examples": [
+                                                        {
+                                                            "is_required": true
+                                                        },
+                                                        {
+                                                            "is_required": false
+                                                        }
+                                                    ],
+                                                    "type": "boolean"
                                                 }
                                             },
                                             "required": [
                                                 "output_type_id",
-                                                "value"
-                                            ]
+                                                "value",
+                                                "is_required"
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "cdf": {
                                             "description": "Object defining the cumulative distribution function of the predictive distribution output type.",
                                             "type": "object",
                                             "properties": {
                                                 "output_type_id": {
-                                                    "description": "Object containing required and optional arrays defining possible values of the target variable at which values of the cumulative distribution function of the predictive distribution will be recorded. These should be listed in order from low to high.",
+                                                    "description": "Object containing required arrays defining possible values of the target variable at which values of the cumulative distribution function of the predictive distribution will be recorded. These should be listed in order from low to high.",
                                                     "examples": [
                                                         {
                                                             "required": [
                                                                 10,
                                                                 20
-                                                            ],
-                                                            "optional": null
+                                                            ]
                                                         },
                                                         {
                                                             "required": [
@@ -977,40 +943,14 @@
                                                                 "EW202245",
                                                                 "EW202246",
                                                                 "EW202247"
-                                                            ],
-                                                            "optional": null
+                                                            ]
                                                         }
                                                     ],
                                                     "type": "object",
                                                     "properties": {
                                                         "required": {
-                                                            "description": "Array of unique target values that must be present for submission to be valid. Can be null if no target values are required and all valid target values are specified in the optional property.",
-                                                            "type": [
-                                                                "array",
-                                                                "null"
-                                                            ],
-                                                            "uniqueItems": true,
-                                                            "items": {
-                                                                "oneOf": [
-                                                                    {
-                                                                        "type": [
-                                                                            "number",
-                                                                            "integer"
-                                                                        ],
-                                                                        "minimum": 0
-                                                                    },
-                                                                    {
-                                                                        "type": "string"
-                                                                    }
-                                                                ]
-                                                            }
-                                                        },
-                                                        "optional": {
-                                                            "description": "Array of valid but not required unique target values. Can be null if all target values are required and are specified in the required property.",
-                                                            "type": [
-                                                                "array",
-                                                                "null"
-                                                            ],
+                                                            "description": "Array of unique target values that must be present for submission to be valid.",
+                                                            "type": "array",
                                                             "uniqueItems": true,
                                                             "items": {
                                                                 "oneOf": [
@@ -1018,8 +958,7 @@
                                                                         "type": [
                                                                             "number",
                                                                             "integer"
-                                                                        ],
-                                                                        "minimum": 0
+                                                                        ]
                                                                     },
                                                                     {
                                                                         "type": "string"
@@ -1029,9 +968,9 @@
                                                         }
                                                     },
                                                     "required": [
-                                                        "required",
-                                                        "optional"
-                                                    ]
+                                                        "required"
+                                                    ],
+                                                    "additionalProperties": false
                                                 },
                                                 "value": {
                                                     "type": "object",
@@ -1057,24 +996,38 @@
                                                         "type",
                                                         "minimum",
                                                         "maximum"
-                                                    ]
+                                                    ],
+                                                    "additionalProperties": false
+                                                },
+                                                "is_required": {
+                                                    "description": "Is output type required? When required, property should be set to 'true'. If output type is optional, set to 'false'.",
+                                                    "examples": [
+                                                        {
+                                                            "is_required": true
+                                                        },
+                                                        {
+                                                            "is_required": false
+                                                        }
+                                                    ],
+                                                    "type": "boolean"
                                                 }
                                             },
                                             "required": [
                                                 "output_type_id",
-                                                "value"
-                                            ]
+                                                "value",
+                                                "is_required"
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "pmf": {
                                             "description": "Object defining a probability mass function for a discrete variable output type. Includes nominal, binary and ordinal variable types.",
                                             "type": "object",
                                             "properties": {
                                                 "output_type_id": {
-                                                    "description": "Object containing required and optional arrays specifying valid categories of a discrete variable. Note that for ordinal variables, the category levels should be listed in order from low to high.",
+                                                    "description": "Object containing arrays of required values specifying valid categories of a discrete variable. Note that for ordinal variables, the category levels should be listed in order from low to high.",
                                                     "examples": [
                                                         {
-                                                            "required": null,
-                                                            "optional": [
+                                                            "required": [
                                                                 "low",
                                                                 "moderate",
                                                                 "high",
@@ -1085,22 +1038,8 @@
                                                     "type": "object",
                                                     "properties": {
                                                         "required": {
-                                                            "description": "Array of unique categories of a discrete variable that must be present for submission to be valid. Can be null if no categories are required and all valid categories are specified in the optional property.",
-                                                            "type": [
-                                                                "array",
-                                                                "null"
-                                                            ],
-                                                            "uniqueItems": true,
-                                                            "items": {
-                                                                "type": "string"
-                                                            }
-                                                        },
-                                                        "optional": {
-                                                            "description": "Array of valid but not required unique categories of a discrete variable. Can be null if all categories are required and are specified in the required property.",
-                                                            "type": [
-                                                                "array",
-                                                                "null"
-                                                            ],
+                                                            "description": "Array of unique categories of a discrete variable that must be present for submission to be valid.",
+                                                            "type": "array",
                                                             "uniqueItems": true,
                                                             "items": {
                                                                 "type": "string"
@@ -1108,9 +1047,9 @@
                                                         }
                                                     },
                                                     "required": [
-                                                        "required",
-                                                        "optional"
-                                                    ]
+                                                        "required"
+                                                    ],
+                                                    "additionalProperties": false
                                                 },
                                                 "value": {
                                                     "type": "object",
@@ -1140,13 +1079,28 @@
                                                         "type",
                                                         "minimum",
                                                         "maximum"
-                                                    ]
+                                                    ],
+                                                    "additionalProperties": false
+                                                },
+                                                "is_required": {
+                                                    "description": "Is output type required? When required, property should be set to 'true'. If output type is optional, set to 'false'.",
+                                                    "examples": [
+                                                        {
+                                                            "is_required": true
+                                                        },
+                                                        {
+                                                            "is_required": false
+                                                        }
+                                                    ],
+                                                    "type": "boolean"
                                                 }
                                             },
                                             "required": [
                                                 "output_type_id",
-                                                "value"
-                                            ]
+                                                "value",
+                                                "is_required"
+                                            ],
+                                            "additionalProperties": false
                                         },
                                         "sample": {
                                             "description": "Object defining a sample output type.",
@@ -1157,7 +1111,6 @@
                                                     "examples": [
                                                         {
                                                             "output_type_id_params": {
-                                                                "is_required": true,
                                                                 "type": "integer",
                                                                 "min_samples_per_task": 100,
                                                                 "max_samples_per_task": 100
@@ -1165,7 +1118,6 @@
                                                         },
                                                         {
                                                             "output_type_id_params": {
-                                                                "is_required": false,
                                                                 "type": "character",
                                                                 "max_length": 6,
                                                                 "min_samples_per_task": 100,
@@ -1181,10 +1133,6 @@
                                                     ],
                                                     "type": "object",
                                                     "properties": {
-                                                        "is_required": {
-                                                            "description": "Boolean. Whether inclusion of samples is required for the submission to be valid",
-                                                            "type": "boolean"
-                                                        },
                                                         "type": {
                                                             "description": "Data type of sample indices.",
                                                             "type": "string",
@@ -1220,7 +1168,6 @@
                                                         }
                                                     },
                                                     "required": [
-                                                        "is_required",
                                                         "type",
                                                         "min_samples_per_task",
                                                         "max_samples_per_task"
@@ -1236,7 +1183,8 @@
                                                         "required": [
                                                             "max_length"
                                                         ]
-                                                    }
+                                                    },
+                                                    "additionalProperties": false
                                                 },
                                                 "value": {
                                                     "type": "object",
@@ -1272,13 +1220,28 @@
                                                     },
                                                     "required": [
                                                         "type"
-                                                    ]
+                                                    ],
+                                                    "additionalProperties": false
+                                                },
+                                                "is_required": {
+                                                    "description": "Is output type required? When required, property should be set to 'true'. If output type is optional, set to 'false'.",
+                                                    "examples": [
+                                                        {
+                                                            "is_required": true
+                                                        },
+                                                        {
+                                                            "is_required": false
+                                                        }
+                                                    ],
+                                                    "type": "boolean"
                                                 }
                                             },
                                             "required": [
                                                 "output_type_id_params",
-                                                "value"
-                                            ]
+                                                "value",
+                                                "is_required"
+                                            ],
+                                            "additionalProperties": false
                                         }
                                     },
                                     "additionalProperties": false
@@ -1340,7 +1303,10 @@
                                                 "type": [
                                                     "object",
                                                     "null"
-                                                ]
+                                                ],
+                                                "additionalProperties": {
+                                                    "type": "string"
+                                                }
                                             },
                                             "description": {
                                                 "description": "a verbose description of the target that might include information such as the target_measure above, or definitions of a 'rate' or similar.",
@@ -1412,7 +1378,8 @@
                                 "task_ids",
                                 "output_type",
                                 "target_metadata"
-                            ]
+                            ],
+                            "additionalProperties": false
                         }
                     },
                     "submissions_due": {
@@ -1449,7 +1416,8 @@
                                     "relative_to",
                                     "start",
                                     "end"
-                                ]
+                                ],
+                                "additionalProperties": false
                             },
                             {
                                 "properties": {
@@ -1467,7 +1435,8 @@
                                 "required": [
                                     "start",
                                     "end"
-                                ]
+                                ],
+                                "additionalProperties": false
                             }
                         ],
                         "required": [
@@ -1503,6 +1472,22 @@
                                 "arrow"
                             ]
                         }
+                    },
+                    "derived_task_ids": {
+                        "description": "Names of derived task IDs, i.e. task IDs whose values are derived from (and therefore dependent on) the values of other variables. Use this property to override the global setting for a specific round.",
+                        "examples": [
+                            [
+                                "target_end_date"
+                            ]
+                        ],
+                        "type": [
+                            "array",
+                            "null"
+                        ],
+                        "uniqueItems": true,
+                        "items": {
+                            "type": "string"
+                        }
                     }
                 },
                 "required": [
@@ -1529,6 +1514,22 @@
                 "logical",
                 "Date"
             ]
+        },
+        "derived_task_ids": {
+            "description": "Names of derived task IDs, i.e. task IDs whose values are derived from (and therefore dependent on) the values of other variables.",
+            "examples": [
+                [
+                    "target_end_date"
+                ]
+            ],
+            "type": [
+                "array",
+                "null"
+            ],
+            "uniqueItems": true,
+            "items": {
+                "type": "string"
+            }
         }
     },
     "required": [

@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@hubverse-org hubverse-org deleted a comment from annakrystalli Oct 9, 2024
@zkamvar
Copy link
Member

zkamvar commented Oct 9, 2024

Note I did manage to re-indent with jq but it expanded all arrays to multiple rows too with no clear way how to avoid that.

I think we should just bend to the formatter. One thing we could do to avoid this is to run 3.0.1 through the formatter in this PR as well so that the diff appears the same. The formatter will not change any structural information of the JSON, just its format.

From there, we can add guidance for future additions to run it through the formatter.

Copy link

@LucieContamin LucieContamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update, I just added a question about the derived_tasks_ids.

Comment on lines 1480 to 1490
"derived_task_ids": {
"description": "Names of derived task IDs, i.e. task IDs whose values are derived from (and therefore dependent on) the values of other variables.",
"type": [
"array",
"null"
],
"uniqueItems": true,
"items": {
"type": "string"
}
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong here but should it not be at the "round" level, as depending on the round, you might have so really different task IDs ? (or is it already?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it's a hub property overall in that I can't see many situations where a task ID is derived in one round and not derived in another. The chances of it being stable are far greater than the property changing by round so it would be annoying to have to re-define it again in every round.

I'll make sure however that if a new derived task ID is added to a new round and the task id name added to derived_task_ids, it will not affect validation of older rounds, i.e. if the task ID is not present in the data, derived_task_ids will just be ignored.

The is the potential to allow for a derived_task_ids property at the round level that overrides the overall property but I think we can wait and see if anyone requests that as it feels like an extreme edge case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the additional information.

You make a really good point, however I am worrying a bit about the scenario hubs. They tend to have lot of rounds and have different tasks id for some round as for the need of a specific round or scenario you might need to create new column that are not replicated in the next round.
I am not sure how frequent this new columns will be tagged as derived tasks ids, but even saying that it looks to me a little bit weird to have the tasks id as hub level as they are not define at hub level but at round level.

Copy link
Member Author

@annakrystalli annakrystalli Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. The problem though does not arise from a derived task id not being in all rounds. The problem is if say you create a task id as a derived taks id and then later one change that (i.e. the task ID is no longer derived. That's the only situation where this could be problematic but seems to me not good practice within a hub.

Copy link
Member Author

@annakrystalli annakrystalli Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely don't want everyone to have to respecify this property at each round as for the majority of hubs this is very stable so we will for sure keep the overall high level specification. As mentioned there is the option for overriding at the round but I do not want to implement unless its actually necessary

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem though does not arise from a derived task id not being in all rounds

I misunderstood and I was afraid that would be an issue but if not, that solve some of my problems! Apologize for that!
I agree that changing the behavior of a column is bad practice and should not be supported.

I definitely don't want everyone to have to respecify this property at each round as for the majority of hubs this is very stable so we will for sure keep the overall high level specification.

I also agree, that's to much to ask for something that might not be used often.
And the optional overriding seems like a good idea, and I think a hub can do their own wrapper/function to deal with it if necessary.

So to summarize, thank you for the additional information, I change my mind and I am ok with it being at hub level!

@annakrystalli
Copy link
Member Author

I think we should just bend to the formatter. One thing we could do to avoid this is to run 3.0.1 through the formatter in this PR as well so that the diff appears the same. The formatter will not change any structural information of the JSON, just its format.

Good option! See #106

From there, we can add guidance for future additions to run it through the formatter.

I opened the following issue to discuss approach: #107

@annakrystalli
Copy link
Member Author

/diff

1 similar comment
@annakrystalli
Copy link
Member Author

/diff

Merge branch 'main' into br-v4.0.0
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
@annakrystalli
Copy link
Member Author

Hey @LucieContamin ! I added a round level property now too so you can now use that to override the global property.

Glad you asked for this feature as going back to introduce it I found that the global derived_task_ids property was nested at the wrong level!! 🙈

@annakrystalli
Copy link
Member Author

/diff

@zkamvar
Copy link
Member

zkamvar commented Oct 16, 2024

/diff

@zkamvar
Copy link
Member

zkamvar commented Oct 18, 2024

point estimates

Something that was brought up in response to reichlab/variant-nowcast-hub#117 (comment) is that the "NA" is a bit confusing because it sure looks like a character, but when we expand the grid the output_type_id columns become NA (which is an intentional move by Ooms described in section 2.1.1 of the JSONlite package paper)

Now that we are using is_required for point estimate types, we might be able to take this opportunity to set the required property to a single element null array. This will have exactly the same result as the "NA" array, but with the following advantages:

  1. inter-language compatibility can be achieved since null is a concept that even JSON can understand
  2. we can clearly communicate that this should be a missing value in the data as opposed to a character string.

This is what I think it would look like in the schema:

"required": {
    "description": "Not relevant for point estimate output types. Must be a single array of null"
    "type": "array"
    "items": {
        "const": null,
        "maxItems": 1
    }
}

Demo

Here's a demo that shows that ["NA"] and [null] are equivalent by modifying a tasks.json file and reading them in with jsonlite

hub_con <- hubData::connect_hub(
  system.file("testhubs/simple", package = "hubUtils")
)
hp <- attributes(hub_con)$hub_path
# Get the tasks file which has a `mean` output type
tasks <- fs::path(hp, "hub-config", "tasks.json")

# copy the file to a new temp file
tmp <- withr::local_tempfile()
fs::file_copy(tasks, tmp, overwrite = TRUE)
# The two files are identical
unname(tools::md5sum(tmp) == tools::md5sum(tasks))
#> [1] TRUE

cfg <- readLines(tmp)
writeLines(cfg[231:240]) # optional is NA
#>                 "output_type": {
#>                     "mean": {
#>                         "output_type_id": {
#>                             "required": null,
#>                             "optional": ["NA"]
#>                         },
#>                         "value": {
#>                             "type": "integer",
#>                             "minimum": 0
#>                         }

# Change '["NA"]' to '[null]'
nullcfg <- sub('["NA"]', '[null]', cfg, fixed = TRUE)
writeLines(nullcfg[231:240]) # optional is now null
#>                 "output_type": {
#>                     "mean": {
#>                         "output_type_id": {
#>                             "required": null,
#>                             "optional": [null]
#>                         },
#>                         "value": {
#>                             "type": "integer",
#>                             "minimum": 0
#>                         }
writeLines(nullcfg, con = tmp) # changing to null

# The two files are no longer identical
unname(tools::md5sum(tmp) == tools::md5sum(tasks))
#> [1] FALSE
waldo::compare(cfg, nullcfg)
#> old[81:87] vs new[81:87]
#>   "                    \"mean\": {"
#>   "                        \"output_type_id\": {"
#>   "                            \"required\": null,"
#> - "                            \"optional\": [\"NA\"]"
#> + "                            \"optional\": [null]"
#>   "                        },"
#>   "                        \"value\": {"
#>   "                            \"type\": \"integer\","
#> 
#> old[232:238] vs new[232:238]
#>   "                    \"mean\": {"
#>   "                        \"output_type_id\": {"
#>   "                            \"required\": null,"
#> - "                            \"optional\": [\"NA\"]"
#> + "                            \"optional\": [null]"
#>   "                        },"
#>   "                        \"value\": {"
#>   "                            \"type\": \"integer\","


# The resulting R object after reading in JSON are identical
identical(
  jsonlite::fromJSON(tasks, simplifyDataFrame = FALSE), 
  jsonlite::fromJSON(tmp, simplifyDataFrame = FALSE)
)
#> [1] TRUE

Created on 2024-10-18 with reprex v2.1.1

@zkamvar
Copy link
Member

zkamvar commented Nov 19, 2024

/diff

@annakrystalli annakrystalli merged commit 0163a89 into main Nov 25, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
3 participants