Skip to content

Commit

Permalink
Merge pull request #372 from dfe-analytical-services/development
Browse files Browse the repository at this point in the history
Development
  • Loading branch information
Hanco20 authored Jun 20, 2024
2 parents a3e8fcb + 0fe1f2a commit b3117b8
Show file tree
Hide file tree
Showing 28 changed files with 1,834 additions and 2,074 deletions.
1 change: 1 addition & 0 deletions .github/workflows/deploy-shiny.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ jobs:

- uses: r-lib/actions/setup-r@v2
with:
r-version: 4.4.0
use-public-rspm: true

- name: Set env vars (dev)
Expand Down
Binary file modified Data/AppData/CoreIndicators.xlsx
Binary file not shown.
88 changes: 37 additions & 51 deletions DeveloperGuide/LocalSkillsDeveloperGuide.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
# Local skills developer guide


- [Introduction](#introduction)
- [I want to see the dashboard](#i-want-to-see-the-dashboard)
- [I want to run the dashboard in R](#i-want-to-run-the-dashboard-in-r)
- [I want to update a dataset](#i-want-to-update-a-dataset)
- [I want to add a metric to the
dashboard](#i-want-to-add-a-metric-to-the-dashboard)
- [It’s a metric that already exists in the
data.](#its-a-metric-that-already-exists-in-the-data.)
data.](#its-a-metric-that-already-exists-in-the-data)
- [It’s a brand new metric](#its-a-brand-new-metric)
- [I want to monitor dashboard
usage](#i-want-to-monitor-dashboard-usage)
Expand Down Expand Up @@ -146,23 +147,14 @@ should now be empty.

*For Nomis Datasets:*

You should just be able to skip to stage (3) (Run ExtractLoadData.R) -
this pulls all the latest Nomis data. However, please take note that
the highest qualifications data currently includes data from two timeseries
(NVQ and RQF), as RQF replaced NVQ in January 2022. The code for this section
will need to be updated at each release (yearly) to use all available RQF data,
and NVQ data for the remaining quarters. Once there have been five releases of
the RQF data the code can be changed to use only the five latest releases of
RQF data.

3. Run the ExtractLoadData.R script. This goes through all the numbered
folders in /Data and loads the datasets in each folder as a R
object.

`source("~/RProjects/lsip_dashboard/ExtractLoadData.R", echo=TRUE)`

This takes about 20 mins. You should see your environment is now
populated with data, including your new, updated dataset.
You should just be able to skip to stage (4) as ExtractLoadData.R in
step (5) will pull all the latest Nomis data. However, please take note
that the highest qualifications data currently includes data from two
timeseries (NVQ and RQF), as RQF replaced NVQ in January 2022. The code
for this section will need to be updated at each release (yearly) to use
all available RQF data, and NVQ data for the remaining quarters. Once
there have been five releases of the RQF data the code can be changed to
use only the five latest releases of RQF data.

4. Dashboard admin:

Expand All @@ -181,16 +173,14 @@ populated with data, including your new, updated dataset.
the /Data/3-2_dataText/dataText.xlsx file as well. These are used to
populate dynamic text on the dashboard’s Local Skills tab.

<!-- -->

5. We have datasets from a wide range of sources that come in all kinds
of formats. We do a lot of data cleaning, manipulating and
formatting to get into a format the dashboard can work with. We do
this in the TransformData.R script. If your new datset is in the
same (or even pretty close) format as the old dataset (as is
hopefully the case), you just need run this script:
5. Run the ExtractLoadData.R script. This runs all the scripts in
/importData and goes through all the numbered folders in /Data and
loads the datasets in each folder as a R object. As we have datasets
from a wide range of sources that come in all kinds of formats, we
do a lot of data cleaning, manipulating and formatting to get into a
format the dashboard can work with.

`source("~/RProjects/lsip_dashboard/TransformData.R", echo=TRUE)`
`source("~/RProjects/lsip_dashboard/ExtractLoadData.R", echo=TRUE)`

This takes about 20 mins. You should see your environment is populated
with new, clean datasets and your Data/AppData folder has been updated
Expand Down Expand Up @@ -255,16 +245,16 @@ being shown in the dashboard:

To add one of these into the dashboard:

1. Remove the metric from the unused list in TransformData.R under the
“4.1 Unused metrics” header.
1. Remove the metric from the unused list in importData/combineData.R
under the “4.1 Unused metrics” header.

2. Add your metric into metricChoices in global.R and assign it a more
user friendly name for use in the dashboard.

3. Add a row into /Data/3-2_dataText/dataText.xlsx for your metric
assigning it the relevant text.

4. Run TransformData.R
4. Run extractLoadData.R

5. Run `runApp()` and check the metric is available in the Local Skills
tab.
Expand Down Expand Up @@ -303,12 +293,12 @@ To add one of these into the dashboard:
charts ie percentage vs vol. If the problem is in the time chart or
map, check server.R where there are a few if statements to assign a
metric to either % or vol. If the problem is in the breakdown chart
check the server and TransformData where some metrics are split into
check the server and combineData.R where some metrics are split into
proportions and some are not.

- If you want it to appear in the Data Explorer tab, remove it from the
exclusion list in TransformData.r/4.4 C_dataHub and give it a good
name in there as well.
exclusion list in combineData.r/4.4 C_dataHub and give it a good name
in there as well.

- You may want to add to the Overview page. If so, add to ui.R wherever
you want and create KPIs and charts in server.R using the following
Expand Down Expand Up @@ -342,19 +332,15 @@ To add one of these into the dashboard:
4. Add some overview information about the dataset on the User
Guide page of the dashboard. an do this in ui.R/2.1.2 Contents

4. Add some data extraction code to ExtractLoadData.R (copy one of the
examples in there) directing to your new folder and giving your data
a name like I_datasetName.

5. Run ExtractLoadData.R. You should now have your raw data in your
local environment.

6. You then need to format your data so that it matches the form the
dashboard is prepared for. This code is written in TransformData.R.
4. Add a new script to /importData to extract and clean the data and
add the script to the run list in ExtractLoadData.R (copy one of the
examples in there). This file should direct to your new folder and
import the data. It will also need to format your data so that it
matches the form the dashboard is prepared for.

a\. Firstly clean the raw data so it is a usable format with
headings and data rows. There are some functions within
TransformData.R that might help (formatNomis cleans Nomis data,
headings and data rows. There are some functions within the
functions folder that might help (formatNomis cleans Nomis data,
formatLong puts data into long format)

b\. Ensure you have data for all the geographic areas used in the
Expand Down Expand Up @@ -382,17 +368,17 @@ To add one of these into the dashboard:
| value | The value of the metric in this subgroup in numeric terms. NA if suppressed. | numeric | 5000 |

7. Give the now clean and formatted dataset a name like C_dataName and
add into TransformData.R/3. Combine datasets.
add into importData/combineData.R file.

8. Run TransformData.R. You should see some new files in your local
8. Run ExtractLoadData.R. You should see some new files in your local
environment and the data in Data/AppData will have updated to
include your new metric.

9. Add your metric into metricChoices in global.R and assign it a more
user friendly name for use in the dashboard.

10. Give your metric a good name for the Data Explorer tab in
TransformData.r/4.4 C_dataHub.
combineData.r/4.4 C_dataHub.

11. Run the app `runApp()`.

Expand Down Expand Up @@ -433,7 +419,7 @@ To add one of these into the dashboard:
charts ie percentage vs vol. If the problem is in the time chart or
map, check server.R where there are a few if statements to assign a
metric to either % or vol. If the problem is in the breakdown chart
check the server and TransformData where some metrics are split into
check the server and combineData.R where some metrics are split into
proportions and some are not.

- You may want to add to the Overview page. If so, add to ui.R wherever
Expand Down Expand Up @@ -477,9 +463,9 @@ Another source of problem is changing geographical areas:
latest data on <https://geoportal.statistics.gov.uk/>)

- area name changes. These can be random eg typos, or genuine updates to
names. In transformData.R there are a number of bits of correcting
code to allign names. We attempt to use codes where possible but some
data sets do not have them.
names. In the /importData files there are a number of bits of
correcting code to allign names. We attempt to use codes where
possible but some data sets do not have them.

- some data sets use the boundaries at the time in their historical
data. Some project the current projections back. In the dashboard we
Expand Down
42 changes: 15 additions & 27 deletions DeveloperGuide/LocalSkillsDeveloperGuide.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -104,13 +104,7 @@ These are the common release points, however be aware that there may be revision

*For Nomis Datasets:*

You should just be able to skip to stage (3) (Run ExtractLoadData.R) - this pulls all the latest Nomis data. However, please take note that the highest qualifications data currently includes data from two timeseries (NVQ and RQF), as RQF replaced NVQ in January 2022. The code for this section will need to be updated at each release (yearly) to use all available RQF data, and NVQ data for the remaining quarters. Once there have been five releases of the RQF data the code can be changed to use only the five latest releases of RQF data.

3. Run the ExtractLoadData.R script. This goes through all the numbered folders in /Data and loads the datasets in each folder as a R object.

`source("~/RProjects/lsip_dashboard/ExtractLoadData.R", echo=TRUE)`

This takes about 20 mins. You should see your environment is now populated with data, including your new, updated dataset.
You should just be able to skip to stage (4) as ExtractLoadData.R in step (5) will pull all the latest Nomis data. However, please take note that the highest qualifications data currently includes data from two timeseries (NVQ and RQF), as RQF replaced NVQ in January 2022. The code for this section will need to be updated at each release (yearly) to use all available RQF data, and NVQ data for the remaining quarters. Once there have been five releases of the RQF data the code can be changed to use only the five latest releases of RQF data.

4. Dashboard admin:

Expand All @@ -122,11 +116,9 @@ This takes about 20 mins. You should see your environment is now populated with

- You may need to update the Latest Period and/or the data caveats in the /Data/3-2_dataText/dataText.xlsx file as well. These are used to populate dynamic text on the dashboard's Local Skills tab.

<!-- -->
5. Run the ExtractLoadData.R script. This runs all the scripts in /importData and goes through all the numbered folders in /Data and loads the datasets in each folder as a R object. As we have datasets from a wide range of sources that come in all kinds of formats, we do a lot of data cleaning, manipulating and formatting to get into a format the dashboard can work with.

5. We have datasets from a wide range of sources that come in all kinds of formats. We do a lot of data cleaning, manipulating and formatting to get into a format the dashboard can work with. We do this in the TransformData.R script. If your new datset is in the same (or even pretty close) format as the old dataset (as is hopefully the case), you just need run this script:

`source("~/RProjects/lsip_dashboard/TransformData.R", echo=TRUE)`
`source("~/RProjects/lsip_dashboard/ExtractLoadData.R", echo=TRUE)`

This takes about 20 mins. You should see your environment is populated with new, clean datasets and your Data/AppData folder has been updated with the updated versions of the files the dashboard uses.

Expand Down Expand Up @@ -161,13 +153,13 @@ We currently have a fair few metrics that exist in the data but aren't being sho

To add one of these into the dashboard:

1. Remove the metric from the unused list in TransformData.R under the "4.1 Unused metrics" header.
1. Remove the metric from the unused list in importData/combineData.R under the "4.1 Unused metrics" header.

2. Add your metric into metricChoices in global.R and assign it a more user friendly name for use in the dashboard.

3. Add a row into /Data/3-2_dataText/dataText.xlsx for your metric assigning it the relevant text.

4. Run TransformData.R
4. Run extractLoadData.R

5. Run `runApp()` and check the metric is available in the Local Skills tab.

Expand All @@ -186,9 +178,9 @@ To add one of these into the dashboard:

**Optional extras/troubleshooting**

- You might need to make some adjustments to how it is presented in the charts ie percentage vs vol. If the problem is in the time chart or map, check server.R where there are a few if statements to assign a metric to either % or vol. If the problem is in the breakdown chart check the server and TransformData where some metrics are split into proportions and some are not.
- You might need to make some adjustments to how it is presented in the charts ie percentage vs vol. If the problem is in the time chart or map, check server.R where there are a few if statements to assign a metric to either % or vol. If the problem is in the breakdown chart check the server and combineData.R where some metrics are split into proportions and some are not.

- If you want it to appear in the Data Explorer tab, remove it from the exclusion list in TransformData.r/4.4 C_dataHub and give it a good name in there as well.
- If you want it to appear in the Data Explorer tab, remove it from the exclusion list in combineData.r/4.4 C_dataHub and give it a good name in there as well.

- You may want to add to the Overview page. If so, add to ui.R wherever you want and create KPIs and charts in server.R using the following function: createOverviewKPI, createOverviewChart, renderOverviewChart.

Expand All @@ -208,13 +200,9 @@ To add one of these into the dashboard:

d. Add some overview information about the dataset on the User Guide page of the dashboard. an do this in ui.R/2.1.2 Contents

4. Add some data extraction code to ExtractLoadData.R (copy one of the examples in there) directing to your new folder and giving your data a name like I_datasetName.

5. Run ExtractLoadData.R. You should now have your raw data in your local environment.

6. You then need to format your data so that it matches the form the dashboard is prepared for. This code is written in TransformData.R.
4. Add a new script to /importData to extract and clean the data and add the script to the run list in ExtractLoadData.R (copy one of the examples in there). This file should direct to your new folder and import the data. It will also need to format your data so that it matches the form the dashboard is prepared for.

a\. Firstly clean the raw data so it is a usable format with headings and data rows. There are some functions within TransformData.R that might help (formatNomis cleans Nomis data, formatLong puts data into long format)
a\. Firstly clean the raw data so it is a usable format with headings and data rows. There are some functions within the functions folder that might help (formatNomis cleans Nomis data, formatLong puts data into long format)

b\. Ensure you have data for all the geographic areas used in the dashboard: LADUs, LSIPs, LEPs, MCAs, and national.

Expand All @@ -223,7 +211,7 @@ To add one of these into the dashboard:
c\. You then need to manipulate your now clean data into the following form:

| Column name | Description | Format | Example |
|------------------|-------------------|------------------|------------------|
|-------------|------------------------------------------------------------------------------------------------------------------------|-----------|-------------------|
| geogConcat | Area name and geography | character | Black Country LEP |
| metric | Variable of interest | character | inemployment |
| breakdown | *If* the dataset has a breakdown they are listed here. Every metric **must** have a "Total" as well as any breakdowns. | character | Age |
Expand All @@ -234,13 +222,13 @@ To add one of these into the dashboard:
| valueText | The value of the metric in this subgroup. Includes any supression from the source. | character | 5000 |
| value | The value of the metric in this subgroup in numeric terms. NA if suppressed. | numeric | 5000 |

7. Give the now clean and formatted dataset a name like C_dataName and add into TransformData.R/3. Combine datasets.
7. Give the now clean and formatted dataset a name like C_dataName and add into importData/combineData.R file.

8. Run TransformData.R. You should see some new files in your local environment and the data in Data/AppData will have updated to include your new metric.
8. Run ExtractLoadData.R. You should see some new files in your local environment and the data in Data/AppData will have updated to include your new metric.

9. Add your metric into metricChoices in global.R and assign it a more user friendly name for use in the dashboard.

10. Give your metric a good name for the Data Explorer tab in TransformData.r/4.4 C_dataHub.
10. Give your metric a good name for the Data Explorer tab in combineData.r/4.4 C_dataHub.

11. Run the app `runApp()`.

Expand All @@ -261,7 +249,7 @@ To add one of these into the dashboard:

**Optional extras/troubleshooting**

- You might need to make some adjustments to how it is presented in the charts ie percentage vs vol. If the problem is in the time chart or map, check server.R where there are a few if statements to assign a metric to either % or vol. If the problem is in the breakdown chart check the server and TransformData where some metrics are split into proportions and some are not.
- You might need to make some adjustments to how it is presented in the charts ie percentage vs vol. If the problem is in the time chart or map, check server.R where there are a few if statements to assign a metric to either % or vol. If the problem is in the breakdown chart check the server and combineData.R where some metrics are split into proportions and some are not.

- You may want to add to the Overview page. If so, add to ui.R wherever you want and create KPIs and charts in server.R using the following function: createOverviewKPI, createOverviewChart, renderOverviewChart

Expand Down Expand Up @@ -289,7 +277,7 @@ Another source of problem is changing geographical areas:

- updating boundaries (pages like this are good to keep an eye on <https://en.wikipedia.org/wiki/2019%E2%80%932023_structural_changes_to_local_government_in_England> but changes to LSIPs, MCAs and LEPs you are better off looking at the latest data on <https://geoportal.statistics.gov.uk/>)

- area name changes. These can be random eg typos, or genuine updates to names. In transformData.R there are a number of bits of correcting code to allign names. We attempt to use codes where possible but some data sets do not have them.
- area name changes. These can be random eg typos, or genuine updates to names. In the /importData files there are a number of bits of correcting code to allign names. We attempt to use codes where possible but some data sets do not have them.

- some data sets use the boundaries at the time in their historical data. Some project the current projections back. In the dashboard we apply the latest boundaries to all files.

Expand Down
Loading

0 comments on commit b3117b8

Please sign in to comment.