Skip to content

Commit

Permalink
December 2024 (#1029)
Browse files Browse the repository at this point in the history
* Move `cli` warning messages before `return()` (#1016)

* reorder episode file functions cli messages

* reorder indiv file functions cli messages

* Refine cli messages

* reorder cli messages

* Style code

---------

Co-authored-by: Jennit07 <[email protected]>

* Update check_year_valid.R (#1017)

Co-authored-by: Jennit07 <[email protected]>

* Remove person_id. Matched in later process

* Remove redundant #TODO comments

* remove redundant #TODO comments

* Update news - sep release date

* Write temp data (#1014)

* add_test_to_filename and write_temp_data function

* Update documentation

* remove test_mode default

* Update documentation

* add read_temp_data

* Style code

* Update documentation

* change test_mode to write_temp_to_disk and add clean temp function

* Update documentation

* Style code

* Style code

* Include extra temp file

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennifer Thom <[email protected]>

* sequence writing tests to excel (#1013)

* sequence writing tests to excel

* Style code

* minor changes

* fix bugs in anonymous function

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Sc latest quarter (#1012)

* prompt warnings of latest period when reading in sc data

* Style code

* add warning to show latest quarter for sc client

* Style code

* Update R/read_lookup_sc_client.R

Co-authored-by: Jennit07 <[email protected]>

* Update R/read_lookup_sc_demographics.R

Co-authored-by: Jennit07 <[email protected]>

* Update R/read_sc_all_alarms_telecare.R

Co-authored-by: Jennit07 <[email protected]>

* Update R/read_sc_all_care_home.R

Co-authored-by: Jennit07 <[email protected]>

* Update R/read_sc_all_home_care.R

Co-authored-by: Jennit07 <[email protected]>

* Update R/read_sc_all_sds.R

Co-authored-by: Jennit07 <[email protected]>

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* death join and distinct refined death (#1015)

* distinct death date, keep the earliest one and remove na

* add activity after death 100% accurate joining

* Style code

* remove redundant combine death function

* Update documentation

* fix NA in activity_after_death

* Style code

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* 1018 moving dd hl1 (#1019)

* moving dd and homelessness to front

* fix typos

* Update documentation

* Style code

---------

Co-authored-by: lizihao-anu <[email protected]>
Co-authored-by: Jennit07 <[email protected]>

* Update process_extract_ae.R (#1020)

* Update process_extract_ae.R

* Style code

---------

Co-authored-by: lizihao-anu <[email protected]>

* Organise pre processing scripts (#1023)

* Create a new folder to organise scripts

* Add comments

* Move script to folder

* Style code

---------

Co-authored-by: Jennit07 <[email protected]>

* Clean test folder (#1021)

* Update refs script

* Add check_year_valid to update_refs script

* Update documentation

* Add description to 00_update_refs

* Update wb name and paths for write_xlsx

* Style code

* Clean up github redundant scripts

* Rename script to match function name

* Update documentation

* Update cost test path

* write to disk only if file does not exist

* Remove object

* Update R/00-update_refs.R

Co-authored-by: Zihao Li <[email protected]>

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Zihao Li <[email protected]>

* Update homelessness completeness code  (#1026)

* Update homelessness completeness

* Style code

---------

Co-authored-by: Jennit07 <[email protected]>

* Update documentation

* update namespace

* Update references

* update process_tests_sc_demographics

* Update - write temp file on `create_episode_file`

* get_chi for data

* Style code

* IT deaths changes

* Style code

* remove get_chi

* Specify year in episode file tests

* specify year in indiv tests

* Update `check_year_valid`

* Fix `ch_provider` new coding guidance

* Add `full.names` parameter to `write_temp_data`

* Revert "Fix `ch_provider` new coding guidance"

This reverts commit d2d584b.

* filter na episodes by filtering period

* Update running scripts indiv file

* Update `check_year_valid`

* update end_date

* Update NEWS.md

* Update create_individual_file.R

* update test-check_year_valid

* fix binding issue and remove redundance

* Style code

* R cmd check over v4.1.2

---------

Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennit07 <[email protected]>
Co-authored-by: Jennifer Thom <[email protected]>
Co-authored-by: lizihao-anu <[email protected]>
  • Loading branch information
5 people authored Dec 10, 2024
1 parent 4df8dad commit e4b0d43
Show file tree
Hide file tree
Showing 107 changed files with 1,056 additions and 822 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
strategy:
fail-fast: false
matrix:
r_version: ['4.0.2', '4.1.2', 'release']
r_version: ['4.1.2', 'release']

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
Expand Down
8 changes: 6 additions & 2 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Generated by roxygen2: do not edit by hand

export("%>%")
export(add_deceased_flag)
export(add_homelessness_date_flags)
export(add_homelessness_flag)
export(add_hri_variables)
export(add_nsu_cohort)
export(check_year_format)
export(clean_temp_data)
export(clean_up_free_text)
export(compute_mid_year_age)
export(convert_ca_to_lca)
Expand All @@ -21,10 +21,12 @@ export(create_episode_file)
export(create_homelessness_lookup)
export(create_individual_file)
export(create_service_use_cohorts)
export(end_date)
export(end_fy)
export(end_fy_quarter)
export(end_next_fy_quarter)
export(find_latest_file)
export(fy)
export(fy_interval)
export(get_boxi_extract_path)
export(get_ch_costs_path)
Expand Down Expand Up @@ -89,7 +91,6 @@ export(midpoint_fy)
export(next_fy)
export(phs_db_connection)
export(previous_update)
export(process_combined_deaths_lookup)
export(process_costs_ch_rmd)
export(process_costs_dn_rmd)
export(process_costs_gp_ooh_rmd)
Expand Down Expand Up @@ -156,6 +157,7 @@ export(produce_episode_file_tests)
export(produce_sc_sandpit_tests)
export(produce_source_extract_tests)
export(produce_test_comparison)
export(qtr)
export(read_dev_slf_file)
export(read_extract_acute)
export(read_extract_ae)
Expand All @@ -178,12 +180,14 @@ export(read_sc_all_alarms_telecare)
export(read_sc_all_care_home)
export(read_sc_all_home_care)
export(read_sc_all_sds)
export(read_temp_data)
export(rename_hscp)
export(setup_keyring)
export(start_fy)
export(start_fy_quarter)
export(start_next_fy_quarter)
export(write_file)
export(write_temp_data)
export(years_to_run)
importFrom(data.table,.N)
importFrom(data.table,.SD)
Expand Down
10 changes: 9 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,12 @@
# September 2024 Update - Unreleased
# December 2024 Update - released 10-Dec-24
* 24/25 files have been updated, containing data up to September 2024.
* 17/18 - 23/24 files have been updated.
* Homelessness completeness flag is now available in 23/24 files.
* Substance misuse flag updated.
* Mid-2023 & Mid-2022 population estimates for Scotland have been updated.
* Mid-2022 Small Area Population Estimates for 2011 Data Zones have been updated.

# September 2024 Update - released 13-Sep-24
* New 24/25 files created
* New NSU cohort for 23/24 available
* New SPARRA scores calculated from April 24/25
Expand Down
File renamed without changes.
File renamed without changes.
70 changes: 70 additions & 0 deletions Pre_processing_scripts/write_anon_chi_files.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
################################################################################
# Name of file - Write_anon_chi_files.R
#
# Original Authors - Jennifer Thom, Zihao Li
# Original Date - July 2024
# Written/run on - R Posit
# Version of R - 4.1.2
#
# Description: Run this script in stages to convert chi to anon chi and save files.
# By default this is set up to take the delayed discharges file
# convert the chi to anon_chi and save to disk. Important for
# ensuring we do not save chi anywhere on disk.
#
################################################################################

## Stage 1 - Setup environment
#-------------------------------------------------------------------------------

# Set up directory
source_dir <- "/conf/hscdiip/SLF_Extracts/Delayed_Discharges"

# Specify type of files e.g parquet, rds, csv
pattern <- ".parquet"
cat(stringr::str_glue("Looking in '{source_dir}' for parquet files."))

# List all files in the directory
parquet_files <- list.files(source_dir, pattern = ".parquet", full.names = TRUE)
print(stringr::str_glue("Found {length(parquet_files)} parquet files to process."))

# Create a function to read variable names and check if CHI is in the file
is_chi_in_file <- function(filename) {
data <- arrow::read_parquet(filename, nrow = 5)
return(grepl("chi", names(data)) %>% any())
}


# Stage 2 - In each file, convert chi to anon_chi and save to disk
#-------------------------------------------------------------------------------

# create a loop for converting to anon chi in all listed files
for (data_file in parquet_files) {
# specify new name and new file path
save_file_path <- file.path(source_dir, paste0("anon-", basename(data_file)))
chi_in_file <- is_chi_in_file(data_file)

# If chi is in the file, convert to anon_chi
if (chi_in_file) {
read_file(data_file) %>%
slfhelper::get_anon_chi() %>%
write_file(save_file_path)

cat("Replaced chi with anon chi:", data_file, "to", save_file_path, "\n")
} else {
read_file(data_file) %>%
write_file(save_file_path)
cat("renamed file with anon chi:", data_file, "to", save_file_path, "\n")
}
}


# Stage 3 - Remove files with CHI
#-------------------------------------------------------------------------------

# Create a loop for removing the old files with CHI
for (data_file in parquet_files) {
file.remove(data_file)
cat("Removed chi files:", data_file, "in", source_dir, "\n")
}

# End of Script #
137 changes: 129 additions & 8 deletions R/00-update_refs.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,109 @@
################################################################################
# # Name of file - 00-update_refs.R
# Original Authors - Jennifer Thom, Zihao Li
# Original Date - August 2021
# Update - Oct 2024
#
# Written/run on - RStudio Server
# Version of R - 4.1.2
#
# Description - Use this script to update references needed for the SLF update.
#
# Manual changes needed to the following Essential Functions:
# # End_date
# # Check_year_valid
# # Delayed_discharges_period
# # Latest_update
#
################################################################################

#' End date
#'
#' @return Get the end date of the latest update period
#' @export
#'
end_date <- function() {
## UPDATE ##
# Specify update by indicating end of quarter date
# Q1 June = 30062024
# Q2 September = 30092024
# Q3 December = 31122024
# Q4 March = 31032024
lubridate::dmy(31122024)
}


#' Check data exists for a year
#'
#' @description Check there is data available for a given year
#' as some extracts are year dependent. E.g Homelessness
#' is only available from 2016/17 onwards.
#'
#' @param year Financial year
#' @param type name of extract
#'
#' @return A logical TRUE/FALSE
check_year_valid <- function(
year,
type = c(
"acute",
"ae",
"at",
"ch",
"client",
"cmh",
"cost_dna",
"dd",
"deaths",
"dn",
"gpooh",
"hc",
"homelessness",
"hhg",
"maternity",
"mh",
"nsu",
"outpatients",
"pis",
"sds",
"sparra"
)) {
if (year <= "1415" && type %in% c("dn", "sparra")) {
return(FALSE)
} else if (year <= "1516" && type %in% c("cmh", "homelessness", "dd")) {
return(FALSE)
} else if (year <= "1617" && type %in% c("ch", "hc", "sds", "at", "client", "cost_dna")) {
return(FALSE)
} else if (year <= "1718" && type %in% "hhg") {
return(FALSE)
} else if (year >= "2122" && type %in% c("cmh", "dn")) {
return(FALSE)
} else if (year >= "2324" && type %in% "hhg") {
return(FALSE)
} else if (year >= "2425" && type %in% c("nsu", "sds")) {
return(FALSE)
} else if (year >= "2526" && type %in% c("ch", "hc", "sds", "at", "sparra")) {
return(FALSE)
}

return(TRUE)
}


#' Delayed Discharge period
#'
#' @description Get the period for Delayed Discharge
#'
#' @return The period for the Delayed Discharge file
#' as MMMYY_MMMYY
#' @export
#'
#' @family initialisation
get_dd_period <- function() {
"Jul16_Sep24"
}


#' Latest update
#'
#' @description Get the date of the latest update, e.g 'Jun_2022'
Expand All @@ -7,9 +113,10 @@
#'
#' @family initialisation
latest_update <- function() {
"Sep_2024"
"Dec_2024"
}


#' Previous update
#'
#' @param months_ago Number of months since the previous update
Expand Down Expand Up @@ -51,19 +158,33 @@ previous_update <- function(months_ago = 3L, override = NULL) {
return(previous_update)
}

#' Delayed Discharge period

#' Extract latest FY from end_date
#'
#' @description Get the period for Delayed Discharge
#' @return fy in format "2024"
#' @export
#'
#' @return The period for the Delayed Discharge file
#' as MMMYY_MMMYY
fy <- function() {
# Latest FY
fy <- phsmethods::extract_fin_year(end_date()) %>% substr(1, 4)
}


#' Extract latest quarter from end_date
#'
#' @return qtr in format "Q1"
#' @export
#'
#' @family initialisation
get_dd_period <- function() {
"Jul16_Jun24"
qtr <- function() {
# Latest Quarter
qtr <- lubridate::quarter(end_date(), fiscal_start = 4)

qtr <- stringr::str_glue("Q{qtr}")

return(qtr)
}


#' The year list for slf to update
#'
#' @description Get the vector of years to update slf
Expand Down
Loading

0 comments on commit e4b0d43

Please sign in to comment.