Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fte point in time sql #23

Open
andreabringhurst opened this issue Aug 30, 2024 · 5 comments
Open

Add fte point in time sql #23

andreabringhurst opened this issue Aug 30, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@andreabringhurst
Copy link
Collaborator

Use the SQL for off the daily snap shot as a base for an fte point in time sql that will be similar to the headcount point in time sql.

@andreabringhurst andreabringhurst added the enhancement New feature or request label Aug 30, 2024
@andreabringhurst andreabringhurst self-assigned this Aug 30, 2024
@andreabringhurst
Copy link
Collaborator Author

FTE SQL sent from Craig in Slack
-- updated daily enrollment headcount
WITH cte_next_term_not_summer AS (
SELECT a.term_desc,
a.term_id AS real_term_id,
a.registration_start_date -1 AS real_next_reg_start_date,
CASE
WHEN a.season = 'Spring' THEN CAST(CAST(a.term_id AS integer) + 20 AS text)
ELSE a.next_term_id
END AS next_term_not_summer
FROM export.term a
),

cte_next_term_data AS (

SELECT a.term_desc,
       b.real_term_id,
       b.real_term_id::int - 100 AS last_term_id,
       current_date AS today
  FROM export.term a

LEFT JOIN cte_next_term_not_summer b
ON b.next_term_not_summer = a.term_id
WHERE a.registration_start_date::date > current_date - 2
AND b.real_next_reg_start_date::date < current_date - 2

)

SELECT a.days_to_class_start,
a.is_enrolled,
a.student_id,
a.term_id,
a.year,
a.season,
a.student_type_desc AS student_type,
a.student_type_code,
a.date AS enrollment_date,
a.institutional_attempted_credits,
a.level_id,
b.is_degree_seeking,
a.full_time_part_time_code,
p.college_abbrv AS college,
p.department_id AS department,
b.primary_program_id AS program,
c.gender_code AS gender,
c.ipeds_race_ethnicity AS race_ethnicity,
a.academic_year_desc,
d.census_date,
d.term_start_date,
d.term_end_date,
d.registration_start_date,
d.is_current_term,
e.real_term_id,
f.last_term_id,
e.real_term_id IS NOT NULL AS is_term_to_use,
b.freshman_cohort_code,
b.freshman_cohort_desc,
b.transfer_cohort_code,
b.transfer_cohort_desc,
b.is_cohort_excluded

 FROM export.daily_enrollment a

LEFT JOIN export.student_term_level b
ON a.student_id = b.student_id
AND a.term_id = b.term_id
AND b.is_primary_level
LEFT JOIN export.student c
ON c.student_id = a.student_id
LEFT JOIN export.academic_programs p
ON p.program_id = b.primary_program_id
LEFT JOIN export.term d
ON a.term_id = d.term_id
LEFT JOIN cte_next_term_data e
ON d.term_id = e.real_term_id
LEFT JOIN cte_next_term_data f
ON d.term_id::int = f.last_term_id
WHERE a.term_id::int IN (e.real_term_id::int, f.last_term_id)
AND a.is_enrolled;

@andreabringhurst
Copy link
Collaborator Author

The FTE shaping sent from Craig in Slack:
The real shaping is done with some R code. Here is the parts the do the shaping. There is a lot that gets done in the script so I parsed out only the code for FTE. the fte_check is doing the most.

pit_enrollment_df <- pit_enrollment_sql %>%
mutate(days_to_class_start_date = format(days_to_class_start_date, format = "%b %d %Y")) %>% # this adds the date
relocate(days_to_class_start_date, .after = days_to_class_start)

term_info_df <- pit_enrollment_df %>%
filter(is_term_to_use) %>%
head(1) %>%
mutate(days_to_class_start = as.integer(difftime(census_date, term_start_date, units='days')))

previous_term_info_df <- pit_enrollment_df %>%
filter(!is_term_to_use) %>%
head(1) %>%
mutate(days_to_class_start = as.integer(difftime(census_date, term_start_date, units='days')))

target_term <- term_info_df[["term_id"]]

previous_term <- (as.integer(target_term) - 100L) %>%
as.character()

##########
choose_date <- function(todays_date=TRUE, number_days_out) {

if(todays_date){
target_dtcs <- pit_enrollment_df %>%
filter(term_id == target_term) %>%
select(days_to_class_start) %>%
max()

return(target_dtcs)

}
else{

return(number_days_out)

}
}

fte_check <- pit_enrollment_df %>%
mutate(target_census_check =if_else(days_to_class_start >= target_census_dtsc, target_census_dtsc, target_dtcs)) %>%
filter(term_id == target_term &
days_to_class_start == target_census_check &
target_census_check <= min(target_census_check)) %>%
filter(is_degree_seeking & level_id == "UG") %>%
group_by(term_id) %>%
summarise(credits = sum(institutional_attempted_credits)) %>%
group_by(term_id, credits) %>%
summarise(fte = round(sum(credits/15), 2))

both_fte <- rbind(fte_sql, fte_check)

compare_to_fte <- both_fte %>%
select(term_id, fte) %>%
mutate(FTE = "FTE") %>%
mutate(term_id = if_else(term_id == previous_term, "previous_year", "current_year")) %>%
pivot_wider(names_from = term_id, values_from = fte) %>%
mutate(difference = current_year - previous_year) %>%
mutate(percent_change = (difference / ((current_year + previous_year) / 2)))

@andreabringhurst
Copy link
Collaborator Author

andreabringhurst commented Nov 12, 2024

This issue is waiting for the next enhancements to utDataStoR. The code provided above needs to be reviewed and made functional. The work in the app is being done in both sql and R code so there will be work that needs to be done to make the sql provide the right data. This will require an update the the fte vignette and the make_fte_sql function.

@andreabringhurst
Copy link
Collaborator Author

@CraigDemke I have written an sql that I think is producing the right data. It looks more like the headcount_point_in_time.sql (https://github.com/ut-effectiveness/utDataStoR/blob/master/inst/sql/headcount/headcount_point_in_time.sql) in utDataStoR. Please review it and give me some feedback. Also, I'm not sure what to check it against. I want to check it against Edify. Do I run it and then the Edify FTE, and take the most recent Edify FTE compared to 1 day ago of the fte_point_in_time?

WITH cte_undergrad_fte AS (SELECT a.term_id,
a.date AS enrollment_date,
date_part('week', a.date) - 11 AS week,
a.days_to_class_start,
SUM(ROUND(a.institutional_attempted_credits/15, 2)) AS undergrad_fte
FROM export.daily_enrollment a
WHERE a.is_current_term = TRUE
AND a.is_enrolled
AND a.level_id = 'UG'
GROUP BY a.term_id, a.date, a.days_to_class_start
ORDER BY a.date DESC),

cte_graduate_fte AS (SELECT b.term_id,
b.date AS enrollment_date,
date_part('week', b.date) - 11 AS week,
b.days_to_class_start,
SUM(ROUND(b.institutional_attempted_credits/10, 2)) AS graduate_fte
FROM export.daily_enrollment b
WHERE b.is_current_term = TRUE
AND b.is_enrolled
AND b.level_id = 'GR'
GROUP BY b.term_id, b.date, b.days_to_class_start
ORDER BY b.date DESC)

SELECT a.term_id,
a.enrollment_date,
a.week,
a.days_to_class_start,
COALESCE(a.undergrad_fte, 0) AS undergrad_fte,
COALESCE(b.graduate_fte, 0) AS graduate_fte,
a.undergrad_fte + b.graduate_fte AS total_fte
FROM cte_undergrad_fte a
LEFT JOIN cte_graduate_fte b
ON b.term_id = a.term_id
AND b.enrollment_date = a.enrollment_date
GROUP BY a.term_id, a.enrollment_date, a.week, a.days_to_class_start, a.undergrad_fte, b.graduate_fte
ORDER BY a.enrollment_date DESC;

@andreabringhurst
Copy link
Collaborator Author

I have put some sql together (edify_fte_point_in_time.sql). See the branch new_fte_sql - https://github.com/ut-effectiveness/utDataStoR/tree/new_fte_sql/inst/sql/fte

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant