Predict development pressure: how do we define “a lot of development”?
+
Define affordability burden: how do we define “affordability burden”? – % change year over year in population that is experience rate burden (will probably see extreme tipping points), growing population, % change in area incomes
+
Identify problem zoning
+
Calculate number of connected parcels
+
Predict development pressure at the block level
+
Identify not burdened areas
+
Identify problem zoning
+
Calcualte number of connected parcels
+
Advocate for upzoning in parcels where there is local development pressure, no affordability burden, problem zoning, and high number of connected parcels
To begin, we run a simple regression incorporating three engineered groups of features: space lag, time lag, and distance to 2022. We include this last variable because of a Philadelphia tax abatement policy that led to a significant increase in residential development in the years immediately before 2022. We will use this as a baseline model to compare to our more complex models.
+
+
+Show the code
+
permits_train <-filter(permits_bg %>%select(-mapname), year <2022)
+permits_test <-filter(permits_bg %>%select(-mapname), year ==2022)
+
+reg <-lm(permits_count ~ ., data =st_drop_geometry(permits_train))
+
+predictions <-predict(reg, permits_test)
+predictions <-cbind(permits_test, predictions)
+
+predictions <- predictions %>%
+mutate(abs_error =abs(permits_count - predictions),
+pct_error = abs_error / permits_count)
+
+ggplot(predictions, aes(x = permits_count, y = predictions)) +
+geom_point() +
+labs(title ="Predicted vs. Actual Permits",
+subtitle ="2022") +
+geom_smooth(method ="lm", se =FALSE)
We find that our OLS model has an MAE of only MAE: 2.44–not bad for such a simple model! Still, it struggles most in the areas where we most need it to succeed, so we will try to introduce better variables and apply a more complex model to improve our predictions.
Predict development pressure: how do we define “a lot of development”?
+
Define affordability burden: how do we define “affordability burden”? – % change year over year in population that is experience rate burden (will probably see extreme tipping points), growing population, % change in area incomes
+
Identify problem zoning
+
Calculate number of connected parcels
+
Predict development pressure at the block level
+
Identify not burdened areas
+
Identify problem zoning
+
Calcualte number of connected parcels
+
Advocate for upzoning in parcels where there is local development pressure, no affordability burden, problem zoning, and high number of connected parcels
To begin, we run a simple regression incorporating three engineered groups of features: space lag, time lag, and distance to 2022. We include this last variable because of a Philadelphia tax abatement policy that led to a significant increase in residential development in the years immediately before 2022. We will use this as a baseline model to compare to our more complex models.
+
+
+Show the code
+
permits_train <-filter(permits_bg %>%select(-mapname), year <2022)
+permits_test <-filter(permits_bg %>%select(-mapname), year ==2022)
+
+reg <-lm(permits_count ~ ., data =st_drop_geometry(permits_train))
+
+predictions <-predict(reg, permits_test)
+predictions <-cbind(permits_test, predictions)
+
+predictions <- predictions %>%
+mutate(abs_error =abs(permits_count - predictions),
+pct_error = abs_error / permits_count)
+
+ggplot(predictions, aes(x = permits_count, y = predictions)) +
+geom_point() +
+labs(title ="Predicted vs. Actual Permits",
+subtitle ="2022") +
+geom_smooth(method ="lm", se =FALSE)
We find that our OLS model has an MAE of only MAE: 2.44–not bad for such a simple model! Still, it struggles most in the areas where we most need it to succeed, so we will try to introduce better variables and apply a more complex model to improve our predictions.