bewilburring_causal_model.qmd

---
title: "Potentially Actual Causal Model"
format: 
  html:
    code-fold: true
    toc: true
    toc-location: left
    tbl-cap-location: top
    fig-cap-location: top
    df-print: paged
  
---

# Theory: Open Market Hypothes-ish and Reassessment Cycle as Exogenous Shock

The value of the availability of an incentive classification should already be capitalized into the fair market value of any given PIN. That said, despite CCAO's involvement in the re-classification process, it is fundamentally unaware whether a PIN is in the process of obtaining an incentive classification. Moreover, it is unlikely CCAO incorporates the value of the opportunity cost of foregoing an incentive classification when valuing income-producing properties.

Thus, obtaining an incentive classification should immediately increase the FMV of a PIN. At the same time, we would not see that increase reflected in the PIN's assessment until its next reassessment cycle. While CCAO claims it never uses "fully loaded cap rates," the point is fundamentally irrelevant to the FMV of a PIN, which is what CCAO is legally obligated to calculate.

The next reassessment following the grant of incentive classification, then, serves as the exogenous shock necessary to make a credible causal claim about the effect of incentivizing a PIN. For the first time, CCAO is able to observe the existence of an incentive classification as it shifts from an unknown (to CCAO) fact about a PIN to a known (to CCAO) face about a PIN.

# Data import and prep

Note that we have dropped outliers from our dataset for purposes of these models.

```{r setup}
#| output: false

options(scipen = 999, digits = 4) #no scientific notation

# Load packages

library(tidyverse)
library(glue)
library(fixest)
library(modelsummary)
library(tinytable)
library(sandwich)
library(tinytex)

#comm_ind <- read_csv("./Output/comm_ind_PINs_2011-2022_balanced.csv") 

comm_ind <- read_csv("./Output/comm_ind_PINs_2006to2022_timeseries.csv") |>
  filter(year >= 2011) |>
      
  # set reference levels
  mutate(incent_change = as.factor(incent_change),
         landuse_change = as.factor(landuse_change),
         triad = as.factor(Triad),
         in_tif = as.factor(in_tif),
         land_use = as.factor(land_use),
         incent_prop = as.factor(incent_prop),
         clean_name = as.factor(clean_name), 
         incent_change = relevel(incent_change, ref = "Never Incentive"),
         landuse_change = relevel(landuse_change, ref = "Always Commercial"),
         incent_prop = relevel(incent_prop, ref = "Non-Incentive"),
         triad = relevel(triad, ref = "City"),
         land_use = relevel(land_use, ref = "Commercial"))

# has binary variable for if it was a reassessment year or not. 
# Manually created based on the 3 year rotation used for reassessments.
reassessment_years <- read_csv("./Necessary_Files/Triad_reassessment_years.csv")


reassessments_long <- reassessment_years %>% 
  pivot_longer(cols = c(`2006`:`2022`), names_to = "year", values_to = "reassessed_year")
```


```{r}
table(comm_ind$first_incent_year, comm_ind$next_reassessment)
```


```{r convert_panel_data}

comm_ind_df <- comm_ind |> 

  select(pin, class, year, clean_name, fmv, fmv_growth_2011, incent_prop, 
         land_use, major_class_code, landuse_change, incent_change, triad, 
         reassess_lag, reassessed_year,
         fmv_growth_2011_w, base_year_fmv_2011_w,
         in_tif) %>%
  mutate(incent_prop = ifelse(incent_prop == "Incentive", 1, 0))


comm_ind_df <- panel(comm_ind_df, panel.id = c("pin", "year"))

```

**Figure out how to code something that would represent this situation and allow us to model it:** 

The treatment is having the class of property go from a non-incentive class to an incentive class. Need to consider if it is a reassessment year or not. If it is not a reassessment year, then the class change should result in a change in AV that comes purely from the change in the level of assessment (25% to 10%).  If it is a reassessment year, then the change in value is due to the change in level of assessment and the change in the FMV of the property due to local property trends.  

- We still have the mailed_av that is then appealed by the incentive property which should only reflect the change in market trends since CCAO is removed from the incentive process and the board of appeals is the one that changes the property class. The mailed_av and the clerk_av should still have the same proportion of change in the the AV from the level of assessment (25% to 10%).   

- After a property gets its class changed to an incentive class, how does its fair market value change in the **next assessment cycle** compared to properties that did not get an incentive class?  

- some kind of difference-in-differences design   

# Causal Model

We propose a basic difference-in-difference design for our casual model, using the next reassessment following an incentive class change as the exogenous shock.

```{r}
#| eval: false

## From AV_DID file in the levy project


TRA <- read_csv("./Necessary_Files/triad_reassessment_years_piv2.csv") %>% 
  select(Triad, `2014`:`2018`)

TRA2 <- TRA %>% pivot_longer(cols = `2014`:`2018`, names_to = "year", values_to = "reassess_01")

# panel_data0 <- recoded_data %>%
#   filter(minor_type == "MUNI") %>%
#   filter(home_rule_ind == 1) %>%
#  # mutate(reassess_year = as.numeric(reassess_year)) %>%
#   mutate(agency_num = as.factor(agency_num)) %>%
#   mutate(year_num = as.character(year),
#          year = as.factor(year))
# 
# panel_data1 <- panel_data0 %>% filter(year_num > 2014 & year_num < 2019)
# 
# 
# panel_data2 <- left_join(panel_data1, TRA2, by = c("Triad", "year_num" = "year"))
# table(panel_data2$year)
```


## Dependent Variable

% Change in FMV from Prior Year

Given that the value (or at least opportunity cost) of the availability of an incetive classification is "baked into" the value of a PIN, we should only see a single bump in growth: at the next assessment cycle following the grant of the incentive. The alternative--that land owners don't account for the value of incentive classifications--is implausible and thus, following the initial bump in value, the PIN's FMV growth should return to the general trend.

## Independent Variable of Interest

First Assessment Year following Reclassification

Given that CCAO does not "know," a priori, the availability of an incentive classification for a given PIN, we should not see its value appear until the next reassessment cycle. An alternative set of IVs of interest--several lag variables of having an incentive classification--also makes theoretical sense, but would reduce the explanatory value of the model.

## Control Variables

### Change in Land Use

Change in land use has consistently been statistically and practically significant in conection with PIN FMV.

### Year

Since we are using a DiD model with a staggered treatment, we can't use TWFE. Thus, we're including year as a control variable

## Fixed Effects

We will be using municipality-level fixed effects because those entities are the "deciders" on incentive classifications and the level at which we want to remove the "across" variation.

**Want to control for group differences and time differences. Essentially have a "Treated" and "Untreated" group Fixed Effects and "before treatment" and "after treatment" time fixed effects.**

- Coefficient on "Treated" would be the DID effect. Uses an interaction term for Groups and pre-post treatment time frames.

## Matching

- Match within Municipality  
- Control is property that never received incentive classification
- Treatment is having the property class become an incentive class (600 to 899)  
- Match on Difference between av_mailed and av_clerk?


# Models Lacking Valid Causal Claims

Absent incorporation of an exogenous shock, we cannot make causal claims based on the models below, but can identify important associations.

## Logit

```{r logit_models_simple}

logit_simple_list <- list(
  
 "pin/year FE" =  feglm(incent_prop ~ land_use | pin + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year),
 
  "pin/muni/year FE" =  feglm(incent_prop ~ land_use | pin + clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year),
  
  "muni/year FE" =  feglm(incent_prop ~ land_use | clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year)
  
)

modelsummary(logit_simple_list,
             stars = TRUE)

```

```{r logit_year_landuse}

land_year_list <- list(
  
  "standard" = feglm(incent_prop ~ land_use | pin + clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year),
  
  "land_use/year" = feglm(incent_prop ~ land_use + land_use*year | pin + clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year) ,
  
  "triad" = feglm(incent_prop ~ land_use + triad | pin + clean_name + year,
                  data = comm_ind_df,
                  family = binomial(link = "logit"),
                  panel.id = ~ pin + year)
)

modelsummary(land_year_list,
             stars = TRUE)

```

```{r logit_intermediate_models}

logit_med_list <- list(
  
      "TIF?" =  feglm(incent_prop ~ land_use + in_tif | pin + clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year),
    
     "Landuse-Year Interact." = feglm(incent_prop ~ land_use + land_use*year | 
                                       pin + clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year),
    
    "Reassess?" = feglm(incent_prop ~ land_use + reassessed_year | 
                          pin +      clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year),
    
    "Reassess-Year Interact." = feglm(incent_prop ~ land_use + reassessed_year + reassessed_year*land_use
      | pin + clean_name + year,
      data = comm_ind_df,
      family = binomial(link = "logit"),
      panel.id = ~ pin + year)
  
)

modelsummary(logit_med_list,
             stars = TRUE,
             coef_omit = "year"
             )

```