Some potentially useful resources:
- https://www.biostat.wisc.edu/~kbroman/presentations/steps2rr.pdf
- https://egap.org/resource/10-things-to-know-about-project-workflow/
- https://cghlewis.com/blog/data_mgmt_resources/
- https://cghlewis.com/blog/data_clean_02/
- Essential resource to think about how to organize data in speadsheets and why: (Broman & Woo, 2017)
Two great books on Causal Inference. The Mixtape has an emphasis on the maths beyond it, while The Effect might be more hands-on. Both provide snippets of code for the analyses.
To delve more into the specifics of different methods, the Mixtape Sessions are an incredible resource. Provides slides and code.
- Repository of recent advances in DiD methods, focus on its application: https://asjadnaqvi.github.io/DiD/
- MLU-EXPLAIN: Visual explanations of core machine learning concepts. Including OLS, and logistic regression.
- https://lakens.github.io/statistical_inferences/
- Report results: https://modelsummary.com/index.html
- Marginal effects (eg interactions): https://marginaleffects.com
- https://tilburgsciencehub.com
- On how to create a website.
- https://egap.org/methods-guides/
- The
EpidemiologistPolitical Scientist R Handbook - General advice (also technical) on writing: https://macartan.github.io/teaching/how-to-write
- Effective abstracts: https://www.fabriziogilardi.org/resources/papers/good-abstracts.pdf
Using R, generally you can use the package "rvest" for static pages or "RSelenium" for dynamic ones. Before scrapping, take into account ethical and legal considerations (eg. respecting website terms of service, avoiding server overload, and ensuring compliance with legal regulations), this guide provides an introduction.
- These fantastic slides by Etienne Bacher provides almost all that is necessary to successfully extract information from static and dynamic webpages.