layout | title |
---|---|
page |
Method Overview |
We are using the 2023 UW RDS (Respondent-Driven Sampling) and UW PSD (Puget Sound Data) datasets.
We prepared the data by:
- Creating functions to clean the data and impute additional columns where needed.
- Standardizing common columns in both UW RDS and UW PSD datasets to harmonize them, ensuring consistent factor levels and data formats.
We used:
- R Studio for data analysis.
- GitHub for version control and collaboration.
- R Studio: Provides the environment for writing and executing R code.
- GitHub: Manages version control and collaboration.
- Interoperability: R Studio integrates with GitHub for version control through the RStudio IDE, allowing seamless collaboration and tracking of changes in scripts and data processing functions.
The workflow was divided into several steps:
- Function Creation: Fellows created functions to target specific columns and tasks in the data.
- Function Integration: All created functions were consolidated into a single script.
- Data Cleaning: The script produced cleaned dataframes, including:
- A clean UW_RDS dataset
- A clean UW_PSD dataset
- A clean combined dataframe called UWRDS-PSD for integrated analysis.
- Created and tested individual functions for data cleaning and imputation.
- Standardized columns across datasets for harmonization.
- Combined the functions into a unified script.
- Ran the script to generate cleaned dataframes for subsequent analyses.
- Successful Approaches: The final approach involved using standardized functions to clean and harmonize data, resulting in reliable datasets for analysis. We then apply the RDS-II estimators and bootstrap methods for robust statistical analysis.
-
Shortcomings:
- Column Standardization: Despite efforts, discrepancies might still exist if initial datasets had inconsistencies or errors.
- Dependency on Tools: Reliance on R Studio and GitHub requires users to be proficient with these tools; any issues with tool integration or updates could impact the workflow.
-
Improvements:
- Ongoing Harmonization: Continuously review and update column standardization procedures to address any emerging inconsistencies.
- Tool Flexibility: Considered incorporating additional tools or methods to improve interoperability and address any potential limitations of the primary tools used.
The survey was designed to capture comprehensive data on the unsheltered population using the following methodology:
- Hub Locations: Established up to twelve hub locations, including public libraries and veterans shelters, to facilitate survey participation.
- Seed Participants: Recruited around twenty initial seed participants who distributed coupons to their peers.
- Coupon Distribution: Participants were given coupons to refer others, with incentives including cash cards and Metro bus tickets. Referring participants received credit for each completed referral.
- Data Collection: Conducted at hub locations with the assistance of trained students and volunteers, using tablets and the UW Qualtrics data entry form.
The sampling methodology employed includes:
- Respondent-Driven Sampling (RDS): The statistical model compensates for the fact that the sample was collected through peer referral rather than a traditional random sampling approach such as an addressed based sample. This method relies on multiple waves of peer-to-peer recruitment to approximate random sampling within “hard-to-reach” populations.
- Initial Screener: Approximately 5 minutes, interviewer-administered, with a $20 gift card and two Metro bus tickets.
- Individual Interview: About 30 minutes, self-administered via tablet, with an additional $10 gift card as incentive.
Participants were selected based on the following criteria:
- Age: 18 years or older.
- Consent: Capability of providing informed consent.
- Housing Status: Identified as unsheltered, as per HUD definitions.
Interviews were conducted in English and Spanish, with real-time translation available for other languages.