forked from dyutibarma/monochrome
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
247b5f8
commit e4beeab
Showing
4 changed files
with
30 additions
and
0 deletions.
There are no files selected for viewing
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
title: Unaccompanied Migrant Children Part 2 | ||
header: Unaccompanied Migrant Children Part 2 | ||
--- | ||
# 2024-06-04-Unaccompanied Migrant Children Part 2 | ||
|
||
For the last two blog posts, I have worked on the [data set that the New York Times released](https://github.com/nytimes/hhs-child-migrant-data) about unaccompanied migrant children in the United States. These are children who have crossed the border into the United States without their parents or legal guardians. In Part 1, I explored the overall trends in the data. The second blog post centers on a searchable database. The hope is that people will be able to look at the data to get a better understanding of the situation and the challenges these children face in their communities. For instance, after searching my hometown zip code of “39305”, I saw that there were three individuals placed in my hometown. | ||
|
||
A particularly useful element of the data is the spatial and time data. We have information about when individuals entered the United States, their country of origin, and where they were placed. This allows for some interesting analysis and visualizations. For instance, we can see that while there are children from around the globe, we have a large concentration of individuals from Central America, particularly Guatemala, El Salvador, and Honduras. I figured creating an arc map would allow us to see some of these trends better. | ||
|
||
Below, you can see a visual representation of the movement of unaccompanied migrant children from their countries of origin to their placement locations in the United States. | ||
|
||
<iframe width="100%" height="500px" src="https://studio.foursquare.com/map/public/1cb5eb5b-30bd-40e4-84ea-8cf0ffef1695/embed" frameborder="0" allowfullscreen></iframe> | ||
|
||
However, simply knowing where these individuals land is not useful in and of itself. Frequently, I have found that many individuals beginning their first journey into data science or data analytics will not take into account the dependent nature of the data. One way we can solve this is mixed models, but this becomes trickier with the spatial data. [Tobler’s first law of geography](https://en.wikipedia.org/wiki/Tobler%27s_first_law_of_geography) states that “everything is related to everything else, but near things are more related than distant things.” In many cases, what this means is that we need to account for the spatial relationships between our data points. For example, if we are looking at the number of unaccompanied migrant children in a particular area, we need to consider the characteristics of the surrounding areas as well. One thing that we can use is [Cluster and Outlier Analysis with Anselin Local Moran's I.](https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/cluster-and-outlier-analysis-anselin-local-moran-s.htm) With these, we can get four results: | ||
|
||
- High-High | ||
- Low-High | ||
- High-Low | ||
- Low-Low | ||
|
||
The middle two are particularly interesting because they represent outliers in the data. A High-Low result indicates an area with a high number of unaccompanied migrant children surrounded by areas with low numbers. Conversely, a Low-High result indicates an area with a low number of unaccompanied migrant children surrounded by areas with high numbers. These outliers can provide valuable insights into the factors influencing the distribution of these children across the United States, such as local policies, community support, or economic conditions. | ||
|
||
While enumeration units (e.g., counties, provinces, countries, zip codes) are useful, they may not always accurately represent the spatial relationships between data points. This is because many of these are arbitrary. Thus, we need to standardize these into some shape. The standards for doing this are called [geospatial indexing systems](https://benfeifke.com/posts/geospatial-indexing-explained/). These can help minimize some of the effects of [modifiable areal unit problem](https://en.wikipedia.org/wiki/Modifiable_areal_unit_problem). Popular ones are [Google’s S2](http://s2geometry.io/) and [GeoHash](http://s2geometry.io/). However, I have been obsessed recently with [Uber’s H3](https://www.uber.com/blog/h3/). Not only does it solve some issues with the previous ones that are not worth getting into here, but it is baked into [FourSquare Studio](https://studio.foursquare.com/), which is my GIS software of choice. | ||
|
||
Below, you will see two maps separated by a slider. On the left side, we see grids at [4 resolution](https://h3geo.org/docs/core-library/restable/) with the number of unaccompanied migrant children. The darker the color, the higher the number. On the right side, we see the results of the Anselin Local Moran's I analysis based on the counts of each H3 cell. A benefit of doing this is that we can also run a [permutation test](https://en.wikipedia.org/wiki/Permutation_test). Below, I have used six neighbors to reflect each side of the hexagon and 999 permutations. Furthermore, I have filtered to only show those that are [statistically significant](https://en.wikipedia.org/wiki/Statistical_significance) based on a 0.05 alpha value. | ||
|
||
<iframe width="100%" height="500px" src="[https://studio.foursquare.com/map/public/b5f52509-1d98-4e4f-b736-3b435b755537/embed](https://studio.foursquare.com/map/public/b5f52509-1d98-4e4f-b736-3b435b755537/embed)" frameborder="0" allowfullscreen></iframe> | ||
|
||
As you will notice, we did not have any outliers. This suggests that the distribution of unaccompanied migrant children across the United States is relatively consistent, with areas of high and low concentrations clustering together. Nonetheless, it is essential to note that this analysis is based on a snapshot of the data and does not account for changes over time. |