Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Global Human Settle Layer datasets #1

Open
travishathaway opened this issue Apr 19, 2022 · 1 comment
Open

Add support for Global Human Settle Layer datasets #1

travishathaway opened this issue Apr 19, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@travishathaway
Copy link
Owner

travishathaway commented Apr 19, 2022

The global human settlement layer (GHSL) offers a global data set for population densities. Combining this with OSM data could provide for some pretty interesting analysis. I am creating this issue as a way to document how I go about adding support for this data in this project.

Acceptance Criteria

  • The process for importing this data is clearly documented somewhere (docs for the project or even a link to a publicly available Google Doc)
  • A new report command has been created showing how to query this data and create some reports/visualizations
  • Perhaps even a new command for importing this data (optional)
@travishathaway travishathaway self-assigned this Apr 19, 2022
@travishathaway travishathaway added the enhancement New feature or request label Apr 19, 2022
@travishathaway
Copy link
Owner Author

travishathaway commented Apr 19, 2022

Here's the method I've come up so far for this (written as a bash script):

# Download the whole data set first
curl -O -L https://cidportal.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_MT_GLOBE_R2019A/GHS_POP_E2015_GLOBE_R2019A_54009_250/V1-0/GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0.zip

# Unzip to preferred location

# Reproject data set to EPSG:3857
gdalwarp \
  -t_srs EPSG:3857 \
  GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0.tif \
  GHS_POP_tmp.tif

# Compress; the temp file we just created is about 45GB!!!
gdal_translate \
  -co compress=lzw \
  GHS_POP_tmp.tif \ 
  GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0-3857.tif

rm GHS_POP_tmp.tif

# Extract based on a country polygon
gdalwarp \ 
    -of GTiff \
    -cutline ../natural_earth_datasets/germany.geojson \
    -crop_to_cutline GHS_POP_E2015_GLOBE_R2019A_54009_250_V1_0-3857.tif \
    germany_population.tiff 

# Import into PostgreSQL
# This step can also be piped directly to the psql command
raster2pgsql \
  -c \
  -s 3857 \
  -t auto \
  -I \
  -M \
  germany_population.tiff \
  public.pop_data > germany_pop.sql

At this point, we are ready to begin querying our database.

The only thing that I don't like about the above solution is that when I reproject to a different SRS, the data set balloons to 45GB!!! (from only being about 500MB). I have no idea why this is happening, but it seems like the solution is just to compress it again and delete the temp file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant