Skip to content

Latest commit

 

History

History
220 lines (178 loc) · 8.02 KB

Lab-answers.md

File metadata and controls

220 lines (178 loc) · 8.02 KB

Lab - Week 2 - More SQL and PostGIS Basics

Goal of Lab

  • Use PostGIS functions to explore properties of geometries in dataset
  • Combine PostGIS with other PostgreSQL WHERE conditions, GROUP BYs, ORDER BYs, etc.
  • Deal with the quirks of messy data
  • Get familiar with some OpenStreetMap data

Dataset

Dataset: OpenStreetMap Buildings for University City Neighborhood

  • Preview data
  • Download link (copy and paste this into the Carto upload tool): https://raw.githubusercontent.com/MUSA-509/week-2-digging-into-databases/master/data/university_city_osm_buildings.geojson

I fetched this data from GeoFabrik's wonderful OSM download service. I downloaded the Pennsylvania file and filtered to only included building footprints that land in University City in Philadelphia.

About the dataset, notice the schema:

  • cartodb_id (number, actually an integer): Carto adds this as a utility for optimizing performance of this table. It wasn't in the original dataset.
  • the_geom (geometry): Carto uses the_geom for the geometry column. In the GeoJSON it was geometry as per the GeoJSON specification.
  • osm_id (string): A unique identifier for this geometry in the OSM database
  • code (number): OSM code for type building. This are all the same in this dataset.
  • fclass (string): Description of OSM feature. All are building here.
  • name (string): Name of building. Note: many are not named
  • building_type (string): Type of building (if assigned). Notice that many are null valued.

Recommended Workflow

  1. Upload data to your Carto account
  2. Open dataset in map or data view and toggle on the SQL panel
  3. Clone this repository or download the Lab.md Markdown file
  4. Edit the file in your favorite text editor. Here are two recommendations:

Find Average Area (in square meters) of all buildings

Use the PostGIS function ST_Area. Since our geometries are in WGS84, we need to use geography type casting (the_geom::geography).

SELECT avg(ST_Area(the_geom::geography)) as avg_area_sq_m
FROM andyepenn.university_city_osm_buildings

Find Average Area (in square meters) of buildings by type

Write a query to give:

  • type of building
  • average area of building by type
  • count of buildings of this type
SELECT building_type, avg(ST_Area(the_geom::geography)) as avg_area_sq_m, count(*) as building_count
FROM andyepenn.university_city_osm_buildings
GROUP BY building_type

Which university building is largest? Smallest?

Which building is largest?

SELECT building_type, name, ST_Area(the_geom::geography) as building_area_sq_m
FROM andyepenn.university_city_osm_buildings
ORDER BY ST_Area(the_geom::geography) DESC
LIMIT 1

Which building is smallest?

SELECT building_type, name, ST_Area(the_geom::geography) as building_area_sq_m
FROM andyepenn.university_city_osm_buildings
ORDER BY ST_Area(the_geom::geography) ASC
LIMIT 1

Smallest with a building name that is not null:

SELECT building_type, name, ST_Area(the_geom::geography) as building_area_sq_m
FROM andyepenn.university_city_osm_buildings
WHERE name is not null
ORDER BY ST_Area(the_geom::geography) ASC
LIMIT 1

What is the area (in square meters) of Meyerson Hall?

SELECT building_type, name, ST_Area(the_geom::geography) as building_area_sq_m
FROM andyepenn.university_city_osm_buildings
WHERE name = 'Meyerson Hall'

Find the 10 closest buildings to Meyerson Hall

Using the ST_Distance function, find the 10 closest buildings to Meyerson Hall.

Hint: We can use ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326) for the geometry to represent Meyerson Hall.

SELECT
    building_type,
    name,
    ST_Distance(
      the_geom::geography,
      ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
    ) as building_distance_m
FROM
    andyepenn.university_city_osm_buildings
ORDER BY
    ST_Distance(the_geom::geography, ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography) ASC
LIMIT 10

Find the 10 closest buildings to Meyerson Hall using the distance operator

PostGIS has a distance operator (<->) that allows one to do distance ordering in an ORDER BY.

SELECT
    building_type,
    name,
    ST_Distance(
      the_geom::geography,
      ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
    ) as building_distance_m
FROM andyepenn.university_city_osm_buildings
ORDER BY the_geom::geography <-> ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
LIMIT 10

Find the 10 closest buildings to Meyerson Hall excluding Meyerson Hall

Use the ST_Intersects function to exclude Meyerson Hall. We know that Meyerson Hall is at the point ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326). ST_Intersects returns true/false/null. To only included geometries that do not intersect, precede ST_Intersects with NOT.

SELECT
    building_type,
    name,
    ST_Distance(
      the_geom::geography,
      ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
    ) as building_distance
FROM andyepenn.university_city_osm_buildings
WHERE name != 'Meyerson Hall'
ORDER BY the_geom::geography <-> ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
LIMIT 10

Or use the following if we did't have a name attribute for Meyerson Hall and instead only had the geometry to uniquely define it.

SELECT
    building_type,
    name,
    ST_Distance(
      the_geom::geography,
      ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
    ) as building_distance
FROM andyepenn.university_city_osm_buildings
WHERE Not ST_Intersects(the_geom, ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326))
ORDER BY the_geom::geography <-> ST_SetSRID(ST_MakePoint(-75.19265679, 39.9522405), 4326)::geography
LIMIT 10

Compute and Map the Convex Hull of all of the buildings

Use the ST_ConvexHull function to find the minimum convex shape that includes all of the buildings in University City. See step 5 in the Mapping Follow Along from this week's lecture.

SELECT
    ST_ConvexHull(
      ST_Collect(the_geom)
    ) as the_geom,
    ST_Transform(
      ST_ConvexHull(
        ST_Collect(the_geom)
      ),
      3857
    ) as the_geom_webmercator,
    1 as cartodb_id
FROM andyepenn.university_city_osm_buildings

Note: I added the utility columns the_geom_webmercator and cartodb_id so that we could visualize this on a map.

Finding Largest / Smallest buildings by class

This query gives us the largest building (by building footprint area) for each building class:

SELECT
  ST_Area(b.the_geom::geography) as area,
  b.name,
  b.building_type
FROM andyepenn.university_city_osm_buildings as b
JOIN (
  SELECT max(ST_Area(the_geom::geography)) as maxarea, building_type
  FROM andyepenn.university_city_osm_buildings
  GROUP BY building_type
  ) as m
ON m.building_type = b.building_type AND ST_Area(b.the_geom::geography) = m.maxarea

Modify this query to give the smallest building for each class instead:

SELECT
  ST_Area(b.the_geom::geography) as area,
  b.name,
  b.building_type
FROM andyepenn.university_city_osm_buildings as b
JOIN (
  SELECT min(ST_Area(the_geom::geography)) as maxarea, building_type
  FROM andyepenn.university_city_osm_buildings
  GROUP BY building_type
  ) as m
ON m.building_type = b.building_type AND ST_Area(b.the_geom::geography) = m.maxarea

Note: This is using a new keyword called "JOIN" that we will learn more about next week.