From 1d28344fda0ef785c0388517ac39f27f718db78b Mon Sep 17 00:00:00 2001 From: spwoodcock Date: Tue, 3 Sep 2024 08:20:50 +0100 Subject: [PATCH] docs: add extra info on importer/updater plans --- importer/README.md | 46 ++++++++++++++++++++++++++++++++++++++++++++-- updater/README.md | 12 ++++++++++-- 2 files changed, 54 insertions(+), 4 deletions(-) diff --git a/importer/README.md b/importer/README.md index 033536b..4e07db3 100644 --- a/importer/README.md +++ b/importer/README.md @@ -1,5 +1,47 @@ # OSM Importer Service -Import data into a fresh OSM Sanbox instance. +Import data into a fresh OSM Sandbox instance. -Uses Geofabrik country data, then filters down to the users required BBOX. +Method: + 1. User provides BBOX to download data for. + - First we do a simple calculation to get centroid from BBOX. + - https://nominatim.org/release-docs/latest/api/Reverse + 2. (Optional) reverse geocode the country name from BBOX area. + 3. Download latest country data using GeoFabrik. + 4. Filter data using `osmium` BBOX functionality. + 5. Import the BBOX data into the sandbox db using `osmosis`. + +> [!NOTE] +> While `osmium` is the most performant and best maintained tool +> for dealing with OSM data, it does not support importing into +> an OSM-type database (dbapi). +> +> It's primary purpose is for importing into an alternative +> PostGIS database for data analysis, using PostGIS representations +> of each geometry (the OSM db does not use PostGIS). +> +> As a result, the only available tool for importing into dbapi +> format is `osmosis`, a now deprecated Java tool. + +## Work Modes + +### Option 1: Startup + +- Each osm-sandbox instance is throwaway. +- The user starts sandbox with a bbox, the data is populated. +- The mapping concludes, data is extracted, and the sandbox deleted. + +### Option 2: Triggered + +- We run one osm-sandbox instance. +- The user triggers import for an AOI. +- The data is imported using the workflow above. + +## Updating Data + +See the `updater` section of this repo. + +## Future + +- This is a test service to demo different approaches. +- The end goal is to contribute to developmentseed/osm-seed. diff --git a/updater/README.md b/updater/README.md index 92b1ded..370fbe5 100644 --- a/updater/README.md +++ b/updater/README.md @@ -2,5 +2,13 @@ Update the data in an existing OSM Sandbox instance with latest OSM data. -Uses the `.osc` daily diff files provided by OSM, and filters down to the -users required BBOX. +- After initial load, the user may want to update the data in sandbox. +- To sync with the current OSM database, we need to use replication data. +- This is made available on OSM at intervals: minute, hour, day. + - https://wiki.openstreetmap.org/wiki/Planet.osm/diffs + - E.g. https://planet.osm.org/replication/day/000/004/ +- We download the daily `.osc` diff files provided by OSM. + - We must download each file since the date of first data import. + - The diffs can be filtered by BBOX using `osmium`. +- Then the actual data import / applying of data in the db + will likely be done by `osmosis`.