Skip to content

Latest commit

 

History

History
74 lines (46 loc) · 2.83 KB

README.md

File metadata and controls

74 lines (46 loc) · 2.83 KB

lookup-osm-wikidata

This tool helps in finding various language translations of a osm feature from its corresponding wikidata item, given its feature ID and feature Type.

Setting up the tool:

The required python dependencies are packed into reqirements.txt. Run the following command to instlal these dependencies.

pip install requirements.txt

Note: The tool needs Python version higher than 2.6. While setting up on a Linux machine, you can use install_python_dev_linux.sh script to install the dev version of python 2.7

This repository has two scripts that help matching OSM features to Wikidata, and get metadata like translations from Wikidata API.

osm.py

osm.py queries Mapbox's Dynamosm API to fetch the following properties for an OSM feature:

  • wikidata - Wikidata ID of the feature.
  • name
  • name:en
  • wikipedia - Wikipedia reference of the feature.
  • geometry
  • optional properties - script can return additional properties from osm, required by user.

Input

A csv with columns osm_id, osm_type.

Output

Output is a csv file called output.csv which is similar to input.csv, with additional properties. The CSV has the following log information that might be useful to debug:

osm:logs:

  • Success : Either of wikidata or wikipedia tag present for osm feature
  • No wikidata/wikipedia: None of wikidata/wikipedia present
  • Dynamosm request failure: API to query osm failed
  • No OSM Id/ Type: Either of osm_id or osm_type not present in input

Example syntax

python osm.py name:zh name:es postal_code population

wiki.py

wiki.py queries wikipedia API to fetch wikidata ID for those items which don't have osm:wikidata. Then it queries wikidata API to fetch required language translations.

Input

A csv with columns osm:wikidata, osm:wikipdia, osm:geometry.

Output

Output is a csv file called finalOutput.csv in which it appends the following columns to the input file: wiki:wikidata wiki:Distance wiki:label:languageCodes wiki:logs

wiki:Distance

This represents distance in kilo meterts between osm feature and corresponding wikidata item which can be used to validate the match. Higher distance indicates a potential descripency.

wiki:label:languageCode

Each of these contain the translations for corresponding language code

wiki:logs

  • languageCode Present: Translation for this language code present in wikidata
  • No languageCode label: Translation for this language code not present in wikidata
  • Wiki API Error: API to query wikidata failed
  • No wikidata: None of osm:wikidata or wiki:wikidata present
  • Wikipedia error: API to query wikipedia failed
  • No wikidata / wikipedia: None of osm:wikidata or osm:wikipedia present

Example

python wiki.py zh zh-hans