Python module for the automated and metadata-based collection, ingestion and formatting of raw EU data from national providers.
documentation | available at: ... |
status | since 2020 – in construction |
contributors | |
license | EUPL |
Quick install and start
Once installed, the module can be imported simply:
>>> import pyeudatnat
Notebook examples
- A basic example regarding healthcare services to start with the module.
- ...
Usage
You will need first to create a special class given the metadata associated each the national data:
>>> from pyeudatnat import base
>>> NewDataCategory = base.datnatFactory(cat = 'new')
Following, it is pretty straigthforward to create an instance of a national dataset:
>>> datnat = NewDataCategory()
>>> datnat.load_data()
>>> datnat.format_data()
>>> datnat.save_data(fmt = 'csv')
Note the output schema (see also "attributes" in the documentation below) should be defined outside, e.g. in an external config.py
file.
- Various possible geocoding, including
GISCO
.
Default coder is GISCO
, but you can use a different geocoder also using an appropriate key:
- Automated translation,
- ...
Software resources/dependencies
- Packages for data handling:
pandas
. - Packages for geocoding:
geopy
,pyproj
andhappygisco
. - Package for JSON formatting:
geojson
. - Package for translations:
googletrans
.