- Make requirement for ckanext-harvest optional (#10)
- Adds new migration script option
contributor-id-migrate
to add the contributorID to existing manually maintained datasets - Adds SHACL validation support to the triplestore ckan command
- Introduce the possiblity to validate the dataset graph by SHACL when updating the dataset and save the validation result in a triplestore
- Handle requests exceptions if the triplestore endpoint is not reachable
- Also deletes data in triplestore when dataset is deleted in CKAN
- Adds the triplestore ckan command
- Improve logging messages in duplicate detection
- Improve logging when updating data in the triplestore
- Remove pinning version for cryptography dependency. Version >=3.3.1 is working again.
- Improve exception handling when updating data in the triplestore
- Pin version for cryptography dependency avoiding build errors with version >=3.3
- Implemented: Add harvested data into a triplestore
- Avoid crashes of the fetch consumer in case deletion harvest objects are corrupted
- Fixed problem with python dependency 'pycountry' that caused the build to fail.
- When remote datasets without resources/distributions are rejected (
resources_required
), any local version of the dataset is deleted if present. - Fix line endings to match .gitattributes
- Fix harvester plugin docs (#11)
- Update requirement ckanext-dcat to version 1.1.0
- Catch exception if 'email-validator' is not available in older CKAN versions
- Remove patch disabling SSL verification for older Python 2.7 versions
- Adds support for the different VCARD representations for DCAT.contactPoint
- Update version for requirements ckanext-harvest and ckanext-dcat
- Remove the restriction to a specific version of CKAN
- Fix in RDF profile: Remove prefix "mailto:" from values in fields containing an email address in method parse_dataset
- Change in DCAT-AP.de RDF harvester: Remove validator 'email-validator' from create/update package schema
- Improve logic of the duplicate detection and add deletion of older duplicates within the duplicate detection
- Map older licenses in resources from DCAT-AP.de version v1.0 to the latest version v1.0.2
- Improve comparing dates with and without time zone information used by the duplicate detection
- Add different implementation for cleaning tags/keywords
- Add harvest source configuration
resources_required
, which logs and skips all datasets without distributions (CKAN resources)
- Fix possible error in logging message when setting default license
- Add support for class FOAF.Agent as rdf:type in dcatde:originator, dcatde:maintainer, dct:contributor and dct:creator
- Set default license (
http://dcat-ap.de/def/licenses/other-closed
) in the resources of a dataset if no license is provided and write a log entry with additional information about the harvest source, dataset and resource in the info level. Introduce configuration parameterckanext.dcatde.harvest.default_license
for defining the default license. - Serialize dcatde:contributorID as type UriRef if the value is an URI, otherwise as Literal
- Rename environment names for internal ci/cd pipeline
- Update ckanext-dcat to v0.0.9
- Update ckanext-harvest to v1.1.4
- Remove patches (Fixes #6)
- Delete requirements subfolder which contained pre-built wheels
- Add supervisor config for harvesting
gather_consumer
andfetch_consumer
- Add cronjob scripts to run and clear harvest jobs. These scripts are used with GovData and were previously included in ckanext-govdatade.
- Add support for dct:type in dcatde:originator, dcatde:maintainer, dct:contributor and dct:creator
- The profile and examples now use the DCAT-AP.de v1.0.1 Namespace
- Renamed
legalbasisText
tolegalBasis
andgeocodingText
togeocodingDescription
- Renamed
- Added logic to parse older DCAT-AP.de Namespaces
- Improved dct:format and dcat:mediaType handling
- Improved selecting of the default language
- Fix problem with not deleting metadata without guid while harvesting
- Fix handling of downloadURL and accessURL
- Select title, description and names in the default language if available
- Fix error in in graph_from_dataset() if there is no contactPoint exists in the graph
- Updated the examples for the licenses in CKAN and the license mapping to DCAT-AP.de v1.0.1
- Updated the example for the RDF endpoint to DCAT-AP.de v1.0.1
- Added patch for DCAT harvester that it uses the default
_get_user_name
logic of ckanext-harvest - Added patch for ckanext-harvest that the default dataset name suffix is configurable
- OGD
metadata_original_id
is now mapped todct:identifier
instead ofadms:identifier
- Added new migration script option
adms-id-migrate
to fix existing DCAT-AP.de datasets
- Added new migration script option
- Correctly set
metadata_harvested_portal
for the custom RDF Harvester
- Avoiding an invalid rdf graph because of whitespaces in URIRef values by removing whitespaces before adding URIRef objects into the graph
- Added DCAT-AP.de specific RDF Harvester
- The dependency to ckanext-harvest was added
- Harmonized the version between the other CKAN-Plugins of GovData
- Initial version of the CKAN plugin
- Extends the Output-Mapping about the DCAT-AP.de specific fields
- Contains a script (CKAN paster command) to create CKAN groups from the DCAT-AP categories
- Contains a migration script (CKAN paster command) to migrate the datasets in the CKAN database from OGD to DCAT-AP.de structure
- Contains a shell script to purge the CKAN groups representing the OGD categories