This project is a prototype for crowdsourcing the quality assessment of linked data. The creation of new Tasks for the crowd will be in a streaming manner, compared to the batched creation typically employed in these scenarios. This prototype solves a specific problem and will be generalized later.
This first use case is the deduplication of resources representing bike-racks in the Trento area. The Trento administration would like to know how many bike-racks exists. The discovery of bike-racks is crowdsourced. When a crowdworker create a new entry for a bike-rack, this resource is checked by an unique-id-service against existing resources. If this service is not confident to which existing resource this bike-rack belongs, it sends a request to a web-service this project implements. The payload of this request consists of an ID for a bike-rack resource and a list of candidates, which could be the same resource.
The resources are enriched with properties from sparql endpoint, which enables the crowdworker do decide which resources are identical. The service then creates a Pybossa task with the Pybossa RESTful API. The following image shows how an enriched task is presented to a crowdworker:
- Install the Pybossa development server.
- Install Sbt 1.0.
- Clone the repository.
- Define environment in
application.conf
. cd rdf-crowdsourced-quality-checker/bike-rack-linker
sbt test
The properties are queried with the Java-RDF Mapper module.
- to Broker
- linkEvaluationRequest
- publishResult
- to Quadstore
- queryProperties
todo
Are on Trello
- Sequence Diagramm
- Pybossa
- tombatossals (David Rubert)
- Bike-Racks in Italy Ckan
- LinkedGeoData mappings
- OpenStreetMap bicycle amenity
- GeoFabrik Trento area
- Forbidden characters - Fiware-Orion
- amaxilat/orion-client: Java Client for the Orion Context Broker Publish/Subscribe Context Broker GE
- fiware/orion - Docker Hub
Pybossa won't start after halt:
sudo -u redis /usr/bin/redis-server /etc/redis/sentinel.conf --sentinel