CARML is an implementation of the RML mapping specification, with extensions to process streams. It can be used to convert non-RDF data like XML, JSON or CSV to RDF.
This project creates a web service around the CARML RML Engine. This facilitates using carml as a mapping engine from non-Java/JVM projects. Via the HTTP API, one can send mappings and sources with a POST to the service and get the resulting triples back.
At Zazuko, we use the service to scale RDF conversion of millions of XML files by integrating the carml service in our linked data pipelining framework barnard59. The step implementing this service can be found here.
If you are looking for a command-line tool you might want to check out carml-jar
This project provides two flavors
- WAR to use in stock tomcat
- stand-alone service which uses Apache Meecorwave
To build this project you need a standard maven setup
mvn clean package
Will generate both the Meecrowave bundle and the drop in WAR
Results are available in
war/target/war-1.0.0-SNAPSHOT.war
service/target/meecrowave-meecrowave-distribution.zip
The war should be copied in the Tomcat webapps directory, the zip distribution contains a Meecrowave instance that can be started through bin/meecrowave.sh run
The war has test endpoint at service/test
the meecrowave instance has the test endpoint at /test
The service at /
(meecrowave),/service/
(war) expects multipart/form-data
with the following fields to be POSTed
mapping
a turtle based R2RML mapping filesource
the source file, the formats supported are XML, CSV and JSON, indicated by the content type
Headers
- The service supports content negotiation to determine the result format through the
Accept
header, if none is provided it will returntext/turtle
To process a mapping from the command line the following curl command can be used:
curl -F [email protected] -F [email protected] -H "Accept: text/turtle" http://localhost:8080/
Where:
mapping.ttl
is a valid R2RML mapping flesource.xml
is XML file that is described by themapping.ttl
text/turtle
is the requested output formathttp://localhost:8080
is the URI where the service is listening
Either a RDF file in the requested format is returned with 200 OK
status code or a error report according to the Problem Details for HTTP APIs with a 400
status code.
The RML spec supports file based sources by default and CARML extends this to use streams. This service expects a logical source that declares a stream named 'stdin'
Example:
PREFIX rml: <http://semweb.mmlab.be/ns/rml#>
PREFIX carml: <http://carml.taxonic.com/carml/>
PREFIX rr: <http://www.w3.org/ns/r2rml#>
PREFIX ql: <http://semweb.mmlab.be/ns/ql#>
<#person>
a rr:TriplesMap;
rml:logicalSource [
rml:source [
a carml:Stream;
carml:streamName "stdin"
];
rml:referenceFormulation ql:JSONPath;
rml:iterator "$.characters[*]"
].
If you are using XRM plugin, set the mapping outputs to carml
and use stdin
instead of file-names. The plugin will produce this mapping for you.