-
Notifications
You must be signed in to change notification settings - Fork 58
RDF Format
-
OutputFormat:
com.thinkaurelius.faunus.formats.edgelist.rdf.RDFInputFormat
The Semantic Web community is one of the original promoters of the graph as an approach to data modeling. Their efforts have led to the development of the RDF format. While there are many RDF formats, an RDF file is (conceptually) composed of triples whereby a subject is connected to an object by a predicate. For instance:
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
In this way, RDF is an edge list format. Faunus, on the other hand, makes use of an adjacency list in its representation. As such, the RDFInputFormat
provided by Faunus is a MapReduce job that converts an edge list into a adjacency list.
faunus.input.format.rdf.format
There are numerous RDF formats. Faunus currently supports the following formats.
- rdf-xml
- n-triples
- turtle
- n3
- trix
- trig
NOTE: Faunus makes use of LineRecordReader
to read statements from an RDF file. If a line (\n
) does not contain a complete legal RDF fragment, then an exception is thrown by the RDF parser.
faunus.input.format.rdf.literal-as-property
There are two types of triples to be aware of — one that is a URI connecting to a URI and one that is a URI connecting to a literal.
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#age> "32"^^<http://www.w3.org/2001/XMLSchema#int> .
If the above Faunus property is set to true
, then the Hercules vertex has an age property with an integer values of 32.
faunus.input.format.rdf.use-localname
The theoretically infinite RDF graph is embedded with the infinite address space of URIs. To leverage this infinite space, a vertex is specified using a URI. In many situations, the full URI is not required and as such, if the above property is set to true
, then
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
Generates vertices with name hercules
and jupiter
connected by a father
edge.
faunus.input.format.rdf.as-properties
RDF is a triple format. As such, there are no properties, only vertices and edges. In some situations, an object URI should be treated as a property of the vertex. For instance, when http://www.w3.org/1999/02/22-rdf-syntax-ns#type
is specified in the String
list of the property above, then the triple
<http://thinkaurelius.com#hercules> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://thinkaurelius.com#demigod>
yields a Hercules vertex with type-property demigod. A typical setting for this property is below.
faunus.input.format.rdf.as-properties=http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2000/01/rdf-schema#label