Supplementary Materials for our ESWC 2017 Submission

This repository contains the supplementary materials used to reproduce the experiments submitted to the 14th Extended Semantic Web Conference (ESWC 2017).

NB: This repository is in progress – Please check back later.

Dataset

The dataset used in our experiments are open access, computer science articles retrieved from CORE dataset dump 2016.

GATE Pipeline

This is the text mining pipeline used to extract relevant information from the dataset. The pipeline requires Java version 8.0 or better and GATE version 3.0 or better.

When the pipeline loads successfully, you should see the following sequence of processing resources in the GATE Developer interface:

License

The text mining pipeline is distributed under the terms of the GNU LGPL v3.0. You can find a copy of the license in the pipeline folder.

Knowledge Base

The provided knowledge base contains all the extracted entities from our dataset of 100 computer science articles. The semantic triples are expressed using the Resource Description Framework (RDF) syntax.

Vocabularies

The triples in our knowledge base use a number of (linked open) vocabularies:

Vocabulary/Ontology	Namespace in KB	Description	URL
Publication Ontology (PUBO)	pubo	PUBO vocabularies are used to express the relations between documents, annotations and their inter-relations.	http://lod.semanticsoftware.info/pubo/pubo
SALT Rhetorical Ontology (SALT)	sro	Used for describing rhetorical elements (claims and contributions) of documents.	http://salt.semanticauthoring.org/ontologies/sro
Resource Description Framework (RDF)	rdf	Formal representation of semantic triples.	http://www.w3.org/1999/02/22-rdf-syntax-ns
RDF Schema (RDFS)	rdfs	Schema used in our knowledge base, as well as DBpedia ontology.	http://www.w3.org/2000/01/rdf-schema
Content Ontology	cnt	Used for representing the literal (verbatim) content of extracted annotations.	http://www.w3.org/2011/content
DBpedia Ontology	dbpedia	Used for grounding documents' topics with DBpedia ontology resources.	http://dbpedia.org/resoource/

Publishing the knowledge base through Fuseki

The knowledge base can be published with Apache Jena Fuseki that can servce RDF data over HTTP.

You need to download Fuseki server first. Follow the instructions on the download page for installation. The experiments descibed in our paper are based on Fuseki version 2.0.0.
The knowledge base triples generated in our experiments are located in the knowledge-base folder. Unzip the triples.zip file to your disk and publish it through Fuseki using either of the following approaches:

a) You can create an empty dataset with Fuseki and upload the .nq file using the "upload files" tab.

b) Alternatively, you can use Apache Jena's TDB-loader command to copy the triples to a directory on your disk and publish the directory through Fuseki:

/JENA_INSTALLATION/bin/tdbloader --loc=/PATH/TO/TRIPLESTORE /PATH/TO/triples.nq
After publishing the knowledge base, verify that all of the triples are uploaded. A triple count on the knowledge base should return 1,712,452 triples.

License

The knowledg-base (triples.zip) is distributed under the terms of the Creative Commons Attribution License v4.0. You can find a copy of the license in the knowledge-base folder.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
dataset		dataset
evaluation		evaluation
graphics		graphics
knowledge-base		knowledge-base
pipeline		pipeline
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supplementary Materials for our ESWC 2017 Submission

Table of Contents

Dataset

GATE Pipeline

License

Knowledge Base

Vocabularies

Publishing the knowledge base through Fuseki

License

About

Releases

Packages

Languages

SemanticSoftwareLab/Supplements-ESWC2017

Folders and files

Latest commit

History

Repository files navigation

Supplementary Materials for our ESWC 2017 Submission

Table of Contents

Dataset

GATE Pipeline

License

Knowledge Base

Vocabularies

Publishing the knowledge base through Fuseki

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages