Skip to content

ekoi/bridge-service

Repository files navigation

Dataverse Bridge to Digital Archive Repository (DAR)

Dataverse Bridge

Overview

The bridge service was developed in order to transfer datasets from a Dataverse instance to another Digital Archive Repository. At DANS, we will use the bridge to transfer datasets from DataverseNL to our long-term archive called EASY. The transfer is done by using the SWORD protocol (v2.0). DANS created a plug-in for EASY, but it is possible to create other plug-ins for the bridge service in order to transfer datasets from Dataverse to a repository of your choice. In the sections below more details are provided about:

Architecture

Plugins System

Due to modularity, flexibility and artifical separation purposes, the Dataverse Bridge application uses a simple plugin system architecture. Thanks to the Java reflection API that allows runtime type introspections and dynamic code loading, the bridge-service will call a plugin (e.g. bridge-plugin-easy) without knowing all the details of the plugin in advance.
More details on how to create a new plugin can be found here.

The application consists of the following parts:

bridge-plugin

The bridge-plugin- mainly consists of interfaces that are used by the bridge-service in order to identify and treat all plugins the same way. The plugins need to implement the interfaces.

bridge-service

The bridge-service of the Dataverse Bridge application is the host of application. The host application does not depend on the plugin implementation, it only relies on the plugin interface. To be able to load the plugin for the desired Digital Repository the bridge-service must know the class name of this plugin. The bridge-service will read the class name from the supplied plugin configuration. The bridge-service was generated by the swagger-codegen project. It uses OpenAPI-Spec to generate the server stub.

The underlying library integrating swagger to SpringBoot is springfox

Bridge API

Changes needed in the Dataverse code

To enable "Archive" button on the dataverse side, additional xhtml, java files and settings configurations are needed.

Archive button

Archive popup

Database Settings

  • :DataverseBridgeConf

Create a json (e.g. dvn.json) file that contains the bridge url and the alias of the user group that has permission to transfer datasets to another repository.

{
    "dataverse-bridge-url": "http://localhost:8592/api",
    "user-group": "SWORD",
    "conf":
        [
            {
                "darName": "EASY",
                "dvBaseMetadataXml": "https://test.dataverse.nl/api/datasets/export?exporter=ddi&persistentId="
             }
         ]
 }
curl -X PUT -d '/path-to/dvn.json' http://localhost:8080/api/admin/settings/:DataverseBridgeConf

Dataverse Role Setting

To be able to archive a dataset with the Dataverse Bridge, the following conditions have been set:

  • The user should be part of a group named 'SWORD'.
  • The user should have an admin-role for the dataverse that contains the dataset that is going to be archived.

Configuration

Create a group 'SWORD' (alias 'SWORD') in the dataverse root. Add the users that should have permission to transfer a dataset to this group.
It is not necessary to give this group permissions (a role) on any dataverse or dataset level. It is also not necessary to create a new role for this group.

Setting up the bridge service

There are two ways to set up the bridge service. You can use the Quick start option, or do it step by step.

Quick start

Download bridge-quickstart, unzip it in a folder.
To start run on the terminal start.sh.
To shutdown, execute shutdown.sh command.

This Quick start shows how to deploy the dataverse bridge by using the default properties.

With this method you can generate the service step by step. This allows you to configure the service as you like (e.g. modificate the path, DAR target, port).

How to generate:

The dataverse bridge includes the Spring boot that provides a set of starter Pom’s build file, which includes an embedded Servlet Container.
The following command, shows how to go from an OpenAPI spec (dataverse-bridge-api) to generated Spring Boot server stub.

swagger-codegen generate -i dataverse-bridge-api.yaml -l spring -o . -c dataverse-bridge-config.json\
 --import-mappings Archiving=nl.knaw.dans.dataverse.bridge.service.db.domain.ArchivingAuditLog

Every time you run the code generation tool, it will overwrite the code it has generated previously. However, the generator tool offers a way of leaving certain files intact that is called .swagger-codegen-ignore file, which works just like a .gitignore file.

.swagger-codegen-ignore

README.md
pom.xml
src/main/java/nl/knaw/dans/dataverse/bridge/service/*
src/main/resources/application.properties

Start the Bridge Appication

Prerequisites:

  • Java 8
    The Dataverse Bridge application is built for java 8 and up.
  • application properties:
    The Dataverse Bridge application loads properties from the application properties that are located in config directory of the current working directory.
    The following properties are needed to fill in the application-dev.properties file:
################### Database Configuration ##########################
spring.datasource.url=jdbc:hsqldb:file:./database/bridgedb;sql.syntax_pgs=true
spring.datasource.username=sa
spring.datasource.password=

'./database/bridgedb' means that the in-memory hsql database will be created in the database directory of the current working directory.

################### JavaMail Configuration ##########################
bridge.apps.support.email.from=
bridge.apps.support.email.send.to=
spring.mail.host=


################# Apps Configuration ##############################
bridge.apikey=
bridge.temp.dir.bags=/path/bagit-temp/bags

To launch the service in another environment, you can use the profile-specific properties that are usually specified. E.g. application-act.properties for acceptation server or application-prod.properties for a production server.
You can launch your application with a -D argument, such as -Dspring.profiles.active=prod to launch the Dataverse Bridge application using application-prod.properties.

Config

  • Log Directory

The location of the log directory is configured in the application properties. As default, the log directory is located in the 'logs' directory of the current working directory.

################### Logging Configuration ##########################
logging.path=./logs
  • Digital Archive Repository Target

The dar-target-conf is the directory where the configuration is saved that contains the name and target URL of the Digital Archive Repository. The configuration has to be provided as a json file, e.g. easy-dev.json:

{
    "dar-name":"EASY",
    "iri":"http://deasy.dans.knaw.nl/sword2/collection/1"
}

Dar Target Conf

  • Plugins Directory

In this directory, you can put your plugin. The plugin itself needs to have a certain structure that is described here.

Plugin Directory Structure

Starting the application

Starting the service as an simple java application.
The 'application-dev.properties' is used as indicate on '-D' argument: -Dspring.profiles.active=dev

java -Dspring.profiles.active=dev -jar target/bridge-service-0.5.0.jar

To start in the debug mode:


java -Dspring.profiles.active=dev -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5105 -jar bridge-service-0.5.0.jar

You can view the api documentation in swagger-ui by pointing to http://localhost:8592/api

Stopping the application

To shutdown the dataverse bridge application:

curl -X POST 'http://localhost:8582/api/admin/shutdown

Creating a plugin

The plugin relies on the power of a XSLT that has the ability to change the structure of a XML file from one format to another. In this case, it will transform the dataverse XML metadata (DDI, Dublin Core) or even Dataverse Metadata in JSON format to the desired Digital Archive Repository metadata format. Plugins must implements the IAction interface of the bridge-plugin.

Include this dependency into your pom.xml to obtain the 0.5-SNAPSHOT release version of the bridge-plugin

<dependency>
    <groupId>nl.knaw.dans.dataverse.bridge.plugin</groupId>
    <artifactId>bridge-plugin</artifactId>
    <version>0.5-SNAPSHOT</version>    
 </dependency>

After creating the jar file from the project, the plugin can be uploaded (or deploy in the plugins directory of bridge-service) to the brige-service in the zip format. The plugin must have the following Plugin Directory Structure:

easy (directory, must be in lowercase)
easy.json (json file that describe the plugin, see an example below)
-- lib (directory where the plugin.jar is located)
-- xsl (directory where the xsl files are located)

Plugin Directory Structure

An example of easy.json

{
  "dar-name": "EASY",
  "action-class-name": "nl.knaw.dans.dataverse.bridge.plugin.dar.easy.EasyIngestAction",
  "action-class-url": "lib/bridge-plugin-easy-0.5-SNAPSHOT-jar-with-dependencies.jar",
  "xsl":
        [
            {
                "xsl-name":"dataset.xml",
                "xsl-url":"xsl/dvn-ddi2ddm-dataset.xsl"
            },
            {
                "xsl-name":"files.xml",
                "xsl-url":"xsl/dvn-ddi2ddm-files.xsl"
            }
        ]
}

The bridge-plugin-easy is the implementation of a bridge-plugin for ingesting data to the EASY archive. This plugin transforms the dataverse metadata file in DDI format into the required metadata files by EASY; ‘dataset.xml’ and ‘files.xml’. This is done according to the requirements described in this document: 'Depositing in EASY with SWORD v2.0' requirements document.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published