Skip to content

Helper for ReDBox-Mint. Reads ReDBox and Mint OAI-PMH RIF-CS portals via a network connection and generates a static web page for each RIF-CS record. This allows the metadata to be exposed to the internet via static web pages whilst the ReDBox-Mint web applications are not.

License

Notifications You must be signed in to change notification settings

grantj-re3/FlindersRedbox-rif2website

Repository files navigation

FlindersRedbox-rif2website

Purpose

"ReDBox is a metadata registry application for describing research data. The Mint is an name-authority and vocabulary service that complements ReDBox." See http://www.redboxresearchdata.com.au/. The purpose of this script is to read ReDBox and Mint OAI-PMH RIF-CS (XML) via a network connection and generate a static web page for each registryObject element (ie. RIF-CS record). Each static web page is generated by extracting XML information from rules specified in a file.

Notes

  • Has been tested and designed for use on ReDBox and Mint dev-handle build.

  • In order for handles to point to these pages, each data source template (eg. Mint home/harvest/Parties_People.json) should have it's urlTemplate like:

    "urlTemplate": "http://MY_STATIC_PAGES_VHOST/MY_PATH/[[OID]].html",

    At the time of writing, all Mint urlTemplates are:

    "urlTemplate": "http://MY_STATIC_PAGES_VHOST/md/m/[[OID]].html",

    and ReDBox urlTemplate (in home/harvest/workflows/dataset.json) is:

    "urlTemplate": "http://MY_STATIC_PAGES_VHOST/md/r/[[OID]].html",

  • Because the source information is the ReDBox-Mint OAI-PMH portals, hence available on a network, this script (and so the destination website) can be on a host other than ReDBox or Mint servers.

Application environment

Read the INSTALL file.

Installation

Read the INSTALL file.

Features

  • The following OAI-PMH harvest methods are permitted:

    • The first harvest of ReDBox (or Mint) must be a full harvest (--full-harvest)
    • Subsequent harvests of ReDBox (or Mint) may optionally be incremental harvests (--incr-harvest). Incremental harvests use the OAI-PMH from argument to obtain all new and updated records since the specified from-datestamp. The summary page (discussed below) is for all records even if an incremental harvest is used (provided a full harvest has been performed in the past and there are no 'gaps' in the incremental harvest datestamps).
  • Use the RIF-CS key to lookup the Facinator OID:

    • with redirect (from Handle.net) 1 level deep
  • Store Facinator OIDs in a local cache in order to bypass the (Handle.net) lookups above. This results in a massive performance improvement (of 80 times on our test system).

  • If there is more than one OAI-PMH page of records, iterate through all pages by using the resumption token

  • Use the following config files (with hash elements which can be overwritten):

    • main (containing RIFCS URL, target root dir, target html-template, user-agent)
    • multiple rule-files according to user's preference Eg. perhaps 1 per record type (eg. collection, party) and subtype (eg. dataset, person)
  • log file (eg. errors, warnings)

  • HtmlHelper class

  • Allow one invocation for Mint and another for Redbox.

  • Using a replacement token for xpath PRIMARY RECORD TYPE so the user can reference other rulesets. Eg ActivityProjectRules = PartyPersonRules

  • Allow program to determine which RIF-CS records will be processed based which *Rules arrays exist.

  • Security checks before running eval().

  • Convert URLs into hyperlinks.

  • Ensure shell script will run as a cronjob.

  • Cope with ReDBox-Mint being offline.

  • Make a handle-landing page for each retired record.

  • Make a rule to show the OID.

  • Make a rule to show the ANDS "Registry View" URL for record. ANDS Services say this is no longer possible since RDA Release 10.

  • Make a rule to show the RDA URL for record.

  • Split dest dir by redbox/mint repo; may be necessary to avoid an OID namespace clash!

  • Creates a summary page (eg. index.html) which points to all static pages created by this script.

  • Use common html-template for both summary page and individual pages.

  • Allow selected output table rows to be highlighted (eg. with bold or italic text).

  • Ruby source code produces rdoc documentation.

  • Rules have been written for record types:

    • party-person
    • activity-project
    • collection-* (but not complete)

    No rules have been written for service-*

Todo

  • Consider untarring images/css in ruby via config file (perhaps using minitar gem).
  • Consider instructing the web browser not to cache the page.

Acknowledgement

The development of this software was a component of a larger [Flinders University] (http://www.flinders.edu.au/) project funded by the Australian National Data Service (ANDS).

About

Helper for ReDBox-Mint. Reads ReDBox and Mint OAI-PMH RIF-CS portals via a network connection and generates a static web page for each RIF-CS record. This allows the metadata to be exposed to the internet via static web pages whilst the ReDBox-Mint web applications are not.

Resources

License

Stars

Watchers

Forks

Packages

No packages published