"ReDBox is a metadata registry application for describing research data. The Mint is an name-authority and vocabulary service that complements ReDBox." See http://www.redboxresearchdata.com.au/. The purpose of this script is to read ReDBox and Mint OAI-PMH RIF-CS (XML) via a network connection and generate a static web page for each registryObject element (ie. RIF-CS record). Each static web page is generated by extracting XML information from rules specified in a file.
-
Has been tested and designed for use on ReDBox and Mint dev-handle build.
-
In order for handles to point to these pages, each data source template (eg. Mint home/harvest/Parties_People.json) should have it's urlTemplate like:
"urlTemplate": "http://MY_STATIC_PAGES_VHOST/MY_PATH/[[OID]].html",
At the time of writing, all Mint urlTemplates are:
"urlTemplate": "http://MY_STATIC_PAGES_VHOST/md/m/[[OID]].html",
and ReDBox urlTemplate (in home/harvest/workflows/dataset.json) is:
"urlTemplate": "http://MY_STATIC_PAGES_VHOST/md/r/[[OID]].html",
-
Because the source information is the ReDBox-Mint OAI-PMH portals, hence available on a network, this script (and so the destination website) can be on a host other than ReDBox or Mint servers.
Read the INSTALL file.
Read the INSTALL file.
-
The following OAI-PMH harvest methods are permitted:
- The first harvest of ReDBox (or Mint) must be a full harvest (--full-harvest)
- Subsequent harvests of ReDBox (or Mint) may optionally be incremental harvests (--incr-harvest). Incremental harvests use the OAI-PMH from argument to obtain all new and updated records since the specified from-datestamp. The summary page (discussed below) is for all records even if an incremental harvest is used (provided a full harvest has been performed in the past and there are no 'gaps' in the incremental harvest datestamps).
-
Use the RIF-CS key to lookup the Facinator OID:
- with redirect (from Handle.net) 1 level deep
-
Store Facinator OIDs in a local cache in order to bypass the (Handle.net) lookups above. This results in a massive performance improvement (of 80 times on our test system).
-
If there is more than one OAI-PMH page of records, iterate through all pages by using the resumption token
-
Use the following config files (with hash elements which can be overwritten):
- main (containing RIFCS URL, target root dir, target html-template, user-agent)
- multiple rule-files according to user's preference Eg. perhaps 1 per record type (eg. collection, party) and subtype (eg. dataset, person)
-
log file (eg. errors, warnings)
-
HtmlHelper class
-
Allow one invocation for Mint and another for Redbox.
-
Using a replacement token for xpath PRIMARY RECORD TYPE so the user can reference other rulesets. Eg ActivityProjectRules = PartyPersonRules
-
Allow program to determine which RIF-CS records will be processed based which *Rules arrays exist.
-
Security checks before running eval().
-
Convert URLs into hyperlinks.
-
Ensure shell script will run as a cronjob.
-
Cope with ReDBox-Mint being offline.
-
Make a handle-landing page for each retired record.
-
Make a rule to show the OID.
-
Make a rule to show the ANDS "Registry View" URL for record. ANDS Services say this is no longer possible since RDA Release 10.
-
Make a rule to show the RDA URL for record.
-
Split dest dir by redbox/mint repo; may be necessary to avoid an OID namespace clash!
-
Creates a summary page (eg. index.html) which points to all static pages created by this script.
-
Use common html-template for both summary page and individual pages.
-
Allow selected output table rows to be highlighted (eg. with bold or italic text).
-
Ruby source code produces rdoc documentation.
-
Rules have been written for record types:
- party-person
- activity-project
- collection-* (but not complete)
No rules have been written for service-*
- Consider untarring images/css in ruby via config file (perhaps using minitar gem).
- Consider instructing the web browser not to cache the page.
The development of this software was a component of a larger [Flinders University] (http://www.flinders.edu.au/) project funded by the Australian National Data Service (ANDS).