Skip to content

Using the converter

Kirk Hess edited this page Mar 24, 2017 · 2 revisions

Using the converter

XSLT Processor

In the simplest case, you can invoke an XSLT processor with the main stylesheet (xsl/marc2bibframe2.xsl) as the first argument, and an XML file containing MARCXML as the second:

xsltproc xsl/marc2bibframe2.xsl test/data/marc.xml

Converter parameters

The converter supports three optional parameters:

  • baseuri - the URI stem for generated entities. Default is http://example.org/, which will result in minting URIs like http://example.org/<record ID>#Work

  • idfield - the field of the MARC record that contains the record ID, used in minting URIs as above. Default is 001. If the idfield refers to a MARC data field rather than a MARC control field, the subfield can also be indicated - e.g. 035a (the default subfield is a). Note - there is no built-in facility in the stylesheets for URI-encoding.

  • serialization - the RDF serialization to be used for output. Currently only rdfxml is supported (the default).

Different XSLT processors have different syntaxes for passing parameters. For xsltproc, the syntax is:

xsltproc --stringparam baseuri http://mylibrary.org/ xsl/marc2bibframe2.xsl test/data/marc.xml

For Metaproxy integration, the converter parameters can be passed to the stylesheets using the <param> element in the YAZ configuration:

<xslt stylesheet="xsl/marc2bibframe2.xsl">
  <param name="baseuri" value="http://mylibrary.org/"/>
</xslt>

Converter configuration

Some elements of the conversion can be configured using XML files in the conf directory. Currently, this only includes language mappings for elements generated by 880 tags, and subject thesaurus mappings for MADSRDF elements generated by 6XX tags.

Converter design

The main stylesheet of the XSLT converter application, xsl/marc2bibframe.xsl, uses push processing to process the fields of each MARC record and build the two main elements it generates, a bf:Work and a bf:Instance. In addition, the fields are pushed through to generate a bflc:adminMetaData property of the bf:Work and to generate bf:hasItem properties of the bf:Instance.

Elements in the resulting RDF/XML document that are not blank nodes or nodes with statically determined URIs are given newly minted URIs constructed from the stem of the baseuri parameter (default http://example.org/), the record ID of the MARC record (by default the value of the 001 field), and a hash URI for the new element. For elements that are not the main bf:Work or bf:Instance element generated by the record, the hash URI is constructed from the element class, the field number, and the position of the field in the MARC record, e.g.:

```http://example.org/13600108#Agent100-12```

The templates that match the MARC fields are contained in included stylesheets from the main stylesheet, along with some utility templates in the utils.xsl stylesheet and templates for matching control subfields in the ConvSpec-ControlSubfields.xsl stylesheet. Configuration information is read into variables using the document() function.

As much as possible, templates representing each specification document in the specifications are contained in a stylesheet with the same name, for easier maintenance.

Clone this wiki locally