A draft standard for formatting metadata schema config files

Metadata configuration files may contain four top level sections: Schema files, Properties, Classes, Work Types, and Namespaces. Each is interpreted separately and serves a separate function. The general structure of a metadata config file is therefore:

schema_files:
  ** other metadata schema files to load go here**
properties:
  ** property definitions go here**
classes:
  ** class definitions go here**
work_types:
  ** work type properties assigned here**
namespaces:
  ** additional RDF namespace defined here**

Schema Files Section

When a metadata file is loaded, the "Files" section is interpreted first. This section includes a list of filenames to load. Each is assumed to be another metadata schema file formatted according to these same rules. The files are loaded in order. When each file is loaded, it may override some settings defined by the previously loaded files. The order of loading may therefore be important. Files loaded later take precedence.

A simple schema files section might look like this:

schema_files:
  - 'metadata/shared_core_schema.yml'
  - 'metadata/samvera_geospatial_properties.yml'

The metadata schemas from all the loaded files are merged together. Deciding exactly how this merge should take place is the main source of complexity in this gem.

Many settings may be overridden by other files, but others (such as the predicate associated with a property) cannot ever be changed once defined. Therefore, no files may contradict one another about a given property's predicate or reuse a previously defined predicate without creating an error.

Properties Section

The properties section contains metadata properties to be used in the system. Each property consists of a unique property name and a set of attributes that define how that property behaves. The overall structure of the property sections looks like:

properties:

 first_property_name:
   predicate: DC:something  
   other_attribute1: true
   other_attribute2: false

 second_property_name:
   predicate: DC:somethingelse  
   other_attribute1: false

Property Attributes

Property attributes behave different behaviors depending on the context in which they are used. Some attributes may be required and others optional,

predicate Required, unique
definition Required, unique
usage_note
usage_warning
label
range
vocabularies (not yet implemented) This would hold a list of controlled vocabularies to associate with the property.
input This can specify which view partial to use for the edit form.
multiple (true/false) Whether multiple values are allowed for this property. Defaults to true.
facet (true/false) Whether this property should be indexed as a facet. Defaults to false.
required (true/false) Whether this property should be required.
primary (true/false) Whether this property is displayed on the short list of important properties on the edit form.
hidden (true/false) Whether this property is completely hidden from all display and indexing. Defaults to false.
work_title (true/false) Whether this property should be displayed as the primary title of the work. Defaults to false.
display_type (true/false) Whether this property should be displayed as the type of the work. Defaults to false.
display_groups This attribute may contain a list of arbitrary groups that this item may belong to for display reasons. Local installations may want to use these to customize views without hard-coding schema information into view files.

Defaults

A property called "default" or "defaults" may be defined. If the default property defines an attribute that is not defined by another property, that other property will use the default property's attribute instead.

Work Types section

The work types section specifies which properties are associated with each work type, and allows that work type to override attributes of those properties. In the future, it is possible to have this code dynamically define work types based on this schema; if that ever happens, more work type attributes would need to be specified in this section.

An example work types section might look like:

work_types:
  generic_work:
    properties:
      title:
      description:
        primary: true
        required: true
      creator:
        primary: true
        required: true
      contributor:
      rights_license:
  magazine:
    properties:
      title:
      description:
        label: "Summary"
        primary: true
        required: true
      creator:
        primary: true
        required: true
      publisher:
      date_published:
      scale:
      contributor:
      rights_license:

Classes section

The classes section defines classes of rdf objects that are saved in Fedora but are not part of the PCDM (i.e. not Hyrax Works). These often correspond to entries in remote authority systems, and may contain local customizations.

An example classes section might look like:

classes:

  Agent:
    parent: "ActiveTripes::Resource"
    rdf_label: "::RDF::Vocab::FOAF.Agent"
    properties: 
      local_label:
        definition: "If set, this is a local label for this agent to be indexed and displayed instead of any label fetched from remote authority"
        predicate: 'skos:altLabel'

Namespaces section

The namespaces section simply allows us to define namespaces that are not already defined in the RDF-vocab gem used by Hyrax. Most of the important ones are pre-defined, but sometimes we might need one that isn't.

An example namespaces section might be:

namespaces:
  edm: "http://www.europeana.eu/schemas/edm/"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly