-
Notifications
You must be signed in to change notification settings - Fork 2
A draft standard for formatting metadata schema config files
Metadata configuration files may contain four top level sections: Schema files, Properties, Classes, Work Types, and Namespaces. Each is interpreted separately and serves a separate function. The general structure of a metadata config file is therefore:
schema_files:
** other metadata schema files to load go here**
properties:
** property definitions go here**
classes:
** class definitions go here**
work_types:
** work type properties assigned here**
namespaces:
** additional RDF namespace defined here**
When a metadata file is loaded, the "Files" section is interpreted first. This section includes a list of filenames to load. Each is assumed to be another metadata schema file formatted according to these same rules. The files are loaded in order. When each file is loaded, it may override some settings defined by the previously loaded files. The order of loading may therefore be important. Files loaded later take precedence.
A simple schema files section might look like this:
schema_files:
- 'metadata/shared_core_schema.yml'
- 'metadata/samvera_geospatial_properties.yml'
The metadata schemas from all the loaded files are merged together. Deciding exactly how this merge should take place is the main source of complexity in this gem.
Many settings may be overridden by other files, but others (such as the predicate associated with a property) cannot ever be changed once defined. Therefore, no files may contradict one another about a given property's predicate or reuse a previously defined predicate without creating an error.
The properties section contains metadata properties to be used in the system. Each property consists of a unique property name and a set of attributes that define how that property behaves. The overall structure of the property sections looks like:
properties:
first_property_name:
predicate: DC:something
other_attribute1: true
other_attribute2: false
second_property_name:
predicate: DC:somethingelse
other_attribute1: false
Property attributes behave different behaviors depending on the context in which they are used. Some attributes may be required and others optional,
- predicate Required, unique
- definition Required, unique
- usage_note
- usage_warning
- label
- range
- vocabularies (not yet implemented) This would hold a list of controlled vocabularies to associate with the property.
- input This can specify which view partial to use for the edit form.
- multiple (true/false) Whether multiple values are allowed for this property. Defaults to true.
- facet (true/false) Whether this property should be indexed as a facet. Defaults to false.
- required (true/false) Whether this property should be required.
- primary (true/false) Whether this property is displayed on the short list of important properties on the edit form.
- hidden (true/false) Whether this property is completely hidden from all display and indexing. Defaults to false.
- work_title (true/false) Whether this property should be displayed as the primary title of the work. Defaults to false.
- display_type (true/false) Whether this property should be displayed as the type of the work. Defaults to false.
- display_groups This attribute may contain a list of arbitrary groups that this item may belong to for display reasons. Local installations may want to use these to customize views without hard-coding schema information into view files.
A property called "default" or "defaults" may be defined. If the default property defines an attribute that is not defined by another property, that other property will use the default property's attribute instead.
The work types section specifies which properties are associated with each work type, and allows that work type to override attributes of those properties. In the future, it is possible to have this code dynamically define work types based on this schema; if that ever happens, more work type attributes would need to be specified in this section.
An example work types section might look like:
work_types:
generic_work:
properties:
title:
description:
primary: true
required: true
creator:
primary: true
required: true
contributor:
rights_license:
magazine:
properties:
title:
description:
label: "Summary"
primary: true
required: true
creator:
primary: true
required: true
publisher:
date_published:
scale:
contributor:
rights_license:
The classes section defines classes of rdf objects that are saved in Fedora but are not part of the PCDM (i.e. not Hyrax Works). These often correspond to entries in remote authority systems, and may contain local customizations.
An example classes section might look like:
classes:
Agent:
parent: "ActiveTripes::Resource"
rdf_label: "::RDF::Vocab::FOAF.Agent"
properties:
local_label:
definition: "If set, this is a local label for this agent to be indexed and displayed instead of any label fetched from remote authority"
predicate: 'skos:altLabel'
The namespaces section simply allows us to define namespaces that are not already defined in the RDF-vocab gem used by Hyrax. Most of the important ones are pre-defined, but sometimes we might need one that isn't.
An example namespaces section might be:
namespaces:
edm: "http://www.europeana.eu/schemas/edm/"