Skip to content
BAlter edited this page Apr 27, 2021 · 5 revisions

Solr Index

Finding Aids is powered by Blacklight, which means Solr. A Solr index allows for robust searching and quick retrieval of large data sets. How perfect for special collections finding aids! These finding aids, exported by Archivists' Toolkit or Archivespace or [INSERT YOUR ARCHIVE TOOL HERE] as EAD (Encoded Archival Description) XML documents, can be massive files with varying search and styling needs. With the use of the solr_ead gem we've been able to index these documents in Solr in a way Blacklight can easily read and hence we can benefit from all its built-in freebies.

EAD Repository

Another github repository houses the actual EADs in all their original exported glory. We can schedule a full reindex of the EADs into Solr. We can also schedule jobs to reindex only changed EADs or EADs changed within a certain period of time.

Indexing EAD files

Index/Reindex a single EAD or a whole directory

rake findingaids:ead:index EAD=findingaids_eads/archives/adler.xml
rake findingaids:ead:index EAD=findingaids_eads/archives

Reindex only the files in the data repository that have changed since the last commit

rake findingaids:ead:reindex_changed

Reindex only the files in the data repository that have changed since last week

rake findingaids:ead:reindex_changed_since_last_week

Reindex only the files in the data repository that have changed since yesterday

rake findingaids:ead:reindex_changed_since_yesterday

Reindex only the files in the data repository that have changed since X days ago

rake findingaids:ead:reindex_changed_since_days_ago[days]

Delete from index

Warning: This will delete everything in the Solr index

Never do this in production*

rake findingaids:ead:clean

To delete all records from the index do the following in the Rails console:

indexer = SolrEad::Indexer.new
indexer.solr.delete_by_query("*:*")

Or the following for a delete by query:

indexer.solr.delete_by_query("respository_s:fales")

Custom document definition

SolrEad allows for the definition of a CustomDocument which overrides the default terminology when converting the EAD into a Solr document (the terminology is written in om format). This CustomDocument can be found in lib/findingaids/custom_document.rb with further formatting done in lib/findingaids/record.rb.

See the solr_ead documentation for more information on custom documents.

Component indexing and searching

EAD XML documents have separate components denoted by <c> elements, which if specified SolrEad indexes separately with a reference back to its parent EAD. Similarly to the CustomDocument, a CustomComponent can be defined and is defined by us at lib/findingaids/custom_component.rb.