-
Notifications
You must be signed in to change notification settings - Fork 11
Creating a new TEI file
All TEI files should start with the following three lines:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="https://raw.githubusercontent.com/msdesc/consolidated-tei-schema/master/msdesc.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="https://raw.githubusercontent.com/msdesc/consolidated-tei-schema/master/msdesc.rng" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
These can be copied and pasted verbatim and will not need updating when the schema is updated.
Using these in an XML-aware editor like Oxygen will mean errors will be highlighted with red underlining as you work on the file.
Ensure all files are validated before committing them to this repository.
The fourth line is usually the opening tag of the root element, which should look like this:
<TEI xmlns="http://www.tei-c.org/ns/1.0" xml:id="manuscript_UNIQUENUMBER">
The UNIQUENUMBER must be replaced with a number which is unique across all the Fihrist collections. That is needed because the xml:id
attribute is what gives manuscripts their persistent URLs on the new web site, instead of the transitory URLs the old web site used to generate.
Batches of manuscript IDs have been pre-allocated to member institutions and are kept in the identifiers folder. Follow the instructions in the readme in that folder.
Note, if you want to commit new files before they are ready to be published on the Fihrist web site, you can comment out the manuscript ID, which will prevent that record from being included when the Fihrist web site is next re-indexed. For example:
<TEI xmlns="http://www.tei-c.org/ns/1.0"><!-- xml:id="manuscript_123456"-->
Just remember to change it back when you do wish it to be published, e.g.:
<TEI xmlns="http://www.tei-c.org/ns/1.0" xml:id="manuscript_123456">
It is not absolutely essential, but each msItem
element can be given an xml:id
attribute. By convention, the value is the filename (without the .xml) followed by -itemX (or -itemX-itemY for an msItem
nested inside another msItem
, or itemX-itemY-itemZ when triply-nested, etc.) In multi-part manuscripts, the part number should also be included.
Examples:
<msItem xml:id="MS_Marsh_538-item2">
<msItem xml:id="WMS_Arabic_420-part1-item1-item4">
The first example is the second msItem
in MS_Marsh_538.xml. The second example is the fourth child msItem
within the first msItem
in the first msPart
in WMS_Arabic_420.xml
The xml:id
attribute of an msItem
element is only for referencing the catalogued description of an instance of a work in one manuscript and its TEI record. The "work_UNIQUENUMBER" IDs that suffix the URLs of work-pages on the Fihrist web site identify works as an intellectual entities. These are assigned to abstract works in the authority files, which only the central Fihrist editor should modify. When cataloguing the instance of a work in a manuscript, first check whether the same work in another manuscript is already in Fihrist, by searching on the web site. Try alternative titles or transliterations. If you find a match, copy its "work_UNIQUENUMBER" from the URL into the key
attribute of the title
child element(s) of the msItem
in your TEI record (e.g. if cataloguing a copy of Gulistān, use key="work_20962"
).
If you cannot find a match, the work you are cataloguing is probably the first instance of it in any of the manuscripts in Fihrist. If so, create a blank key attribute (key=""
) in the title
. After you have committed and pushed your record, a new entry will be generated for the work, which will have its own unique "work_UNIQUENUMBER" ID. The Fihrist editor will review it (to ensure it is indeed a unique work) and plug the new ID into the key attribute(s) in your record.
Not all works need keys. If, for example, you are cataloguing a book of one hundred short poems, you can choose to catalogue each poem as a child msItem
, each with a title, but only add a key
attribute to the title
of the msItem
for the whole poetry collection. In that case, only one authority entry will be created.
These work the same as work keys, except the key
attribute should be created in author
or editor
(or, in contexts other than the originators of works, persName
) elements. And, as well as searching the Fihrist web site for the person's name, also search VIAF. If you find them in VIAF, you can plug the VIAF ID into the key
attribute, prefixed by "person_" (e.g. for the poet Saʻdī, use key="person_100206721"
). Try alternative versions of their name, or different transliterations.
If you cannot find a match, the person probably hasn't been mentioned in any of the manuscripts already in Fihrist. If so, create a blank key attribute (key=""
).
Not all persons need keys. If, for example, you are recording a given-name mentioned in a text, which is not and can never be further identified, you can choose not to add a key
attribute to the persName
. In that case, no authority entry will be created for that name.
These are similar to work and person keys, except you must use subjects in the Library of Congress Subject Headings classification. Search there and paste the LCSH ID into the key
attribute of the term
element, prefixed by "subject_" (e.g. for the topic of cheese-making, use key="subject_sh99005888"
). Blank keys should not be used in term
elements.
If the manuscript being described has been wholly or partially digitized, and available online, you can add the following inside the additional
section, after the adminInfo
:
<surrogates>
<bibl type="digital-facsimile" subtype="____">
<ref target="___________">
<title>___________</title>
</ref>
<note>(___________)</note>
</bibl>
</surrogates>
The subtype
attribute should be either "full" or "partial". The target
attribute should be the persistent URL of the digital surrogate. The title
should be the name of the service hosting the image (e.g. "Digital Bodleian", "Manchester Digital Collections", etc). The note
should be either "full digital facsimile" or a description of the extent of a partial digitization (e.g. "miniature paintings only", "single sample image", etc.) If there are multiple digital surrogates at different URLs, add more bibl
tags in the same surrogates
element.
The TEI schema has more detailed documentation, which is available here:
https://msdesc.github.io/consolidated-tei-schema/msdesc.html
A lot of the code snippets are examples from western medieval manuscripts, but the principles should be similar.