- ab
- abbr
- acquisition
- add
- additional
- additions
- antiphon
- app
- bibl
- binding
- bindingDesc
- catDesc
- category
- cb
- Certainty
- change
- choice
- cit
- citedRange
- collation
- collection
- colophon
- condition
- country
- creation
- custEvent
- date
- decoDesc
- decoNote
- del
- depth
- desc
- dim
- dimensions
- div
- editor
- ex
- expan
- explicit
- facsimile
- faith
- filiation
- foliation
- foreign
- gap
- geo
- graphic
- keywords
- handDesc
- handNote
- handShift
- height
- hi
- history
- idno
- incipit
- item
- l
- language
- layout
- layoutDesc
- lb
- lem
- list
- listApp
- listBibl
- listPerson
- listRelation
- listWit
- locus
- material
- measure
- msContents
- msDesc
- msIdentifier
- msItem
- msFrag
- msPart
- nationality
- notatedMusic
- note
- objectDesc
- occupation
- orig
- origDate
- origin
- origPlace
- p
- pb
- persName
- person
- personGrp
- physDesc
- place
- placeName
- provenance
- ptr
- q
- quote
- rdg
- ref
- region
- relation
- repository
- roleName
- rubric
- seal
- sealDesc
- seg
- settlement
- signatures
- source
- space
- subst
- summary
- supportDesc
- supplied
- surrogates
- TEI
- term
- textLang
- title
- unclear
- watermark
- width
- witness
- active
- ana
- assertedValue
- atLeast
- atMost
- cRef
- calendar
- cause
- cert
- color
- columns
- contemporary
- corresp
- defective
- dur
- evidence
- facs
- form
- from
- hand
- href
- ident
- key
- n
- name
- new
- notAfter
- notAfter-custom
- notBefore
- notBefore-custom
- part
- passive
- pastedown
- place
- reason
- ref
- rend
- rendition
- resp
- role
- sameAs
- script
- source
- subtype
- target
- to
- type
- unit
- url
- value
- when
- when-custom
- who
- wit
- writtenLines
- xml:base
- xml:id
- xml:lang
- @source
- Additional
- Additions and Varia
- Aligning transliteration and morphological annotations with Alpheios Alignment Tool
- Art Themes
- Attribution of single statements
- Authority files (keywords)
- Bibliographic References
- Binding Description
- Canonicalized TEI
- Catalogue Workflow
- Collation
- Colophons, Titles and Supplications
- Contributing sets of images to the research environment
- Contributing to the research environment
- Corpora
- Create New Entry
- Create a new file, delete existing, deal with doublets
- Critical Apparatus
- Critical Edition Workflow
- Dates
- Decoration Description
- Definition of Works, Textparts and Narrative Units
- Documentary Texts
- Dubious spelling
- Editing the Schema
- Editing these Guidelines
- Editions in Work Records
- Entities ID structure
- Event
- Figures and Links to Images
- General
- General Structure of Work Records
- Groups
- Hands Description
- History
- Identifiers Structure
- Images
- Images of Manuscripts for editions
- Inscriptions
- Keywords
- La Syntaxe du Codex
- Language
- Layout
- Letters
- Linking from Wikidata to the research environment
- Manuscript Contents
- Manuscript Description
- Manuscript Physical Description
- Manuscripts
- Named Entities
- Narrative Units
- Object Description
- Person
- Place or Repository
- Places
- References
- References to a text and its structure
- Referencing parts of the manuscript
- Relations
- Relative Location
- Repositories
- Revisions
- Roles and roleNames
- Scrolls
- Seals Description
- Setup
- Some useful how-to for personal workspace set up
- Spaces
- Stand-off annotations with Hypothes.is
- Standardisation of transcription from Encyclopaedia Aethiopica
- State and Certainty
- Statements about persons
- Structure
- Summary on the Use of @ref and @corresp
- TEI
- Taxonomy
- Team IDs
- Text Encoding
- Training Materials
- Transcriptions with Transkribus
- Transformation
- Transliteration Principles
- Users
- Using Xinclude
- Validation process
- Workflow
- Works
- Works Description
- Zotero Bibliography Guidelines
- titleStmt of Manuscript Records
Canonicalized TEI
A post-processed version of the TEI encoded according to these guidelines
The TEI described in these guidelines is highly customized, making use of the so-called ODD (One Document Does it all). Some of these customizations, although documented in the ODD and the derived schema, might make the data not so much interchangeable.
The Beta maṣāḥǝft project maintains a script which makes the data a bit more canonical, which can be used also by others to the same effect. The post.xsl transformation can be found here. It relies on a list of editors of the project and a canonicalized taxonomy.
We can start from the latter. While the taxonomy maintained with the data has the following format.
<category>
<desc>Special Manuscripts</desc>
<category>
<catDesc>GoldenGospel</catDesc>
</category>
<category>
<catDesc>miniatureCollection</catDesc>
</category>
</category>
Example 1
This is not what the TEI guidelines describe, it is a much smaller version,
without @xml:id
, where the actual string used for reference is the content of
<catDesc>
↗ which corresponds to the name of the
TEI file in the database for that concept. The canonicalized taxonomy looks instead like the following example.
<category>
<desc>Special Manuscripts</desc>
<category xml:id="GoldenGospel" corresp="https://betamasaheft.eu/authority-files/GoldenGospel/main">
<catDesc>Golden Gospel</catDesc>
</category>
<category xml:id="miniatureCollection" corresp="https://betamasaheft.eu/authority-files/miniatureCollection/main">
<catDesc>Miniature Collection</catDesc>
</category>
</category>
Example 2
Here the values of <catDesc>
↗ are moved to an @xml:id
and the element value is replaced by looking at the <title>
↗ of that file.
Additionally, a @corresp
with the URL to the landing page on the app for that concept is provided.
When in the post-processed TEI, this is included, the file can point to these values internally. So, for example, <term[@key]>
↗ which uses a list of values
from the schema, reproduced from the taxonomy example above for the edited file, can be changed
to <term[@ana]>
↗ where the value is a fragment URI (e.g. #GoldenGospel).
In Beta maṣāḥǝft the different versions can be shown and obtained. Pointing simply at the ID of the file .xml (e.g. https://betamasaheft.eu/BNFet32.xml) the user obtains the file encoded according to these guidelines. prepending /tei/ (e.g. https://betamasaheft.eu/tei/BNFet32.xml) the TEI file encoded according to these Guidelines will be first transformed with post.xsl and then presented.
To stay with the example above, the edited file will not contain the <taxonomy>
↗ at all, as the values are replicated in the schema as values for the @key
attribute of <term>
↗ or
other elements and attributes. However, in the post-processed file, the entire taxonomy will be included, in the canonicalized form described above.
Another important difference regards the URIs. In the edited TEI file, we want them to be as short as possible.
We use for identifiers of Beta maṣāḥǝft the plain ID of the file. This is to be interpreted using the
@xml:base
attribute, which contains the base URL of the app, and will allow an interpreter to point to a full URI of the
web resource for that, which they can additionally use to retrieve any of the content types provided (HTML, XML or an RDF representation or centred graph).
In the post-processed version of the TEI, this @xml:base
is not needed, because all these pointers are spelt out entirely.
Similarly, URIs of external resources are entered in the edited XML using prefixes. These are documented in the <listPrefixDef>
↗ which is also not physically present in the file, it is
included.
<listPrefixDef>
<prefixDef ident="bm" matchPattern="([a-zA-Z0-9]+)" replacementPattern="https://www.zotero.org/groups/358366/ethiostudies/items/tag/bm:$1">
</prefixDef>
<prefixDef ident="betmas" matchPattern="([a-zA-Z0-9\.\-]+)" replacementPattern="https://betamasaheft.eu/$1">
</prefixDef>
<prefixDef ident="iha" matchPattern="([a-zA-Z0-9\.\-]+)" replacementPattern="http://islhornafr.eu//$1">
</prefixDef>
<prefixDef ident="ethiocal" matchPattern="([a-zA-Z0-9]+)" replacementPattern="https://raw.githubusercontent.com/BetaMasaheft/BetMas/master/BetMas/calendars/ethiopian.xml#$1">
</prefixDef>
<prefixDef ident="pleiades" matchPattern="(\d{5-8})" replacementPattern="https://pleiades.stoa.org/places/$1">
</prefixDef>
<prefixDef ident="sdc" matchPattern="([a-zA-Z0-9]+)" replacementPattern="https://w3id.org/sdc/ontology#$1">
</prefixDef>
<prefixDef ident="wd" matchPattern="([a-zA-Z0-9]+)" replacementPattern="https://www.wikidata.org/entity/$1">
</prefixDef>
<prefixDef ident="snap" matchPattern="([a-zA-Z]+)" replacementPattern="http://data.snapdrgn.net/ontology/snap#$1">
</prefixDef>
<prefixDef ident="saws" matchPattern="([a-zA-Z]+)" replacementPattern="http://purl.org/saws/ontology#$1">
</prefixDef>
<prefixDef ident="skos" matchPattern="([a-za-zA-Z]+)" replacementPattern="http://www.w3.org/2004/02/skos/core#$1">
</prefixDef>
<prefixDef ident="gn" matchPattern="([a-zA-Z0-9]+)" replacementPattern="http://www.geonames.org/ontology#$1">
</prefixDef>
<prefixDef ident="dcterms" matchPattern="([a-zA-Z]+)" replacementPattern="http://purl.org/dc/terms/$1">
</prefixDef>
<prefixDef ident="dc" matchPattern="([a-zA-Z]+)" replacementPattern="http://purl.org/dc/terms/$1">
</prefixDef>
<prefixDef ident="lawd" matchPattern="([a-zA-Z]+)" replacementPattern="http://lawd.info/ontology/$1">
</prefixDef>
<prefixDef ident="syriaca" matchPattern="([a-zA-Z\-]+)" replacementPattern="http://syriaca.org/documentation/relations.html#$1">
</prefixDef>
<prefixDef ident="agrelon" matchPattern="([a-zA-Z]+)" replacementPattern="http://d-nb.info/standards/elementset/agrelon.owl#$1">
</prefixDef>
<prefixDef ident="rel" matchPattern="([a-zA-Z]+)" replacementPattern="http://purl.org/vocab/relationship/$1">
</prefixDef>
<prefixDef ident="em" matchPattern="(\d+)" replacementPattern="https://www.eagle-network.eu/voc/material/lod/$1">
</prefixDef>
<prefixDef ident="eo" matchPattern="(\d+)" replacementPattern="https://www.eagle-network.eu/voc/objtyp/lod/$1">
</prefixDef>
<prefixDef ident="ew" matchPattern="(\d+)" replacementPattern="https://www.eagle-network.eu/voc/writing/lod/$1">
</prefixDef>
<prefixDef ident="ic" matchPattern="([a-zA-Z0-9]+)" replacementPattern="http://iconclass.org/$1">
</prefixDef>
<prefixDef ident="ecrm" matchPattern="([a-zA-Z0-9]+)" replacementPattern="http://erlangen-crm.org/current/$1">
</prefixDef>
<prefixDef ident="foaf" matchPattern="([a-zA-Z0-9]+)" replacementPattern="http://xmlns.com/foaf/0.1/$1">
</prefixDef>
</listPrefixDef>
Example 3
The included TEI fragment contains statements like the following.
<prefixDef ident="wd" matchPattern="([a-zA-Z0-9]+)" replacementPattern="https://www.wikidata.org/entity/$1">
</prefixDef>
Example 4
This information, which is edited once for all files used in the post-processing to reconstruct from a name-spaced pointer the full URI.
<relation name="skos:broadMatch" active="NAR0001gwelt" passive="betmas:LandGrant"></relation>
Example 5
Will be in the post-processed file
<relation name="skos:broadMatch" ref="http://www.w3.org/2004/02/skos/core#broadMatch" active="https://betamasaheft.eu/NAR0001gwelt" passive="https://betamasaheft.eu/LandGrant"></relation>
Example 6
This makes also the <prefixDef>
↗s not useful so they are removed from the post-processed file.
Also values of @who
in <change>
↗, or of @calendar
in <date>
↗ are governed by a list of values in the schema.
In the first case, we want them instead to be, in a more canonical way, pointers to the @xml:id
of an <editor>
↗ or <respStmt>
↗. In the post-processed file this is transformed. Similarly
for the @calendar
a list of <calendar>
↗ elements with @xml:id
is added, while the values in the attribute are transformed to references to those, by pre-pending a #.
Another change which happens when post-processing is that empty pointers are populated so that a repository reference, for example, gets also a content.
<repository ref="INS0344AksumS"></repository>
Example 7
will become
<repository ref="https://betamasaheft.eu/INS0344AksumS">ʾAksum Ṣǝyon</repository>
Example 8
Similarly, the <bibl>
↗ containing in the TEI only a <ptr>
↗ with @target
is processed in the more canonical version, to include the entire
rendering as TEI of the record pointed to in Zotero.
<bibl>
<ptr target="bm:Rueppell1840Reise"></ptr>
</bibl>
Example 9
will be seen in the post-processed TEI as
<bibl corresp="http://zotero.org/groups/358366/items/BRDJ9JFF" type="book">
<title level="m">Reise in Abyssinien</title>
<author>
<forename>Eduard</forename>
<surname>Rüppell</surname>
</author>
<pubPlace>Frankfurt am Main</pubPlace>
<publisher>
gedruckt auf Kosten des Verfassers, und in Commission bei Siegmund Schmerber
</publisher>
<date>1840</date>
<biblScope unit="volume">II</biblScope>
<note type="url">https://archive.org/details/reiseinabyssinie02rupp</note>
</bibl>
Example 10
<locus>
↗ is also populated with some text when this is not present and only the attributes are given.
<locus from="1r" to="4v"></locus>
Example 11
becomes
<locus from="1r" to="4v">ff. 1r-4v </locus>
Example 12
This page is referred to in the following pages
Revisions of this page
- Pietro Maria Liuzzo on 2019-07-03: first version of guidelines from Wiki