Epigraf

Epigraf is used in the interacademic project The German Inscriptions for creating editions of medieval inscriptions. The aim of the project is to collect and edit all Latin and German inscriptions from the Middle Ages and the early modern period up to the year 1650. As things stand today, the collection covers Germany and Austria as well as South Tyrol.

The results of the collection are published in printed volumes. A volume contains either the inscriptions of one or more urban or rural districts or the inscriptions of individual towns. Digital reproductions of these volumes are available on the German Inscriptions Online (DIO) portal. We are constantly working on providing volumes as structured data dumps that can be used for data analyses.

Volume	Articles	Quality	EpiDoc	RDF	Version	Published
DI 007 Braunschweig	73	Original data	di7_epidoc.zip	di7_ttl.zip di7_rdf_xml.zip di7_jsonld.zip	1.0	2025-05-18
DI 077 Greifswald	448	Original data	di77_epidoc.zip	di77_ttl.zip di77_rdf_xml.zip di77_jsonld.zip	1.0	2025-05-18
DI 102 Stralsund	457	Original data	di102_epidoc.zip	di102_ttl.zip di102_rdf_xml.zip di102_jsonld.zip	1.0	2025-05-18

Don't hesitate to get in touch with us. We are happy to hear about your use cases.

Please respect the licences included in the data dumps. Where possible, data is published under CC BY 4.0 which allows you to use, edit and remix the data as long as you properly reference the source.

Citation: The German Inscriptions (2025). Structured Data Dumps (Version 1.0) [Data set]. Epigraf. https://epigraf.inschriften.net/pages/data

Data published on epigraf.inschriften.net comprises a mixture of different qualities. Our IRIs are based on a typing scheme that allows the data type to be determined directly from the identifiers:

Original data (EPI-Artikel): Books produces directly with Epigraf are well structured. The projects are typed as epi, the articles as epi-article.
Legacy data (DIO-Artikel): The project The German Inscriptions was started long before the advent of computers, digital data formats and content management systems. For a large slice of this data, we generated digital versions on the DIO-Portal and the data is further being reconstructed and published on Epigraf step by step. The projects fall under the type dio and the articles are marked as dio-articles.
Placeholders (DI-Artikel): Data that is not published on Epigraf but referenced by other data is represented by placeholders. The placeholders do not contain any useful data, just the article number and the title of the project or article. Such projects carry the type di and the articles are of type di-article.
Collections (Bände): Some articles represent a whole volume, they are coined as collection.

For guest users, we hide placeholder data, rearrange the article sections for better readability, and limit the visible fields according to copyright regulations. All data is published in printed books first, and made available in structured formats in the DIO portal and on Epigraf after a moving copyright wall of two years.

EpiDoc is a data format used to digitally encode editions of ancient documents. It uses a subset of the Text Encoding Initiative (TEI) format and is widely used in the domain of epigraphy. The published data contains a selection of fields:

Title, author and identifier information of the article.
Transcription and translation of the inscription text according to the Leiden conventions.
Object description including dates, locations, object types, manufacturing techniques, materials and preservation state.
Commentary including information about the type of text and used languages.

Data following the Resource Description Framework (RDF) is structured along triples that form statements consisting of an object, a predicate and an object. Epigraf stores data in the Relational Article Model. From those data, triples are generated according to the following mapping.

We use selected vocabularies to model statements about articles that describe inscriptions and objects:

The RDF base vocabulary and schema.org vocabulary are used to describe the structure of the articles and basic attributes of objects and inscriptions.
The CIDOC Conceptual Reference Model (CRM) provides basic vocabualry for the field of cultural heritage as documented by museums, libraries and archives.
In the first version of our dumps, most objects consist of literals. Transcriptions follow the Leiden conventions. We are working on integrating norm data from controlled vocabularies.

The RDF data is serialized in TTL, RDF/XML and JSON-LD format. All archives contain the same set of triples. Internationalized Resouce Identifiers (IRIs) are used for unique and persistent identification of the resources. At the time of this writing, the IRI resolver is only available for registered users. We need to tackle some technical issues before it will be publicly openend.

Available data dumps

Data qualities

EpiDoc format

RDF format