Main Menu

Powered by <TEI:TOK>
Maarten Janssen, 2014-

TEITOK Help Pages

Using XML Templates

TEITOK was initially designed to work with existing annotated text in TEI format, where the files are generated in whatever way the user had previously established to create XML files. However, to make it easier to create XML new XML files, it is possible to generate new files based on an XML Template. In that case, the creation of a TEITOK annotated file consists of the following steps:

  1. Create a new file based on a template - the interface will ask for all the fields that need to be filled in and create the XML file with the teiHeader based on the template, with an empty text.
  2. Transcribe the raw text using shorthand notation: the type of traditional palaeographic abbreviation for the frequent markup in the documents.
  3. Once the transcription is (provisionally) complete, the shorthand is converted into full TEI XML format, and the individual words in the text are marked as items (tokenization)
  4. Once in tokenized XML, the text is processed for further annotation, either by hand by clicking on words for which the annotation needs to be changed or added, or by running a part-of-speech tagger to automatically assign labels to words and correct the errors made by the tagger.

Back to index