Session Information

This page shows the session details and the presentations assigned to this session.

Encoding the Writing Process: TEI Between Research and Computational Use

Abstract

The Text Encoding Initiative (TEI) has long been used in digital humanities to encode manuscripts and historical documents, primarily focusing on textual products. More recently, TEI has been applied to the encoding of the writing process itself (Bekius, 2023), opening new possibilities for integrating genetic criticism, writing studies, and process-oriented research.As an open and extensible XML-based markup language, TEI is a promising candidate for encoding not only manuscripts, but also born-digital writing processes, shifting the focus from documents to writing sessions and dynamic trajectories of text production. Such an approach enables new and potential applications, including the visualization of writing dynamics (e.g. through tools such as Keystroke Loxensis (Bekius 2024) as part of the eXtant toolkit) or the creation of structured datasets for computational analysis and artificial intelligence systems.Even though TEI could ensure interoperability across projects and disciplines, its complexity and verbosity raise concerns when applied to large-scale or fine-grained writing process data, such as keystroke logs. Encoding long writing sessions at a micro-level can present problems related to elements over-lapping, as well as being time-consuming and cognitively demanding.This roundtable explores this tension by asking whether TEI can realistically function as a standard for writing process research, and under what conditions. Key questions for discussion include: Is TEI suited to represent writing dynamics captured through log files? What alternatives or hybrid solutions might exist? Can parts of the encoding process be automated? A central focus will be the selection problem: which process data is actually relevant to encode, particularly when studying creativity in writing? An additional perspective from computer science will consider whether TEI-based representations of writing processes can function as inputs for artificial agents designed to reproduce an author’s writing style and creative dynamics.Bekius, Lamyk. (2023). Behind the Computer Screens: The use of keystroke logging for genetic criticism applied to born-digital works of literature. [PhD Dissertation Antwerp University & University of Amsterdam]. https://pure.uva.nl/ws/files/139150661/thesis.pdf.Bekius, Lamyk. (2024). ‘Nanogenetic econarratology : where narratology meets keystroke logging data’, in Van Hulle, Dirk (éd.), Genetic Narratology: Analysing Narrative Across Versions, Cambridge, Open book publishers, 2024.Workgroup on Genetic Editions. (2010). ‘An Encoding Model for Genetic Editions’. https://tei-c.org/Vault/TC/tcw19.html.