Literary Machine Translation Project

Project Title

Literary Machine Translation to Produce Translations that Reflect Translators’ Style and Generate Retranslations

Funding Agency

The Scientific and Technological Research Council of Turkey (TÜBİTAK)

Project Team

Principal Investigator: Mehmet Şahin

Researchers: Ena Hodzik, Tunga Güngör

Post-doctoral researcher: Sabri Gürses

Graduate students: Zeynep Yirmibeşoğlu, Harun Dallı, Olgun Dursun

Undergraduate students: Duru Eroğlu, Mert Alpsoy, Ayşegül Hatayoğlu, Ezgi Gülek

Project Duration

15 November 2021 – 15 July 2024

Summary

Machine translation (MT) has become a valuable tool for translating a wide variety of text types. Researchers in countries that produce key research on the use of MT have recently started to test its efficiency in the translation of creative texts. Researchers in Turkey, however, have so far paid little attention to the use of corpus tools for analyzing the style of literary translations or to the use of MT for translating literary works into Turkish.

 

This project aims to use MT to translate into Turkish literary texts whose style has been defined using corpus tools. By extension, we aim to determine the extent to which an MT system can reproduce a translator’s style if it is trained on a corpus of previous translations by the same translator. We also intend to build MT systems that will generate a distinctive Turkish version of a source text based on a corpus of previous translations by different translators. We will adopt an interdisciplinary approach that integrates translation studies, computer engineering, and corpus linguistics. Our research questions are as follows:

  1. What are the distinctive measurable characteristics of a translator’s style?
  2. To what extent can a customized MT system reflect a translator’s style when the system is trained on previous translations by the same translator?
  3. To what extent can a customized MT system trained on existing translations of a given text generate a distinctive retranslation?

We will first identify the major characteristics of Turkish translators’ style by analyzing a corpus of their translations. We will select renowned translators who are no longer living and focus on source texts in English, Russian, German, or French. Our model will be adapted from Baker’s (2000) corpus analysis of translator style and Leech and Short’s (1981) literary style analysis.

Second, we will create a customized MT engine and use it to create translations of new translations, and then analyze them for evidence of the translators’ style. We will use both statistical machine translation (SMT) and neural machine translation (NMT) models and evaluate the differences between them. We will use the standard SMT model which comprises a translation model, a language model, and a reordering model, and analyze the effect of adding more components. For NMT, we will use two state-of-the-art models: the sequence-to-sequence model and the transformer model.

The final stage of the project will focus on creating retranslations with our customized MT engines. We will investigate how MT systems trained on a corpus of existing translations of a source text can be used to generate a retranslation of the same text. We will also investigate the problems that the use of plagiarized versions in training MT systems would entail. To do this, we will use the analysis model proposed by Şahin et al. (2018), where similarities between retranslations are analyzed both quantitatively and qualitatively.

The findings of the project will provide a better understanding of the benefits and drawbacks of MT in the context of literary translation. The project may outline a new translation process with implications for both the publishing sector and translation education. Our project will contribute to efforts to preserve Turkey’s national and cultural heritage by creating a method for reproducing the authorial style of prominent deceased translators. The future of translation will surely be shaped by developments in MT and artificial intelligence. The project will shed light on the trajectory that the translation profession will follow and help spark a discussion of critical issues like ethics, copyright, remuneration, and the sustainability of the translation profession.