Translator’s note to “Machine Translation and Benjamin: Pure Language”

Abstract

This article revisits Jae Hyun Lee’s Machine Translation and Benjamin: Pure Language to trace how neural machine translation research laid the foundations for today’s large language models. By situating Google’s Neural Machine Translation system within longer histories of universal language, it highlights how contemporary AI advances echo longstanding philosophical ambitions to abstract and encode meaning beyond linguistic boundaries.

When Jae Hyun Lee—a media scholar at Seoul National University whose work on artificial intelligence and algorithmic society explores the philosophical and cultural implications of digital technologies—published “Machine Translation and Benjamin: Pure Language” in 2019, transformer-based language models were still emerging technologies.¹ Six years later, after ChatGPT became a household name and countless think-pieces about AI ensued, Lee’s article reads as a kind of road map. It reminds us that to understand today’s large language models (LLMs) we must first understand machine translation, because the very architecture that powers models such as ChatGPT emerged from neural machine translation research. Reading Lee’s article through this lens can clarify both the technological development of LLMs and the centuries old cultural ambitions pertaining to universal language.
This article centres on Google’s Neural Machine Translation (GNMT), introduced in 2016, but to further understand Lee’s analysis we should briefly examine what came before and after. The immediate precedent for GNMT emerged two years earlier, when Google researchers unveiled a sequence-to-sequence system: two linked neural networks, one that encodes a source sentence into a high-dimensional vector—a mathematical representation of semantics in coordinate space—and another that decodes that vector into a sentence in the target language.² This implied that meaning, as it is conveyed through orthographic writing, could be stripped of its language-specific form and stored in an abstract coordinate space—a development that seemed to echo the goal long pursued by philosophers and logicians who dreamt of a universal representation of language.
GNMT carried the idea further. Instead of keeping separate models for every language pair, the team built one shared encoder that could handle multiple languages simultaneously. The system operates by prepending a simple language code, such as <2ko> for Korean or <2jp> for Japanese, to signal which target language the decoder should generate.³ As English, Japanese, and Korean sentences all pass through the same learned vector space, GNMT achieved the first credible zero-shot translation, rendering Japanese directly into Korean despite never being trained on that specific language pair.
For Lee, this shared space resembles Walter Benjamin’s reine Sprache—an immanent pure language revealed through the act of translation. Benjamin’s concept is reminiscent of yet fundamentally different from earlier conceptions of universal language from Descartes onwards, which sought to impose a constructed philosophical language rather than to reveal the linguistic universality immanent within the relationality of all languages. For instance, seventeenth-century Scottish scholar George Dalgarno proposed a compositional approach that aimed to identify the fundamental semantic components of any concept or object and represent them through combinations of signs that corresponded to these components. The intermediary language NUDE developed by botanist and early machine translation researcher R. H. Richens in 1956, operated on similar principles, using semantic primitives—irreducible units of meaning—represented by letters.⁴ This allowed computers to paraphrase sentences by breaking them down into their basic semantic elements. More recent projects, such as the Universal Networking Language launched in 1996, have continued this quest for a ‘perfect language’ as Umberto Eco would call it.⁵ Such histories of universal language help provide important context for understanding Lee’s question about whether Google had “at last created Descartes’ universal language through technology”. All these projects have sought to hand-code an objective representation of the linguistic patterns that constitute meaning. What GNMT demonstrated through its zero-shot translation capability was the automation of this process. Instead of a hand-coded interlingua, the system learned to extract one from swathes of multilingual data. This emergent interlingua aligns more closely with Benjamin’s vision of pure language as something discovered through the act of translation rather than imposed through philosophical analysis.
Introduced in 2017, the transformer architecture retained the fundamental encoder-decoder structure that powered both the seq2seq model and GNMT, while replacing the sequential processing with parallel self-attention mechanisms for greater computational efficiency. This design subsequently forked into specialised variants, such as encoder-only models like BERT for reading and classification, decoder-only models like the GPT series for text generation, while translation systems retained the full encoder-decoder structure. Retrieval-augmented and multimodal models operate on similar principles: they encode documents, images, or web snippets into vector representations before generating text from these encoded forms. Despite these architectural variations, the core insight that emerged from GNMT—that meaning can be captured in learned vector spaces that exist between rather than within languages—continues to underpin how contemporary AI systems represent and manipulate semantic content.
Lee’s essay reminds us that this shared semantic space, far from being neutral, inherits the cultural ambitions and epistemological assumptions of earlier universal language projects. In this light, Benjamin’s insight becomes newly relevant. Perhaps the question that arises from this connection between machine translation and Benjamin is not whether machines can access pure language, but whether something akin to pure language was always already machinic—a relational space between languages rather than a perfect language itself. What GNMT and its successors may have revealed is not so much artificial intelligence as the fundamentally relational aspect of meaning that Benjamin glimpsed in his theory of translation.

References
Eco, Umberto. The Search for the Perfect Language. 1st ed. Oxford: Blackwell, 1995.
Johnson, Melvin, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, et al. “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation.” November 14, 2016. arXiv:1611.04558 [cs.CL].
Lee, Jae Hyun. “Gigyebeonyeokgwa benyamin: sunsu eoneo” [Machine translation and Benjamin: pure language]. In Ingong jineung gisul bipyeong [Technology criticism of artificial intelligence], 39–57. Seoul: Communication Books, 2019.
Léon, Jacqueline. “From Universal Languages to Intermediary Languages in Machine Translation: The Work of the Cambridge Language Research Unit (1955–1970).” In History of Linguistics 2002: Selected Papers from the Ninth International Conference on the History of the Language Sciences, 27-30 August 2002, São Paulo – Campinas, edited by Eduardo Guimarães and Diana Luz Pessoa de Barros, 123–132. Amsterdam: John Benjamins Publishing Company, 2007.
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. “Sequence to Sequence Learning with Neural Networks.” In Advances in Neural Information Processing Systems 27 (NIPS 2014): 3104–3112, 2014.

Notes

Jae Hyun Lee, “Gigyebeonyeokgwa benyamin: sunsu eoneo” (Machine translation and Benjamin: pure language), in Ingong jineung gisul bipyeong (Artificial intelligence technology criticism) (Seoul: Communication Books, 2019), 39–57. This article was originally published as a chapter in his 2019 book which unpacked different aspects of AI, each through the lens of a specific thinker. ↩
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, “Sequence to Sequence Learning with Neural Networks,” in Advances in Neural Information Processing Systems 27 (NIPS 2014): 3104–3112, 2014. ↩
Melvin Johnson et al., “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation” (November 14, 2016), arXiv:1611.04558. ↩
Jacqueline Léon, “From Universal Languages to Intermediary Languages in Machine Translation: The Work of the Cambridge Language Research Unit (1955–1970),” in History of Linguistics 2002: Selected Papers from the Ninth International Conference on the History of the Language Sciences, 27-30 August 2002, São Paulo – Campinas, ed. Eduardo Guimarães and Diana Luz Pessoa de Barros, (Amsterdam: John Benjamins Publishing Company, 2007), 125. ↩
Umberto Eco, The Search for the Perfect Language, 1st ed. (Oxford: Blackwell, 1995). ↩