Artificial Intelligence and Bible Translation
Today the Bible is being translated into thousands of different languages. Some of these, like Chinese or Arabic, are spoken by billions or hundreds of millions of people. Others are local dialects that are known primarily in the small communities that use them. Most of this translation work is done using software designed specifically for Bible translation. Some of this software can process human language much as traditional software handles data, using a variety of techniques collectively termed Natural Language Processing (NLP).
NLP has often been done using statistical models that can run on typical computers. More advanced software, such as Google Search, Google Translate, and ChatGPT, uses complex language tools and algorithms called neural models to process and generate human language. These generally require high-end servers, and analyze vast amounts of text data to create powerful Large Language Models (LLMs) using methods that are often called “artificial intelligence” or AI. (Artificial intelligence is often loosely defined and implies human-like understanding. This article uses the terms NLP and LLM, which are more specific and precise.) LLMs have dramatically improved the power of NLP, promising new ways to make the translation process more efficient, ensure the quality of Bible translations, and help researchers create high-quality resources for Bible translators.
NLP tools can associate the words in a translation to the corresponding words in the Bible’s original Greek or Hebrew, a process known as “alignment.” These alignments can be used to:
■ Perform many kinds of checks for translation consistency
■ Associate dictionary entries, images, articles, and maps with the original text
■ Create draft translations that can be used as a starting point for a translation
■ Provide rich linguistic analysis of a translation and the choices made in the translation process
■ Identify unclear, unnatural, or inaccurate passages within a translation
■ Produce exegetical resources to support translation work
Using NLP for some of these purposes is not new. For example, Paratext, the most widely used Bible translation tool, began using NLP in 2007. Within Paratext, the “Biblical Terms” tool uses NLP to help translation teams ensure that key biblical terms like “salvation,” “sanctification,” and “redemption” are translated accurately and consistently throughout the Bible. Paratext also uses the alignment process to associate external resources with the original biblical text, and has an “Interlinearizer” that can provide glosses for a translation.
Researchers are now exploring new ways to leverage modern NLP. For instance, several translation organizations are exploring first drafts generated entirely with NLP. SIL International’s AQuA project (www.ai.sil.org/Projects/AQuA) trains NLP models using the judgments of expert translation consultants to identify passages that are unclear, inaccurate, or unnatural. AQuA is not always reliable; sometimes the passages it flags are fine. But if AQuA flags a passage as problematic, then the odds are good that the passage has problems that should be discussed. Other groups are exploring a variety of ways to do linguistic analysis of translations using NLP to support Bible translation and create exegetical re-sources.
NLP software may improve quality and efficiency, but Bible translation is best done by human beings, using software as a tool. Producing a first draft is only about 10 percent of the work involved in creating a translation, whether that draft is produced by humans or NLP. Even if software could create a perfect translation—and it cannot—the process of translating the Bible into a local language creates a sense of ownership and community, bringing together the translation team and the group they serve, while also preparing that community to study and teach using the translation that is produced. If the group for whom a translation is intended does not feel connected to it, they may never use it.
Software and resources are best used in ways that help translation teams work together. Such teams are usually diverse, composed of people who bring different cultures, skills, knowledge, and experience to the task. Teams may include native speakers of the target language who lack training in the biblical languages and experienced scholars and translators who do not have a background in the target language.
Modern LLMs like ChatGPT are powerful NLP systems that can write essays or web page content, but they are not optimized for Bible translation. For one thing, they are not trained on trusted translation resources. But developers can create high-quality output by providing good linguistic data, commentaries, and other reliable reference works as input, configuring software so that it does not use less acceptable sources. Additionally, although such models do not have good support for most of the world’s languages, an existing LLM can be “fine-tuned” for a new language, providing useful results with only a small amount of text from that language.
The translation world also faces the same challenges that other developers face in preparing LLMs for use in the NLP context. LLMs can sometimes make things up, provide wildly inconsistent answers, or provide answers in forms that are hard to use. Together with the rest of the software world, we are learning how to overcome these challenges in Bible translation.
Above all, while NLP can rapidly create results that are often good, these results still need to be verified and edited by human beings before they can be considered trustworthy. If this is kept in mind, NLP can be used in a wide variety of ways to improve the quality of Bible translation and make it more efficient. This article only scratches the surface; the entire field of NLP is progressing very rapidly, and so are the applications to Bible translation. The possibilities are vast, and we are only beginning to explore them.
Today the Bible is being translated into thousands of different languages. Some of these, like Chinese or Arabic, are spoken by billions or hundreds of millions of people. Others are local dialects that are known primarily in the small communities that use them. Most of this translation work is done using software designed specifically for Bible translation. Some of this software can process human language much as traditional software handles data, using a variety of techniques collectively termed Natural Language Processing (NLP). NLP has often been done using statistical models that can run on typical computers. More advanced software, such […]
You have already read your free article for this month. Please join the BAS Library or become an All Access member of BAS to gain full access to this article and so much more.