When Old Meets New: How Do We Decode Ancient Texts with AI and Machine Learning?

How Do We Decode Ancient Texts with AI and Machine Learning

Unraveling cryptic ancient languages with rising technologies.

If you have been to Egypt and visited heritage buildings, such as the Karnak Temple or the Tomb of Queen Nefertari, you may have had a glimpse of the Egyptian hieroglyphs. Hieroglyphs are one of the oldest and most complicated writing forms in history. Decoding hieroglyphic inscriptions is challenging because many were damaged due to antiquity and hence became fragmentary and illegible. Meanwhile, since many ancient people used inscriptions to document their lives, deciphering these first-hand data will allow us to understand the history of ancient civilizations.

Egyptian hieroglyphs are not the only ancient texts academics strive to decrypt; several others remain unidentified or, at least to most of us, unfamiliar. Modern technologies, namely artificial intelligence (AI) and machine learning (the use of AI in handling repetitive and time-consuming tasks with automation), can support historians’ research of resurrecting different ancient texts. This will enable us to connect with some of the greatest ancient civilizations of all time. Here are some examples:

Google’s Fabricius

Google’s Arts and Culture team never fails to take us by surprise. After dropping Art Selfie (an application that matches selfies to classic art pieces) back in 2018, the team kicked off another craze by unveiling Fabricius, an interactive Egyptian hieroglyph translation tool built on machine learning. Named after the father of epigraphy (the study of ancient inscriptions), Georg Fabricius, the tool enables users to understand Egyptian hieroglyphs.

Google’s Fabricius
Image courtesy of Google Arts & Culture

Unless you’re an academic specializing in epigraphy, you may not know hieroglyphs very well. Therefore, before you play with hieroglyphs, Google will walk you through the entire writing system in six steps, where you can learn about the origin of the system and the major steps of studying the hieroglyphs. Everyone can be an Egyptologist in this “history” lesson because users will be given a chance to get their hands on trying to read and translate a set of hieroglyphic signs.

Fabricius comprises three interactive sections: learn, play and work. While the first two sections are mainly for the general public, the last section is designed for a more professional user base.

In the playing section, users can input some very basic words, such as “hello”, “happy” and “thank you”, and let the system transcribe them into hieroglyphs. Although the “play” function can only handle elementary expressions, Google is not to blame because the translation in this part is for fun only.

On the other hand, the Fabricius workbench appears to be way more professional and reliable. With automated machine learning, academics can upload their images of hieroglyphs to the system, where the drawings will be automatically compared to over 800 different hieroglyphs.

Currently, the workbench is designed for desktops but not mobile devices. If Google launches a mobile version of the workbench someday, Egyptologists will be able to decode the complicated hieroglyphs conveniently with just their phones or tablets.

DeepMind’s Ithaca

Reviving ancient texts is more difficult when the text itself is incomplete, such as in the case of the Rosetta Stone. When the Rosetta Stone (a stele inscribed with hieroglyphic scripts at the top, Demotic in the middle and ancient Greek at the bottom) was discovered in 1799, it was damaged and incomplete. Since the stone was broken from the top, only 14 lines of hieroglyphs were left, making translation challenging at that moment.

DeepMind’s Ithaca
The Rosetta Stone 
Image courtesy of the British Museum

More than two centuries later, we can finally see a breakthrough. Tech conglomerate Alphabet’s subsidiary DeepMind partnered with the University of Venice, the University of Oxford and the Athens University of Economics and Business to roll out an AI platform called Ithaca. It can automatically fill in the gaps in ancient Greek texts. The team fed Ithaca with over 63,000 transcribed ancient Greek inscriptions to enable the system to identify the letter and word patterns and their connections with phrases.

Restoring a piece of ancient Greek text with Ithaca
Image courtesy of DeepMind

Before the advent of modern technology, historians could normally restore fragmented ancient texts with 25% accuracy. However, a study conducted by DeepMind indicates that Ithaca achieves an accuracy of 62% in restoring damaged ancient texts, which is nearly two times more accurate than humans. DeepMind’s research team is currently building other versions of Ithaca that can handle more ancient texts in the future.

MIT’s AI algorithm

While machine learning appears to be an imperative solution that helps historians decode ancient texts, it’s not always applicable owing to its operational nature. To conduct machine learning, technicians need to feed the algorithm with data to enable it to recognize patterns in the data. The more data the researchers provide for the algorithm, the more accurate and reliable the results the model can produce. Although there is no “golden rule” suggesting how much data is required to train a decent algorithm, the rule of thumb is that the model should explore as much as it can before it can perform well.

Meanwhile, since not all ancient texts are well-preserved, the amount of data (such as drawings or copies of the texts) an algorithm receives cannot be guaranteed, making it difficult to generate accurate outputs. This is where MIT’s AI algorithm (a subset of machine learning that trains the computer to learn how to operate independently) comes into play.

Without receiving any data, the algorithm can decipher a lost language on its own by analyzing the features of existing languages around the world. Since some words are related to each other across languages, identifying the writing patterns of modern languages will enable the algorithm to fill in gaps in ancient texts based on the basic language structure.

Say, the algorithm processes an English sentence after learning the basic sentence structure in modern English (i.e., subject, verb and object). So, when it deals with the sentence, “[missing part] sits on the tree”, it will add a noun (e.g., parrot) instead of a verb back to the sentence to make the sentence logical and comprehensible.

The algorithm has been tested on dead languages Linear B (a syllabic language related to ancient Greek) and Ugaritic (the old form of Hebrew), and the model is proven to translate 67.3% of cognates accurately. Although the current MIT AI algorithm can only process words and phrases, it opens a new door for historians to study ancient texts more effortlessly than ever.

Beyond transcribing and filling in gaps within ancient texts, AI tools, like Ithaca, can even distinguish the established date of the inscriptions within 30 years and recognize their provenance with 71% accuracy. Since provenance and languages are closely associated, historians can take different types of data obtained with AI into account and decode ancient texts more accurately. In a broader sense, it’s not exaggerating to say that AI has bridged modern and ancient civilizations. What’s next?

Also read:

Header image courtesy of Unsplash


Share on facebook
Share on twitter
Share on linkedin
Share on email


How Organizations Can Cope with Change

How Organizations Can Cope with Change

It is almost inevitable that you will experience some changes in the workplace during your career. Be it through digitalization or employee restructuring—businesses need to keep themselves ahead of the curve by quickly adapting to change.

What LaMDA’s “Sentience” Means for AI

What LaMDA’s “Sentience” Means for AI

With the advent of self-driving cars and artificial intelligence (AI) artists, AI is getting closer and closer to replicating human capabilities each day. However, there is one thing that separates humans from AI—emotional intelligence or sentience. Or, at least, so we thought.
In June this year, Google software engineer Blake Lemoine came out with the claim that Google’s AI chatbot LaMDA (short for language model for dialogue applications) had become sentient.

PLC Ultima A Cryptocurrency for Mass Use or a Scam

PLC Ultima: A Cryptocurrency for Mass Use or a Scam?

In December 2021, a new contender entered the cryptocurrency market. Going by the name PLC Ultima (PLCU), the currency was just valued at US$0.10 when it first came into existence. To put this into perspective, 6000 cryptocurrencies had been launched in 2021, taking the total number of cryptos in the market from 10,000 to 16,000.

Budget Southeast Asia Getaways for Entrepreneurs

5 Budget Southeast Asia Getaways for Founders

As an entrepreneur, you’re always on the go, pushing yourself to the limits and working hard to achieve your goals. While starting your own business can be incredibly rewarding, you need to take some time off from time to time. Not just for your mental health—though that’s important, too—taking a break from having to constantly juggle life, work and family obligations 24/7 also helps keep those creative juices flowing.

Top 7 Luxury Watches to Invest in 2022

Top 7 Luxury Watches to Invest in 2022

This year, luxury watches overtook Bitcoin and vintage cars to become the most coveted investment option among the three. Though watch prices have fallen since the infamous crypto crash, they are still significantly up from their prices in 2019. In fact, during the pandemic, people used their saved up money to invest in watches, including Rolex and Patek Philippe. Renowned watch brands caught everyone’s attention as their resale values surged.

Art Theft The Disturbing New Issue on NFT Platforms

Art Theft: The Disturbing New Issue on NFT Platforms

The dark side of the non-fungible tokens (NFTs) market started to reveal itself after NFTs became mainstream last year. Previously, we have explored how the NFT marketplace is riddled with scams, such as artist impersonations and insider trading. Not only have such hoaxes become a big issue for buyers hoping to get their hands on genuine artist works but also for the artists themselves.