When Old Meets New: How Do We Decode Ancient Texts with AI and Machine Learning?

How Do We Decode Ancient Texts with AI and Machine Learning

Unraveling cryptic ancient languages with rising technologies.

If you have been to Egypt and visited heritage buildings, such as the Karnak Temple or the Tomb of Queen Nefertari, you may have had a glimpse of the Egyptian hieroglyphs. Hieroglyphs are one of the oldest and most complicated writing forms in history. Decoding hieroglyphic inscriptions is challenging because many were damaged due to antiquity and hence became fragmentary and illegible. Meanwhile, since many ancient people used inscriptions to document their lives, deciphering these first-hand data will allow us to understand the history of ancient civilizations.

Egyptian hieroglyphs are not the only ancient texts academics strive to decrypt; several others remain unidentified or, at least to most of us, unfamiliar. Modern technologies, namely artificial intelligence (AI) and machine learning (the use of AI in handling repetitive and time-consuming tasks with automation), can support historians’ research of resurrecting different ancient texts. This will enable us to connect with some of the greatest ancient civilizations of all time. Here are some examples:

Google’s Fabricius

Google’s Arts and Culture team never fails to take us by surprise. After dropping Art Selfie (an application that matches selfies to classic art pieces) back in 2018, the team kicked off another craze by unveiling Fabricius, an interactive Egyptian hieroglyph translation tool built on machine learning. Named after the father of epigraphy (the study of ancient inscriptions), Georg Fabricius, the tool enables users to understand Egyptian hieroglyphs.

Google’s Fabricius
Image courtesy of Google Arts & Culture

Unless you’re an academic specializing in epigraphy, you may not know hieroglyphs very well. Therefore, before you play with hieroglyphs, Google will walk you through the entire writing system in six steps, where you can learn about the origin of the system and the major steps of studying the hieroglyphs. Everyone can be an Egyptologist in this “history” lesson because users will be given a chance to get their hands on trying to read and translate a set of hieroglyphic signs.

Fabricius comprises three interactive sections: learn, play and work. While the first two sections are mainly for the general public, the last section is designed for a more professional user base.

In the playing section, users can input some very basic words, such as “hello”, “happy” and “thank you”, and let the system transcribe them into hieroglyphs. Although the “play” function can only handle elementary expressions, Google is not to blame because the translation in this part is for fun only.

On the other hand, the Fabricius workbench appears to be way more professional and reliable. With automated machine learning, academics can upload their images of hieroglyphs to the system, where the drawings will be automatically compared to over 800 different hieroglyphs.

Currently, the workbench is designed for desktops but not mobile devices. If Google launches a mobile version of the workbench someday, Egyptologists will be able to decode the complicated hieroglyphs conveniently with just their phones or tablets.

DeepMind’s Ithaca

Reviving ancient texts is more difficult when the text itself is incomplete, such as in the case of the Rosetta Stone. When the Rosetta Stone (a stele inscribed with hieroglyphic scripts at the top, Demotic in the middle and ancient Greek at the bottom) was discovered in 1799, it was damaged and incomplete. Since the stone was broken from the top, only 14 lines of hieroglyphs were left, making translation challenging at that moment.

DeepMind’s Ithaca
The Rosetta Stone 
Image courtesy of the British Museum

More than two centuries later, we can finally see a breakthrough. Tech conglomerate Alphabet’s subsidiary DeepMind partnered with the University of Venice, the University of Oxford and the Athens University of Economics and Business to roll out an AI platform called Ithaca. It can automatically fill in the gaps in ancient Greek texts. The team fed Ithaca with over 63,000 transcribed ancient Greek inscriptions to enable the system to identify the letter and word patterns and their connections with phrases.

Restoring a piece of ancient Greek text with Ithaca
Image courtesy of DeepMind

Before the advent of modern technology, historians could normally restore fragmented ancient texts with 25% accuracy. However, a study conducted by DeepMind indicates that Ithaca achieves an accuracy of 62% in restoring damaged ancient texts, which is nearly two times more accurate than humans. DeepMind’s research team is currently building other versions of Ithaca that can handle more ancient texts in the future.

MIT’s AI algorithm

While machine learning appears to be an imperative solution that helps historians decode ancient texts, it’s not always applicable owing to its operational nature. To conduct machine learning, technicians need to feed the algorithm with data to enable it to recognize patterns in the data. The more data the researchers provide for the algorithm, the more accurate and reliable the results the model can produce. Although there is no “golden rule” suggesting how much data is required to train a decent algorithm, the rule of thumb is that the model should explore as much as it can before it can perform well.

Meanwhile, since not all ancient texts are well-preserved, the amount of data (such as drawings or copies of the texts) an algorithm receives cannot be guaranteed, making it difficult to generate accurate outputs. This is where MIT’s AI algorithm (a subset of machine learning that trains the computer to learn how to operate independently) comes into play.

Without receiving any data, the algorithm can decipher a lost language on its own by analyzing the features of existing languages around the world. Since some words are related to each other across languages, identifying the writing patterns of modern languages will enable the algorithm to fill in gaps in ancient texts based on the basic language structure.

Say, the algorithm processes an English sentence after learning the basic sentence structure in modern English (i.e., subject, verb and object). So, when it deals with the sentence, “[missing part] sits on the tree”, it will add a noun (e.g., parrot) instead of a verb back to the sentence to make the sentence logical and comprehensible.

The algorithm has been tested on dead languages Linear B (a syllabic language related to ancient Greek) and Ugaritic (the old form of Hebrew), and the model is proven to translate 67.3% of cognates accurately. Although the current MIT AI algorithm can only process words and phrases, it opens a new door for historians to study ancient texts more effortlessly than ever.

Beyond transcribing and filling in gaps within ancient texts, AI tools, like Ithaca, can even distinguish the established date of the inscriptions within 30 years and recognize their provenance with 71% accuracy. Since provenance and languages are closely associated, historians can take different types of data obtained with AI into account and decode ancient texts more accurately. In a broader sense, it’s not exaggerating to say that AI has bridged modern and ancient civilizations. What’s next?

Also read:

Header image courtesy of Unsplash


Share on facebook
Share on twitter
Share on linkedin
Share on email


Essential Gaming Slang Terms for True Gamers

Essential Gaming Slang Terms for True Gamers

Gaming is not just a hobby; it’s a culture with its own unique language. Understanding slang and jargon is crucial for having an immersive experience and connecting with fellow gamers. From the acronyms that define player roles to the phrases that capture epic moments, mastering these slang terms is a must for every true gamer.

LinkedIn Launches Tools to Boost Job Seekers' Safety and Confidence

LinkedIn Launches Tools to Boost Job Seekers’ Safety and Confidence

Networking platform LinkedIn has introduced a range of tools to empower job seekers to confidently navigate their job search process while ensuring their safety and security. The latest updates include the implementation of verifications on job posts, enabling the display of verified information about job posters or their companies.

A Step-by-Step Guide

The Power of a Wikipedia Page for Your Business: A Step-by-Step Guide

The one thing that builds trust between your company and its potential customers is having its own Wikipedia page. It is the first thing that shows up when someone looks up your company (besides your website of course!) and gives potential customers all the information they might need about your business.

Top 5 Unique Pet Care Startups to Watch

From Diagnostics to Play Dates: Top 5 Unique Pet Care Startups to Watch

All pet owners out there understand the feeling of wanting to do whatever it takes to make their furry companions’ lives just a little bit more comfortable. It is perhaps that exact feeling that has made the average pet owner spend over US$1,300 on pet care a year. According to a 2021 survey conducted by the market research firm OnePoll, 52% of Americans spend more on their pets than they do on themselves each year.

Course5 Intelligence Gains US$55 Million Funding Boost

Course5 Intelligence Gains US$55 Million Funding Boost; Closes First Round Successfully with 360 ONE Asset’s Tech Fund

Analytics and artificial intelligence (AI) solutions company Course5 Intelligence has recently announced its plans to raise a funding round of USD 55 million. The initial closing of the funding round was achieved through the participation of 360 ONE Asset Management Limited’s Tech Fund, which specializes in investing in promising technology companies. Leading the round, 360 ONE Asset invested US$28 million in Course5.

How to Find Your Company’s North Star Metric to Ensure Success

How to Find Your Company’s North Star Metric to Ensure Success

In the world of business, having a singular goal to focus on can be the key to success. That’s where the North Star Metric (NSM) comes in. Coined by startup investor Sean Ellis, the NSM is the measure of the value a company is delivering to its customers and is used as a means to predict the growth of the business.