Sign Language Translation: Why It Is Hard to Translate Hand Gestures to Spoken Words

Why It Is Hard to Translate Hand Gestures to Spoken Words

The sign language translation technology is on its way.

According to the Ethnologue guide, there are 7,151 spoken languages in the world. While the number might be surprising, did you know that there are also more than 300 sign languages worldwide, with American sign language (ASL) as the most widely used one?

As of 2021, 430 million people globally are suffering from some form of hearing loss, and that number is likely to double by 2050. The enormous number suggests that more people may need to learn sign language to communicate with deaf people in the near future. Some may even feel discouraged from learning the skill, since mastery of sign language takes time and extensive practice. This is where sign language translation comes into play. However, despite strenuous efforts made in developing sign language translation technology, it remains challenging to perfect the task. Read on to learn more about the obstacles facing sign language translation and how they are addressed by the newly-invented tech solutions.

Why are sign languages difficult to translate?

There is no universal sign language

As mentioned earlier, there are over 300 sign languages worldwide. From a linguistic perspective, a sign language is similar to a distinct spoken language in that they’re not mutually intelligible among themselves. To put it simply, an ASL user cannot understand British sign language (BSL) and vice versa because sign languages are developed based on regions’ dialect and culture.

Since sign language translation remains relatively experimental, there hasn’t been any system or device that translates ASL to BSL or allows users to translate sign language to any foreign language. Researchers from around the world, however, are developing systems that translate their regions’ sign language.

For instance, researchers from Complex Software Lab (a software engineering lab at University College Dublin, Ireland) have put together a new artificial intelligence (AI) -based technology that can translate Irish sign language (ISL) into spoken words. The team behind the AI understands that about 70% of communication in sign language comes from facial expressions. So, they also leverage computer vision and deep learning to capture facial expressions for more accurate translation. We will delve more into the other elements of sign language in the later part of this article.

Besides ISL translation, researchers from Australian video consultation services company Coviu have also developed a web application that translates Auslan (the national sign language of Australia) alphabets using machine learning. The team started off with building their own Auslan alphabet image dataset containing photos of different people signing the alphabets. Those photos served as data “teaching” the computer to match the signs with the corresponding alphabets. When signers sign in front of the computer webcam, machine learning helps the computer translate the Auslan alphabets in real-time.

Understanding individual signs is not enough

Except when we are replying or chit-chatting with someone (e.g., “yes,” “no,” “sure,” “how’s it,” etc.), we normally make complete sentences when communicating. The same also applies to sign language, and hence a sign language translator or software should be able to translate not only single, individual signs but also complete sentences.

Back in 2016, two sophomores from the University of Washington won the Lemelson-MIT Student Prize for their invention of a pair of sign language translation gloves named SignAloud. Inside the gloves, there are sensors measuring hand position and movement. The sensor data is then sent to a nearby computer via Bluetooth that matches the hand movements with corresponding gestures. If the data matches a gesture in the system, the translated signs will be read aloud through speakers. While the idea of translating sign language via gloves is undoubtedly groundbreaking, translating only words or phrases is far from enough.

Three years later, in 2019, researchers from Michigan State University rolled out a deep learning-backed sensory device called DeepASL, which can translate complete ASL sentences without requiring users to stop after each sign. To boost system performance, the team collected approximately 7,000 samples covering 56 common ASL words and 100 sentences to train the algorithm. The DeepASL device is powered by leap motion (a small movement tracking computer interface) which maps and tracks human hand movements through its cameras.

Sign languages involve more than just gestures

Since we see signers mainly using their hands while signing, we may think sign language is only about hand gestures. The truth is, sign language comprises three elements: hand gestures, body movements and facial expressions. All these elements help signers express meaning, such as raising the eyebrows to turn a phrase into a question. The importance of non-verbal cues in sign language communication suggests that researchers must consider more than hand gestures when developing their applications; otherwise, the translation can never be accurate.

SignAll, a Hungary-based startup that strives to enable spontaneous communication between deaf people and the hearing, has put together a device that translates sign language using computer vision and natural language processing. To unpack some technical terms, computer vision refers to enabling computers to process and analyze information (e.g. videos and pictures) through using high-performance camera and image processing software. As for natural language processing, it is the use of technology, like AI, to understand text and spoken words as humans do. 

For translation, signers need to wear gloves and sign in front of three cameras that capture their hand gestures, facial expressions and body movements. This data is then sent and processed by a central computer that transcribes complete ASL sentences for the hearing. On the other hand, the computer can also transcribe speech for deaf people using natural language processing.

Speech to signs—sign language translation goes both ways

Researchers from the Complex Software Lab at University College Dublin also noticed that sign language communication should be for both deaf people and those who can hear. Therefore, their technology focuses on more than just transcribing sign language into spoken language but also the other way around. When the software is translating speech to ISL, the signers will see an avatar signing on their screen. If signers want to translate signing to speech, all they need to do is sign in front of a Microsoft Kinect (a motion-sensing device) and let the system transcribe for the hearing.

While many people welcome and advocate sign language translation technology, some remain skeptical about the invention because sign language is too sophisticated to be translated accurately. We should, however, appreciate the efforts and time researchers have invested in building the technology and their attempts to enable effective communication between deaf people and the hearing.

Also read:

Header image courtesy of Freepik


Share on facebook
Share on twitter
Share on linkedin
Share on email


What Are Altcoins and Is It Safe to Invest in These Cryptocurrencies?

In the crypto world, while Bitcoin continues to dominate headlines, there’s a growing interest in alternative cryptocurrencies, known as “altcoins”. Recent developments, such as Ethereum’s significant Shanghai upgrade and the U.S. Securities and Exchange Commission’s approval of Bitcoin ETFs, have spotlighted these innovative Bitcoin alternatives. Altcoins like Ethereum, Binance Coin and newcomers are carving out their own niches and pushing the boundaries of what cryptocurrencies can do.