Do you want to talk to C-3PO from Star Wars? Don’t worry; the day is probably just around the corner.
Human language is complicated, especially for computers. While technology has found ways to allow superior computers, like artificial intelligence (AI), to understand other complex human problems (Maths, for example), we may never find a way to crack the human language for our computer friends. The problem with machine language translation is an AI-complete problem (solving these problems requires the AI to be as intelligent or good at something as a human). While some AIs out there can perform some basic conversations with humans using technologies, such as natural language processing (NLP), they fail to work when it comes to some ambiguous situations, such as the lack of context and vague pronouns. But why can’t a computer understand human language completely?
There are many reasons for this. One of them is that most languages, primarily when spoken, sometimes do not follow the prescriptive rules (the more formal grammatical rules you learned in English class) of that language. As a result, it may be hard to set some explicit rules for AIs to follow. For example, the usage of “whom” in formal and informal English: The word “whom” must be used in a formal context, but it is acceptable to use “who” as a replacement in an informal, spoken context. While this may seem pretty straightforward, the question remains: How may a computer distinguish a formal context from an informal one or vice versa?
There is often a blurry line between what is right and what is wrong in a language. And besides formality, there are also some other problems which may make the computer fail to understand you properly. What are they?
Ambiguities in language
While you may not realize it, many of our conversations are somewhat ambiguous. Sarcasm is one example. Though it may be easier to understand sarcastic situations with other modalities, such as facial expressions, the current generalized AIs do not come with such functions. It may be hard for them to analyze the phrase’s true meaning.
Consider this: You just failed a task, and you said, “Oh, I am so good at this!” This is you being sarcastic. Although we, as humans, can understand this easily, it may be hard for AIs to realize that this is sarcasm. Indeed, in this scenario, the acknowledgment of sarcasm is built on evaluating whether you have failed the task. If you indeed fulfilled the task, there is no sarcasm. But failure is often subjective. And this conundrum—of whether you have failed the task—is another thing the AI needs to handle.
While there are solutions to this, some of them are not reliable. For example, a proposed solution is to detect cue words that signify sarcasms. However, if the cue words are not in its database, the AI may fail to notice the sarcasm.
Implicatures in language
Language is full of implicatures (that there is a hidden meaning in the language, and you need the contextual background to fully understand the meaning), especially in daily conversations. Often, we may want to make the request sound more polite, and using implicatures can serve that purpose. In fact, you probably use them every day. So, what are they?
They are a relatively simple concept. Take, for instance, the sentence “It is really hot in here.” When we hear this, we automatically understand that the other person is feeling hot, and we should turn on the fan or open the windows to achieve better thermal comfort. This is because we understand what the speaker implies and act on it, which is also known as a perlocutionary effect.
But think about this: Is the request to open the window or turn on the fan explicitly mentioned by the speaker? No! This is another reason why we cannot fully communicate with AIs. The implicatures are not spoken in words, so the AIs will not receive that information. Even we would sometimes overlook implicatures, not to mention AIs.
Can an advanced AI learn and master a language?
The present technology, such as Siri, simply converts your speech into lines of code, and the AI then analyzes the texts. However, information from texts may be insufficient in interpreting the meaning. Indeed, the famous linguist Michael Halliday suggested that language is simply one of the many elements required to establish a meaning. Other resources, such as body language, also play an essential role in developing meaning.
Various latest AI models take this theory into account and analyze other modalities, such as graphics, audio and even motion, to form a comprehensive meaning. The mobile watch designed by MIT is a great example. The device adopts a more scientific approach, analyzing not only the texts but also some physiological signals, such as your pulse or body language, to examine the tone of the speaker. While the device is still in the experimental stages, this is undoubtedly a step closer to solving AI’s language learning problem.
It is obvious that we are approaching a “completed AI”. Famous computer scientist John von Neumann proposed the term “technological singularity” to refer to when it is impossible to predict what technology will be like. Now, the time after the singularity happens is like a blank canvas. Everything is possible after that point. And perhaps when the day comes that we can have a C-3PO capable of understanding all the modalities, forming a comprehensive understanding is just around the corner.
Also read:
- What is Natural Language Processing?
- How Formula 1 Incorporates Amazon’s AI and Machine Learning to Enhance Viewing Experience
- How AI Can Help Solve the Growing Mental Health Crisis
Header image courtesy of Freepik