Human language

2021 is a banner year for AI Natural Language Processing – News

The AI ​​company acquired by Google tried to master the natural language processing that machines use to understand human language

By Shalini Verma

Posted: Wed Dec 29 2021, 22:31

Universities have faced challenges in assessing their students remotely. The practical difficulties of running exams remotely have given rise to interesting alternatives such as open book exams and research-based essay writing. The quality of submissions will likely improve if students search for references in books.

The same goes for Artificial Intelligence. If an AI system can look up a memory bank for reference instead of memorizing our languages, its output might be better. This is what DeepMind claims to have achieved with RETRO. Having gained credibility in teaching AI to learn games like Go on your own and predict complex protein structures, Google acquired AI company DeepMind to try and master the natural language processing that machines use to understand human language. Pre-trained language models produce text by predicting the words and phrases that should come next in their response.

DeepMind’s RETRO is a model whose performance is enhanced by an external resource – a massive body of text of some 2,000 billion words. To put it in perspective, it would take 175 people to make a lifetime of continuous reading.

When the model generates text, it looks for this external resource to make its prediction more accurate. The researchers claim that such a model design makes it easier to understand how AI infers and detects bias. Training costs less, which makes it more accessible to organizations.

Natural language processing has been a daunting task for machines, given our complex languages. Billions of research dollars have been invested in language models.

Last year, OpenAI’s GPT-3 language model demonstrated the ability for a computer to respond in complex and meaningful sentences. Linguistic models have tended to waver beyond a narrow scope. GPT-3 has demonstrated that if a language model is scaled enough, it can be versatile. The greater the number of internal configuration parameters or values, the higher the accuracy of the model. Other large organizations have developed their own large models. There has been a kind of gold rush, to create bigger and better general-purpose language models, loading billions of parameters.

At the end of 2021, NVIDIA and Microsoft developed the Megatron-Turing NLG 530B model which is trained on the entire English Wikipedia, 63 million English press articles, 38 GB of Reddit discussions, GitHub, books from the Gutenberg project , etc. With a whopping 530 billion parameters, the model is fully trained to make inferences. Notably, this is a 3-fold improvement over the 175 billion parameters of GPT-3 while leaving the other major language models far behind. Google and the Beijing Academy of Artificial Intelligence have built models exceeding a trillion parameters.

While large language models are all the rage, they are becoming too heavy for training and serving downstream AI applications such as digital assistants. They also consume data, computing power and energy. Researchers are still limited by an insufficient number of training examples. Imagine trying to train such a model on identifying bank fraud. It would take forever to label conversations that involve a new kind of fraud.

AI research teams are trying to get around this problem. RETRO from DeepMind is one such attempt with just 7 billion parameters. Another approach is small-stroke learning, as successfully demonstrated by GPT-3, which uses a very small set of labeled examples to train the model. It was transformational because GPT-3 could be trained with as few as 16 examples.

Little learning is being explored by Meta (formerly Facebook) for moderation of content on its social media platforms, which requires rapid policy enforcement. As harmful content continues to evolve, Meta struggles to find enough tagged content. He has deployed a new AI model that is first trained on generic corpora of massive text available for free. It is then trained on the policy data that has been tagged previously. Finally, he is trained on a concise text on a new policy.

It uses a new framework called Entailment as Few-Shot Learner. The implication is to establish a logical consequence between the sentences. For example, if “the car is accelerating” then “the car is moving” must also be true. Simply put, the AI ​​tool can recognize a hate message because it understands the policy that the content violates. The tool was used to quickly detect and suppress hate speech and messages questioning vaccines.

2021 has been a banner year for natural language processing, as organizations vie for bigger and better models. The written word has never been so essential as it is today to make AI useful to society. Will there be a finish line for this race for the supremacy of natural language models? Not anytime soon. But AI will certainly become smarter, a better helper, and more accessible to average humans. Each of these efforts will be a springboard to move the needle forward, if only a little. Ultimately, we can expect to use a more discrete intelligent machine that would go unnoticed until it breaks down, much like electricity.

Shalini Verma is CEO of PIVOT technologies, a Dubai-based cognitive innovation company. She tweets @ shaliniverma1.