2023 Author: Bryan Walter | [email protected]. Last modified: 2023-05-21 22:24
Yandex presented a neural network language algorithm for generating YaLM texts and the Zeliboba service based on it. The service is able to select the next word in a sentence and, thanks to this, write small texts based on several words entered by the user. The language model underlying Zeliboba was trained on several terabytes of Russian-language texts, including Wikipedia articles, news articles, and social media posts.
From the editor
Soon after the post was published, Yandex closed access to the service. “As a result of an error, we opened a demo version of the technology ahead of time, which is still in the stage of internal testing. We will definitely tell you more about everything - but a little later,”- said a company representative.
Updated: On June 17, Yandex again opened access to the service, but with minor changes. It was renamed "Balaboba" and taught to accept complaints from users about offensive results of text generation. The company also claims that the service does not accept requests for politics, religion and other sensitive topics, but, apparently, the developers only weaned it from responding to requests for several popular politicians and words, such as "rally".
There has been significant progress in the field of natural language processing algorithms in the past few years due to several factors. This is largely due to a neural network architecture developed in 2017 by researchers at Google, called Transformer. The Transformer architecture is best known for the GPT family of neural network models developed by OpenAI. Starting with the GPT-2 model, the quality of text creation has become so high that the developers, fearing using it for unscrupulous purposes, decided not to share the full model with the public, limiting themselves to a simplified one.
The quality of the generated text in this type of neural networks depends on various factors, largely on the number of parameters used in the network. In GPT-2 there were one and a half billion, and in GPT-3, presented last year, there are already 175 billion, which allowed, after learning with just a few examples, to perform tasks for working with different types of text, including writing poetry, answering questions and translate texts.
Russian developers are also working on the creation of neural networks based on the Transformer architecture, but trained to work with Russian-language texts. At the end of 2020, Sberbank developers published the Russian-language version of GPT-3 with 760 million parameters, and now the Russian-language Transformer model was presented by Yandex.
Like GPT-3, Yandex's YaLM (Yet another Language Model) algorithm is trained to predict the next word in a text sequence. By combining the predicted words, the model can supplement the proposed word or sentence with several more words or sentences. To demonstrate how the algorithm works, the developers created the Zeliboba service. In it, the user can enter text and receive a continuation from the algorithm. Also in "Zelibob" you can choose the style of the augmented text, for example, toasts, captions to posts on Instagram or kid quotes.
An example of how the algorithm works
The developers note that the algorithm is extremely quickly retrained to work in a new style - for this it needs from five to several dozen examples in the target style.
The authors of the algorithm have created several models that differ in the number of parameters - from 1 to 13 billion (Zelibob uses a version with 3 billion parameters). All of them were trained on a variety of samples of Russian-language texts, among which were articles from Wikipedia, books, posts on social networks and news items. The developers used both their own and publicly available datasets, including Taiga and RDT.
A Yandex representative told N + 1 that in the near future the company will tell in what other services it is planned to use the new text generation algorithm; other details about the possible use of the YaLM family were not reported.
An example of how the algorithm works
In the benchmark of Russian-language models for natural language processing RussianSUPERGLUE, the YaLM algorithm (version with a billion parameters) took second place, behind the model from the developer neverrixx.
In recent months, neural networks with the Transformer architecture have begun to appear, using trillions of parameters. The first such network was presented by developers from Google, and recently the local media told about the creation of such an algorithm in China.