2023 Author: Bryan Walter | [email protected]. Last modified: 2023-05-21 22:24
Researchers at OpenAI have created an algorithm for a robotic arm that allows her to solve a Rubik's cube by flipping and rolling it with her fingers. First, the algorithm was trained by trial and error in a virtual environment, and then it was transferred to a real device. Of the most complex configurations requiring 26 turns, the robotic arm completes the cube 20 percent of the time, and 60 percent success for configurations requiring 15 turns, researchers say on the OpenAI blog.
In the field of robotic arm-like manipulators, the main focus of developers is on prostheses or remote-controlled humanoid robots. By themselves, these designs are often already quite dexterous and allow complex manipulations, but the algorithms for controlling the robotic arms are still lagging behind the electromechanical component. To improve the level of algorithms, some companies focus on their applied problems, and researchers often solve “childish” problems that are difficult to apply in practice. However, in the process of solving them, technologies are often born that can subsequently be applied in many areas.
In 2017, programmers from the non-profit organization OpenAI set themselves the task of solving a Rubik's cube with one robotic arm. In 2018, they showed an intermediate result of their work, having taught the robotic arm to turn the cube with the right side up to 50 times in a row. The researchers have now shown that they have achieved the ultimate goal using similar algorithms and learning principles.
Since the authors set themselves the task of creating an algorithm for dexterous manipulations with objects, they used the available implementation of the two-phase Kocemba algorithm to calculate the moves in the solution process. In addition, they used the commercially available Shadow Dexterous Hand.
The basic algorithms can be broken down into two main parts. The first is based on the architecture of a convolutional neural network and is responsible for the visual perception of the cube. She takes three images of a hand with a cube from different angles and calculates on their basis the position of the cube, as well as the angles between its planes. The second algorithm uses a long short-term memory (LSTM) recurrent neural network architecture. It receives data from the first and based on them, as well as the assembly sequence calculated by the Kocemba algorithm, creates a sequence of movements for the fingers.
The scheme of the algorithms
As in the previous work, the researchers used for training not a set of real robotic arms, but a virtual environment with their copies. This made it possible, firstly, to parallelize the learning process and thereby speed up it, and, secondly, to improve the quality of algorithms by changing the parameters of the environment. In the learning process, the algorithms were trained by trial and error, and gradually reached the threshold level of success, after which the environment automatically changed parameters, for example, the size and mass of the cube. Because of this, the algorithm was forced to adapt again to the conditions. This is what made it possible to prepare the algorithms for transferring to a real robotic arm, without the need to absolutely accurately simulate all aspects of the interaction between the cube and the arm.
OpenAI recently showed other notable work in machine learning. Researchers have created neural network agents that have learned to play hide and seek on their own. During the learning process, the two parties discovered new winning strategies of behavior several times, and one of these strategies uses a feature of the virtual environment that the authors overlooked during development.