DeepMind introduced a machine learning system for generating code from a text description of a task

DeepMind, known for its developments in the field of artificial intelligence and the construction of neural networks capable of playing computer and board games at the human level, presented the AlphaCode project, which develops a machine learning system for code generation that can participate in programming competitions on the Codeforces platform and demonstrate average result. A key feature of the development is the ability to generate code in Python or C ++, taking as input text with a problem statement in English.

To test the system, 10 new Codeforces competitions with more than 5000 participants were selected, held after the training of the machine learning model was completed. The results of the assignments allowed the AlphaCode system to enter approximately in the middle of the rating of these competitions (54.3%). The predicted overall rating of AlphaCode was 1238 points, which ensures entry into the Top 28% among all Codeforces participants who have participated in competitions at least once in the last 6 months. It is noted that the project is still at the initial stage of development and in the future it is planned to improve the quality of the generated code, as well as to develop AlphaΠ‘ode towards systems that help in writing code, or application development tools that people without programming skills can use.

The project uses the Transformer neural network architecture combined with sampling and filtering methods to generate various unpredictable code variants that correspond to natural language text. After filtering, clustering, and ranking, the most optimal working code is sifted from the generated stream of options, which is then checked for the correct result (in each task of the competition, an example of input data and a result corresponding to this example, which should be obtained after the program is executed, are indicated).

DeepMind introduced a machine learning system for generating code from a text description of a task

For rough training of the machine learning system, the codebase available in the public GitHub repositories was used. After preparing the initial model, an optimization phase was performed, implemented on the basis of a collection of code with examples of problems and solutions offered to the participants of the Codeforces, CodeChef, HackerEarth, AtCoder and Aizu competitions. In total, 715 GB of code from GitHub and more than a million examples of solutions to typical competition problems were used for the training. Before proceeding to code generation, the task text went through a normalization phase, at which everything superfluous was excluded and only significant parts were left.

DeepMind introduced a machine learning system for generating code from a text description of a task


Source: opennet.ru

Add a comment