Yandex opens machine learning residency program for experienced back-end developers. If you have written a lot in C++/Python and want to apply this knowledge in ML, then we will teach you how to do practical research and select experienced curators. You will work on key Yandex services and gain skills in such areas as linear models and gradient boosting, recommender systems, neural networks for image, text and sound analysis. You will also learn how to correctly evaluate your models using offline and online metrics.
The duration of the program is one year, during which participants will work in the Yandex Machine Intelligence and Research Department, as well as attend lectures and seminars. Participation is paid and assumes full employment: 40 hours per week, starting July 1 this year.
And now in more detail - about what kind of audience we are waiting for, what the workflow will be like, and in general, how a back-end specialist can switch to a career in ML.
Directivity
Many companies have Residency Programs, including, for example, Google and Facebook. They are mainly aimed at junior and intermediate level professionals who are trying to step into the side of ML research. Our program is for a different audience. We invite back-end developers who have already gained enough experience and know for sure that they need to shift towards ML in their competencies, gain practical skills - and not the skills of a scientist - in solving industrial machine learning problems. This does not mean that we do not support young researchers. For them, we organized a separate program -
Where the resident will have to work
We in the department of machine intelligence and research develop project ideas ourselves. The main source of inspiration is scientific literature, articles, trends of the research community. My colleagues and I analyze what we read, see how we can improve or expand the methods proposed by scientists. At the same time, each of us takes into account his own area of ββββknowledge and interests, formulates the task based on the areas that he considers important. At the intersection of the results of external research and own competencies, the idea of ββthe project is usually born.
Such a system is good in that it largely solves the technological problems of Yandex services even before they arise. When a service faces a problem, its representatives come to us, most likely to take the technologies we have already prepared, which can only be correctly applied in the product. If something is not ready, at least we will quickly remember where you can βstart diggingβ, in which articles to look for a solution. As you know, the scientific approach is to stand on the shoulders of giants.
What is to be done
In Yandex β and even specifically in our department β all relevant areas of ML are being developed. Our task is to improve the quality of a wide variety of products, and this serves as an incentive to test everything new. In addition, new services appear regularly. So the lecture program has all the key (well-established) areas of machine learning in industrial development. When compiling my part of the course, I used the experience of teaching at the School of Data Analysis, as well as materials and achievements of other teachers of the ShAD. I know that colleagues did the same.
In the first months, training according to the course program will be approximately 30% of your working time, then - about 10%. However, it is important to understand that working with the ML models themselves will continue to take about four times less than all related processes. These include preparing the backend, receiving data, writing a pipeline for their preprocessing, optimizing code, adapting to specific hardware, etc. An ML engineer is, if you like, a fullstack developer (only with a greater bias in machine learning), solve the problem from start to finish. Even with a finished model, you will probably need to do a number of more actions: parallelize its execution on several machines, prepare an implementation in the form of a handle, a library, or a component of the service itself.
Student choice
If you have the impression that it is better to go into ML engineers after working as a backend developer first, this is not so. Entering the same ShAD without real experience in developing services, learning and becoming extremely in demand in the market is a great option. Many specialists at Yandex ended up in their current positions in this way. If some company is ready to offer you a job in the field of ML immediately after graduation, it is probably worth accepting the offer too. Try to get into a good team with an experienced mentor and get ready to learn a lot.
What usually prevents you from doing ML
If a backender aspires to become an ML engineer, he - without taking into account the residency program - can choose from two areas of development.
Firstly, to study within the framework of some educational course.
Secondly, you can participate in combat projects where you need to implement one or another ML algorithm. However, there are very few such projects on the IT development market: in most tasks, machine learning is not used. Even in banks that are actively exploring opportunities related to ML, only a few are engaged in data analysis. If you werenβt able to join one of these teams, the only thing left to do is either start your own project (where, most likely, you will set deadlines for yourself, and this has little to do with combat production tasks), or start competing on Kaggle.
Indeed, team up with other community members and try yourself in competitions
I described two possible lines of development - training through educational programs and training "in combat", for example on Kaggle. The residence program is a combination of these two methods. You are waiting for lectures and seminars at the ShAD level, as well as really combat projects.
Source: habr.com