How I Organized Machine Learning Training at NSU

My name is Sasha and I love machine learning as well as human learning. Now I supervise educational programs at the Computer Science Center and supervise a bachelor's degree in data analysis at St. Petersburg State University. Prior to that, he worked as an analyst at Yandex, and even earlier as a scientist: he was engaged in mathematical modeling at the ICT SB RAS.

In this post, I want to tell you what came out of the idea of ​​launching machine learning training for students, graduates of Novosibirsk State University and everyone else.

How I Organized Machine Learning Training at NSU

I have long wanted to organize a special course to prepare for competitions in data analysis on Kaggle and other platforms. This seemed like a great idea:

  • Students and everyone who wants to apply theoretical knowledge in practice, gain experience in solving problems in public competitions.
  • Students who take places in the top in such competitions have a good effect on the attractiveness of NSU for applicants, students and graduates. The same thing happens with sports programming training.
  • Such a special course perfectly complements and expands fundamental knowledge: participants independently implement machine learning models, often join teams that compete at the global level.
  • Other universities have already conducted such training, so I hoped for the success of the special course at NSU as well.

Release

The Novosibirsk Academgorodok is a very fertile ground for such undertakings: students, graduates and teachers of the Computer Science Center and strong technical faculties, for example, FIT, MMF, FF, strong support from the NSU administration, an active ODS community, experienced engineers and analysts from various IT companies. Around the same time, we learned about the grant program from Botan Investments β€” the fund supports teams that show good results in sports ML competitions.

We found an audience at NSU for weekly meetings, created a chat in Telegram, and from October 1, we launched together with students and graduates of the CS center. 19 people came to the first lesson. Six of them became regular participants in training. In total, 31 people came to the meeting at least once during the academic year.

First results

We met with the guys, exchanged experiences, discussed competitions and a rough plan for the future. We quickly realized that the fight for places in data analysis competitions is regular exhausting work, similar to unpaid full-time work, but very interesting and exciting πŸ™‚ One of the participants, Kaggle-master Maxim, advised us to advance in the competitions individually first, and only a few weeks after that, join teams, taking into account the public score. We did just that! At face-to-face training, we discussed models, scientific articles, and the intricacies of Python libraries, and solved problems together.

The results of the fall semester were three silver medals in two competitions on Kaggle: TGS Salt Identification ΠΈ PLAsTiCC Astronomical Classification. And one third place in the CFT competition for correcting typos with the first money won (in the money, as experienced kegglers say).

Another very important indirect result of the special course was the launch and configuration of the VKI NSU cluster. Its computing power has significantly improved our competitive life: 40 CPUs, 755Gb RAM, 8 NVIDIA Tesla V100 GPUs.

How I Organized Machine Learning Training at NSU

Before that, we survived as best we could: we counted on personal laptops and desktops, in Google Colab and in Kaggle-kernels. One team even had a self-written script that automatically saved the model and restarted the calculation that had stopped due to the time limit.

In the spring semester, we continued to get together, exchange successful finds and talk about our competition decisions. New interested participants began to come to us. During the spring semester, I managed to take one gold, three silvers and nine bronzes in eight competitions on Kaggle: pet finder, Santander, Gendered resolution, Whale Identification, Quora, Google Landmarks and others, bronze in Recco challenge, third place in the Changellenge>>Cup and first place (again in the money) in the machine learning competition at programming championship from Yandex.

What the trainees say

Mikhail Karchevsky
β€œI am very glad that such activities are carried out in Siberia, as I believe that participation in competitions is the fastest way to master ML. For such competitions, the hardware is expensive enough to buy on your own, but here you can try ideas for free.

Kirill Brodt
β€œBefore the advent of ml-training, I did not particularly participate in competitions, with the exception of training and Hindu competitions: I did not see the point in this, since I had work in the field of MO, and I am familiar with it. The first semester I went as a student. And starting from the second semester, as soon as computing resources appeared, I thought, why not participate. And it got me hooked. The task, data and metrics were invented and prepared for you, take it and use the full power of the ML, check the state-of-the-art models and techniques. If it wasn’t for the training and, last but not least, the computing resources, I wouldn’t have started participating soon.”

Andrey Shevelev
β€œFace-to-face ML training helped me find like-minded people, with whom I managed to deepen my knowledge in the field of machine learning and data analysis. It is also a great option for those who do not have much free time for independent analysis and immersion in the topic of competitions, but still want to be in the topic.

Join us in social media

Competitions on Kaggle and other platforms hone practical skills and quickly turn into interesting work in the field of data science. People who have participated in a difficult competition together often become colleagues and continue to successfully solve already working problems. This happened with us too: Mikhail Karchevsky, together with a friend from the team, went to work in one company on a recommender system.

Over time, we plan to expand this activity with scientific publications and participation in machine learning conferences. Join us as participants or experts in Novosibirsk - write to me or Kirill. Organize similar trainings in your cities and universities.

Here is a little cheat sheet to help you get started:

  1. Think about a convenient place and time for regular classes. Optimal - 1-2 times a week.
  2. Write to potentially interested participants about the first meeting. First of all, these are students of technical universities, ODS participants.
  3. Start a chat to discuss current affairs: Telegram, VK, WhatsApp or any other messenger convenient for most.
  4. Keep a public lesson plan, a list of competitions and participants, follow the results.
  5. At nearby universities, research institutes or companies, find free computing power or grants for it.
  6. PROFIT!

Source: www.habr.com

Add a comment