🥇Mikhail Salosin. Golang meeting. Using Go in the backend of the Look+ application

Mikhail Salosin (hereinafter - MS): - Hi all! My name is Michael. I work as a back-end developer at MC2 Software, and I will talk about using Go in the back-end of the Look+ mobile application.

Does anyone here like hockey?

Then this app is for you. It is for Android and iOS, it is used to watch broadcasts of various sporting events online and on record. The application also has various statistics, text broadcasts, tables for conferences, tournaments and other information useful for fans.

Also in the application there is such a thing as video moments, i.e. you can watch the sharp moments of matches (goals, fights, shootouts, etc.). If you don't feel like watching the whole broadcast, you can watch only the most interesting.

What was used in development?

The main part was written in Go. The API that mobile clients communicated with was written in Go. Also, a service was written in Go to send push notifications to mobile phones. We also had to write our own ORM, which we may someday talk about. Well, some small services are written in Go: resizing and uploading images for the editors side ...

We used Postgres (PostgreSQL) as the database. The interface for the editors was written in Ruby on Rails using the ActiveAdmin gem. Ruby is also used to import statistics from a statistics provider.

For API system tests, we used the Python unittest. Memcached is used for API payment throttling, Chef for configuration control, Zabbix for collecting and monitoring internal system statistics. Graylog2 is for collecting logs, Slate is the API documentation for clients.

Protocol selection

The first problem we faced: we had to choose the protocol for the interaction of the backend with mobile clients, based on the following points…

The most important requirement is that the data on the clients must be updated in real time. That is, everyone who is currently watching the broadcast should receive updates almost instantly.
To simplify, we have assumed that the data that is synchronized with clients is not deleted, but hidden using special flags.
Any rare requests (like statistics, lineups, team statistics) are obtained by regular GET requests.
Plus, the system had to calmly withstand 100 users at the same time.

Based on this, we had two protocol options:

Websockets. But we didn't need channels from the client to the server. We only needed to send updates from the server to the client, so web socket is a redundant option.
Server-Sent Events (SSE) is just right! It is quite simple and satisfies, in principle, everything that we need.

Server-Sent Events

A few words about how this thing works ...

It works on top of an http connection. The client sends a request, the server responds with Content-Type: text/event-stream and does not close the connection with the client, but continues to write data to the connection:

Data can be sent in a format agreed with customers. In our case, we sent it in this form: the name of the changed structure (person, player) was sent to the event field, and JSON with new, changed fields for the player was sent to the data field.

Now about how the interaction itself works.

First of all, the client determines when the last time synchronization was performed with the service: it looks into its local database and determines the date of the last change recorded by it.
It sends a request with that date.
In response, we send him all the updates that have occurred since that date.
After that, it connects to the live channel and does not close until it needs these updates:

We send him a list of changes: if someone scores a goal - we change the score of the match, got injured - it is also sent in real time. Thus, in the match event feed, customers instantly receive up-to-date data. Periodically, so that the client understands that the server has not died, that nothing has happened to it, we send a timestamp once every 15 seconds - so that it knows that everything is in order and there is no need to reconnect.

How is a live connection serviced?

First of all, we create a channel that will receive updates with a buffer.
After that, we subscribe this channel to receive updates.
We set the correct header so that the client knows that everything is ok.
We send the first ping. We just write down the current timestamp of the connection.
After that, we read from the channel in a loop until the update channel is closed. The channel periodically receives either the current timestamp or changes that we are already writing to open connections.

The first problem we faced was the following: for each connection opened with the client, we created a timer that ticked once every 15 seconds - it turns out if we had 6 thousand connections open with one machine (with one API server), 6 thousand timers were created. This led to the fact that the machine did not hold the required load. The problem was not so obvious to us, but we got a little help, and we fixed it.

As a result, now we have ping coming from the same channel from which update comes.

Accordingly, there is only one timer that ticks every 15 seconds.

There are several auxiliary functions here - sending the header, ping and the structure itself. That is, the name of the table (person, match, season) and the information about this record itself are transmitted here:

Mechanism for sending updates

Now a little about where the changes come from. We have several people, editors, who watch the broadcast in real time. They create all the events: someone was sent off, someone got injured, some kind of replacement ...

With the help of CMS, the data gets into the database. After that, the database uses the Listen/Notify mechanism to notify the API servers about this. API servers already send this information to clients. Thus, in fact, we have only a few servers connected to the database and there is no special load on the database, because the client does not interact directly with the database in any way:

PostgreSQL: Listen/Notify

The Listen / Notify mechanism in Postgres allows you to notify event subscribers that some event has changed - some record has been created in the database. To do this, we wrote a simple trigger and function:

When inserting or changing a record, we call the notify function on the data_updates channel, passing the name of the table and the ID of the record that was changed or inserted there.

For all tables that need to be synchronized with the client, we define a trigger that, after changing / updating a record, calls the function indicated on the slide below.
How does the API subscribe to these changes?

A Fanout mechanism is created - it sends messages to the client. It collects all customer channels and distributes the updates it has received through these channels:

Here is the pq standard library, which connects to the database and says that it wants to listen to the channel (data_updates), checks that the connection is open and everything is fine. I omit error checking to save space (not checking is risky).

Next, we asynchronously set a Ticker that will send a ping every 15 seconds and start listening to the channel we subscribed to. If we get a ping, we publish this ping. If we received some record, then we publish this record to all subscribers of this Fanout'a.

How does fan-out work?

In Russian, this translates as "splitter". We have one object that registers subscribers who want to receive some kind of updates. And as soon as an update to this object arrives, it distributes this update to all its subscribers. Simple enough:

How it is implemented in Go:

There is a structure, it is synchronized with the help of Mutex'ov. It has a field that saves the state of Fanout's connection to the database, i.e. at the moment it is listening and will receive updates, as well as a list of all available channels - map, the key of which is the channel and struct as values (in fact, it not used in any way).

Two methods - Connected and Disconnected - allow you to tell Fanout that we have a connection to the base, it has appeared and that the connection to the base is broken. In the second case, you need to disconnect all clients and tell them that they can no longer listen to anything and that they reconnect because the connection to them was closed.

There is also a Subscribe method that adds the channel to the "listeners":

There is the Unsubscribe method, which removes the channel from the listeners if the client is disconnected, as well as the Publish method, which allows you to send a message to all subscribers.

Question: What is being transmitted on this channel?

MS: - The model that has changed or ping is transmitted (essentially just a number, integer).

MS: - You can send anything, any structure, publish it - it just turns into JSON and that's it.

MS: - We receive a notification from Postgres - it contains the name of the table and the identifier. By the name of the table, we get and the identifier we get the record we need, and we send this structure for publication.

Infrastructure

What does it look like in terms of infrastructure? We have 7 iron servers: one of them is completely dedicated to the base, the other six are running virtual machines. There are 6 copies of the API: each virtual machine with the API runs on a separate iron server - this is for reliability.

We have two frontends that have Keepalived installed to improve accessibility so that one frontend can replace the other if something goes wrong. Also - two copies of the CMS.

There is also a statistics importer. There is a DB Slave from which backups are periodically made. There is Pigeon Pusher - the application that sends pushes to clients, as well as infrastructure things: Zabbix, Graylog2 and Chef.

In fact, this infrastructure is redundant, because 100 thousand can be served with fewer servers. But there was iron - we used it (we were told that it was possible - why not).

Pros of Go

After we worked on this application, such obvious advantages of Go came to light.

Cool http library. With it, you can create quite a lot already "out of the box".
Plus, channels that allowed us to very easily implement a mechanism for sending notifications to clients.
The wonderful Race detector feature allowed us to eliminate several critical bugs (staging infrastructure). Everything that works on staging is running, compiled with the Race key; and we, accordingly, can see on the staging infrastructure what potential problems we have.
Minimalism and simplicity of the language.

We are looking for developers! If someone wants - please.

Questions

Question from the audience (hereinafter - Q): - I think you missed one important point regarding Fan-out. Do I understand correctly that when you send a response to a client, you block if the client doesn't want to read?

MS: No, we are not blocking. Firstly, we have it all behind nginx, that is, there are no problems with slow clients. Secondly, the client has a channel with a buffer - in fact, we can put up to a hundred updates there ... If we cannot write to the channel, then it deletes it. If we see that the channel is blocked, then we simply close the channel, and that's it - the client will reconnect if there is any problem. Therefore, in principle, blocking does not occur here.

В: - Couldn't it be immediately sent to Listen / Notify a record, and not an identifier table?

MS: - Listen/Notify has an 8k bytes limit on the preload it sends. In principle, it would be possible to send if we were dealing with a small amount of data, but it seems to me that this way [as we do] is simply more reliable. The restrictions are in Postgres itself.

В: – Do customers receive updates on matches they are not interested in?

MS: – In general, yes. As a rule, there are 2-3 matches in parallel, and then quite rarely. If a client is watching something, then usually he is watching the match that is going on. Then, on the client, there is a local database in which all these updates are added, and even without an Internet connection, the client can watch all past matches for which he has updates. In fact, we synchronize our database on the server with the client's local database so that it can work offline.

В: – Why did you make your ORM?

Alexey (one of the developers of "Look +"): - At that time (it was a year ago) there were fewer ORMs than now, when there are quite a lot of them. Of most ORMs out there, what I dislike the most is that most of them run on empty interfaces. That is, the methods that in these ORMs are ready to take on anything: a structure, a structure pointer, a number, something completely irrelevant ...

Our ORM generates structures based on the data model. Myself. And so all methods are concrete, don't use reflection, etc. They accept structs and expect to use whatever structs come along.

В: – How many people participated?

MS: - At the initial stage, two people participated. Somewhere in June we started, in August the main part was ready (the first version). There was a release in September.

В: - Where you describe SSE, you do not use timeout. Why is that?

MS: - To be honest, SSE is still an html5 protocol: the SSE standard is designed to communicate with browsers, as far as I understand. It has additional features so that browsers can reconnect (and so on), but we don’t need them, because we had clients that could implement any logic for connecting and receiving information. We did not rather SSE, but something similar to SSE. It's not the protocol itself.
There was no need. As far as I understand, clients implemented the connection mechanism almost from scratch. They basically didn't care.

В: – What additional utilities did you use?

MS: – We used govet and golint most actively to keep the style consistent, as well as gofmt. Nothing else was used.

В: - What did you use to fix it?

MS: - Debugging by and large went with the help of tests. We did not use any debugger, GOP.

В: - Can you return the slide where the Publish function is implemented? Do single-letter variable names bother you?

MS: - No. They have a fairly "narrow" scope. They are not used anywhere except here (except for the internals of this class), and it is very compact - it takes only 7 lines.

В: Somehow it's still not intuitive...

MS: - No, no, it's a real code! It's not about style. It's just such a utilitarian, very small class - only 3 fields inside the class ...

MS: – By and large, all the data that is synchronized with clients (seasonal matches, players) does not change. Roughly speaking, if we make some other kind of sport in which we need to change the match, we will simply take everything into account in the new version of the client, and the old versions of the client will be banned.

В: – Are there any third-party packages for managing dependencies?

MS: We used go dep.

В: - In the topic of the report, there was something about video, but there was no video in the report.

MS: – No, I have nothing in the topic about the video. It's called "Look+" - that's what the app is called.

В: – You said that you are streaming for clients?..

MS: - We did not deal with streaming video. This was completely done by Megafon. Yes, I did not say that the application is megaphone.

MS: – Go – for sending all data – by score, by match events, statistics… Go is the whole backend for the application. The client needs to know from somewhere what link to use for the player so that the user can watch the match. We have links to videos and streams that are prepared.

Some ads 🙂

Thank you for staying with us. Do you like our articles? Want to see more interesting content? Support us by placing an order or recommending to friends, cloud VPS for developers from $4.99, a unique analogue of entry-level servers, which was invented by us for you: The whole truth about VPS (KVM) E5-2697 v3 (6 Cores) 10GB DDR4 480GB SSD 1Gbps from $19 or how to share a server? (available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

Dell R730xd 2 times cheaper in Equinix Tier IV data center in Amsterdam? Only here 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $99! Read about How to build infrastructure corp. class with the use of Dell R730xd E5-2650 v4 servers worth 9000 euros for a penny?

Source: habr.com

Mikhail Salosin. Golang meeting. Using Go in the backend of the Look+ application