Wrote API - broke XML (two)

The first MyWarehouse API appeared 10 years ago. All this time we have been working on existing versions of the API and developing new ones. And several versions of the API have already been buried.

There will be a lot of things in this article: how the API was created, why the cloud service needs it, what gives users, what rake we managed to step on and what we want to do next.

My name is Oleg Alekseev oalexeev, I am the technical director and co-founder of MoegoSklad.

Why make an API for a service

Our clients, and these are tens of thousands of entrepreneurs, actively use cloud solutions: banking, online stores, commodity accounting, CRM. Connected to one - and it's hard to stop. And now the fifth, eighth, tenth service makes the work of an entrepreneur easier, but users transfer data between these cloud services manually. Work turns into a nightmare.

The obvious solution is to allow users to transfer data between cloud services. For example, import and export data as files, which can then be uploaded to the desired service. Files are usually changed according to the format of each service. This is more or less simple manual work, but as the number of these services grows, it becomes more and more difficult to perform.

So the next step is the API. With it, the cloud service benefits from the fact that it links several services at one point. The emergence of such an ecosystem attracts new customers through additional opportunities. A product with new functionality becomes more profitable and useful.

If you create your own programming interfaces, this attracts third-party salesmen in the form of programmers who know about your product through the API. They start building solutions based on the proposed API and earn money by automating the tasks of their customers.

The accounting system of MoegoSklad is based on simple processes. The main thing is working with primary documents, the ability to carry out acceptance and shipment of goods, and receive reports for business based on primary documents. There is also data transfer, for example, to cloud accounting, and their receipt from banking systems or retail outlets. We also work with online stores: we receive information about products and send data on the balance.

Wrote API - broke XML (two)

The first API of MyWarehouse

Over the 10 years of MySklad's work with the API, we have acquired all sorts of integrations that allow us to exchange data, work with banks, make payments and use external telephony.

In the first year, we made it possible to upload any data in XML format. Then it was much clearer and more familiar to users to keep data offline, and not in some kind of cloud, and we gave it to them. Unloading was started by manual export from the interface. That is, the API could not yet be called.

At the same time, we began to cooperate with the Rusagro company - they already used the "adult" ERP for production and sales planning, but the loading of cars at the factories was automated in MoemSklad. So we got the first rudiments of a real API: the exchange between our service and ERP took place by sending a large file with data on all types of documents.

This is a good option for batch data exchange, but along with documents, their dependencies had to be transferred: information about goods, contractors, and warehouses. Such a dump is not so difficult to generate when exporting, but it is rather difficult to parse when importing, since all the information comes in one package: both about new documents and about existing ones.

The first XML API did not last long - two years later we began to rebuild it. Even at the start of its work, we made several mistakes when building a software interface.

Wrote API - broke XML (two)
How the XML API was made: illustration from one of our architects. By the way, look forward to his articles.

Here are our main mistakes:

  1. The JAXB markup was done directly on the entity beans. To communicate with the database, we use Hibernate, and JAXB markup was made for the same beans. This error got out almost immediately: any update of the data structure led to the need to urgently notify everyone who uses the API, or to build crutches that would ensure compatibility with the previous data structure.
  2. The API grew as a kind of addition, and initially we did not determine what part of the product it was. They didn’t even think about whether the API is something important, whether it is necessary to maintain backward compatibility for its first clients. At some point, the number of API users was about 5% of the total small number, and they were not paid attention to. The universal filtering done at one time led to the fact that they began to use us as a backend. This filtering was not GraphQL at all, but something like that - it worked through a lot of query string parameters. With such a powerful tool, it was hard for users to resist, and requests were transferred to us so that they were sent directly from the UI of their online stores. The situation was an unpleasant surprise, because the provision of such a service should require a different billing and a generally different understanding of the API itself as a product.
  3. Due to the fact that the API was not developed as a main product, API documentation was produced and published on a leftover basis - through reverse engineering. This way seems quite simple and convenient, but it contradicts contract work. This is when there is a certain component with a pre-installed scheme of work. The developer implements it in accordance with this scheme and task, the component is tested, the client receives a product that matches the analyst's idea. Reverse engineering, on the other hand, throws a product on the market that simply exists: with crutches, strange solutions and bicycles instead of the desired functionality.
  4. The entire flow of requests that came through the API could be analyzed as nothing more than an Nginx or application server log. This did not allow one to single out subject areas, except to break them down by users and subscribers. If there is no way to regulate the registration of an application or clients, it becomes impossible to analyze the situation. This problem had the least impact on the development of the API, it is more about understanding its relevance and functionality.

Attempt number two: REST API

In 2010, we tried to build an exchange system with online accounting - BukhSoft. Didn't take off. But in the process of integration, a full-fledged API appeared: a REST exchange service, where there were no liberties such as accessing operations in the form of RPC calls. All communication with the API was brought to the standard rest mode: the query string contains the name of the entity, and the operation that is performed on it is specified using the http method. We have added filtering by the moment of updating entities, and users have the opportunity to build replication with their systems.

In the same year, an API appeared for unloading warehouse and commodity balances. The most valuable parts of the system became available to users through the API - the exchange of primary documents and settlement data on balances and the cost of goods.

In December 2015, RetailCRM published the first third party library to access our API. It began to be used quite actively, while the popularity of the service as a whole grew, the load on the API grew faster than the load on the web interface. One day, growth turned into a load jump.

Wrote API - broke XML (two)

Wrote API - broke XML (two)

And this jump, which is indicated by the arrow on the left, led the server serving our API to utter amazement. For a week we figured out what exactly generates this load. It turned out that these are the same requests broadcast to our API from the fronts of the clients. Everything was eaten by about 50 customers. It was then that we realized one of our mistakes - the complete absence of limits.

As a result, we introduced a limit on the number of simultaneous requests. From one account it became possible to open no more than two requests at the same time. This is enough to work in replication mode for data exchange in batch mode. And those who wanted to use us as a backend, from that moment on, were forced to comply with the tariffs more, as they introduced work on several accounts into their software tools.

Putting in order

Already since 2014, the demand for the existing API has become an important part of the business, and the API itself has generated the largest amount of data in the exchange of data with customers. In 2015, we launched a project to clean up the API. We chose JSON instead of XML as the format and began to build it based on the features that were identified during the implementation of the previous version:

  1. Ability to manage versions. Versioning allows you to develop a new version without affecting an existing application or breaking the user experience.
  2. The ability for the user to see the metadata in the very response they receive.
  3. Ability to share large documents. If we process a document with more than 4-5 thousand positions, this becomes a problem for the server: a long transaction, a long http request. We built a special mechanism that allows you to update the document in parts and manage the individual positions of this document by sending them to the server.
  4. Tools for replication - were in the previous version.
  5. Load limits - like a legacy of a rake that was stepped on in the previous version. We introduced limits on the number of requests in a period of time, the number of parallel requests and requests from one ip-address.

Since then, we have released two minor versions of the API and launched several specialized APIs, but the overall approach has remained the same. The updated exchange format and new architecture made it possible to fix flaws in the API much faster.

MyWarehouse API today

Today, the MySklad API solves many problems:

  • data exchange with online stores, accounting systems, banks;
  • receipt of settlement data, reports;
  • use as a backend for client applications - our mobile applications and desktop cash desk work through the API
  • sending notifications about data changes in MyWarehouse β€” webhooks;
  • telephony;
  • loyalty systems.

Based on the API, our CEO Askar Rakhimberdiev rhino in four hours I wrote a telegram bot that pulls the rest through the API: github.com/arahimberdiev/com-lognex-telegram-moysklad-stock

Now dry numbers.

Here are our statistics for the old REST API:

  • 400 companies;
  • 600 users;
  • 2 million requests per day;
  • 200 Gb/day of outgoing traffic.

And here is what we came up with for all the MyWarehouse APIs:

  • more than 70 integrations (some of them can be viewed here www.moysklad.ru/integratsii);
  • 8500 companies;
  • 12 users;
  • 46 million requests per day;
  • 2 TB/day of outgoing traffic.

What's next

API development plans are under active discussion. We try to take into account the operating experience that users provide us. Not always and not everything can be done at once, but not far off is a new version of the API with more convenient metadata and a less sprawling structure, OAuth for authentication, an API for applications embedded in the interface.

You can follow the news on a special site for developers of integrations with MySklad: dev.moysklad.ru.

Source: habr.com

Add a comment