Vela → smart cache for time series and more

In fintech, we often have to process quite massive amounts of currency exchange rate data. We get data from different sources, and each of them has their own idea of ​​how to extrapolate the values ​​of the rates for tomorrow, the day after tomorrow, the next month, and even the next three years. If someone could predict rates correctly, it would be just right to close the business and just stupidly change money back and forth. Some sources are more trustworthy, some are completely garbage, with occasional interspersed near-correct values, but for exotic couples. Our job is to sift through these tens of thousands of values ​​per second and determine exactly what to show to customers. We need to filter out the only correct value from a ton of dirt and silt, like flamingos do at dinner.

Vela → smart cache for time series and more

A special distinguishing feature of flamingos is a massive downward-curved beak, with which they filter food from water or silt.
 — Wiki

This is how the library was born. VelaAn that stores the state cache for multiple values ​​at given time intervals. Under the hood, it filters out bad and outdated data on the fly, and also provides access to the latest N validated values ​​for each key (currency pair, in our case).

Let's say we collect rates for three pairs of currencies. The simplest definition Vela to store the current state would look something like this:

defmodule Pairs do
  use Vela,
    eurusd: [sorter: &Kernel.<=/2],
    eurgbp: [limit: 3, errors: 1],
    eurcad: [validator: Pairs]

  @behaviour Vela.Validator

  @impl Vela.Validator
  def valid?(:eurcad, rate), do: rate > 0
end

Updating values

Vela.put/3 the function will sequentially do the following:

  • will cause validator on the value, if one is defined (see chapter Validation below);
  • will add the value either to the range of good values ​​if the validation was successful, or to the service series :__errors__ otherwise;
  • will trigger sorting if sorter defined for the given key, or just put the value at the head of the list (LIFO, see chapter Sorting below);
  • will cut the row according to the parameter :limit passed at creation;
  • will return the updated structure Vela.

iex|1 > pairs = %Pairs{}
iex|2 > Vela.put(pairs, :eurcad, 1.0)
#⇒ %Pairs{..., eurcad: [1.0], ...}
iex|3 > Vela.put(pairs, :eurcad, -1.0)
#⇒ %Pairs{__errors__: [eurcad: -1.0], ...}
iex|4 > pairs |> Vela.put(:eurusd, 2.0) |> Vela.put(:eurusd, 1.0)
#⇒ %Pairs{... eurusd: [1.0, 2.0]}

Also Vela implements Access, so you can use any of the standard functions for deep updating structures from the arsenal to update the values Kernel: Kernel.get_in/2, Kernel.put_in/3, Kernel.update_in/3, Kernel.pop_in/2and Kernel.get_and_update_in/3.

Validation

The validator can be defined as:

  • external function with one argument (&MyMod.my_fun/1), it will only get the validation value;
  • external function with two arguments, &MyMod.my_fun/2she'll get a couple serie, value for validation;
  • module that implements Vela.Validator;
  • configuration parameter threshold, and - optionally - compare_by, see chapter Comparison below.

If the validation is successful, the value is added to the list under the corresponding key, otherwise the tuple {serie, value} sent in :__errors_.

Comparison

The values ​​stored in these rows can be anything. To teach Vela to compare them, it is necessary to transfer compare_by parameter to the series definition (unless the values ​​can be compared with the standard Kernel.</2); this parameter must be of type (Vela.value() -> number()). By default it's just & &1.

Also, you can pass the parameter to the row definition comparator to calculate delta values ​​(min/max); for example, passing Date.diff/2 as a comparator, you can get the correct deltas for dates.

Another convenient way to work is to pass a parameter threshold, which defines the maximum allowable ratio of the new value to {min, max} interval. Since it is given as a percentage, the check does not use comparatorbut still using compare_by. For example, to specify a threshold value for date times, you would specify compare_by: &DateTime.to_unix/1 (to get an integer value) and threshold: 1, causing new values ​​to be allowed only if they are in ±band interval from the current values.

Finally, you can use Vela.equal?/2 to compare two caches. If the values ​​define a function equal?/2 or compare/2, then these functions will be used for comparison, otherwise we stupidly use ==/2.

Getting values

Handling the current state usually starts with a call Vela.purge/1, which removes obsolete values ​​(if validator tied to timestamps). You can then call Vela.slice/1, which will return keyword with row names as keys and the first, actual values.

You can also use get_in/2/pop_in/2 for low-level access to the values ​​in each row.

application

Vela can be extremely useful as a time series cache in a process state like GenServer/Agent. We want to never use stale rate values, and to do that we just keep the process with the state handled Vela, with the validator shown below.

@impl Vela.Validator
def valid?(_key, %Rate{} = rate),
  do: Rate.age(rate) < @death_age

и Vela.purge/1 quietly removes all obsolete values ​​every time we need data. To access the actual values, we simply call Vela.slice/1, and when we need a small exchange rate history (the whole series), we simply return it - already sorted - with validated values.

Happy time series caching!

Source: habr.com

Add a comment