Creating a stateful skill for Alice on the serverless functions of Yandex.Cloud and Python

Let's start with the news. Yesterday Yandex.Cloud announced the launch of a serverless computing service Yandex Cloud Functions. This means: you only write the code of your service (for example, a web application or a chatbot), and the Cloud itself creates and maintains virtual machines where it runs, and even replicates them if the load increases. You don't have to think at all, it's very convenient. And the payment goes only for the time of calculations.

However, some people may not pay at all. These are the developers Alice's external skills, that is, chatbots built into it. Any developer can write, host and register such a skill, and from today skills don’t even need to be hosted — just upload their code to the cloud in the form the same serverless function.

But there are a couple of nuances. Firstly, your pet code may require some dependencies, and it is not trivial to drag them into the Cloud. Secondly, any normal chatbot needs to store the state of the dialog somewhere (stateful therefore); how to do it in a serverless function the easiest way? Thirdly, how can you write a quick-dirty skill for Alice or even some kind of bot with a non-zero plot? About these nuances, in fact, the article.

Creating a stateful skill for Alice on the serverless functions of Yandex.Cloud and Python

Moral preparation

For the impatient: I collect the necessary dependencies with a makefile before uploading the function to the cloud, I store the state of the dialog in Yandex Object Storage (it supports the S3 API), and I use my own library to manage the dialog tgalice. As a result, it turns out such demo skill. And now let's analyze all this in a little more detail.

For the slightly less impatient, before diving into my story of complexity, I highly recommend starting simple. Namely, create a single-file echobot function, register and test it in the Dialog developer console; details - in Blog Dialogues и cloud documentation. And only then start.

If you want to immediately feel what we are fighting for, ask Alice 'turn on the skill "IT horoscope"'. The following dialogue should take place:

Alice: Hello! You are in the "IT horoscope" skill. Say "Start" to find out what the stars promise you.
User: Start
A: Please state your name.
Yu: Vasisualy
A: Now tell me the year of your birth. Only four digits, nothing more.
Yu: second
A: Please try again. State the year of your birth - four digits.
Yu: 2002
A: Great! Now name the month of your birth.
Yu: February
A: Excellent! Finally, tell me your date of birth - just a number, just one or two digits.
Yu: 18
A: Thank you, Vasisual! Now we know: you are 17 years old and you are an Aquarius. Here you are, of course, lucky! The stars tell you а что говорят звёзды, вы узнаете, запустив навык.

The problem is that to maintain even such a simple dialog, you need to remember the name and date of birth of the user, and in a serverless environment this is not trivial. It will not work to store the context in RAM or as a file on disk, because Yandex.Cloud can run the function on several virtual machines at the same time and switch between them arbitrarily. You'll have to use some kind of external storage. Object Storage was chosen as a fairly inexpensive and simple storage directly in Yandex.Cloud (that is, probably fast). As a free alternative, you can try, for example, a free piece Cloudy Mongi somewhere far away. Both Object Storage (it supports the S3 interface) and Mongo have convenient Python wrappers.

Another problem is that in order to go to Object Storage, MongoDB, and any other database or data store, you need some external dependencies that you need to upload to Yandex Functions along with your function code. And I would like to do it comfortably. It’s completely convenient (like on heroku), alas, it won’t work, but you can create some basic comfort by writing a script to build the environment (make file).

How to start the horoscope skill

  1. Get ready: go to some machine with Linux. In principle, you can probably work with Windows too, but then you have to conjure with the launch of the makefile. And in any case, you will need at least 3.6 installed Python.
  2. Clone from github example of a horoscope skill.
  3. Register in Ya.Cloud: https://cloud.yandex.ru
  4. Create yourself two buckets in object-storage, call them by any name {BUCKET NAME} и tgalice-test-cold-storage (this middle name is now hardcoded into main.py my example). The first bucket will be needed only for deployment, the second - for storing dialog states.
  5. Create service account, give him a role editor, and get static credentials for it {KEY ID} и {KEY VALUE} - we will use them to record the state of the dialogue. All this is needed so that the function from Ya.Cloud can access the storage from Ya.Cloud. Someday, I hope, authorization will become automatic, but for now - so.
  6. (Optional) install command line interface yc. You can also create a function through the web interface, but the CLI is good because all sorts of innovations appear in it faster.
  7. Now you can, in fact, prepare the assembly of dependencies: run on the command line from the folder with the skill example make all. A bunch of libraries (mostly, as usual, unnecessary) will be installed in the folder dist.
  8. Fill with pens into Object Storage (into a bucket {BUCKET NAME}) the archive obtained at the previous step dist.zip. If desired, you can also do this from the command line, for example, using AWS CLI.
  9. Create a serverless function via the web interface or using the utility yc. For the utility, the command will look like this:

yc serverless function version create
    --function-name=horoscope
    --environment=AWS_ACCESS_KEY_ID={KEY ID},AWS_SECRET_ACCESS_KEY={KEY VALUE}
    --runtime=python37
    --package-bucket-name={BUCKET NAME}
    --package-object-name=dist.zip
    --entrypoint=main.alice_handler
    --memory=128M
    --execution-timeout=3s

When manually creating a function, all parameters are filled in the same way.

Now the function you created can be tested through the developer console, and then finalized and published skill.

Creating a stateful skill for Alice on the serverless functions of Yandex.Cloud and Python

What's under the hood

The makefile actually contains a fairly simple script for installing dependencies and putting them into an archive. dist.zip, something like this:

mkdir -p dist/
pip3 install -r requirements.txt --target dist/ 
cp main.py dist/main.py
cp form.yaml dist/form.yaml
cd dist && zip --exclude '*.pyc' -r ../dist.zip ./*

The rest is a few simple tools wrapped in a library tgalice. The process of filling in user data is described by the config form.yaml:

form_name: 'horoscope_form'
start:
  regexp: 'старт|нач(ать|ни)'
  suggests:
    - Старт
fields:
  - name: 'name'
    question: Пожалуйста, назовите своё имя.
  - name: 'year'
    question: Теперь скажите мне год вашего рождения. Только четыре цифры, ничего лишнего.
    validate_regexp: '^[0-9]{4}$'
    validate_message: Пожалуйста, попробуйте ещё раз. Назовите год вашего рождения - четыре цифры.
  - name: 'month'
    question: Замечательно! Теперь назовите месяц вашего рождения.
    options:
      - январь
     ...
      - декабрь
    validate_message: То, что вы назвали, не похоже на месяц. Пожалуйста, назовите месяц вашего рождения, без других слов.
  - name: 'day'
    question: Отлично! Наконец, назовите мне дату вашего рождения - только число, всего одна или две цифры.
    validate_regexp: '[0123]?d$'
    validate_message: Пожалуйста, попробуйте ещё раз. Вам нужно назвать число своего рождения (например, двадцатое); это одна или две цифры.

The python class takes over the work of parsing this config and calculating the final result

class CheckableFormFiller(tgalice.dialog_manager.form_filling.FormFillingDialogManager):
    SIGNS = {
        'январь': 'Козерог',
        ...
    }

    def handle_completed_form(self, form, user_object, ctx):
        response = tgalice.dialog_manager.base.Response(
            text='Спасибо, {}! Теперь мы знаем: вам {} лет, и вы {}. n'
                 'Вот это вам, конечно, повезло! Звёзды говорят вам: {}'.format(
                form['fields']['name'],
                2019 - int(form['fields']['year']),
                self.SIGNS[form['fields']['month']],
                random.choice(FORECASTS),
            ),
            user_object=user_object,
        )
        return response

More precisely, the base class FormFillingDialogManager is engaged in filling out the "form", and the method of the child class handle_completed_form tells what to do when she is ready.

In addition to this main flow of dialogue, the user must also be greeted, as well as give help on the "help" command and release from the skill on the "exit" command. For this in tgalice there is also a template, so the whole dialog manager is made up of pieces:

dm = tgalice.dialog_manager.CascadeDialogManager(
    tgalice.dialog_manager.GreetAndHelpDialogManager(
        greeting_message=DEFAULT_MESSAGE,
        help_message=DEFAULT_MESSAGE,
        exit_message='До свидания, приходите в навык "Айтишный гороскоп" ещё!'
    ),
    CheckableFormFiller(`form.yaml`, default_message=DEFAULT_MESSAGE)
)

CascadeDialogManager works simply: it tries to apply to the current state of the dialogue all its components in turn, and selects the first relevant one.

As a response to each message, the dialog manager returns a python object Response, which can then be converted into plain text, or into a message in Alice or Telegram - depending on where the bot is running; it also contains the changed state of the dialog that needs to be saved. All this kitchen is handled by another class, DialogConnector, so the direct script for starting a skill on Yandex Functions looks like this:

...
session = boto3.session.Session()
s3 = session.client(
    service_name='s3',
    endpoint_url='https://storage.yandexcloud.net',
    aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'],
    region_name='ru-central1',
)
storage = tgalice.session_storage.S3BasedStorage(s3_client=s3, bucket_name='tgalice-test-cold-storage')
connector = tgalice.dialog_connector.DialogConnector(dialog_manager=dm, storage=storage)
alice_handler = connector.serverless_alice_handler

As you can see, most of this code creates a connection to the Object Storage S3 interface. How this connection is directly used, you can read in tgalice code.
The last line creates a function alice_handler - the one that we ordered to pull Yandex.Cloud when we set the parameter --entrypoint=main.alice_handler.

That, in fact, is all. Makefiles for building, S3-like Object Storage for context storage, and a python library tgalice. Together with the serverless features and expressiveness of python, this is enough to develop the skill of a healthy person.

You may ask why you need to create tgalice? All boring code that transfers JSONs from request to response and from storage to memory and back lies in it. There is also a regular application, a function for understanding that "February" is similar to "February", and other NLU for the poor. According to my idea, this should already be enough to be able to sketch skill prototypes in yaml files without being too distracted by technical details.

If you want a more serious NLU, you can screw it to your skill Rasa or DeepPavlov, but setting them up will require additional dancing with a tambourine, especially on serverless. If you don’t feel like coding at all, you should use the visual type constructor Aimylogic. When creating tgalice, I thought about some kind of intermediate path. Let's see what happens.

Well, now join Aliy skills developer chatread documentationand create amazing skills!

Source: habr.com

Add a comment