How to delegate simple reports to a robot. We write a bot in Python and Google BigQuery

How to delegate simple reports to a robot. We write a bot in Python and Google BigQuery

Do you have tasks that repeat day after day, week after week? For example, writing reports. You request data, analyze, visualize (make graphs, charts), and then send it to your boss. But what if all this is automated?

In this tutorial, we will create a bot for Telegram that will help automate reporting. And the coolest thing is that the whole program will consist of only 50 lines of code! If you are creating a bot for Telegram for the first time, then you should also read this one post.

Skillbox recommends: Practical course Python developer from scratch.

We remind you: for all readers of "Habr" - a discount of 10 rubles when enrolling in any Skillbox course using the "Habr" promotional code.

Let's start

Installing libraries

We will use google-cloud-bigquery to get data from Google BigQuery. Matplotlib, Numpy ΠΈ pandas help you visualize the data. python-telegram-bot will send the finished data to Telegram.

pip3 install google-cloud-bigquery matplotlib numpy pandas python-telegram-bot

Connecting the Google BigQuery API

If we want to use the service, we need to connect the Google BigQuery API. For this we go to Google Developers Console and create a new project (or select an existing one).

In the control panel, select ENABLE APIS AND SERVICES and look for the BigQuery API.

How to delegate simple reports to a robot. We write a bot in Python and Google BigQuery

Select Enable to connect the API.

How to delegate simple reports to a robot. We write a bot in Python and Google BigQuery

Create an account key

Let's go again to Google Developers Console, select the tab Credentials, Create credentials and Service account key.

Then - New service account, and in the Service account name field, enter a name.

From the Role drop-down list, select Project > Owner, then Create.

How to delegate simple reports to a robot. We write a bot in Python and Google BigQuery

The file that will automatically load is called creds.json.

We expose GOOGLE_APPLICATION_CREDENTIALS, specifying the path to creds.json in the terminal.

export GOOGLE_APPLICATION_CREDENTIALS='[PATH_TO_CREDS.JSON]'

If everything went well, it's time to start writing the program.

Creating an application

For the tutorial, we will use data from bigquery-public-data.stackoverflow, for our report, we will choose the number of daily publications.

Everything is quite simple.

Query the table -> Visualize the data -> Save the visualization -> Send the image

Let's create one function to define each thread.

Query to BigQuery

First, let's import the library.

from google.cloud import bigquery

We create a function called query_to_bigquery, where the parameter is query.

def query_to_bigquery(query):
    client = bigquery.Client()
    query_job = client.query(query)
    result = query_job.result()
    dataframe = result.to_dataframe()
    return dataframe

This function will return the query as a data frame.

Visualizing data

To solve this problem, choose matplotlib.

import matplotlib.pyplot as plt

We need five parameters, where x is the x-axis data, x_label is the title for the axis, y is the y-axis data, y_label is the title for the axis, and title is the title of the entire visualization.

def visualize_bar_chart(x, x_label, y, y_label, title):
    plt.title(title)
    plt.xlabel(x_label)
    plt.ylabel(y_label)
    index = np.arange(len(x))
    plt.xticks(index, x, fontsize=5, rotation=30)
    plt.bar(index, y)
    return plt

Saving the image

Now let's use two functions to create a visualization and save it.

We will send the number of posts published daily. First, we write a request.

query = """
        SELECT DATE(creation_date) date, COUNT(*) total_posts
        FROM `bigquery-public-data.stackoverflow.post_history`
        GROUP BY 1
        HAVING date > DATE_SUB('2018-12-02', INTERVAL 14 DAY)
        ORDER BY 1
        """

The query helps collect data for two weeks starting December 2, 2018.

We use this date because 2018-12-02 is the latest data recorded in bigquery-public-data.stackoverflow.post_history, otherwise you can use CURRENT_DATE() to get the newest data.

We call the query_to_bigquery function to get the data.

dataframe = query_to_bigquery(query)

We then use the date data column for the x-axis and the total_posts column for the y-axis.

x = dataframe['date'].tolist()
y = dataframe['total_posts'].tolist()

Visualize using the visualize_bar_chart function and save as an image.

plt = visualize_bar_chart(x=x, x_label='Date', y=y, y_label='Total Posts', title='Daily Posts')
plt.savefig('viz.png')

We wrap this code in a function called get_and_save_image.

def get_and_save_image():
    query = """
            SELECT DATE(creation_date) date, COUNT(*) total_posts
            FROM `bigquery-public-data.stackoverflow.post_history`
            GROUP BY 1
            HAVING date > DATE_SUB('2018-12-02', INTERVAL 14 DAY)
            ORDER BY 1
            """
    dataframe = query_to_bigquery(query)  
    x = dataframe['date'].tolist()
    y = dataframe['total_posts'].tolist()
    plt = visualize_bar_chart(x=x, x_label='Date', y=y, y_label='Total Posts', title='Daily Posts')
    plt.savefig('viz.png')

Sending an image

In order to send a report to the addressee, you need to know the chat_id parameter.

Use userinfobot and type /start. The bot responds with the necessary information, the chat_id is contained in the id field.

Now we create the send_image function. It will use the get_and_save_image function to get and save the image. And then we send everything to the right contact.

def send_image(bot, update):
    get_and_save_image()
    chat_id = 'CHAT_ID_RECEIVER'
    bot.send_photo(chat_id=chat_id, photo=open('viz.png','rb'))

Main program

Finally, we create another function, main, to start the application. Don't forget to change YOUR_TOKEN for the bot.

Remember: this program will send the image automatically at the time you specify. For example, we will send a report at nine in the morning every day.

def main():
    updater = Updater('YOUR_TOKEN')
    updater.job_queue.run_daily(send_image, time=datetime.datetime.strptime('9:00AM', '%I:%M%p').time(), days=(0,1,2,3,4,5,6))
    updater.start_polling()
    updater.idle()
 
if __name__ == '__main__':
    main()

As a result, our application will look like this:

from google.cloud import bigquery
from telegram.ext import Updater
 
import matplotlib.pyplot as plt
import numpy as np
import datetime
 
def query_to_bigquery(query):
    client = bigquery.Client()
    query_job = client.query(query)
    result = query_job.result()
    dataframe = result.to_dataframe()
    return dataframe
 
def visualize_bar_chart(x, x_label, y, y_label, title):
    plt.title(title)
    plt.xlabel(x_label)
    plt.ylabel(y_label)
    index = np.arange(len(x))
    plt.xticks(index, x, fontsize=5, rotation=30)
    plt.bar(index, y)
    return plt
 
def get_and_save_image():
    query = """
            SELECT DATE(creation_date) date, COUNT(*) total_posts
            FROM `bigquery-public-data.stackoverflow.post_history`
            GROUP BY 1
            HAVING date > DATE_SUB('2018-12-02', INTERVAL 14 DAY)
            ORDER BY 1
            """
    dataframe = query_to_bigquery(query)  
    x = dataframe['date'].tolist()
    y = dataframe['total_posts'].tolist()
    plt = visualize_bar_chart(x=x, x_label='Date', y=y, y_label='Total Posts', title='Daily Posts')
    plt.savefig('viz.png')
 
def send_image(bot, update):
    get_and_save_image()
    chat_id = 'CHAT_ID_RECEIVER'
    bot.send_photo(chat_id=chat_id, photo=open('viz.png', 'rb'))
 
def main():
    updater = Updater('YOUR_TOKEN')
    updater.job_queue.run_daily(send_image, time=datetime.datetime.strptime('9:00AM', '%I:%M%p').time(), days=(0,1,2,3,4,5,6))
    updater.start_polling()
    updater.idle()
 
if __name__ == '__main__':
main()

We save the file and call it main.py.

Launch the application by typing the following command in the terminal:

python3 main.py

All is ready. Now we have a robot consisting of 50 lines of code that generates reports without our intervention.

Let's check the bot henceby selecting the /send command.

How to delegate simple reports to a robot. We write a bot in Python and Google BigQuery

You can get the finished code at my github.

Skillbox recommends:

Source: habr.com

Add a comment