ื›ื™ืฆื“ ืœื”ืืฆื™ืœ ื“ื•ื—ื•ืช ืคืฉื•ื˜ื™ื ืœืจื•ื‘ื•ื˜. ื›ืชื™ื‘ืช ื‘ื•ื˜ ื‘-Python ื•ื‘-Google BigQuery

ื›ื™ืฆื“ ืœื”ืืฆื™ืœ ื“ื•ื—ื•ืช ืคืฉื•ื˜ื™ื ืœืจื•ื‘ื•ื˜. ื›ืชื™ื‘ืช ื‘ื•ื˜ ื‘-Python ื•ื‘-Google BigQuery

ื”ืื ื™ืฉ ืœืš ืžืฉื™ืžื•ืช ืฉื—ื•ื–ืจื•ืช ืขืœ ืขืฆืžืŸ ื™ื•ื ืื—ืจ ื™ื•ื, ืฉื‘ื•ืข ืื—ืจ ืฉื‘ื•ืข? ืœืžืฉืœ, ื›ืชื™ื‘ืช ื“ื•ื—ื•ืช. ืืชื” ืžื‘ืงืฉ ื ืชื•ื ื™ื, ืžื ืชื—ื™ื ืื•ืชื, ืžื“ืžื™ื™ื ื™ื ืื•ืชื (ื™ื•ืฆืจื™ื ื’ืจืคื™ื, ืชืจืฉื™ืžื™ื), ื•ืื– ืฉื•ืœื—ื™ื ืื•ืชื ืœื‘ื•ืก ืฉืœืš. ืื‘ืœ ืžื” ืื ื›ืœ ื–ื” ื”ื™ื” ืื•ื˜ื•ืžื˜ื™?

ื‘ืžื“ืจื™ืš ื–ื” ื ื™ืฆื•ืจ ื‘ื•ื˜ ืœื˜ืœื’ืจื ืฉื™ืขื–ื•ืจ ืœื”ืคื•ืš ืืช ื”ื“ื™ื•ื•ื— ืœืื•ื˜ื•ืžื˜ื™. ื•ื”ื“ื‘ืจ ื”ื›ื™ ืžื’ื ื™ื‘ ื”ื•ื ืฉื”ืชื•ื›ื ื™ืช ื›ื•ืœื” ืชื”ื™ื” ืžื•ืจื›ื‘ืช ืž-50 ืฉื•ืจื•ืช ืงื•ื“ ื‘ืœื‘ื“! ืื ืืชื” ื™ื•ืฆืจ ื‘ื•ื˜ ืขื‘ื•ืจ ื˜ืœื’ืจื ื‘ืคืขื ื”ืจืืฉื•ื ื”, ืขืœื™ืš ืœืงืจื•ื ื’ื ืืช ื–ื” ืœื”ืฆื™ื‘.

Skillbox ืžืžืœื™ืฆื”: ืงื•ืจืก ืžืขืฉื™ ืžืคืชื— Python ืžืืคืก.

ืื ื• ืžื–ื›ื™ืจื™ื: ืœื›ืœ ืงื•ืจืื™ Habr - ื”ื ื—ื” ืฉืœ 10 ืจื•ื‘ืœ ื‘ืขืช ื”ืจืฉืžื” ืœื›ืœ ืงื•ืจืก Skillbox ื‘ืืžืฆืขื•ืช ืงื•ื“ ื”ื”ื˜ื‘ื” ืฉืœ Habr.

ื‘ื•ืื• ื ืชื—ื™ืœ

ื”ืชืงื ืช ืกืคืจื™ื•ืช

ืื ื—ื ื• ื ืฉืชืžืฉ google-cloud-bigquery ื›ื“ื™ ืœืงื‘ืœ ื ืชื•ื ื™ื ืž-Google BigQuery. matplotlib, ืงื”ื•ืช ะธ ื“ื•ื‘ื™ ืคื ื“ื” ื™ืขื–ื•ืจ ืœืš ืœื“ืžื™ื™ืŸ ืืช ื”ื ืชื•ื ื™ื ืฉืœืš. ืคื™ืชื•ืŸ-ื˜ืœื’ืจื-ื‘ื•ื˜ ื™ืฉืœื— ืืช ื”ื ืชื•ื ื™ื ื”ืžื•ื’ืžืจื™ื ืœื˜ืœื’ืจื.

pip3 ืœื”ืชืงื™ืŸ ืืช google-cloud-bigquery matplotlib numpy pandas python-telegram-bot

ื—ื™ื‘ื•ืจ Google BigQuery API

ืื ืื ื—ื ื• ืจื•ืฆื™ื ืœื”ืฉืชืžืฉ ื‘ืฉื™ืจื•ืช, ืื ื—ื ื• ืฆืจื™ื›ื™ื ืœื—ื‘ืจ ืืช Google BigQuery API. ื›ื“ื™ ืœืขืฉื•ืช ื–ืืช ืื ื—ื ื• ื”ื•ืœื›ื™ื ืœ Google Developers Console ื•ืฆื•ืจ ืคืจื•ื™ืงื˜ ื—ื“ืฉ (ืื• ื‘ื—ืจ ืงื™ื™ื).

ื‘ืœื•ื— ื”ื‘ืงืจื”, ื‘ื—ืจ ENABLE APIS AND SERVICES ื•ื—ืคืฉ ืืช BigQuery API.

ื›ื™ืฆื“ ืœื”ืืฆื™ืœ ื“ื•ื—ื•ืช ืคืฉื•ื˜ื™ื ืœืจื•ื‘ื•ื˜. ื›ืชื™ื‘ืช ื‘ื•ื˜ ื‘-Python ื•ื‘-Google BigQuery

ื‘ื—ืจ ื”ืคืขืœ ื›ื“ื™ ืœื—ื‘ืจ ืืช ื”-API.

ื›ื™ืฆื“ ืœื”ืืฆื™ืœ ื“ื•ื—ื•ืช ืคืฉื•ื˜ื™ื ืœืจื•ื‘ื•ื˜. ื›ืชื™ื‘ืช ื‘ื•ื˜ ื‘-Python ื•ื‘-Google BigQuery

ืฆื•ืจ ืžืคืชื— ื—ืฉื‘ื•ืŸ

ื‘ื•ื ื ืœืš ืฉื•ื‘ Google Developers Console, ื‘ื—ืจ ื‘ื›ืจื˜ื™ืกื™ื™ื” ืื™ืฉื•ืจื™ื, ืฆื•ืจ ืื™ืฉื•ืจื™ื ื•ืžืคืชื— ื—ืฉื‘ื•ืŸ ืฉื™ืจื•ืช.

ืœืื—ืจ ืžื›ืŸ - ื—ืฉื‘ื•ืŸ ืฉื™ืจื•ืช ื—ื“ืฉ, ื•ื”ื–ืŸ ืืช ื”ืฉื ื‘ืฉื“ื” ืฉื ื—ืฉื‘ื•ืŸ ืฉื™ืจื•ืช.

ืžื”ืจืฉื™ืžื” ื”ื ืคืชื—ืช ืชืคืงื™ื“, ื‘ื—ืจ ืคืจื•ื™ืงื˜ > ื‘ืขืœื™ื ื•ืœืื—ืจ ืžื›ืŸ ืฆื•ืจ.

ื›ื™ืฆื“ ืœื”ืืฆื™ืœ ื“ื•ื—ื•ืช ืคืฉื•ื˜ื™ื ืœืจื•ื‘ื•ื˜. ื›ืชื™ื‘ืช ื‘ื•ื˜ ื‘-Python ื•ื‘-Google BigQuery

ื”ืงื•ื‘ืฅ ืฉื™ื•ืจื“ ืื•ื˜ื•ืžื˜ื™ืช ื ืงืจื creds.json.

ื”ื’ื“ืจ GOOGLE_APPLICATION_CREDENTIALS, ืฆื™ื•ืŸ ื”ื ืชื™ื‘ ืืœ creds.json ื‘ื˜ืจืžื™ื ืœ.

ื™ื™ืฆื•ื GOOGLE_APPLICATION_CREDENTIALS='[PATH_TO_CREDS.JSON]'

ืื ื”ื›ืœ ื”ืœืš ื›ืฉื•ืจื”, ื”ื’ื™ืข ื”ื–ืžืŸ ืœื”ืชื—ื™ืœ ื‘ื›ืชื™ื‘ืช ื”ืชื•ื›ื ื™ืช.

ืฆื•ืจ ืืคืœื™ืงืฆื™ื”

ืขื‘ื•ืจ ื”ืžื“ืจื™ืš ื ืฉืชืžืฉ ื‘ื ืชื•ื ื™ื ืž-bigquery-public-data.stackoverflow, ืขื‘ื•ืจ ื”ื“ื•ื— ืฉืœื ื• ื ื‘ื—ืจ ืืช ืžืกืคืจ ื”ืคืจืกื•ืžื™ื ื”ื™ื•ืžื™ื™ื.

ื”ื›ืœ ื“ื™ ืคืฉื•ื˜.

ืฉืื™ืœืชื” ื‘ื˜ื‘ืœื” -> ื”ื“ืžื™ื™ืŸ ืืช ื”ื ืชื•ื ื™ื -> ืฉืžื•ืจ ืืช ื”ื”ื“ืžื™ื” -> ืฉืœื— ืืช ื”ืชืžื•ื ื”

ื‘ื•ืื• ื ื™ืฆื•ืจ ืคื•ื ืงืฆื™ื” ืื—ืช ื›ื“ื™ ืœื”ื’ื“ื™ืจ ื›ืœ ืฉืจืฉื•ืจ.

ืฉืื™ืœืชื” ืœ-BigQuery

ืจืืฉื™ืช ืื ื• ืžื™ื™ื‘ืื™ื ืืช ื”ืกืคืจื™ื™ื”.

ืž-google.cloud ื™ื™ื‘ื•ื โ€‹โ€‹bigquery

ืื ื• ื™ื•ืฆืจื™ื ืคื•ื ืงืฆื™ื” ื‘ืฉื query_to_bigquery, ื›ืืฉืจ ื”ืคืจืžื˜ืจ ื”ื•ื query.

def query_to_bigquery(query):
    client = bigquery.Client()
    query_job = client.query(query)
    result = query_job.result()
    dataframe = result.to_dataframe()
    return dataframe

ืคื•ื ืงืฆื™ื” ื–ื• ืชื—ื–ื™ืจ ืืช ื”ื‘ืงืฉื” ื›ืžืกื’ืจืช ื ืชื•ื ื™ื.

ื”ื“ืžื™ื™ืช ื ืชื•ื ื™ื

ื›ื“ื™ ืœืคืชื•ืจ ื‘ืขื™ื” ื–ื•, ื‘ื—ืจ matplotlib.

ื™ื‘ื•ื matplotlib.pyplot ื›- plt

ืื ื—ื ื• ืฆืจื™ื›ื™ื ื—ืžื™ืฉื” ืคืจืžื˜ืจื™ื, ื›ืืฉืจ x ื”ื•ื ื ืชื•ื ื™ ืฆื™ืจ ื”-x, x_label ื”ื™ื ื”ื›ื•ืชืจืช ืฉืœ ื”ืฆื™ืจ, y ื”ื™ื ื ืชื•ื ื™ ืฆื™ืจ ื”-y, y_label ื”ื™ื ื”ื›ื•ืชืจืช ืฉืœ ื”ืฆื™ืจ, ื•ื›ื•ืชืจืช ื”ื™ื ื”ื›ื•ืชืจืช ืฉืœ ื”ื”ื“ืžื™ื” ื›ื•ืœื”.

def visualize_bar_chart(x, x_label, y, y_label, title):
    plt.title(title)
    plt.xlabel(x_label)
    plt.ylabel(y_label)
    index = np.arange(len(x))
    plt.xticks(index, x, fontsize=5, rotation=30)
    plt.bar(index, y)
    return plt

ืฉืžื•ืจ ืืช ื”ืชืžื•ื ื”

ื›ืขืช ื ืฉืชืžืฉ ื‘ืฉืชื™ ืคื•ื ืงืฆื™ื•ืช ื›ื“ื™ ืœื™ืฆื•ืจ ื”ื“ืžื™ื” ื•ืœืฉืžื•ืจ ืื•ืชื”.

ืื ื• ื ืฉืœื— ืืช ืžืกืคืจ ื”ืคื•ืกื˜ื™ื ื”ืžืชืคืจืกืžื™ื ืžื“ื™ ื™ื•ื. ืจืืฉื™ืช ื ื›ืชื•ื‘ ื‘ืงืฉื”.

query = """
        SELECT DATE(creation_date) date, COUNT(*) total_posts
        FROM `bigquery-public-data.stackoverflow.post_history`
        GROUP BY 1
        HAVING date > DATE_SUB('2018-12-02', INTERVAL 14 DAY)
        ORDER BY 1
        """

ื”ืฉืื™ืœืชื” ืขื•ื–ืจืช ื‘ืื™ืกื•ืฃ ื ืชื•ื ื™ื ืœืžืฉืš ืฉื‘ื•ืขื™ื™ื ื”ื—ืœ ืžื”-2 ื‘ื“ืฆืžื‘ืจ 2018.

ืื ื• ืžืฉืชืžืฉื™ื ื‘ืชืืจื™ืš ื–ื” ืžื›ื™ื•ื•ืŸ ืฉ-2018-12-02 ื”ื•ื ื”ื ืชื•ื ื™ื ื”ืขื“ื›ื ื™ื™ื ื‘ื™ื•ืชืจ ืฉืชื•ืขื“ื• ื‘-bigquery-public-data.stackoverflow.post_history, ื‘ืžืงืจื™ื ืื—ืจื™ื ืืชื” ื™ื›ื•ืœ ืœื”ืฉืชืžืฉ ื‘-CURRENT_DATE() ื›ื“ื™ ืœืงื‘ืœ ืืช ื”ื ืชื•ื ื™ื ื”ื—ื“ืฉื™ื ื‘ื™ื•ืชืจ.

ื”ืชืงืฉืจ ืœืคื•ื ืงืฆื™ื” query_to_bigquery ื›ื“ื™ ืœืงื‘ืœ ืืช ื”ื ืชื•ื ื™ื.

dataframe = query_to_bigquery(query)

ืœืื—ืจ ืžื›ืŸ ืื ื• ืžืฉืชืžืฉื™ื ื‘ืขืžื•ื“ืช ื ืชื•ื ื™ ื”ืชืืจื™ืš ืขื‘ื•ืจ ืฆื™ืจ ื”-x, ื•ื‘ืขืžื•ื“ืช total_posts ืขื‘ื•ืจ ืฆื™ืจ ื”-y.

x = dataframe['date'].tolist()
y = dataframe['total_posts'].tolist()

ืื ื• ืžื“ืžื™ื™ื ื™ื ืื•ืชื• ื‘ืืžืฆืขื•ืช ื”ืคื•ื ืงืฆื™ื” visualize_bar_chart ื•ืฉื•ืžืจื™ื ืื•ืชื• ื›ืชืžื•ื ื”.

plt = visualize_bar_chart(x=x, x_label='Date', y=y, y_label='Total Posts', title='Daily Posts')
plt.savefig('viz.png')โ€Ž

ืื ื• ืขื•ื˜ืคื™ื ืืช ื”ืงื•ื“ ื”ื–ื” ื‘ืคื•ื ืงืฆื™ื” ืฉื ืงืจืืช get_and_save_image.

def get_and_save_image():
    query = """
            SELECT DATE(creation_date) date, COUNT(*) total_posts
            FROM `bigquery-public-data.stackoverflow.post_history`
            GROUP BY 1
            HAVING date > DATE_SUB('2018-12-02', INTERVAL 14 DAY)
            ORDER BY 1
            """
    dataframe = query_to_bigquery(query)  
    x = dataframe['date'].tolist()
    y = dataframe['total_posts'].tolist()
    plt = visualize_bar_chart(x=x, x_label='Date', y=y, y_label='Total Posts', title='Daily Posts')
    plt.savefig('viz.png')

ืฉืœื— ืชืžื•ื ื”

ืขืœ ืžื ืช ืœืฉืœื•ื— ื“ื•ื— ืœื ืžืขืŸ, ืขืœื™ืš ืœื“ืขืช ืืช ื”ืคืจืžื˜ืจ chat_id.

ื”ืฉืชืžืฉ userinfobot ื•ื”ืงืœื“ /start. ื”ื‘ื•ื˜ ืžื’ื™ื‘ ืขื ื”ืžื™ื“ืข ื”ื“ืจื•ืฉ, chat_id ื›ืœื•ืœ ื‘ืฉื“ื” id.

ืขื›ืฉื™ื• ื‘ื•ืื• ื ื™ืฆื•ืจ ืืช ื”ืคื•ื ืงืฆื™ื” send_image. ื–ื” ื™ืฉืชืžืฉ ื‘ืคื•ื ืงืฆื™ื” get_and_save_image ื›ื“ื™ ืœืื—ื–ืจ ื•ืœืฉืžื•ืจ ืืช ื”ืชืžื•ื ื”. ื•ืื– ืื ื—ื ื• ืฉื•ืœื—ื™ื ื”ื›ืœ ืœืื™ืฉ ื”ืงืฉืจ ื”ื ื›ื•ืŸ.

def send_image(bot, update):
    get_and_save_image()
    chat_id = 'CHAT_ID_RECEIVER'
    bot.send_photo(chat_id=chat_id, photo=open('viz.png','rb'))

ืชื•ื›ื ื” ืจืืฉื™ืช

ืœื‘ืกื•ืฃ, ืื ื• ื™ื•ืฆืจื™ื ืคื•ื ืงืฆื™ื” ื ื•ืกืคืช, ืจืืฉื™ืช, ืœื”ืคืขืœืช ื”ืืคืœื™ืงืฆื™ื”. ืืœ ืชืฉื›ื— ืœืฉื ื•ืช ืืช YOUR_TOKEN ืขื‘ื•ืจ ื”ื‘ื•ื˜.

ื–ื›ื•ืจ: ืชื•ื›ื ื™ืช ื–ื• ืชืฉืœื— ืืช ื”ืชืžื•ื ื” ื‘ืื•ืคืŸ ืื•ื˜ื•ืžื˜ื™ ื‘ื–ืžืŸ ืฉืชืฆื™ื™ืŸ. ืœืžืฉืœ, ื ืฉืœื— ื“ื•ื— ื›ืœ ื™ื•ื ื‘ืชืฉืข ื‘ื‘ื•ืงืจ.

def main():
    updater = Updater('YOUR_TOKEN')
    updater.job_queue.run_daily(send_image, time=datetime.datetime.strptime('9:00AM', '%I:%M%p').time(), days=(0,1,2,3,4,5,6))
    updater.start_polling()
    updater.idle()
 
if __name__ == '__main__':
    main()

ื›ืชื•ืฆืื” ืžื›ืš, ื”ื™ื™ืฉื•ื ืฉืœื ื• ื™ื™ืจืื” ื›ืš:

from google.cloud import bigquery
from telegram.ext import Updater
 
import matplotlib.pyplot as plt
import numpy as np
import datetime
 
def query_to_bigquery(query):
    client = bigquery.Client()
    query_job = client.query(query)
    result = query_job.result()
    dataframe = result.to_dataframe()
    return dataframe
 
def visualize_bar_chart(x, x_label, y, y_label, title):
    plt.title(title)
    plt.xlabel(x_label)
    plt.ylabel(y_label)
    index = np.arange(len(x))
    plt.xticks(index, x, fontsize=5, rotation=30)
    plt.bar(index, y)
    return plt
 
def get_and_save_image():
    query = """
            SELECT DATE(creation_date) date, COUNT(*) total_posts
            FROM `bigquery-public-data.stackoverflow.post_history`
            GROUP BY 1
            HAVING date > DATE_SUB('2018-12-02', INTERVAL 14 DAY)
            ORDER BY 1
            """
    dataframe = query_to_bigquery(query)  
    x = dataframe['date'].tolist()
    y = dataframe['total_posts'].tolist()
    plt = visualize_bar_chart(x=x, x_label='Date', y=y, y_label='Total Posts', title='Daily Posts')
    plt.savefig('viz.png')
 
def send_image(bot, update):
    get_and_save_image()
    chat_id = 'CHAT_ID_RECEIVER'
    bot.send_photo(chat_id=chat_id, photo=open('viz.png', 'rb'))
 
def main():
    updater = Updater('YOUR_TOKEN')
    updater.job_queue.run_daily(send_image, time=datetime.datetime.strptime('9:00AM', '%I:%M%p').time(), days=(0,1,2,3,4,5,6))
    updater.start_polling()
    updater.idle()
 
if __name__ == '__main__':
main()

ืฉืžื•ืจ ืืช ื”ืงื•ื‘ืฅ ื•ืงืจื ืœื• main.py.

ืื ื• ืžืคืขื™ืœื™ื ืืช ื”ืืคืœื™ืงืฆื™ื” ืขืœ ื™ื“ื™ ื”ื–ื ืช ื”ืคืงื•ื“ื” ื‘ื˜ืจืžื™ื ืœ:

python3 main.py

ื”ื›ืœ ืžื•ื›ืŸ. ื›ืขืช ื™ืฉ ืœื ื• ืจื•ื‘ื•ื˜ ื”ืžื•ืจื›ื‘ ืž-50 ืฉื•ืจื•ืช ืงื•ื“ ืฉืžืคื™ืง ื“ื•ื—ื•ืช ืœืœื ื”ืชืขืจื‘ื•ืชื ื•.

ื‘ื•ื ื ื‘ื“ื•ืง ืืช ื”ื‘ื•ื˜ ืžื›ืืŸืขืœ ื™ื“ื™ ื‘ื—ื™ืจื” ื‘ืคืงื•ื“ื” /send.

ื›ื™ืฆื“ ืœื”ืืฆื™ืœ ื“ื•ื—ื•ืช ืคืฉื•ื˜ื™ื ืœืจื•ื‘ื•ื˜. ื›ืชื™ื‘ืช ื‘ื•ื˜ ื‘-Python ื•ื‘-Google BigQuery

ืืชื” ื™ื›ื•ืœ ืœืงื‘ืœ ืืช ื”ืงื•ื“ ื”ืžื•ื’ืžืจ ื‘ GitHub ืฉืœื™.

Skillbox ืžืžืœื™ืฆื”:

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”