ืื—ื–ื•ืจ ื ืชื•ื ื™ Amplitude ื‘ืืžืฆืขื•ืช API

ืžื‘ื•ื

Amplitude ื”ื•ื›ื™ื—ื” ืืช ืขืฆืžื” ื”ื™ื˜ื‘ ื›ื›ืœื™ ืœื ื™ืชื•ื— ืžื•ืฆืจ ื”ื•ื“ื•ืช ืœื”ื’ื“ืจืช ื”ืื™ืจื•ืขื™ื ื”ืงืœื” ืฉืœื” ื•ื’ืžื™ืฉื•ืช ื”ื”ื“ืžื™ื”. ื•ืœืขื™ืชื™ื ืงืจื•ื‘ื•ืช ื™ืฉ ืฆื•ืจืš ืœื”ื’ื“ื™ืจ ืžื•ื“ืœ ื™ื™ื—ื•ืก ืžืฉืœืš, ืœืืกื•ืฃ ืžืฉืชืžืฉื™ื, ืื• ืœื‘ื ื•ืช ืœื•ื— ืžื—ื•ื•ื ื™ื ื‘ืžืขืจื›ืช BI ืื—ืจืช. ื ื™ืชืŸ ืœื‘ืฆืข ื”ื•ื ืื” ื›ื–ื• ืจืง ืขื ื ืชื•ื ื™ ืื™ืจื•ืขื™ื ื’ื•ืœืžื™ื™ื ืžืืžืคืœื™ื˜ื•ื“. ืžืืžืจ ื–ื” ื™ื’ื™ื“ ืœืš ื›ื™ืฆื“ ืœื”ืฉื™ื’ ื ืชื•ื ื™ื ืืœื” ืขื ื™ื“ืข ืžื™ื ื™ืžืœื™ ื‘ืชื›ื ื•ืช.

ื“ืจื™ืฉื•ืช ืžื•ืงื“ืžื•ืช

  1. ืคืจื•ื™ืงื˜ ื‘-Amplitude ื‘ื• ืื™ืจื•ืขื™ื ื›ื‘ืจ ืžื•ื’ื“ืจื™ื ื›ื”ืœื›ื” ื•ื ืืกืคื™ื ืขืœื™ื”ื ืกื˜ื˜ื™ืกื˜ื™ืงื”
  2. Python ืžื•ืชืงืŸ (ืื ื™ ืขื•ื‘ื“ ื‘ื’ืจืกื” 3.8.3), ืฉื”ืงื•ืจื ื”ืคื•ื˜ื ืฆื™ืืœื™ ื›ื‘ืจ ื™ื›ื•ืœ ืœืขื‘ื•ื“ ืื™ืชื” ืœืคื—ื•ืช ื‘ืจืžื” ื‘ืกื™ืกื™ืช

ื”ื•ืจืื”

ืฉืœื‘ 1. ื”ืฉื’ืช ืžืคืชื— API ื•ืžืคืชื— ืกื•ื“ื™

ื›ื“ื™ ืœื”ืขืœื•ืช ื ืชื•ื ื™ื, ืชื—ื™ืœื” ืขืœื™ืš ืœื”ืฉื™ื’ ืžืคืชื— API ื•ืžืคืชื— ืกื•ื“ื™.

ืืชื” ื™ื›ื•ืœ ืœืžืฆื•ื ืื•ืชื ืขืœ ื™ื“ื™ ื‘ื™ืฆื•ืข ื”ื ืชื™ื‘ ื”ื‘ื:

  1. "ื ื”ืœ ื ืชื•ื ื™ื" (ืžืžื•ืงื ื‘ืคื™ื ื” ื”ืฉืžืืœื™ืช ื”ืชื—ืชื•ื ื” ืฉืœ ื”ืžืกืš)
  2. ื‘ื—ืจ ืืช ื”ืคืจื•ื™ืงื˜ ื”ืจืฆื•ื™ ืžืžื ื• ื™ื•ืจื“ื• ื”ื ืชื•ื ื™ื ื•ืขื‘ื•ืจ ืืœื™ื•
  3. ื‘ืชืคืจื™ื˜ ื”ืคืจื•ื™ืงื˜ ืฉื ืคืชื—, ื‘ื—ืจ "ื”ื’ื“ืจื•ืช ืคืจื•ื™ืงื˜"
  4. ืื ื• ืžื•ืฆืื™ื ืืช ืžื—ืจื•ื–ื•ืช ืžืคืชื— ื”-API ื•ื”ืžืคืชื— ื”ืกื•ื“ื™, ืžืขืชื™ืงื™ื ื•ืฉื•ืžืจื™ื ืื•ืชื ื‘ืžืงื•ื ื‘ื˜ื•ื—.

ืžื‘ืœื™ ืœืœื—ื•ืฅ, ืืชื” ื™ื›ื•ืœ ืœืขืงื•ื‘ ืื—ืจ ื”ืงื™ืฉื•ืจ, ืฉื‘ืื•ืคืŸ ื›ืœืœื™ ื ืจืื” ื›ืš:
analytics.amplitude.com/$$$$$$$/manage/project/********/settings,
ื›ืืฉืจ $$$$$$ ื”ื•ื ื”ื›ื ื™ืกื” ืœืืžืคืœื™ื˜ื•ื“ื” ืฉืœ ื”ืืจื’ื•ืŸ ืฉืœืš, ****** ื”ื•ื ืžืกืคืจ ื”ืคืจื•ื™ืงื˜

ืฉืœื‘ 2: ื‘ื“ื™ืงืช ื ื•ื›ื—ื•ืช ื”ืกืคืจื™ื•ืช ื”ื ื“ืจืฉื•ืช

ื”ื—ื“ืฉื•ืช ื”ื˜ื•ื‘ื•ืช ื”ืŸ ืฉื›ืžืขื˜ ื‘ื•ื•ื“ืื•ืช ืฉื›ื‘ืจ ื”ืชืงื ืช ืืช ื”ืกืคืจื™ื•ืช ื”ืืœื” ื›ื‘ืจื™ืจืช ืžื—ื“ืœ ืื• ื”ื•ืจื“ื”, ืื‘ืœ ืืชื” ืฆืจื™ืš ืœื‘ื“ื•ืง. ื”ืจืฉื™ืžื” ื”ืžืœืื” ืฉืœ ื”ืกืคืจื™ื•ืช ืฉื‘ื”ืŸ ื”ืฉืชืžืฉืชื™ ื‘ื–ืžืŸ ื›ืชื™ื‘ืช ืฉื•ืจื•ืช ืืœื• (ื’ืจืกืื•ืช ื‘ืกื•ื’ืจื™ื™ื ืžืฆื•ื™ื ื•ืช ื‘ืžื™ื“ืช ื”ืฆื•ืจืš):

  1. ื‘ืงืฉื•ืช (2.10.0) - ืฉืœื™ื—ืช ื‘ืงืฉื” ื“ืจืš API ืœืงื‘ืœืช ื ืชื•ื ื™ื
  2. pandas (1.0.1) - ืงืจื™ืืช json, ื™ืฆื™ืจืช ืžืกื’ืจืช ื ืชื•ื ื™ื ื•ืื– ื›ืชื™ื‘ื” ืœืงื•ื‘ืฅ
  3. zipfile - ื—ืœืฅ ืงื‘ืฆื™ื ืžืืจื›ื™ื•ืŸ ืฉื”ืชืงื‘ืœ ื“ืจืš ื”-API
  4. gzip - ืคื™ืจื•ืง ืงื‘ืฆื™ json ืž-.gz
  5. os - ืงื‘ืœืช ืจืฉื™ืžื” ืฉืœ ืงื‘ืฆื™ื ืžืืจื›ื™ื•ืŸ ืœื ืืจื•ื–
  6. ื–ืžืŸ - ืื•ืคืฆื™ื•ื ืœื™, ืœืžื“ื•ื“ ืืช ื–ืžืŸ ื”ืจื™ืฆื” ืฉืœ ื”ืกืงืจื™ืคื˜
  7. tqdm - ืื•ืคืฆื™ื•ื ืœื™, ืœื ื™ื˜ื•ืจ ืงืœ ืฉืœ ื”ืชืงื“ืžื•ืช ืขื™ื‘ื•ื“ ื”ืงื‘ืฆื™ื

ืฉืœื‘ 3. ื›ืชื™ื‘ืช ืกืงืจื™ืคื˜ ืœื˜ืขื™ื ืช ื ืชื•ื ื™ื

ืจืžื–: ืกืงืจื™ืคื˜ ื”ื”ื•ืจื“ื” ื”ืžืœื ื ืžืฆื ื‘ืกื•ืฃ ื”ืžืืžืจ; ืื ืชืจืฆื”, ืืชื” ื™ื›ื•ืœ ืžื™ื“ ืœืงื—ืช ืื•ืชื• ื•ืœื”ืชื™ื™ื—ืก ืœื”ืกื‘ืจื™ื ืฉืœื‘ ืื—ืจ ืฉืœื‘ ื‘ืžื™ื“ืช ื”ืฆื•ืจืš.

ืฉืœื‘ 3.1. ื™ื™ื‘ื•ื โ€‹โ€‹ืกืคืจื™ื•ืช

ืื ื• ืžื™ื™ื‘ืื™ื ืืช ื›ืœ ื”ืกืคืจื™ื•ืช ื”ืžืคื•ืจื˜ื•ืช ื‘ืฉืœื‘ ื”ืฉื ื™.

# ะ˜ะผะฟะพั€ั‚ ะฑะธะฑะปะธะพั‚ะตะบ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm

ืฉืœื‘ 3.2. ื”ื’ืฉืช ื‘ืงืฉื” ืœืืžืคืœื™ื˜ื•ื“

ื‘ื•ืื• ืœื–ื”ื•ืช ืืช ืชื—ื™ืœืช ื‘ื™ืฆื•ืข ื”ืกืงืจื™ืคื˜ ื•ื ื›ืชื•ื‘ ืื•ืชื• ืœืžืฉืชื ื” a.

startdate ื•-eatte ืื—ืจืื™ื ืœืชืงื•ืคื” ืœื”ื•ืจื“ืช ื”ื ืชื•ื ื™ื ื•ื”ื ืžื•ื‘ื ื™ื ื‘ื˜ืงืกื˜ ืฉืœ ื”ื‘ืงืฉื” ืฉื ืฉืœื—ื” ืœืฉืจืช Amplitude; ื‘ื ื•ืกืฃ ืœืชืืจื™ืš, ื ื™ืชืŸ ื’ื ืœืฆื™ื™ืŸ ืืช ื”ืฉืขื” ืขืœ ื™ื“ื™ ืฉื™ื ื•ื™ ื”ืขืจืš ืœืื—ืจ 'T' ื‘ื‘ืงืฉื”.

api_key ื•- secret_key ืชื•ืืžื™ื ืœืขืจื›ื™ื ืฉื”ื•ืฉื’ื• ื‘ืฉืœื‘ ื”ืจืืฉื•ืŸ; ืœืžื˜ืจื•ืช ืื‘ื˜ื—ื”, ืื ื™ ืžืฆื™ื™ืŸ ื›ืืŸ ืจืฆืคื™ื ืืงืจืื™ื™ื ื‘ืžืงื•ื ืฉืœื™.

a = time.time()
# ะŸะฐั€ะฐะผะตั‚ั€ั‹ ะฝะฐั‡ะฐะปัŒะฝะพะน ะธ ะบะพะฝะตั‡ะฝะพะน ะดะฐั‚ั‹
startdate = '20200627'
enddate = '20200628'

api_key = 'kldfg844203rkwekfjs9234'
secret_key = '094tfjdsfmw93mxwfek'
# ะžั‚ะฟั€ะฐะฒะปะตะฝะธะต ะทะฐะฟั€ะพัะฐ ะฒ Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. ะ—ะฐะฟั€ะพั ะพั‚ะฟั€ะฐะฒะปะตะฝ')

ืฉืœื‘ 3.3. ื”ื•ืจื“ืช ืืจื›ื™ื•ืŸ ืขื ื ืชื•ื ื™ื

ื ืžืฆื™ื ืฉื ืœืืจื›ื™ื•ืŸ ื•ื ื›ืชื•ื‘ ืื•ืชื• ืœืžืฉืชื ื” ืฉื ื”ืงื•ื‘ืฅ. ืœื ื•ื—ื™ื•ืชื™, ืื ื™ ืžืฆื™ื™ืŸ ืืช ื”ืชืงื•ืคื” + ืžืฆื™ื™ืŸ ืฉืžื“ื•ื‘ืจ ื‘ื ืชื•ื ื™ ืžืฉืจืขืช. ืœืื—ืจ ืžื›ืŸ, ืื ื• ืžืชืขื“ื™ื ืืช ื”ืชื’ื•ื‘ื” ืฉื”ืชืงื‘ืœื” ืžืืžืคืœื™ื˜ื•ื“ ื‘ืืจื›ื™ื•ืŸ.

# ะกะบะฐั‡ะธะฒะฐะฝะธะต ะฐั€ั…ะธะฒะฐ ั ะดะฐะฝะฝั‹ะผะธ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
    code.write(response.content)
print('2. ะั€ั…ะธะฒ ั ั„ะฐะนะปะฐะผะธ ัƒัะฟะตัˆะฝะพ ัะบะฐั‡ะฐะฝ')  

ืฉืœื‘ 3.4. ืื—ื–ื•ืจ ืงื‘ืฆื™ื ืžืชื™ืงื™ื” ื‘ืžื—ืฉื‘ ืฉืœืš

ืกืคืจื™ื™ืช ื”-zipfile ื ื›ื ืกืช ืœืคืขื•ืœื” ื›ื“ื™ ืœืขื–ื•ืจ ืœื—ืœืฅ ืงื‘ืฆื™ื. ื‘ืฉื•ืจื” ื”ืฉืœื™ืฉื™ืช, ื”ื™ื–ื”ืจ ื•ืจืฉื•ื ืืช ื”ื“ืจืš ืฉืœืš ื”ื™ื›ืŸ ืฉื™ื•ืชืจ ื ื•ื— ืœืš ืœื—ืœืฅ.

# ะ˜ะทะฒะปะตั‡ะตะฝะธะต ั„ะฐะนะปะพะฒ ะฒ ะฟะฐะฟะบัƒ ะฝะฐ ะบะพะผะฟัŒัŽั‚ะตั€ะต
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. ะั€ั…ะธะฒ ั ั„ะฐะนะปะฐะผะธ ะธะทะฒะปะตั‡ะตะฝ ะธ ะทะฐะฟะธัะฐะฝ ะฒ ะฟะฐะฟะบัƒ ' + filename)

ืฉืœื‘ 3.5. ื”ืžืจืช json

ืœืื—ืจ ื—ื™ืœื•ืฅ ื”ืงื‘ืฆื™ื ืžื”ืืจื›ื™ื•ืŸ, ืขืœื™ืš ืœื”ืžื™ืจ ืงื‘ืฆื™ json ื‘ืคื•ืจืžื˜ .gz ื•ืœื›ืชื•ื‘ ืื•ืชื ืœืชื•ืš Dataframe ืœื”ืžืฉืš ืขื‘ื•ื“ื”.

ืฉื™ืžื• ืœื‘ ืฉื›ืืŸ ืขืœื™ื›ื ืœืฉื ื•ืช ืฉื•ื‘ ืืช ื”ื ืชื™ื‘ ืœืฉืœืš, ื•ื‘ืžืงื•ื 000000 ืœื›ืชื•ื‘ ืืช ืžืกืคืจ ื”ืคืจื•ื™ืงื˜ ืฉืœื›ื ืž-Amplitude (ืื• ืœืคืชื•ื— ื™ื“ื ื™ืช ืืช ื”ื ืชื™ื‘ ืฉื‘ื• ื—ื•ืœืฅ ื”ืืจื›ื™ื•ืŸ ื•ืœื”ืกืชื›ืœ ืขืœ ืฉื ื”ืชื™ืงื™ื” ืฉื‘ืชื•ื›ื”).

ื‘ืกื“ืจ:

ื›ืชื™ื‘ืช ืกืคืจื™ื” ืœืžืฉืชื ื”, ืงื‘ืœืช ืจืฉื™ืžืช ืงื‘ืฆื™ื ืžืกืคืจื™ื”, ื™ืฆื™ืจืช Dataframe ืจื™ืง, time.sleep(1) ื›ื“ื™ ืฉ-tqdm ื™ืขื‘ื•ื“ ื›ืžื• ืฉืฆืจื™ืš, ื‘ืชื•ืš ื”ืœื•ืœืื” ื ืคืชื— ืงื‘ืฆื™ .gz ื•ืžื™ื“ ืžืฉืชืžืฉื™ื ื‘ืคื ื“ื•ืช ื›ื“ื™ ืœืงืจื•ื json ื•ืœืžืœื ืžืกื’ืจืช ื”ื ืชื•ื ื™ื ื”ื ืชื•ื ื”.

# ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต json ะบ ะพะฑั‹ั‡ะฝะพะผัƒ ั‚ะฐะฑะปะธั‡ะฝะพะผัƒ ั„ะพั€ะผะฐั‚ัƒ
directory = 'C:\Users\...\'+filename+'
# ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต json ะบ ะพะฑั‹ั‡ะฝะพะผัƒ ั‚ะฐะฑะปะธั‡ะฝะพะผัƒ ั„ะพั€ะผะฐั‚ัƒ
directory = 'C:\Users\...\'+filename+'\000000'
files = os.listdir(directory)
amplitude_dataframe = pd.DataFrame()
print('ะŸั€ะพะณั€ะตัั ะพะฑั€ะฐะฑะพั‚ะบะธ ั„ะฐะนะปะพะฒ:')
time.sleep(1)
for i in tqdm(files):
with gzip.open(directory + '\' + i) as f:
add = pd.read_json(f, lines = 'True')
amplitude_dataframe = pd.concat([amplitude_dataframe, add])
time.sleep(1)    
print('4. JSON ั„ะฐะนะปั‹ ะธะท ะฐั€ั…ะธะฒะฐ ัƒัะฟะตัˆะฝะพ ะฟั€ะตะพะฑั€ะฐะทะพะฒะฐะฝั‹ ะธ ะทะฐะฟะธัะฐะฝั‹ ะฒ dataframe')
0000' files = os.listdir(directory) amplitude_dataframe = pd.DataFrame() print('ะŸั€ะพะณั€ะตัั ะพะฑั€ะฐะฑะพั‚ะบะธ ั„ะฐะนะปะพะฒ:') time.sleep(1) for i in tqdm(files): with gzip.open(directory + '\' + i) as f: add = pd.read_json(f, lines = 'True') amplitude_dataframe = pd.concat([amplitude_dataframe, add]) time.sleep(1) print('4. JSON ั„ะฐะนะปั‹ ะธะท ะฐั€ั…ะธะฒะฐ ัƒัะฟะตัˆะฝะพ ะฟั€ะตะพะฑั€ะฐะทะพะฒะฐะฝั‹ ะธ ะทะฐะฟะธัะฐะฝั‹ ะฒ dataframe')

ืฉืœื‘ 3.6. ื›ืชื™ื‘ืช Dataframe ื‘ืืงืกืœ

ื”ื”ืขืœืื” ืœ-exel ื”ื™ื ืจืง ื“ื•ื’ืžื” ื›ืืŸ. ื‘ืžืงืจื™ื ืจื‘ื™ื, ื ื•ื— ื™ื•ืชืจ ืœืขื‘ื•ื“ ืขื ืžืกื’ืจืช ื”ื ืชื•ื ื™ื ื”ืžืชืงื‘ืœืช ื‘ืชื•ืš python ืื• ืœื”ื›ื ื™ืก ืืช ื”ื ืชื•ื ื™ื ืœืื—ืกื•ืŸ.

ืชืฆื˜ืจืš ื’ื ืœื”ื—ืœื™ืฃ ืืช ื ืชื™ื‘ ื”ืขืœืืช ื”ื ืชื•ื ื™ื ื›ืืŸ ื‘ื ืชื™ื‘ ืฉืœืš.

# ะ—ะฐะฟะธัะฐั‚ัŒ ะฟะพะปัƒั‡ะตะฝะฝะพะน ั‚ะฐะฑะปะธั†ั‹ ะฒ Excel-ั„ะฐะนะป
amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False)
print('5. Dataframe ัƒัะฟะตัˆะฝะพ ะทะฐะฟะธัะฐะฝ ะฒ ั„ะฐะนะป ' + filename)

ืฉืœื‘ 3.7. ืื ื• ืกื•ืคืจื™ื ืืช ื–ืžืŸ ื”ืจื™ืฆื” ืฉืœ ื”ืชืกืจื™ื˜

ืจื™ืฉื•ื ื”ื–ืžืŸ ื”ื ื•ื›ื—ื™ ื‘ืžืฉืชื ื” b, ื—ื™ืฉื•ื‘ ื”ื”ืคืจืฉ ื•ืžืกืคืจ ื”ื“ืงื•ืช, ื”ืฆื’ืช ืกืš ื”ื“ืงื•ืช. ื–ื” ื”ืฉืœื‘ ื”ืื—ืจื•ืŸ.

b = time.time()
diff = b-a
minutes = diff//60
print('ะ’ั‹ะฟะพะปะฝะตะฝะธะต ะบะพะดะฐ ะทะฐะฝัะปะพ: {:.0f} ะผะธะฝัƒั‚(ั‹)'.format( minutes))

ืžืกืงื ื”

ืืชื” ื™ื›ื•ืœ ืœืงืจื•ื ืœื˜ื‘ืœื” ื•ืœื”ืชื—ื™ืœ ืœืขื‘ื•ื“ ืื™ืชื” ืขืœ ื™ื“ื™ ืงืจื™ืื” ืœืžืฉืชื ื” amplitude_dataframe ืฉืืœื™ื• ื ื›ืชื‘ื• ื”ื ืชื•ื ื™ื. ื™ื”ื™ื• ืœื• ื›-50 ืขืžื•ื“ื•ืช, ืžืชื•ื›ืŸ ื‘-80% ืžื”ืžืงืจื™ื ืชืฉืชืžืฉื• ื‘: event_type - event_name, event_properties - event parameters, event_time - event time, uuid - client id, user_properties - ืคืจืžื˜ืจื™ ืœืงื•ื—, ื›ื“ืื™ ืœื”ืชื—ื™ืœ ืœืขื‘ื•ื“ ืื™ืชื ืงื•ื“ื. . ื•ื›ืืฉืจ ืžืฉื•ื•ื™ื ื ืชื•ื ื™ื ืžื”ื—ื™ืฉื•ื‘ื™ื ืฉืœืš ืขื ืื™ื ื“ื™ืงื˜ื•ืจื™ื ืžืžืจื›ื–ื™ ื”ืžื—ื•ื•ื ื™ื ืฉืœ Amplitude, ืืกื•ืจ ืœืš ืœืฉื›ื•ื— ืฉื”ืžืขืจื›ืช ืžืฉืชืžืฉืช ื‘ืžืชื•ื“ื•ืœื•ื’ื™ื” ืžืฉืœื” ืœื—ื™ืฉื•ื‘ ืœืงื•ื—ื•ืช/ืžืฉืคื›ื™ื ื™ื™ื—ื•ื“ื™ื™ื ื•ื›ื•', ื•ืœืคื ื™ ืฉืขื•ืฉื™ื ื–ืืช, ื›ื“ืื™ ื‘ื”ื—ืœื˜ ืœืงืจื•ื ืืช ืชื™ืขื•ื“ Amplitude.

ืชื•ื“ื” ืœืš ืขืœ ืชืฉื•ืžืช ื”ืœื‘! ื›ืขืช ืืชื” ื™ื›ื•ืœ ืœื”ืขืœื•ืช ื ืชื•ื ื™ ืื™ืจื•ืขื™ื ื’ื•ืœืžื™ื™ื ืœ-Amplitude ื•ืœื”ืฉืชืžืฉ ื‘ื”ื ื‘ืื•ืคืŸ ืžืœื ื‘ืขื‘ื•ื“ื” ืฉืœืš.

ื›ืœ ื”ืชืกืจื™ื˜:

# ะ˜ะผะฟะพั€ั‚ ะฑะธะฑะปะธะพั‚ะตะบ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm
a = time.time()
# ะŸะฐั€ะฐะผะตั‚ั€ั‹ ะฝะฐั‡ะฐะปัŒะฝะพะน ะธ ะบะพะฝะตั‡ะฝะพะน ะดะฐั‚ั‹
startdate = '20200627'
enddate = '20200628'

api_key = 'd988fddd7cfc0a8a'
secret_key = 'da05cf1aeb3a361a61'
# ะžั‚ะฟั€ะฐะฒะปะตะฝะธะต ะทะฐะฟั€ะพัะฐ ะฒ Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. ะ—ะฐะฟั€ะพั ะพั‚ะฟั€ะฐะฒะปะตะฝ')

# ะกะบะฐั‡ะธะฒะฐะฝะธะต ะฐั€ั…ะธะฒะฐ ั ะดะฐะฝะฝั‹ะผะธ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
    code.write(response.content)
print('2. ะั€ั…ะธะฒ ั ั„ะฐะนะปะฐะผะธ ัƒัะฟะตัˆะฝะพ ัะบะฐั‡ะฐะฝ')  

# ะ˜ะทะฒะปะตั‡ะตะฝะธะต ั„ะฐะนะปะพะฒ ะฒ ะฟะฐะฟะบัƒ ะฝะฐ ะบะพะผะฟัŒัŽั‚ะตั€ะต
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. ะั€ั…ะธะฒ ั ั„ะฐะนะปะฐะผะธ ะธะทะฒะปะตั‡ะตะฝ ะธ ะทะฐะฟะธัะฐะฝ ะฒ ะฟะฐะฟะบัƒ ' + filename)

# ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต json ะบ ะพะฑั‹ั‡ะฝะพะผัƒ ั‚ะฐะฑะปะธั‡ะฝะพะผัƒ ั„ะพั€ะผะฐั‚ัƒ
directory = 'C:\Users\...\'+filename+'
# ะ˜ะผะฟะพั€ั‚ ะฑะธะฑะปะธะพั‚ะตะบ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm
a = time.time()
# ะŸะฐั€ะฐะผะตั‚ั€ั‹ ะฝะฐั‡ะฐะปัŒะฝะพะน ะธ ะบะพะฝะตั‡ะฝะพะน ะดะฐั‚ั‹
startdate = '20200627'
enddate = '20200628'
api_key = 'd988fddd7cfc0a8a'
secret_key = 'da05cf1aeb3a361a61'
# ะžั‚ะฟั€ะฐะฒะปะตะฝะธะต ะทะฐะฟั€ะพัะฐ ะฒ Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. ะ—ะฐะฟั€ะพั ะพั‚ะฟั€ะฐะฒะปะตะฝ')
# ะกะบะฐั‡ะธะฒะฐะฝะธะต ะฐั€ั…ะธะฒะฐ ั ะดะฐะฝะฝั‹ะผะธ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
code.write(response.content)
print('2. ะั€ั…ะธะฒ ั ั„ะฐะนะปะฐะผะธ ัƒัะฟะตัˆะฝะพ ัะบะฐั‡ะฐะฝ')  
# ะ˜ะทะฒะปะตั‡ะตะฝะธะต ั„ะฐะนะปะพะฒ ะฒ ะฟะฐะฟะบัƒ ะฝะฐ ะบะพะผะฟัŒัŽั‚ะตั€ะต
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. ะั€ั…ะธะฒ ั ั„ะฐะนะปะฐะผะธ ะธะทะฒะปะตั‡ะตะฝ ะธ ะทะฐะฟะธัะฐะฝ ะฒ ะฟะฐะฟะบัƒ ' + filename)
# ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต json ะบ ะพะฑั‹ั‡ะฝะพะผัƒ ั‚ะฐะฑะปะธั‡ะฝะพะผัƒ ั„ะพั€ะผะฐั‚ัƒ
directory = 'C:\Users\...\'+filename+'\000000'
files = os.listdir(directory)
amplitude_dataframe = pd.DataFrame()
print('ะŸั€ะพะณั€ะตัั ะพะฑั€ะฐะฑะพั‚ะบะธ ั„ะฐะนะปะพะฒ:')
time.sleep(1)
for i in tqdm(files):
with gzip.open(directory + '\' + i) as f:
add = pd.read_json(f, lines = 'True')
amplitude_dataframe = pd.concat([amplitude_dataframe, add])
time.sleep(1)    
print('4. JSON ั„ะฐะนะปั‹ ะธะท ะฐั€ั…ะธะฒะฐ ัƒัะฟะตัˆะฝะพ ะฟั€ะตะพะฑั€ะฐะทะพะฒะฐะฝั‹ ะธ ะทะฐะฟะธัะฐะฝั‹ ะฒ dataframe')
# ะ—ะฐะฟะธัะฐั‚ัŒ ะฟะพะปัƒั‡ะตะฝะฝะพะน ั‚ะฐะฑะปะธั†ั‹ ะฒ Excel-ั„ะฐะนะป
amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False)
print('5. Dataframe ัƒัะฟะตัˆะฝะพ ะทะฐะฟะธัะฐะฝ ะฒ ั„ะฐะนะป ' + filename)
b = time.time()
diff = b-a
minutes = diff//60
print('ะ’ั‹ะฟะพะปะฝะตะฝะธะต ะบะพะดะฐ ะทะฐะฝัะปะพ: {:.0f} ะผะธะฝัƒั‚(ั‹)'.format( minutes))
0000' files = os.listdir(directory) amplitude_dataframe = pd.DataFrame() print('ะŸั€ะพะณั€ะตัั ะพะฑั€ะฐะฑะพั‚ะบะธ ั„ะฐะนะปะพะฒ:') time.sleep(1) for i in tqdm(files): with gzip.open(directory + '\' + i) as f: add = pd.read_json(f, lines = 'True') amplitude_dataframe = pd.concat([amplitude_dataframe, add]) time.sleep(1) print('4. JSON ั„ะฐะนะปั‹ ะธะท ะฐั€ั…ะธะฒะฐ ัƒัะฟะตัˆะฝะพ ะฟั€ะตะพะฑั€ะฐะทะพะฒะฐะฝั‹ ะธ ะทะฐะฟะธัะฐะฝั‹ ะฒ dataframe') # ะ—ะฐะฟะธัะฐั‚ัŒ ะฟะพะปัƒั‡ะตะฝะฝะพะน ั‚ะฐะฑะปะธั†ั‹ ะฒ Excel-ั„ะฐะนะป amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False) print('5. Dataframe ัƒัะฟะตัˆะฝะพ ะทะฐะฟะธัะฐะฝ ะฒ ั„ะฐะนะป ' + filename) b = time.time() diff = b-a minutes = diff//60 print('ะ’ั‹ะฟะพะปะฝะตะฝะธะต ะบะพะดะฐ ะทะฐะฝัะปะพ: {:.0f} ะผะธะฝัƒั‚(ั‹)'.format( minutes))

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”