Retrieving Amplitude cov ntaub ntawv ntawm API

Taw qhia

Amplitude tau ua pov thawj nws tus kheej ua ib lub cuab yeej tshuaj xyuas khoom ua tsaug rau nws qhov kev teeb tsa yooj yim thiab kev pom kev yooj yim. Txawm li cas los xij, feem ntau muaj qhov xav tau teeb tsa tus qauv kev cai, ua cov neeg siv clustering, lossis tsim lub dashboard hauv lwm lub kaw lus BI. Cov kev ua tau zoo li no tsuas yog ua tau nrog cov ntaub ntawv xwm txheej raw los ntawm Amplitude. Tsab xov xwm no piav qhia txog yuav ua li cas kom tau txais cov ntaub ntawv no nrog kev paub txog kev sau ntawv tsawg kawg nkaus.

Cov Kev Cai Ua Ntej

  1. Ib qhov project hauv Amplitude uas cov xwm txheej twb tau teeb tsa kom raug thiab cov ntaub ntawv tau sau rau lawv
  2. Python tau teeb tsa lawm (Kuv tab tom ua haujlwm ntawm version 3.8.3), uas tus nyeem ntawv twb paub yuav ua li cas ua haujlwm nrog tsawg kawg ntawm qib yooj yim.

Lus Qhia

Kauj Ruam 1: Tau txais tus yuam sij API thiab tus yuam sij zais cia

Yuav kom rub tawm cov ntaub ntawv, koj yuav tsum tau txais tus yuam sij API thiab tus yuam sij zais cia ua ntej.

Koj tuaj yeem nrhiav tau lawv los ntawm kev ua raws li txoj kev:

  1. "Tswj cov ntaub ntawv" (nyob rau sab laug qis ntawm qhov screen)
  2. Xaiv qhov project uas xav tau uas cov ntaub ntawv yuav raug rub tawm thiab mus rau nws
  3. Hauv cov ntawv qhia zaub mov project uas qhib, xaiv "Project settings"
  4. Nrhiav cov API-key thiab secret-key strings, theej lawv, thiab khaws cia rau hauv qhov chaw nyab xeeb.

Tsis tas nias, koj tuaj yeem ua raws li qhov txuas, uas feem ntau zoo li no:
analytics.amplitude.com/$$$$$$$/manage/project/******/settings,
qhov twg $$$$$$ yog koj lub koom haum tus ID nkag mus hauv amplitude, ****** yog tus lej project

Kauj Ruam 2: Xyuas seb puas muaj cov tsev qiv ntawv uas xav tau

Qhov xov xwm zoo yog tias koj yuav luag muaj cov tsev qiv ntawv no tau teeb tsa los ntawm lub neej ntawd lossis rub tawm, tab sis nws tsim nyog kuaj xyuas. Nov yog daim ntawv teev tag nrho ntawm cov tsev qiv ntawv uas kuv siv thaum lub sijhawm sau ntawv (cov qauv nyob hauv cov ntawv kaw qhov twg siv tau):

  1. cov kev thov (2.10.0) - xa ib qho kev thov los ntawm API kom tau txais cov ntaub ntawv
  2. pandas (1.0.1) - nyeem json, tsim ib lub dataframe, thiab tom qab ntawd sau rau hauv cov ntaub ntawv
  3. zipfile - rho tawm cov ntaub ntawv los ntawm cov ntaub ntawv tau txais los ntawm API
  4. gzip β€” kev rho tawm cov ntaub ntawv json los ntawm .gz
  5. os - tau txais cov npe ntawm cov ntaub ntawv los ntawm cov ntaub ntawv tsis tau qhib
  6. lub sijhawm - xaiv tau, ntsuas lub sijhawm ua tiav ntawm tsab ntawv
  7. tqdm - xaiv tau, rau kev saib xyuas yooj yim ntawm kev ua cov ntaub ntawv

Kauj Ruam 3: Sau ib daim ntawv sau cov ntaub ntawv

Lus Cim: Daim ntawv qhia rub tawm tag nrho nyob rau thaum kawg ntawm tsab xov xwm. Yog tias koj nyiam, koj tuaj yeem rub nws tam sim ntawd thiab saib cov lus qhia ib kauj ruam zuj zus raws li qhov xav tau.

Kauj Ruam 3.1. Kev Ntshuam Cov Tsev Qiv Ntawv

Peb import tag nrho cov tsev qiv ntawv teev nyob rau hauv kauj ruam thib ob.

# Π˜ΠΌΠΏΠΎΡ€Ρ‚ Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm

Kauj Ruam 3.2. Xa daim ntawv thov rau Amplitude

Peb nrhiav tau qhov pib ntawm kev ua tiav ntawm tsab ntawv thiab sau nws rau hauv qhov hloov pauv a.

hnub pib thiab hnub kawg yog lub luag haujlwm rau lub sijhawm rau kev rub tawm cov ntaub ntawv thiab tau muab tso rau hauv cov ntawv ntawm qhov kev thov xa neeg rau zaub mov Amplitude, ntxiv rau hnub tim, koj tuaj yeem teev qhia los ntawm kev hloov tus nqi tom qab 'T' hauv qhov kev thov.

api_key thiab secret_key sib raug rau cov nqi tau txais hauv thawj kauj ruam; rau kev ruaj ntseg, kuv teev cov kab ke random ntawm no es tsis yog kuv tus kheej.

a = time.time()
# ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Ρ‹ Π½Π°Ρ‡Π°Π»ΡŒΠ½ΠΎΠΉ ΠΈ ΠΊΠΎΠ½Π΅Ρ‡Π½ΠΎΠΉ Π΄Π°Ρ‚Ρ‹
startdate = '20200627'
enddate = '20200628'

api_key = 'kldfg844203rkwekfjs9234'
secret_key = '094tfjdsfmw93mxwfek'
# ΠžΡ‚ΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ запроса Π² Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. Запрос ΠΎΡ‚ΠΏΡ€Π°Π²Π»Π΅Π½')

Kauj Ruam 3.3. Rub tawm cov ntaub ntawv khaws cia

Peb nrhiav tau lub npe rau cov ntaub ntawv thiab khaws cia rau hauv qhov filename variable. Rau kev yooj yim, kuv teev lub sijhawm thiab qhia tias qhov no yog amplitude data. Tom ntej no, peb txuag cov lus teb tau txais los ntawm Amplitude rau cov ntaub ntawv.

# Π‘ΠΊΠ°Ρ‡ΠΈΠ²Π°Π½ΠΈΠ΅ Π°Ρ€Ρ…ΠΈΠ²Π° с Π΄Π°Π½Π½Ρ‹ΠΌΠΈ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
    code.write(response.content)
print('2. Архив с Ρ„Π°ΠΉΠ»Π°ΠΌΠΈ ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ скачан')  

Kauj Ruam 3.4. Rho tawm cov ntaub ntawv rau hauv ib daim nplaub tshev hauv koj lub computer

Lub tsev qiv ntawv zipfile yuav pab rho tawm cov ntaub ntawv. Ceev faj hauv kab thib peb thiab teev txoj kev mus rau qhov chaw rho tawm uas koj nyiam.

# Π˜Π·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ Ρ„Π°ΠΉΠ»ΠΎΠ² Π² ΠΏΠ°ΠΏΠΊΡƒ Π½Π° ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π΅
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. Архив с Ρ„Π°ΠΉΠ»Π°ΠΌΠΈ ΠΈΠ·Π²Π»Π΅Ρ‡Π΅Π½ ΠΈ записан Π² ΠΏΠ°ΠΏΠΊΡƒ ' + filename)

Kauj Ruam 3.5. Kev hloov JSON

Tom qab rho tawm cov ntaub ntawv los ntawm cov ntaub ntawv khaws cia, koj yuav tsum hloov cov ntaub ntawv json, uas yog hom ntawv .gz, thiab sau rau hauv dataframe rau kev ua haujlwm ntxiv.

Thov nco ntsoov tias ntawm no koj yuav tsum hloov txoj kev mus rau koj tus kheej, thiab hloov ntawm 000000 sau koj tus lej project los ntawm Amplitude (lossis manually qhib txoj kev uas cov ntaub ntawv tau rho tawm thiab saib lub npe ntawm daim nplaub tshev sab hauv).

Raws li qhov tseem ceeb:

Txuag ib daim nplaub tshev rau hauv ib qho variable, tau txais cov npe ntawm cov ntaub ntawv los ntawm daim nplaub tshev, tsim ib lub dataframe khoob, time.sleep(1) rau tqdm kom ua haujlwm kom raug, qhib cov ntaub ntawv .gz hauv lub voj voog thiab nyeem json tam sim ntawd siv pandas thiab sau cov dataframe uas tau teev tseg.

# ΠŸΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ json ΠΊ ΠΎΠ±Ρ‹Ρ‡Π½ΠΎΠΌΡƒ Ρ‚Π°Π±Π»ΠΈΡ‡Π½ΠΎΠΌΡƒ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚Ρƒ
directory = 'C:\Users\...\'+filename+'\000000'
files = os.listdir(directory)
amplitude_dataframe = pd.DataFrame()
print('ΠŸΡ€ΠΎΠ³Ρ€Π΅ΡΡ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ Ρ„Π°ΠΉΠ»ΠΎΠ²:')
time.sleep(1)
for i in tqdm(files):
    with gzip.open(directory + '\' + i) as f:
        add = pd.read_json(f, lines = 'True')
    amplitude_dataframe = pd.concat([amplitude_dataframe, add])
time.sleep(1)    
print('4. JSON Ρ„Π°ΠΉΠ»Ρ‹ ΠΈΠ· Π°Ρ€Ρ…ΠΈΠ²Π° ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ ΠΏΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½Ρ‹ ΠΈ записаны Π² dataframe')

Kauj Ruam 3.6. Txuag cov ntaub ntawv rau hauv Excel

Kev xa tawm mus rau Excel tsuas yog ib qho piv txwv ntawm no xwb. Hauv ntau qhov xwm txheej, nws yooj yim dua los ua haujlwm nrog cov ntaub ntawv tshwm sim hauv Python lossis khaws cov ntaub ntawv hauv qhov chaw cia khoom.

Koj tseem yuav tau hloov txoj kev rub tawm cov ntaub ntawv ntawm no nrog koj tus kheej.

# Π—Π°ΠΏΠΈΡΠ°Ρ‚ΡŒ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠΉ Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Π² Excel-Ρ„Π°ΠΉΠ»
amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False)
print('5. Dataframe ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ записан Π² Ρ„Π°ΠΉΠ» ' + filename)

Kauj Ruam 3.7. Xam lub sijhawm ua haujlwm ntawm tsab ntawv

Sau lub sijhawm tam sim no rau hauv qhov hloov pauv b, xam qhov sib txawv thiab tus naj npawb ntawm feeb, thiab tso saib tag nrho cov feeb. Qhov no yog kauj ruam kawg.

b = time.time()
diff = b-a
minutes = diff//60
print('Π’Ρ‹ΠΏΠΎΠ»Π½Π΅Π½ΠΈΠ΅ ΠΊΠΎΠ΄Π° заняло: {:.0f} ΠΌΠΈΠ½ΡƒΡ‚(Ρ‹)'.format( minutes))

xaus

Koj tuaj yeem nkag mus rau lub rooj thiab pib ua haujlwm nrog nws los ntawm kev hu rau amplitude_dataframe variable, uas yog qhov chaw uas cov ntaub ntawv tau khaws cia. Nws yuav muaj li 50 kab, ntawm cov uas 80% ntawm lub sijhawm koj yuav siv: event_type (lub npe xwm txheej), event_properties (cov kev tshwm sim), event_time (lub sijhawm xwm txheej), uuid (tus neeg siv khoom ID), thiab user_properties (tus neeg siv khoom cov kev tshwm sim). Cov no yog thawj qhov pib ua haujlwm nrog. Thaum piv koj tus kheej cov kev xam nrog cov ntsuas hauv Amplitude dashboards, nco ntsoov tias lub kaw lus siv nws tus kheej txoj kev rau kev xam cov neeg siv khoom / funnels tshwj xeeb, thiab lwm yam, thiab nws yog qhov tseem ceeb kom sab laj nrog Amplitude cov ntaub ntawv ua ntej ua li ntawd.

Ua tsaug rau koj qhov kev mloog! Tam sim no koj tuaj yeem xa cov ntaub ntawv raw mus rau Amplitude thiab siv nws tag nrho hauv koj txoj haujlwm.

Tag nrho cov ntawv sau:

# Π˜ΠΌΠΏΠΎΡ€Ρ‚ Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm
a = time.time()
# ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Ρ‹ Π½Π°Ρ‡Π°Π»ΡŒΠ½ΠΎΠΉ ΠΈ ΠΊΠΎΠ½Π΅Ρ‡Π½ΠΎΠΉ Π΄Π°Ρ‚Ρ‹
startdate = '20200627'
enddate = '20200628'

api_key = 'd988fddd7cfc0a8a'
secret_key = 'da05cf1aeb3a361a61'
# ΠžΡ‚ΠΏΡ€Π°Π²Π»Π΅Π½ΠΈΠ΅ запроса Π² Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. Запрос ΠΎΡ‚ΠΏΡ€Π°Π²Π»Π΅Π½')

# Π‘ΠΊΠ°Ρ‡ΠΈΠ²Π°Π½ΠΈΠ΅ Π°Ρ€Ρ…ΠΈΠ²Π° с Π΄Π°Π½Π½Ρ‹ΠΌΠΈ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
    code.write(response.content)
print('2. Архив с Ρ„Π°ΠΉΠ»Π°ΠΌΠΈ ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ скачан')  

# Π˜Π·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ Ρ„Π°ΠΉΠ»ΠΎΠ² Π² ΠΏΠ°ΠΏΠΊΡƒ Π½Π° ΠΊΠΎΠΌΠΏΡŒΡŽΡ‚Π΅Ρ€Π΅
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. Архив с Ρ„Π°ΠΉΠ»Π°ΠΌΠΈ ΠΈΠ·Π²Π»Π΅Ρ‡Π΅Π½ ΠΈ записан Π² ΠΏΠ°ΠΏΠΊΡƒ ' + filename)

# ΠŸΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ json ΠΊ ΠΎΠ±Ρ‹Ρ‡Π½ΠΎΠΌΡƒ Ρ‚Π°Π±Π»ΠΈΡ‡Π½ΠΎΠΌΡƒ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚Ρƒ
directory = 'C:\Users\...\'+filename+'\000000'
files = os.listdir(directory)
amplitude_dataframe = pd.DataFrame()
print('ΠŸΡ€ΠΎΠ³Ρ€Π΅ΡΡ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ Ρ„Π°ΠΉΠ»ΠΎΠ²:')
time.sleep(1)
for i in tqdm(files):
    with gzip.open(directory + '\' + i) as f:
        add = pd.read_json(f, lines = 'True')
    amplitude_dataframe = pd.concat([amplitude_dataframe, add])
time.sleep(1)    
print('4. JSON Ρ„Π°ΠΉΠ»Ρ‹ ΠΈΠ· Π°Ρ€Ρ…ΠΈΠ²Π° ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ ΠΏΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½Ρ‹ ΠΈ записаны Π² dataframe')

# Π—Π°ΠΏΠΈΡΠ°Ρ‚ΡŒ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΠΎΠΉ Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Π² Excel-Ρ„Π°ΠΉΠ»
amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False)
print('5. Dataframe ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎ записан Π² Ρ„Π°ΠΉΠ» ' + filename)

b = time.time()
diff = b-a
minutes = diff//60
print('Π’Ρ‹ΠΏΠΎΠ»Π½Π΅Π½ΠΈΠ΅ ΠΊΠΎΠ΄Π° заняло: {:.0f} ΠΌΠΈΠ½ΡƒΡ‚(Ρ‹)'.format( minutes))

Tau qhov twg los: www.hab.com

Yuav txhim khu kev qha hosting rau cov chaw nrog DDoS tiv thaiv, VPS VDS servers πŸ”₯ Yuav lub vev xaib hosting txhim khu kev qha nrog kev tiv thaiv DDoS, VPS VDS servers | ProHoster