introduction
Amplitudo se optime probavit ut instrumentum analyticorum productum ob faciliorem eventum suum habeat et flexibilitatem visualizationem. Saepe etiam opus est ut exemplar tuum attributionis, botri utentes, vel ashboardday aedificet in alia BI systemate. Tantum potest talem fraudem facere cum rudis eventus notitia ab Amplitudine. Hoc articulum indicabit tibi quomodo hanc datam minimam cognitionem programmandi obtineas.
PRAEREQUISITIS
- Consilium in Amplitudine in quo eventus iam figurati sunt recte ac statistica in illis collecta
- Python installatur (in versione 3.8.3) quod potentiale lector saltem in gradu fundamentali iam operari potest.
disciplinam
Gradus 1. Obtinens API-clavis et secreta-clavis
Ad notitias fasciculos, primum debes habere API clavem et clavem secretam.
Eas invenire potes sequentem viam;
- "Manage notitia" (sita in fundo left de screen)
- Elige consilium desideratum e quo notitia accipietur et ad eam
- In menu exertus quod opens, "project occasus"
- Invenimus API-clavem et chordas secretas clavis, easque in tuto loco serva.
Sine clicking, nexum sequi potes, quae in genere hoc spectat:
analytics.amplitude.com/$$$$$$$$/manage/project/******/occasus;
ubi $$$$$$ sit amplitudo tua organizationis login, ****** est numerus consilii
Gradus II: Reprehendo coram requiratur libraries
Bonus nuntius est te propemodum certe iam habere has bibliothecas ab defalta vel receptas inauguratas, sed inhibere debes. Integrum indicem bibliothecarum quo usus sum tempore scribendi (versiones in parenthesi ubi consentaneae sunt indicantur);
- petitiones (2.10.0) - petitio per api mittendo ad data recipienda
- pandas (1.0.1) - legendi json, dataframe creando et scribendo ad fasciculum
- zipfile - files extractum ex archivo receptum per API
- gzip - vestimenta json files ex .gz
- os - questus index files ex pacto archivo
- tempus - ad libitum, metire scriptor tempus cursus
- tqdm - ad libitum, facili vigilantia documenti processus progressionis
Gradus 3. scribens data loading scriptor
Monitus: plena download scriptura in fine articuli, si vis, statim accipere potes et ad gradatim explicationes, si opus est, referre.
Gradus 3.1. Inferentes bibliothecas
Omnes bibliothecas in secundo gradu recensitas importamus.
# ΠΠΌΠΏΠΎΡΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm
Gradus 3.2. Petentibus Amplitude exhibenda
Initium scriptorum exsecutionis deprehendamus et in varia ratione illud scribamus.
`startdate` et `enddate` tempus ad data deducenda designant et in textu petitionis missae includuntur. servo Amplitudine, praeter diem, horam etiam specificare potes valorem post 'T' in petitione mutando.
api_key et secret_key respondent valoribus in primo gradu habitis, propter rationes securitatis, sequentes temere hic loco meo definio.
a = time.time()
# ΠΠ°ΡΠ°ΠΌΠ΅ΡΡΡ Π½Π°ΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΈ ΠΊΠΎΠ½Π΅ΡΠ½ΠΎΠΉ Π΄Π°ΡΡ
startdate = '20200627'
enddate = '20200628'
api_key = 'kldfg844203rkwekfjs9234'
secret_key = '094tfjdsfmw93mxwfek'
# ΠΡΠΏΡΠ°Π²Π»Π΅Π½ΠΈΠ΅ Π·Π°ΠΏΡΠΎΡΠ° Π² Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. ΠΠ°ΠΏΡΠΎΡ ΠΎΡΠΏΡΠ°Π²Π»Π΅Π½')
Gradus 3.3. Download archivum cum notitia
Nomen pro archivo ascendimus et id in variabili ariolo scribemus. Pro commodo meo, tempus indicamus + hanc amplitudinem datam esse indicant. Deinceps responsionem receptam ab Amplitudine in archivo commemoramus.
# Π‘ΠΊΠ°ΡΠΈΠ²Π°Π½ΠΈΠ΅ Π°ΡΡ
ΠΈΠ²Π° Ρ Π΄Π°Π½Π½ΡΠΌΠΈ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
code.write(response.content)
print('2. ΠΡΡ
ΠΈΠ² Ρ ΡΠ°ΠΉΠ»Π°ΠΌΠΈ ΡΡΠΏΠ΅ΡΠ½ΠΎ ΡΠΊΠ°ΡΠ°Π½')
Gradus 3.4. Retrieving files ex folder in vestri computer
Bibliotheca zipfile in fabula venit ut files extractum adiuvet. In tertia linea, vide, et scribe viam tuam, ubi commodius est tibi extrahere.
# ΠΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΠ΅ ΡΠ°ΠΉΠ»ΠΎΠ² Π² ΠΏΠ°ΠΏΠΊΡ Π½Π° ΠΊΠΎΠΌΠΏΡΡΡΠ΅ΡΠ΅
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. ΠΡΡ
ΠΈΠ² Ρ ΡΠ°ΠΉΠ»Π°ΠΌΠΈ ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ ΠΈ Π·Π°ΠΏΠΈΡΠ°Π½ Π² ΠΏΠ°ΠΏΠΊΡ ' + filename)
Gradus 3.5. json conversionem
Excisis fasciculis ex archivo, lima json in forma .gz converti debes et eas in notitia operis ulterioris scribes.
Quaeso note hic debes mutare iter iterum ad tuum, et pro 000000 numerus project ab Amplitudine scribe (vel manually iter aperi ubi archivum extrahitur et nomen folder intus intuetur).
Ut:
Scribens directorium ad variabilem, indicem tabulariorum ex directorio nactus, datam inanem notitiarum, time.sleep(1) ad tqdm ad recte operandum, intra fasciam .gz imagini aperimus et statim pandas uteris ut json legere et imple dataframe datae.
# ΠΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ json ΠΊ ΠΎΠ±ΡΡΠ½ΠΎΠΌΡ ΡΠ°Π±Π»ΠΈΡΠ½ΠΎΠΌΡ ΡΠΎΡΠΌΠ°ΡΡ
directory = 'C:\Users\...\'+filename+'\000000'
files = os.listdir(directory)
amplitude_dataframe = pd.DataFrame()
print('ΠΡΠΎΠ³ΡΠ΅ΡΡ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ ΡΠ°ΠΉΠ»ΠΎΠ²:')
time.sleep(1)
for i in tqdm(files):
with gzip.open(directory + '\' + i) as f:
add = pd.read_json(f, lines = 'True')
amplitude_dataframe = pd.concat([amplitude_dataframe, add])
time.sleep(1)
print('4. JSON ΡΠ°ΠΉΠ»Ρ ΠΈΠ· Π°ΡΡ
ΠΈΠ²Π° ΡΡΠΏΠ΅ΡΠ½ΠΎ ΠΏΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½Ρ ΠΈ Π·Π°ΠΏΠΈΡΠ°Π½Ρ Π² dataframe')
Gradus 3.6. Scribens dataframe in excel
Fasciculi excellere exemplum hic est. In multis casibus, commodius est operari cum pythone intra pythone data inde elaborare vel notitias in reposita ponere.
Etiam iter onerationis notitias hic cum tuo reponere habebis.
# ΠΠ°ΠΏΠΈΡΠ°ΡΡ ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΠΎΠΉ ΡΠ°Π±Π»ΠΈΡΡ Π² Excel-ΡΠ°ΠΉΠ»
amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False)
print('5. Dataframe ΡΡΠΏΠ΅ΡΠ½ΠΎ Π·Π°ΠΏΠΈΡΠ°Π½ Π² ΡΠ°ΠΉΠ» ' + filename)
Gradus 3.7. Numeramus currentem tempus scripturae
Recordatio temporis currentis in b variabili, differentiam et numerum minutorum computans, totum minuta ostendens. Hic ultimus gradus est.
b = time.time()
diff = b-a
minutes = diff//60
print('ΠΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΠ΅ ΠΊΠΎΠ΄Π° Π·Π°Π½ΡΠ»ΠΎ: {:.0f} ΠΌΠΈΠ½ΡΡ(Ρ)'.format( minutes))
conclusio,
Tabulam vocare potes et incipere cooperando, amplitude_dataframe variabilis vocando in quam notitia scripta est. Habebit circiter 50 columnas, e quibus in 80% casuum uteris: event_type - eventum nomen, event_properties - eventus parametri, event_time - eventus tempus, uuid - client id, user_properties client parametri, debes incipere operari cum illis primis. . Et cum figuras ex propriis calculis cum indicibus ab Amplitudine ashboardas compares, non debes oblivisci systema methodi sua uti ad calculandum unicum clientium / infundibulum, etc., et antequam hoc facias, amplitudinem documentorum definite legas.
Gratias tibi ago pro attente! Nunc rudis eventus notitias ad Amplitudinem inscribere potes et eo in opere tuo plene utere.
Totum scriptum:
# ΠΠΌΠΏΠΎΡΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊ
import requests
import pandas as pd
import zipfile
import gzip
import os
import time
import tqdm
from tqdm import tqdm
a = time.time()
# ΠΠ°ΡΠ°ΠΌΠ΅ΡΡΡ Π½Π°ΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΈ ΠΊΠΎΠ½Π΅ΡΠ½ΠΎΠΉ Π΄Π°ΡΡ
startdate = '20200627'
enddate = '20200628'
api_key = 'd988fddd7cfc0a8a'
secret_key = 'da05cf1aeb3a361a61'
# ΠΡΠΏΡΠ°Π²Π»Π΅Π½ΠΈΠ΅ Π·Π°ΠΏΡΠΎΡΠ° Π² Amplitude
response = requests.get('https://amplitude.com/api/2/export?start='+startdate+'T0&end='+enddate+'T0', auth = (api_key, secret_key))
print('1. ΠΠ°ΠΏΡΠΎΡ ΠΎΡΠΏΡΠ°Π²Π»Π΅Π½')
# Π‘ΠΊΠ°ΡΠΈΠ²Π°Π½ΠΈΠ΅ Π°ΡΡ
ΠΈΠ²Π° Ρ Π΄Π°Π½Π½ΡΠΌΠΈ
filename = 'period_since'+startdate+'to'+enddate+'_amplitude_data'
with open(filename + '.zip', "wb") as code:
code.write(response.content)
print('2. ΠΡΡ
ΠΈΠ² Ρ ΡΠ°ΠΉΠ»Π°ΠΌΠΈ ΡΡΠΏΠ΅ΡΠ½ΠΎ ΡΠΊΠ°ΡΠ°Π½')
# ΠΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΠ΅ ΡΠ°ΠΉΠ»ΠΎΠ² Π² ΠΏΠ°ΠΏΠΊΡ Π½Π° ΠΊΠΎΠΌΠΏΡΡΡΠ΅ΡΠ΅
z = zipfile.ZipFile(filename + '.zip', 'r')
z.extractall(path = 'C:\Users\...\'+filename)
print('3. ΠΡΡ
ΠΈΠ² Ρ ΡΠ°ΠΉΠ»Π°ΠΌΠΈ ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ ΠΈ Π·Π°ΠΏΠΈΡΠ°Π½ Π² ΠΏΠ°ΠΏΠΊΡ ' + filename)
# ΠΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ json ΠΊ ΠΎΠ±ΡΡΠ½ΠΎΠΌΡ ΡΠ°Π±Π»ΠΈΡΠ½ΠΎΠΌΡ ΡΠΎΡΠΌΠ°ΡΡ
directory = 'C:\Users\...\'+filename+'\000000'
files = os.listdir(directory)
amplitude_dataframe = pd.DataFrame()
print('ΠΡΠΎΠ³ΡΠ΅ΡΡ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ ΡΠ°ΠΉΠ»ΠΎΠ²:')
time.sleep(1)
for i in tqdm(files):
with gzip.open(directory + '\' + i) as f:
add = pd.read_json(f, lines = 'True')
amplitude_dataframe = pd.concat([amplitude_dataframe, add])
time.sleep(1)
print('4. JSON ΡΠ°ΠΉΠ»Ρ ΠΈΠ· Π°ΡΡ
ΠΈΠ²Π° ΡΡΠΏΠ΅ΡΠ½ΠΎ ΠΏΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½Ρ ΠΈ Π·Π°ΠΏΠΈΡΠ°Π½Ρ Π² dataframe')
# ΠΠ°ΠΏΠΈΡΠ°ΡΡ ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΠΎΠΉ ΡΠ°Π±Π»ΠΈΡΡ Π² Excel-ΡΠ°ΠΉΠ»
amplitude_dataframe.to_excel('C:\Users\...\'+filename+'.xlsx',index=False)
print('5. Dataframe ΡΡΠΏΠ΅ΡΠ½ΠΎ Π·Π°ΠΏΠΈΡΠ°Π½ Π² ΡΠ°ΠΉΠ» ' + filename)
b = time.time()
diff = b-a
minutes = diff//60
print('ΠΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΠ΅ ΠΊΠΎΠ΄Π° Π·Π°Π½ΡΠ»ΠΎ: {:.0f} ΠΌΠΈΠ½ΡΡ(Ρ)'.format( minutes))
Source: www.habr.com
