Ukwenza umtapo wezincwadi wasekhaya nge-Notion nePython

Bengilokhu nginesithakazelo endleleni engcono kakhulu yokusabalalisa izincwadi emtatsheni wezincwadi wami we-electronic. Ekugcineni, ngize kule nketho ngokubala okuzenzakalelayo kwenombolo yamakhasi nezinye izinto ezinhle. Ngicela bonke abantu abanentshisekelo ngaphansi kwekati.

Ingxenye 1. I-Dropbox

Zonke izincwadi zami ziku-dropbox. Kunezigaba ezi-4 engihlukanise ngazo zonke izinto: Incwadi Yokufunda, Ireferensi, Inganekwane, Okungelona iqiniso. Kodwa angingezi izincwadi zokubhekisela etafuleni.

Izincwadi eziningi ziyi-.epub, ezinye ziyi-.pdf. Okusho ukuthi, isisombululo sokugcina kufanele ngandlela thize simboze izinketho zombili.

Izindlela zami eziya ezincwadini zimi kanje:

/Книги/Нехудожественное/Новое/Дизайн/Юрий Гордон/Книга про буквы от А до Я.epub 

Uma incwadi iyinganekwane, isigaba (okungukuthi, “Idizayini” esimweni esingenhla) siyasuswa.

Nginqume ukungazihluphi ngeDropbox API, njengoba nginesicelo sayo esivumelanisa ifolda. Okusho ukuthi, uhlelo yilo: sithatha izincwadi kufolda, siqhube incwadi ngayinye ngokusebenzisa ikhawunta yamagama, bese siyengeza kuNotion.

Ingxenye 2. Engeza umugqa

Ithebula ngokwalo kufanele libukeke kanje. QAPHELA: kungcono ukwenza amagama ekholomu ngesiLatini.

Ukwenza umtapo wezincwadi wasekhaya nge-Notion nePython

Sizosebenzisa i-Notion API engekho emthethweni, ngoba esemthethweni ayikalethwa.

Ukwenza umtapo wezincwadi wasekhaya nge-Notion nePython

Iya ku-Notion, cindezela u-Ctrl + Shift + J, iya kokuthi Isicelo -> Amakhukhi, kopisha i-token_v2 bese uyibiza ngokuthi TOKEN. Bese siya ekhasini esilidingayo elinophawu lwelabhulali bese sikopisha isixhumanisi. Siyibiza ngokuthi NOTION.

Bese sibhala ikhodi ukuze sixhume ku-Notion.

database = client.get_collection_view(NOTION)
current_rows = database.default_query().execute()

Okulandelayo, asibhale umsebenzi wokwengeza umugqa etafuleni.

def add_row(path, file, words_count, pages_count, hours):
    row = database.collection.add_row()
    row.title = file

    tags = path.split("/")

    if len(tags) >= 1:
        row.what = tags[0]

    if len(tags) >= 2:
        row.state = tags[1]

    if len(tags) >= 3:
        if tags[0] == "Художественное":
            row.author = tags[2]

        elif tags[0] == "Нехудожественное":
            row.tags = tags[2]

        elif tags[0] == "Учебники":
            row.tags = tags[2]

    if len(tags) >= 4:
        row.author = tags[3]

    row.hours = hours
    row.pages = pages_count
    row.words = words_count

Kwenzakalani lapha. Sithatha bese sengeza umugqa omusha etafuleni emgqeni wokuqala. Okulandelayo, sihlukanisa indlela yethu kanye "/" bese sithola amathegi. Amathegi - ngokuya "Art", "Design", ngubani umbhali, njalo njalo. Bese sibeka zonke izinkambu ezidingekayo zepuleti.

Ingxenye 3. Ukubala amagama, amahora nokunye okujabulisayo

Lona umsebenzi onzima kakhulu. Njengoba sikhumbula, sinamafomu amabili: i-epub ne-pdf. Uma yonke into icacile nge-epub - amagama cishe akhona, ngakho-ke yonke into ayicacile nge-pdf: ingase ihlanganise nezithombe ezinamathiselwe.

Ngakho-ke umsebenzi wethu wokubala amagama ku-PDF uzobukeka kanje: sithatha inani lamakhasi bese siphindaphinda ngokungaguquki okuthile (isilinganiso senani lamagama ekhasini ngalinye).

Nangu yena:

def get_words_count(pages_number):
    return pages_number * WORDS_PER_PAGE

Le WORDS_PER_PAGE yekhasi le-A4 icishe ibe ngu-300.

Manje masibhale umsebenzi wokubala amakhasi. Sizosebenzisa pyPDF2.

def get_pdf_pages_number(path, filename):
    pdf = PdfFileReader(open(os.path.join(path, filename), 'rb'))
    return pdf.getNumPages()

Okulandelayo, sizobhala into yokubala amakhasi ku-Epub. Sisebenzisa epub_converter. Lapha sithatha incwadi, siyiguqule ibe yimigqa, bese sibala amagama omugqa ngamunye.

def get_epub_pages_number(path, filename):
    book = open_book(os.path.join(path, filename))
    lines = convert_epub_to_lines(book)
    words_count = 0

    for line in lines:
        words_count += len(line.split(" "))

    return round(words_count / WORDS_PER_PAGE)

Manje ake sibale isikhathi. Sithatha isibalo sethu samagama esiyintandokazi futhi sihlukanise ngesivinini sakho sokufunda.

def get_reading_time(words_count):
    return round(((words_count / WORDS_PER_MINUTE) / 60) * 10) / 10

Ingxenye 4. Ukuxhuma zonke izingxenye

Kudingeka sidlule kuzo zonke izindlela ezingaba khona kufolda yethu yezincwadi. Hlola ukuthi ingabe isikhona yini incwadi kuNotion: uma ikhona, asisadingi ukudala umugqa.
Khona-ke sidinga ukunquma uhlobo lwefayela, kuye ngokuthi, bala inani lamagama. Engeza incwadi ekugcineni.

Lena ikhodi esiyitholayo:

for root, subdirs, files in os.walk(BOOKS_DIR):
    if len(files) > 0 and check_for_excusion(root):
        for file in files:
            array = file.split(".")
            filetype = file.split(".")[len(array) - 1]
            filename = file.replace("." + filetype, "")
            local_root = root.replace(BOOKS_DIR, "")

            print("Dir: {}, file: {}".format(local_root, file))

            if not check_for_existence(filename):
                print("Dir: {}, file: {}".format(local_root, file))

                if filetype == "pdf":
                    count = get_pdf_pages_number(root, file)

                else:
                    count = get_epub_pages_number(root, file)

                words_count = get_words_count(count)
                hours = get_reading_time(words_count)
                print("Pages: {}, Words: {}, Hours: {}".format(count, words_count, hours))
                add_row(local_root, filename, words_count, count, hours)

Futhi umsebenzi wokuhlola ukuthi incwadi yengeziwe ibukeka kanjena:

def check_for_existence(filename):
    for row in current_rows:
        if row.title in filename:
            return True

        elif filename in row.title:
            return True

    return False

isiphetho

Sibonga wonke umuntu ofunde lesi sihloko. Ngethemba ukuthi ikusiza ukuthi ufunde kabanzi :)

Source: www.habr.com

Engeza amazwana