Faia o se faletusi fale ma Notion ma Python

Sa ou fiafia lava i le auala sili e tufatufa atu ai tusi i laʻu faletusi faaeletonika. I le faaiuga, na ou sau i lenei filifiliga ma le faʻatulagaina otometi o le numera o itulau ma isi mea lelei. Ou te fesili atu i tagata fiafia uma i lalo o pusi.

Vaega 1. Dropbox

O a'u tusi uma o lo'o i luga o le dropbox. E 4 vaega na ou vaevaeina ai mea uma: Tusia'oga, Fa'amatalaga, Fiction, Non-fiction. Ae ou te le faaopoopo tusi faasino i le laulau.

O le tele o tusi e .epub, o isi o .pdf. O lona uiga, o le fofo mulimuli e tatau ona aofia uma ai filifiliga e lua.

O aʻu auala i tusi e pei o lenei:

/Книги/Нехудожественное/Новое/Дизайн/Юрий Гордон/Книга про буквы от А до Я.epub 

Afai o le tusi o tala fatu, ona aveese lea o le vaega (o lona uiga, "Design" i le mataupu o loʻo i luga).

Na ou filifili e aua le faʻalavelave i le Dropbox API, talu ai o loʻo i ai laʻu talosaga e faʻafetaui le faila. O lona uiga, o le fuafuaga lenei: matou te ave tusi mai le faila, faʻataʻitaʻi tusi taʻitasi i se upu fata, ma faʻaopopo i le Notion.

Vaega 2. Faaopoopo se laina

O le laulau lava ia e tatau ona foliga fa'apenei. FAAMATALAGA: e sili atu le faia o igoa koluma ile Latina.

Faia o se faletusi fale ma Notion ma Python

O le a matou faʻaogaina le le aloaia Notion API, aua e leʻi tuʻuina atu le mea aloaia.

Faia o se faletusi fale ma Notion ma Python

Alu ile Notion, fetaomi Ctrl + Shift + J, alu ile Application -> Cookies, kopi token_v2 ma taʻua TOKEN. Ona matou o lea i le itulau matou te manaʻomia ma le faʻailoga faletusi ma kopi le fesoʻotaʻiga. Matou te ta'ua le NOTION.

Ona matou tusia lea o le code e faʻafesoʻotaʻi i le Notion.

database = client.get_collection_view(NOTION)
current_rows = database.default_query().execute()

Sosoo ai, se'i o tatou tusia se galuega e fa'aopoopo ai se laina i le laulau.

def add_row(path, file, words_count, pages_count, hours):
    row = database.collection.add_row()
    row.title = file

    tags = path.split("/")

    if len(tags) >= 1:
        row.what = tags[0]

    if len(tags) >= 2:
        row.state = tags[1]

    if len(tags) >= 3:
        if tags[0] == "Художественное":
            row.author = tags[2]

        elif tags[0] == "Нехудожественное":
            row.tags = tags[2]

        elif tags[0] == "Учебники":
            row.tags = tags[2]

    if len(tags) >= 4:
        row.author = tags[3]

    row.hours = hours
    row.pages = pages_count
    row.words = words_count

O le a le mea o tupu iinei. Matou te ave ma faʻaopoopo se laina fou i le laulau i le laina muamua. O le isi, matou vaeluaina lo matou ala i le "/" ma maua pine. Fa'ailoga - e tusa ai ma le "Art", "Design", o ai le tusitala, ma isi. Ona matou setiina uma lea o fanua manaʻomia o le ipu.

Vaega 3. Faitau upu, itula ma isi mea fiafia

O se galuega e sili atu ona faigata. E pei ona tatou manatua, e lua a tatou faatulagaga: epub ma le pdf. Afai e manino mea uma i le epub - atonu o loʻo i ai upu, ona le manino lea o mea uma e uiga i le pdf: atonu e aofia ai ata faʻapipiʻi.

O la matou galuega mo le faitauina o upu i le PDF o le a pei o lenei: matou te ave le numera o itulau ma faʻateleina i se faʻamautu masani (le averesi numera o upu i le itulau).

O ia lea:

def get_words_count(pages_number):
    return pages_number * WORDS_PER_PAGE

Ole WORDS_PER_PAGE lenei mo se itulau A4 e tusa ma le 300.

Se'i o tatou tusi se galuega e faitau itulau. O le a matou faʻaaogaina pyPDF2.

def get_pdf_pages_number(path, filename):
    pdf = PdfFileReader(open(os.path.join(path, filename), 'rb'))
    return pdf.getNumPages()

Le isi, matou te tusia se mea mo le faitauina o itulau i le Epub. Matou te faaaogaina epub_converter. O iinei tatou te ave ai le tusi, faaliliu i laina, ma faitau upu mo laina taitasi.

def get_epub_pages_number(path, filename):
    book = open_book(os.path.join(path, filename))
    lines = convert_epub_to_lines(book)
    words_count = 0

    for line in lines:
        words_count += len(line.split(" "))

    return round(words_count / WORDS_PER_PAGE)

Ia tatou fuafua nei le taimi. Matou te ave a matou upu e sili ona fiafia i ai ma vaevae i lau faitau saoasaoa.

def get_reading_time(words_count):
    return round(((words_count / WORDS_PER_MINUTE) / 60) * 10) / 10

Vaega 4. Faʻafesoʻotaʻi vaega uma

E tatau ona tatou uia ala uma e mafai i totonu o la tatou faila tusi. Siaki pe o iai se tusi i le Notion: afai e iai, matou te le toe manaʻomia le faia o se laina.
Ona tatou manaʻomia lea e fuafua le ituaiga faila, faʻatatau i lenei, faitau le numera o upu. Faaopoopo se tusi i le faaiuga.

O le code lea matou te maua:

for root, subdirs, files in os.walk(BOOKS_DIR):
    if len(files) > 0 and check_for_excusion(root):
        for file in files:
            array = file.split(".")
            filetype = file.split(".")[len(array) - 1]
            filename = file.replace("." + filetype, "")
            local_root = root.replace(BOOKS_DIR, "")

            print("Dir: {}, file: {}".format(local_root, file))

            if not check_for_existence(filename):
                print("Dir: {}, file: {}".format(local_root, file))

                if filetype == "pdf":
                    count = get_pdf_pages_number(root, file)

                else:
                    count = get_epub_pages_number(root, file)

                words_count = get_words_count(count)
                hours = get_reading_time(words_count)
                print("Pages: {}, Words: {}, Hours: {}".format(count, words_count, hours))
                add_row(local_root, filename, words_count, count, hours)

Ma o le galuega e siaki ai pe ua faaopoopo se tusi e pei o lenei:

def check_for_existence(filename):
    for row in current_rows:
        if row.title in filename:
            return True

        elif filename in row.title:
            return True

    return False

iʻuga

Faafetai i tagata uma na faitau i lenei tusiga. Ou te faʻamoemoe e fesoasoani ia te oe e faitau atili :)

puna: www.habr.com

Faaopoopo i ai se faamatalaga