Iji Notion na Python na-eme ọba akwụkwọ ụlọ

Enwere m mmasị mgbe niile n'otú kacha mma isi kesaa akwụkwọ n'ọbá akwụkwọ eletrọnịkị m. N'ikpeazụ, abịara m na nhọrọ a na nchịkọta akpaka nke ọnụ ọgụgụ ibe na ihe ọma ndị ọzọ. M na-ajụ ndị niile nwere mmasị n'okpuru pusi.

Nkebi 1. Dropbox

Akwụkwọ m niile dị na igbe nchekwa. Enwere edemede 4 nke m kewara ihe niile: Akwụkwọ ọgụgụ, ntụaka, akụkọ ifo, enweghị akụkọ ifo. Ma anaghị m etinye akwụkwọ ntụaka na tebụl.

Ọtụtụ akwụkwọ bụ .epub, ndị ọzọ bụ .pdf. Ya bụ, ngwọta ikpeazụ ga-ekpuchirịrị nhọrọ abụọ ahụ.

Ụzọ m na-aga akwụkwọ dị ka nke a:

/Книги/Нехудожественное/Новое/Дизайн/Юрий Гордон/Книга про буквы от А до Я.epub 

Ọ bụrụ na akwụkwọ akụkọ bụ akụkọ ifo, mgbe ahụ, a ga-ewepụ udi (ya bụ, "Design" na ikpe dị n'elu).

Ekpebiri m na agaghị m enye Dropbox API nsogbu, ebe m nwere ngwa ha na-emekọrịta folda ahụ. Ya bụ, atụmatụ a bụ nke a: anyị na-ewepụ akwụkwọ na nchekwa ahụ, na-agba akwụkwọ ọ bụla site na mpempe okwu, ma tinye ya na Notion.

Nkebi 2. Tinye ahịrị

Tebụl n'onwe ya kwesịrị ịdị ka nke a. AKWỤKWỌ: ọ ka mma ịme aha kọlụm na Latin.

Iji Notion na Python na-eme ọba akwụkwọ ụlọ

Anyị ga-eji Notion API nke na-akwadoghị, n'ihi na ewepụtabeghị nke gọọmentị.

Iji Notion na Python na-eme ọba akwụkwọ ụlọ

Gaa na Notion, pịa Ctrl + Shift + J, gaa na Ngwa -> Kuki, detuo token_v2 wee kpọọ ya TOKEN. Mgbe ahụ, anyị na-aga na ibe anyị chọrọ na akara ọba akwụkwọ ma detuo njikọ ahụ. Anyị na-akpọ ya NOTION.

Mgbe ahụ, anyị na-ede koodu iji jikọọ na Notion.

database = client.get_collection_view(NOTION)
current_rows = database.default_query().execute()

Ọzọ, ka anyị dee ọrụ iji tinye ahịrị na tebụl.

def add_row(path, file, words_count, pages_count, hours):
    row = database.collection.add_row()
    row.title = file

    tags = path.split("/")

    if len(tags) >= 1:
        row.what = tags[0]

    if len(tags) >= 2:
        row.state = tags[1]

    if len(tags) >= 3:
        if tags[0] == "Художественное":
            row.author = tags[2]

        elif tags[0] == "Нехудожественное":
            row.tags = tags[2]

        elif tags[0] == "Учебники":
            row.tags = tags[2]

    if len(tags) >= 4:
        row.author = tags[3]

    row.hours = hours
    row.pages = pages_count
    row.words = words_count

Kedu ihe na-eme ebe a. Anyị na-ewere ma tinye ahịrị ọhụrụ na tebụl n'ahịrị mbụ. Ọzọ, anyị kewara ụzọ anyị na "/" wee nweta mkpado. Tags - n'usoro nke "Art", "Design", onye bụ onye edemede, na na. Mgbe ahụ, anyị na-edozi ubi niile dị mkpa nke efere ahụ.

Nkebi nke 3. Ịgụ okwu, awa na ihe ụtọ ndị ọzọ

Nke a bụ ọrụ siri ike karị. Dịka anyị na-echeta, anyị nwere ụdị abụọ: epub na pdf. Ọ bụrụ na ihe niile doro anya na epub - okwu ndị ahụ nwere ike ịbụ ebe ahụ, mgbe ahụ ihe niile edoghị anya na pdf: ọ nwere ike ịgụnye ihe oyiyi glued.

Ya mere, ọrụ anyị maka ịgụta okwu na PDF ga-adị ka nke a: anyị na-ewere ọnụ ọgụgụ nke ibe wee mụbaa site na otu mgbe (nkezi ọnụ ọgụgụ nke okwu kwa ibe).

Ebe a bụ:

def get_words_count(pages_number):
    return pages_number * WORDS_PER_PAGE

WORDS_PER_PAGE a maka ibe A4 dị ihe dịka 300.

Ugbu a, ka anyị dee ọrụ iji gụọ ibe. Anyị ga-eji pyPDF2.

def get_pdf_pages_number(path, filename):
    pdf = PdfFileReader(open(os.path.join(path, filename), 'rb'))
    return pdf.getNumPages()

Ọzọ, anyị ga-ede ihe maka ịgụta ibe na Epub. Anyị na-eji epub_converter. N'ebe a, anyị na-ewere akwụkwọ ahụ, gbanwee ya ka ọ bụrụ ahịrị, ma gụọ okwu maka ahịrị ọ bụla.

def get_epub_pages_number(path, filename):
    book = open_book(os.path.join(path, filename))
    lines = convert_epub_to_lines(book)
    words_count = 0

    for line in lines:
        words_count += len(line.split(" "))

    return round(words_count / WORDS_PER_PAGE)

Ugbu a, ka anyị gbakọọ oge. Anyị na-ewere ọnụ ọgụgụ nke ọkacha mmasị anyị wee kesaa ya site na ọsọ ọgụgụ gị.

def get_reading_time(words_count):
    return round(((words_count / WORDS_PER_MINUTE) / 60) * 10) / 10

Nkebi 4. Jikọọ akụkụ niile

Anyị kwesịrị ịgafe ụzọ niile enwere ike na nchekwa akwụkwọ anyị. Lelee ma ọ bụrụ na enweelarị akwụkwọ na Notion: ọ bụrụ na ọ dị, ọ dịghịzi mkpa ka anyị mepụta ahịrị.
Mgbe ahụ, anyị kwesịrị ikpebi ụdị faịlụ, dabere na nke a, gụọ ọnụ ọgụgụ okwu. Tinye akwụkwọ na njedebe.

Nke a bụ koodu anyị nwetara:

for root, subdirs, files in os.walk(BOOKS_DIR):
    if len(files) > 0 and check_for_excusion(root):
        for file in files:
            array = file.split(".")
            filetype = file.split(".")[len(array) - 1]
            filename = file.replace("." + filetype, "")
            local_root = root.replace(BOOKS_DIR, "")

            print("Dir: {}, file: {}".format(local_root, file))

            if not check_for_existence(filename):
                print("Dir: {}, file: {}".format(local_root, file))

                if filetype == "pdf":
                    count = get_pdf_pages_number(root, file)

                else:
                    count = get_epub_pages_number(root, file)

                words_count = get_words_count(count)
                hours = get_reading_time(words_count)
                print("Pages: {}, Words: {}, Hours: {}".format(count, words_count, hours))
                add_row(local_root, filename, words_count, count, hours)

Na ọrụ ịlele ma etinyere akwụkwọ dị ka nke a:

def check_for_existence(filename):
    for row in current_rows:
        if row.title in filename:
            return True

        elif filename in row.title:
            return True

    return False

nkwubi

Ekele dịrị onye ọ bụla gụrụ akụkọ a. Enwere m olileanya na ọ ga-enyere gị aka ịgụkwu :)

isi: www.habr.com

Tinye a comment