Samaynta maktabad guri oo leh Notion iyo Python

Waxaan had iyo jeer xiisaynayay sida ugu wanaagsan ee loogu qaybin karo buugaagta maktabaddayda elegtarooniga ah. Dhammaadkii, waxaan u imid doorashadan iyada oo xisaabinta tooska ah ee tirada boggaga iyo waxyaabaha kale ee wanaagsan. Waxaan weydiiyaa dhammaan dadka xiisaynaya bisadda hoosteeda.

Qaybta 1. Dropbox

Dhammaan buugaagteyda waxay ku yaalliin sanduuqa dhibcaha. Waxa jira 4 qaybood oo aan wax walba u qaybiyay: Buug-gacmeed, Tixraac, Fiction, Non-fiction. Laakiin kuma darin buugaagta tixraaca miiska.

Inta badan buugaagta waa .epub, inta kale waa .pdf. Taasi waa, xalka ugu dambeeya waa inuu si uun u daboolaa labada ikhtiyaar.

Wadooyinka aan u maraayo buugaagta waa sidatan:

/Книги/Нехудожественное/Новое/Дизайн/Юрий Гордон/Книга про буквы от А до Я.epub 

Haddii buuggu yahay khayaali, markaa qaybta (taas oo ah, "Design" ee kiiska kore) waa la saarayaa.

Waxaan go'aansaday inaanan ku dhibin Dropbox API, maadaama aan haysto codsigooda kaas oo isku dhejiya faylka. Taasi waa, qorshuhu waa kan: waxaan ka soo qaadnaa buugaagta galka, ku socodsiin buug kasta iyada oo loo marayo counter-counter, oo aan ku darno Fikradda.

Qaybta 2. Ku dar xariiq

Miiska laftiisa waa inuu u ekaado wax sidan oo kale ah. FIIRO GAAR AH: way fiicantahay in magacyo tiirar ah loo sameeyo Laatiinka.

Samaynta maktabad guri oo leh Notion iyo Python

Waxaan isticmaali doonaa API-ga aan rasmiga ahayn ee Fikradda, sababtoo ah kan rasmiga ah weli lama keenin.

Samaynta maktabad guri oo leh Notion iyo Python

Tag fikradda, taabo Ctrl + Shift + J, tag Application -> Kukiyada, koobiyayso token_v2 oo wac TOKEN. Kadibna waxaan tageynaa bogga aan u baahanahay oo wata calaamadda maktabadda oo koobiyi xiriirka. Waxaan ugu yeernaa NOTION.

Ka dib waxaan u qornaa koodka si aan ugu xirno Fikradda.

database = client.get_collection_view(NOTION)
current_rows = database.default_query().execute()

Marka xigta, aan qorno shaqo si aan saf ugu darno miiska.

def add_row(path, file, words_count, pages_count, hours):
    row = database.collection.add_row()
    row.title = file

    tags = path.split("/")

    if len(tags) >= 1:
        row.what = tags[0]

    if len(tags) >= 2:
        row.state = tags[1]

    if len(tags) >= 3:
        if tags[0] == "Художественное":
            row.author = tags[2]

        elif tags[0] == "Нехудожественное":
            row.tags = tags[2]

        elif tags[0] == "Учебники":
            row.tags = tags[2]

    if len(tags) >= 4:
        row.author = tags[3]

    row.hours = hours
    row.pages = pages_count
    row.words = words_count

Maxaa halkan ka socda. Waxaan qaadnaa oo ku darnaa saf cusub miiska safka koowaad. Marka xigta, waxaan u kala qaybinnaa dariiqayada "/" oo hel tags. Tags - marka la eego "Farshaxanka", "Naqshadda", waa kuma qoraaga, iyo wixii la mid ah. Kadibna waxaan dejineynaa dhammaan beeraha lagama maarmaanka ah ee saxanka.

Qaybta 3. Tirinta erayada, saacadaha iyo waxyaabaha kale ee lagu farxo

Tani waa hawl aad u adag. Sida aan xasuusan nahay, waxaan haynaa laba qaab: epub iyo pdf. Haddii wax walba ay ku cad yihiin epub - ereyadu waxay u badan tahay inay jiraan, markaa wax walba si cad uma aha pdf: waxay si fudud uga koobnaan kartaa sawirro dhejis ah.

Markaa shaqadayada tirinta ereyada PDF waxay u ekaan doontaa sidan: waxaanu qaadanaa tirada boggaga waxaana ku dhufannaa mid joogto ah (celcelis ahaan tirada ereyada boggiiba).

Waa tan:

def get_words_count(pages_number):
    return pages_number * WORDS_PER_PAGE

Kan WORDS_PER_PAGE ee bogga A4 waa ku dhawaad ​​300.

Hadda aan qorno shaqo aan ku tirineyno boggaga. Waan isticmaali doonaa pyPDF2.

def get_pdf_pages_number(path, filename):
    pdf = PdfFileReader(open(os.path.join(path, filename), 'rb'))
    return pdf.getNumPages()

Marka xigta, waxaan ku qori doonaa shay loogu talagalay tirinta bogagga Epub. Waxaan isticmaalnaa epub_converter. Halkan waxaan ku qaadaneynaa buugga, oo u beddeleynaa sadar, oo tirina erayada sadar kasta.

def get_epub_pages_number(path, filename):
    book = open_book(os.path.join(path, filename))
    lines = convert_epub_to_lines(book)
    words_count = 0

    for line in lines:
        words_count += len(line.split(" "))

    return round(words_count / WORDS_PER_PAGE)

Hadda aynu xisaabino wakhtiga. Waxaan qaadanaa tirinta kelmada aan jecelnahay waxaanan u qaybinnaa xawaarahaaga akhriska.

def get_reading_time(words_count):
    return round(((words_count / WORDS_PER_MINUTE) / 60) * 10) / 10

Qeybta 4. Isku xirka dhammaan qaybaha

Waxaan u baahannahay inaan marno dhammaan waddooyinka suurtagalka ah ee ku jira buugga buugaagta. Hubi haddii uu hore buug ugu jiray Fiiro gaar ah: haddii uu jiro, uma baahnid inaan abuurno xariiq.
Kadibna waxaan u baahanahay inaan go'aamino nooca faylka, iyadoo ku xiran tan, tiri tirada erayada. Ku dar buug dhamaadka.

Kani waa koodka aanu helno:

for root, subdirs, files in os.walk(BOOKS_DIR):
    if len(files) > 0 and check_for_excusion(root):
        for file in files:
            array = file.split(".")
            filetype = file.split(".")[len(array) - 1]
            filename = file.replace("." + filetype, "")
            local_root = root.replace(BOOKS_DIR, "")

            print("Dir: {}, file: {}".format(local_root, file))

            if not check_for_existence(filename):
                print("Dir: {}, file: {}".format(local_root, file))

                if filetype == "pdf":
                    count = get_pdf_pages_number(root, file)

                else:
                    count = get_epub_pages_number(root, file)

                words_count = get_words_count(count)
                hours = get_reading_time(words_count)
                print("Pages: {}, Words: {}, Hours: {}".format(count, words_count, hours))
                add_row(local_root, filename, words_count, count, hours)

Hawsha lagu hubinayo in buug lagu darayna waxa ay u eegtahay sidan:

def check_for_existence(filename):
    for row in current_rows:
        if row.title in filename:
            return True

        elif filename in row.title:
            return True

    return False

gunaanad

Waad ku mahadsan tahay qof kasta oo akhriyay maqaalkan. Waxaan rajeynayaa inay kaa caawiso inaad wax badan akhrido :)

Source: www.habr.com

Add a comment