ื™ืฆื™ืจืช ืกืคืจื™ื™ื” ื‘ื™ืชื™ืช ืขื Notion ื•-Python

ืชืžื™ื“ ื”ืชืขื ื™ื™ื ืชื™ ื›ื™ืฆื“ ืœื”ืคื™ืฅ ืกืคืจื™ื ื‘ืฆื•ืจื” ื”ื˜ื•ื‘ื” ื‘ื™ื•ืชืจ ื‘ืกืคืจื™ื™ื” ื”ืืœืงื˜ืจื•ื ื™ืช ืฉืœื™. ื‘ืกื•ืคื• ืฉืœ ื“ื‘ืจ, ื”ื’ืขืชื™ ืœืื•ืคืฆื™ื” ื”ื–ื• ืขื ื—ื™ืฉื•ื‘ ืื•ื˜ื•ืžื˜ื™ ืฉืœ ื›ืžื•ืช ื”ื“ืคื™ื ื•ืขื•ื“ ื›ืœ ื˜ื•ื‘. ืื ื™ ืฉื•ืืœ ืืช ื›ืœ ื”ืžืชืขื ื™ื™ื ื™ื ืžืชื—ืช ืœื—ืชื•ืœ.

ื—ืœืง 1. ื“ืจื•ืคื‘ื•ืงืก

ื›ืœ ื”ืกืคืจื™ื ืฉืœื™ ื ืžืฆืื™ื ื‘ื“ืจื•ืคื‘ื•ืงืก. ื™ืฉ 4 ืงื˜ื’ื•ืจื™ื•ืช ืฉืืœื™ื”ืŸ ื—ื™ืœืงืชื™ ื”ื›ืœ: ืกืคืจ ืœื™ืžื•ื“, ืขื™ื•ืŸ, ืกื™ืคื•ืจืช, ืขื™ื•ืŸ. ืื‘ืœ ืื ื™ ืœื ืžื•ืกื™ืฃ ืกืคืจื™ ืขื™ื•ืŸ ืœืฉื•ืœื—ืŸ.

ืจื•ื‘ ื”ืกืคืจื™ื ื”ื .epub, ื”ืฉืืจ ื”ื .pdf. ื›ืœื•ืžืจ, ื”ืคืชืจื•ืŸ ื”ืกื•ืคื™ ื—ื™ื™ื‘ ืื™ื›ืฉื”ื• ืœื›ืกื•ืช ืืช ืฉืชื™ ื”ืืคืฉืจื•ื™ื•ืช.

ื”ื“ืจื›ื™ื ืฉืœื™ ืœืกืคืจื™ื ื”ืŸ ื‘ืขืจืš ื›ืš:

/ะšะฝะธะณะธ/ะะตั…ัƒะดะพะถะตัั‚ะฒะตะฝะฝะพะต/ะะพะฒะพะต/ะ”ะธะทะฐะนะฝ/ะฎั€ะธะน ะ“ะพั€ะดะพะฝ/ะšะฝะธะณะฐ ะฟั€ะพ ะฑัƒะบะฒั‹ ะพั‚ ะ ะดะพ ะฏ.epub 

ืื ื”ืกืคืจ ื”ื•ื ื‘ื“ื™ื•ื ื™, ืื–ื™ ื”ืงื˜ื’ื•ืจื™ื” (ื›ืœื•ืžืจ, "ืขื™ืฆื•ื‘" ื‘ืžืงืจื” ืฉืœืžืขืœื”) ืžื•ืกืจืช.

ื”ื—ืœื˜ืชื™ ืœื ืœื”ืชืขืกืง ื‘-Dropbox API, ืžื›ื™ื•ื•ืŸ ืฉื™ืฉ ืœื™ ืืช ื”ืืคืœื™ืงืฆื™ื” ืฉืœื”ื ืฉืžืกื ื›ืจื ืช ืืช ื”ืชื™ืงื™ื”. ื›ืœื•ืžืจ, ื”ืชื•ื›ื ื™ืช ื”ื™ื ื›ื–ื•: ืื ื—ื ื• ืœื•ืงื—ื™ื ืกืคืจื™ื ืžื”ืชื™ืงื™ื™ื”, ืžืขื‘ื™ืจื™ื ื›ืœ ืกืคืจ ื“ืจืš ืžื•ื ื” ืžื™ืœื™ื ื•ืžื•ืกื™ืคื™ื ืื•ืชื• ืœ-Notion.

ื—ืœืง 2. ื”ื•ืกืฃ ืฉื•ืจื”

ื”ื˜ื‘ืœื” ืขืฆืžื” ืฆืจื™ื›ื” ืœื”ื™ืจืื•ืช ื‘ืขืจืš ื›ืš. ืฉื™ืžื• ืœื‘: ืขื“ื™ืฃ ืœืขืฉื•ืช ืฉืžื•ืช ืขืžื•ื“ื•ืช ื‘ืœื˜ื™ื ื™ืช.

ื™ืฆื™ืจืช ืกืคืจื™ื™ื” ื‘ื™ืชื™ืช ืขื Notion ื•-Python

ื ืฉืชืžืฉ ื‘-Notion API ื”ืœื ืจืฉืžื™, ืžื›ื™ื•ื•ืŸ ืฉื”ืจืฉืžื™ ืขื“ื™ื™ืŸ ืœื ื ืžืกืจ.

ื™ืฆื™ืจืช ืกืคืจื™ื™ื” ื‘ื™ืชื™ืช ืขื Notion ื•-Python

ืขื‘ื•ืจ ืืœ Notion, ื”ืงืฉ Ctrl + Shift + J, ืขื‘ื•ืจ ืืœ Application -> Cookies, ื”ืขืชืง ืืช token_v2 ื•ืงืจื ืœื–ื” TOKEN. ืœืื—ืจ ืžื›ืŸ ื ืขื‘ื•ืจ ืœื“ืฃ ืฉืื ื• ืฆืจื™ื›ื™ื ืขื ืฉืœื˜ ื”ืกืคืจื™ื™ื” ื•ืžืขืชื™ืงื™ื ืืช ื”ืงื™ืฉื•ืจ. ืื ื—ื ื• ืงื•ืจืื™ื ืœื–ื” NOTION.

ืœืื—ืจ ืžื›ืŸ ื ื›ืชื•ื‘ ืืช ื”ืงื•ื“ ื›ื“ื™ ืœื”ืชื—ื‘ืจ ืœ-Notion.

database = client.get_collection_view(NOTION)
current_rows = database.default_query().execute()

ืœืื—ืจ ืžื›ืŸ, ื ื›ืชื•ื‘ ืคื•ื ืงืฆื™ื” ืœื”ื•ืกืคืช ืฉื•ืจื” ืœื˜ื‘ืœื”.

def add_row(path, file, words_count, pages_count, hours):
    row = database.collection.add_row()
    row.title = file

    tags = path.split("/")

    if len(tags) >= 1:
        row.what = tags[0]

    if len(tags) >= 2:
        row.state = tags[1]

    if len(tags) >= 3:
        if tags[0] == "ะฅัƒะดะพะถะตัั‚ะฒะตะฝะฝะพะต":
            row.author = tags[2]

        elif tags[0] == "ะะตั…ัƒะดะพะถะตัั‚ะฒะตะฝะฝะพะต":
            row.tags = tags[2]

        elif tags[0] == "ะฃั‡ะตะฑะฝะธะบะธ":
            row.tags = tags[2]

    if len(tags) >= 4:
        row.author = tags[3]

    row.hours = hours
    row.pages = pages_count
    row.words = words_count

ืžื” ืงื•ืจื” ืคื”. ืื ื—ื ื• ืœื•ืงื—ื™ื ื•ืžื•ืกื™ืคื™ื ืฉื•ืจื” ื—ื“ืฉื” ืœื˜ื‘ืœื” ื‘ืฉื•ืจื” ื”ืจืืฉื•ื ื”. ืœืื—ืจ ืžื›ืŸ, ืื ื• ืžื—ืœืงื™ื ืืช ื”ื ืชื™ื‘ ืฉืœื ื• ืœืื•ืจืš "/" ื•ืžืงื‘ืœื™ื ืชื’ื™ื. ืชื’ื™ื•ืช - ื‘ืžื•ื ื—ื™ื ืฉืœ "ืืžื ื•ืช", "ืขื™ืฆื•ื‘", ืžื™ ื”ืžื—ื‘ืจ ื•ื›ื•'. ืœืื—ืจ ืžื›ืŸ ืื ื• ืžื’ื“ื™ืจื™ื ืืช ื›ืœ ื”ืฉื“ื•ืช ื”ื“ืจื•ืฉื™ื ืฉืœ ื”ืฆืœื—ืช.

ื—ืœืง 3. ืกืคื™ืจืช ืžื™ืœื™ื, ืฉืขื•ืช ื•ืขื•ื“ ืชืขื ื•ื’ื•ืช

ื–ื• ืžืฉื™ืžื” ืงืฉื” ื™ื•ืชืจ. ื›ื–ื›ื•ืจ, ื™ืฉ ืœื ื• ืฉื ื™ ืคื•ืจืžื˜ื™ื: epub ื•-pdf. ืื ื”ื›ืœ ื‘ืจื•ืจ ืขื ื”-epub - ื›ื ืจืื” ืฉื”ืžื™ืœื™ื ืฉื ื‘ื•ื•ื“ืื•ืช, ืื– ืขื ื”-PDF ื”ื›ืœ ืœื ื›ืœ ื›ืš ื‘ืจื•ืจ: ื”ื•ื ืขืฉื•ื™ ืคืฉื•ื˜ ืœื”ื™ื•ืช ืžื•ืจื›ื‘ ืžืชืžื•ื ื•ืช ืžื•ื“ื‘ืงื•ืช.

ืื– ื”ืคื•ื ืงืฆื™ื” ืฉืœื ื• ืœืกืคื™ืจืช ืžื™ืœื™ื ื‘-PDF ืชื™ืจืื” ื›ืš: ืื ื—ื ื• ืœื•ืงื—ื™ื ืืช ืžืกืคืจ ื”ืขืžื•ื“ื™ื ื•ืžื›ืคื™ืœื™ื ื‘ืงื‘ื•ืข ืžืกื•ื™ื (ืžืกืคืจ ื”ืžื™ืœื™ื ื”ืžืžื•ืฆืข ื‘ืขืžื•ื“).

ื”ื ื” ื”ื™ื:

def get_words_count(pages_number):
    return pages_number * WORDS_PER_PAGE

WORDS_PER_PAGE ื–ื” ืขื‘ื•ืจ ื“ืฃ A4 ื”ื•ื ื‘ืขืจืš 300.

ื›ืขืช ื ื›ืชื•ื‘ ืคื•ื ืงืฆื™ื” ืœืกืคื™ืจืช ืขืžื•ื“ื™ื. ืื ื—ื ื• ื ืฉืชืžืฉ pyPDF2.

def get_pdf_pages_number(path, filename):
    pdf = PdfFileReader(open(os.path.join(path, filename), 'rb'))
    return pdf.getNumPages()

ืœืื—ืจ ืžื›ืŸ, ื ื›ืชื•ื‘ ื“ื‘ืจ ืœืกืคื™ืจืช ื“ืคื™ื ื‘- Epub. ืื ื• ืžืฉืชืžืฉื™ื epub_converter. ื›ืืŸ ืื ื—ื ื• ืœื•ืงื—ื™ื ืืช ื”ืกืคืจ, ืžืžื™ืจื™ื ืื•ืชื• ืœืฉื•ืจื•ืช ื•ืกื•ืคืจื™ื ืืช ื”ืžื™ืœื™ื ืœื›ืœ ืฉื•ืจื”.

def get_epub_pages_number(path, filename):
    book = open_book(os.path.join(path, filename))
    lines = convert_epub_to_lines(book)
    words_count = 0

    for line in lines:
        words_count += len(line.split(" "))

    return round(words_count / WORDS_PER_PAGE)

ืขื›ืฉื™ื• ื‘ื•ืื• ื ื—ืฉื‘ ืืช ื”ื–ืžืŸ. ืื ื• ืœื•ืงื—ื™ื ืืช ืกืคื™ืจืช ื”ืžื™ืœื™ื ื”ืื”ื•ื‘ื” ืขืœื™ื ื• ื•ืžื—ืœืงื™ื ืื•ืชื” ื‘ืžื”ื™ืจื•ืช ื”ืงืจื™ืื” ืฉืœืš.

def get_reading_time(words_count):
    return round(((words_count / WORDS_PER_MINUTE) / 60) * 10) / 10

ื—ืœืง 4. ื—ื™ื‘ื•ืจ ื›ืœ ื”ื—ืœืงื™ื

ืื ื—ื ื• ืฆืจื™ื›ื™ื ืœืขื‘ื•ืจ ืขืœ ื›ืœ ื”ื ืชื™ื‘ื™ื ื”ืืคืฉืจื™ื™ื ื‘ืชื™ืงื™ื™ืช ื”ืกืคืจื™ื ืฉืœื ื•. ื‘ื“ื•ืง ืื ื”ืกืคืจ ื›ื‘ืจ ืงื™ื™ื ื‘-Notion: ืื ื›ืŸ, ืื™ืŸ ืœื ื• ืขื•ื“ ืฆื•ืจืš ืœื™ืฆื•ืจ ืฉื•ืจื”.
ืื– ืื ื—ื ื• ืฆืจื™ื›ื™ื ืœืงื‘ื•ืข ืืช ืกื•ื’ ื”ืงื•ื‘ืฅ, ื‘ื”ืชืื ืœื–ื”, ืœืกืคื•ืจ ืืช ืžืกืคืจ ื”ืžื™ืœื™ื. ื”ื•ืกืฃ ืกืคืจ ื‘ืกื•ืฃ.

ื–ื” ื”ืงื•ื“ ืฉืื ื—ื ื• ืžืงื‘ืœื™ื:

for root, subdirs, files in os.walk(BOOKS_DIR):
    if len(files) > 0 and check_for_excusion(root):
        for file in files:
            array = file.split(".")
            filetype = file.split(".")[len(array) - 1]
            filename = file.replace("." + filetype, "")
            local_root = root.replace(BOOKS_DIR, "")

            print("Dir: {}, file: {}".format(local_root, file))

            if not check_for_existence(filename):
                print("Dir: {}, file: {}".format(local_root, file))

                if filetype == "pdf":
                    count = get_pdf_pages_number(root, file)

                else:
                    count = get_epub_pages_number(root, file)

                words_count = get_words_count(count)
                hours = get_reading_time(words_count)
                print("Pages: {}, Words: {}, Hours: {}".format(count, words_count, hours))
                add_row(local_root, filename, words_count, count, hours)

ื•ื”ืคื•ื ืงืฆื™ื” ืœื‘ื“ื•ืง ืื ืกืคืจ ื ื•ืกืฃ ื ืจืื™ืช ื›ืš:

def check_for_existence(filename):
    for row in current_rows:
        if row.title in filename:
            return True

        elif filename in row.title:
            return True

    return False

ืžืกืงื ื”

ืชื•ื“ื” ืœื›ืœ ืžื™ ืฉืงืจื ืืช ื”ืžืืžืจ ื”ื–ื”. ืื ื™ ืžืงื•ื•ื” ืฉื–ื” ื™ืขื–ื•ืจ ืœืš ืœืงืจื•ื ืขื•ื“ :)

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”