Ndagara ndichifarira nzira yakanakisisa yekugovera mabhuku muraibhurari yangu yemagetsi. Pakupedzisira, ndakasvika pane iyi sarudzo nekuverenga otomatiki yehuwandu hwemapeji uye zvimwe zvakanaka. Ndinobvunza vose vanofarira vari pasi pekati.
Chikamu 1. Dropbox
Mabhuku angu ese ari padropbox. Pane zvikamu zvina zvandakagovera zvese: Bhuku reKuverenga, Reference, Fiction, Non-fiction. Asi ini handiwedzere mabhuku ereferenzi patafura.
Mazhinji emabhuku ndee.epub, mamwe ese ari .pdf. Ndiko kuti, mhinduro yekupedzisira inofanira neimwe nzira kuvhara sarudzo mbiri.
Nzira dzangu dzemabhuku dzakaita seizvi:
/Книги/Нехудожественное/Новое/Дизайн/Юрий Гордон/Книга про буквы от А до Я.epub
Kana bhuku iri manyepo, saka chikamu (kureva, "Gadzira" mune iri pamusoro) chinobviswa.
Ndakafunga kusanetsa neDropbox API, sezvo ini ndine yavo application inowiriranisa folda. Ndiko kuti, chirongwa ndeichi: tinotora mabhuku kubva mufolda, tomhanyisa bhuku rega rega kuburikidza nekaunda yemazwi, uye toiwedzera kuNotion.
Chikamu 2. Wedzera mutsara
Tafura pachayo inofanira kutaridzika seizvi. ATTENTION: zviri nani kugadzira mazita emakoramu muchiLatin.
Isu tichashandisa iyo unofficial Notion API, nekuti iyo yepamutemo haisati yaunzwa.
Enda kuNotion, dzvanya Ctrl + Shift + J, enda kuKushandisa -> Cookies, kopi token_v2 uye uishe TOKEN. Ipapo tinoenda kune peji yatinoda nechiratidzo cheraibhurari uye tikopa chinongedzo. Tinozvidaidza kuti ZVOKUITA.
Zvadaro tinonyora kodhi yekubatanidza kuNotion.
database = client.get_collection_view(NOTION)
current_rows = database.default_query().execute()
Tevere, ngatinyorei basa rekuwedzera mutsara patafura.
def add_row(path, file, words_count, pages_count, hours):
row = database.collection.add_row()
row.title = file
tags = path.split("/")
if len(tags) >= 1:
row.what = tags[0]
if len(tags) >= 2:
row.state = tags[1]
if len(tags) >= 3:
if tags[0] == "Художественное":
row.author = tags[2]
elif tags[0] == "Нехудожественное":
row.tags = tags[2]
elif tags[0] == "Учебники":
row.tags = tags[2]
if len(tags) >= 4:
row.author = tags[3]
row.hours = hours
row.pages = pages_count
row.words = words_count
Chii chiri kuitika pano. Isu tinotora uye tinowedzera mutsara mutsva patafura mumutsara wekutanga. Zvadaro, tinoparadzanisa nzira yedu pamwe chete ne "/" uye tinowana ma tag. Tags - maererano "Art", "Design", ndiani munyori, uye zvichingodaro. Zvadaro tinoisa minda yose inodiwa yeplate.
Chikamu 3. Kuverenga mazwi, maawa uye zvimwe zvinofadza
Iri ibasa rakaoma zvikuru. Sezvatinorangarira, tine mafomati maviri: epub uye pdf. Kana zvese zvakajeka ne epub - mazwi angangove aripo, saka zvese hazvina kujeka nezve pdf: inogona kungove nemifananidzo yakanamirwa.
Saka basa redu rekuverenga mazwi muPDF richaita seizvi: tinotora huwandu hwemapeji uye towanza neimwe nguva (avhareji yehuwandu hwemashoko papeji).
Heunoi
def get_words_count(pages_number):
return pages_number * WORDS_PER_PAGE
Iyi WORDS_PER_PAGE yeA4 peji inoita mazana matatu.
Zvino ngatinyorei basa rekuverenga mapeji. Tichashandisa
def get_pdf_pages_number(path, filename):
pdf = PdfFileReader(open(os.path.join(path, filename), 'rb'))
return pdf.getNumPages()
Tevere, isu tichanyora chinhu chekuverenga mapeji muEpub. Isu tinoshandisa
def get_epub_pages_number(path, filename):
book = open_book(os.path.join(path, filename))
lines = convert_epub_to_lines(book)
words_count = 0
for line in lines:
words_count += len(line.split(" "))
return round(words_count / WORDS_PER_PAGE)
Zvino ngativerengei nguva. Isu tinotora mazwi edu atinoda uye tinoapatsanura nekumhanya kwako kwekuverenga.
def get_reading_time(words_count):
return round(((words_count / WORDS_PER_MINUTE) / 60) * 10) / 10
Chikamu 4. Kubatanidza zvikamu zvose
Tinofanira kupinda nenzira dzose dzinobvira mufodhi redu remabhuku. Tarisa kana patova nebhuku muNotion: kana iripo, hatichadi kugadzira mutsara.
Zvadaro tinoda kusarudza rudzi rwefaira, zvichienderana neizvi, kuverenga nhamba yemashoko. Wedzera bhuku kumagumo.
Iyi ndiyo kodhi yatinowana:
for root, subdirs, files in os.walk(BOOKS_DIR):
if len(files) > 0 and check_for_excusion(root):
for file in files:
array = file.split(".")
filetype = file.split(".")[len(array) - 1]
filename = file.replace("." + filetype, "")
local_root = root.replace(BOOKS_DIR, "")
print("Dir: {}, file: {}".format(local_root, file))
if not check_for_existence(filename):
print("Dir: {}, file: {}".format(local_root, file))
if filetype == "pdf":
count = get_pdf_pages_number(root, file)
else:
count = get_epub_pages_number(root, file)
words_count = get_words_count(count)
hours = get_reading_time(words_count)
print("Pages: {}, Words: {}, Hours: {}".format(count, words_count, hours))
add_row(local_root, filename, words_count, count, hours)
Uye basa rekutarisa kana bhuku rawedzerwa rinotaridzika seizvi:
def check_for_existence(filename):
for row in current_rows:
if row.title in filename:
return True
elif filename in row.title:
return True
return False
mhedziso
Ndinotenda kune wese akaverenga chinyorwa ichi. Ndinovimba inokubatsira kuverenga zvakawanda :)
Source: www.habr.com