Kutulutsidwa kwa dongosolo lozindikiritsa zolemba Tesseract 5.2

Kutulutsidwa kwa kachitidwe ka Tesseract 5.2 optical text recognition system kwasindikizidwa, kuthandizira kuzindikira zilembo za UTF-8 ndi zolemba m'zilankhulo zopitilira 100, kuphatikiza Chirasha, Chikazakh, Chibelarusi ndi Chiyukireniya. Zotsatira zitha kusungidwa m'mawu osavuta kapena HTML (hOCR), ALTO (XML), PDF ndi TSV. Dongosololi lidapangidwa koyambirira mu 1985-1995 mu labotale ya Hewlett Packard; mu 2005, code idatsegulidwa pansi pa layisensi ya Apache ndipo idapangidwanso mothandizidwa ndi ogwira ntchito ku Google. Khodi yoyambira polojekitiyi imagawidwa pansi pa layisensi ya Apache 2.0.

Tesseract imaphatikizapo chida chothandizira komanso laibulale ya libtesseract yophatikizira magwiridwe antchito a OCR muzinthu zina. Ma GUI a chipani chachitatu omwe amathandizira Tesseract akuphatikiza gImageReader, VietOCR ndi YAGF. Injini ziwiri zozindikiritsa zimaperekedwa: yachikale yomwe imazindikira zolemba pamlingo wa mawonekedwe amunthu, ndi yatsopano kutengera kugwiritsa ntchito makina ophunzirira makina otengera LSTM recurrent neural network, yokometsedwa kuzindikira zingwe zonse ndikuloleza kuwonjezeka kwakukulu kwa kulondola. Zitsanzo zokonzedwa kale zasindikizidwa m'zinenero 123. Kuti muwongolere magwiridwe antchito, ma modules pogwiritsa ntchito malangizo a OpenMP ndi SIMD AVX2, AVX, AVX512F, NEON kapena SSE4.1 amaperekedwa.

Kusintha kwakukulu mu Tesseract 5.2:

  • Zowonjezera zomwe zakhazikitsidwa pogwiritsa ntchito malangizo a Intel AVX512F.
  • C API imagwiritsa ntchito ntchito yoyambitsa tesseract ndikuyika makina ophunzirira pamakina pamtima.
  • Anawonjezera invert_threshold parameter, yomwe imatsimikizira mulingo wa kusintha kwa zingwe zamawu. Mtengo wokhazikika ndi 0.7. Kuti mulepheretse kusintha, ikani mtengo kukhala 0.
  • Kusintha kwabwino kwa zikalata zazikulu kwambiri pa makamu a 32-bit.
  • Kusintha kwapangidwa kuchokera ku ntchito za std::regex kupita ku std::string.
  • Zolemba zokonzedwa bwino za Autotools, CMake ndi machitidwe ophatikizana mosalekeza.

    Source: opennet.ru

Kuwonjezera ndemanga