Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก

Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก

ื”ื™ื™ ื”ืื‘ืจ!

ืœืขืฆื˜ืข ื”ืึทืจื‘ืกื˜, Kaggle ื›ืึธื•ืกื˜ื™ื“ ืึท ืคืึทืจืžืขืกื˜ ืฆื• ืงืœืึทืกื™ืคื™ืฆื™ืจืŸ ื”ืึทื ื˜-ืฆื™ืขืŸ ื‘ื™ืœื“ืขืจ, Quick Draw Doodle Recognition, ืื™ืŸ ื•ื•ืึธืก, ืฆื•ื•ื™ืฉืŸ ืื ื“ืขืจืข, ืึท ืงืึธืœืขืงื˜ื™ื•ื• ืคื•ืŸ R-Scients ื’ืขื ื•ืžืขืŸ ืึธื ื˜ื™ื™ืœ: ืึทืจื˜ืขื ืงืœืขื•ื•ืฆืึธื•ื•ืึท, ืคื™ืœื™ืคึผืคึผืึท ืžืึทื ืึทื’ืขืจ ะธ ืื ื“ืจื™ื™ ืึธื’ื•ืจืฆืึธื•ื•. ืžื™ืจ ื•ื•ืขืœืŸ ื ื™ืฉื˜ ื‘ืึทืฉืจื™ื™ึทื‘ืŸ ื“ื™ ืคืึทืจืžืขืกื˜ ืื™ืŸ ื“ืขื˜ืึทืœ; ื“ืึธืก ืื™ื– ืฉื•ื™ืŸ ื“ื•ืจื›ื’ืขืงืึธื›ื˜ ืื™ืŸ ืคืจื™ืฉ ืื•ื™ืกื’ืื‘ืข.

ื“ืึธืก ืžืึธืœ ื”ืึธื˜ ืขืก ื ื™ืฉื˜ ืื•ื™ืกื’ืขืึทืจื‘ืขื˜ ืžื™ื˜ ืžืขื“ืึทืœึพืคืึทืจืžืขืจืŸ, ืึธื‘ืขืจ ืึท ืกืš ื•ื•ืขืจื˜ืคื•ืœืข ื“ืขืจืคืึทืจื•ื ื’ ื”ืึธื˜ ืžืขืŸ ื’ืขืงืจืึธื’ืŸ, ื“ืขืจืคืืจ ื•ื•ื™ืœ ืื™ืš ื“ืขืจืฆื™ื™ืœืŸ ื“ื™ ืงื”ื™ืœื” ื•ื•ืขื’ืŸ ืึท ืฆืึธืœ ืคื•ืŸ ื“ื™ ืื™ื ื˜ืขืจืขืกืึทื ื˜ืกื˜ืข ืื•ืŸ ื ื•ืฆื™ืงืกื˜ืข ื–ืื›ืŸ ืื•ื™ืฃ ืงืึทื’ืœ ืื•ืŸ ืื™ืŸ ื“ืขืจ ื•ื•ืึธื›ืขื“ื™ืงืขืจ ืึทืจื‘ืขื˜. ืฆื•ื•ื™ืฉืŸ ื“ื™ ื“ื™ืกืงื•ื˜ื™ืจื˜ืข ื˜ืขืžืขืก: ืฉื•ื•ืขืจ ืœืขื‘ืŸ ืึธืŸ OpenCV, JSON ืคึผืึทืจืกื™ื ื’ (ื“ื™ ื‘ื™ื™ืฉืคื™ืœืŸ ื•ื ื˜ืขืจื–ื•ื›ืŸ ื“ื™ ื™ื ืึทื’ืจื™ื™ืฉืึทืŸ ืคื•ืŸ C ++ ืงืึธื“ ืื™ืŸ ืกืงืจื™ืคึผืก ืึธื“ืขืจ ืคึผืึทืงืึทื“ื–ืฉืึทื– ืื™ืŸ ืจ ื ื™ืฆืŸ Rcpp), ืคึผืึทืจืึทืžืขื˜ืขืจื™ื–ืึทื˜ื™ืึธืŸ ืคื•ืŸ ืกืงืจื™ืคึผืก ืื•ืŸ ื“ืึธืงืขืจื™ื–ืึทื˜ื™ืึธืŸ ืคื•ืŸ ื“ื™ ืœืขืฆื˜ ืœื™ื™ื–ื•ื ื’. ื›ืœ ืงืึธื“ ืคื•ืŸ ื“ืขืจ ืึธื ื–ืึธื’ ืื™ืŸ ืึท ืคืึธืจืขื ืคึผืึทืกื™ืง ืคึฟืึทืจ ื“ื•ืจื›ืคื™ืจื•ื ื’ ืื™ื– ื‘ื ื™ืžืฆื ืื™ืŸ ืจื™ืคึผืึทื–ืึทื˜ืึธืจื™ื–.

ืื™ื ื”ืึทืœื˜:

  1. ืขืคืคืขืงื˜ื™ื•ื•ืข ืœืึธื“ืŸ ื“ืึทื˜ืŸ ืคึฟื•ืŸ CSV ืื™ืŸ MonetDB
  2. ืคึผืจื™ืคึผืขืจื™ื ื’ ื‘ืึทื˜ืฉืึทื–
  3. ื™ื˜ืขืจืึทื˜ืึธืจืก ืคึฟืึทืจ ืึทื ืœืึธื•ื“ื™ื ื’ ื‘ืึทื˜ืฉืึทื– ืคื•ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก
  4. ืกืขืœืขืงื˜ื™ื ื’ ืึท ืžืึธื“ืขืœ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ
  5. ืกืงืจื™ืคึผื˜ ืคึผืึทืจืึทืžืขื˜ืขืจื™ื–ืึทื˜ื™ืึธืŸ
  6. ื“ืึธืงืงืขืจื™ื–ืึทื˜ื™ืึธืŸ ืคื•ืŸ ืกืงืจื™ืคึผืก
  7. ื ื™ืฆืŸ ืงื™ื™ืคืœ ื’ืคึผื•ืก ืื•ื™ืฃ Google ืงืœืึธื•ื“
  8. ืึธื ืฉื˜ืึธื˜ ืึท ืžืกืงื ื

1. ืขืคืคื™ืกื™ืขื ื˜ืœื™ ืžืึทืกืข ื“ืึทื˜ืŸ ืคื•ืŸ ืงืกื•ื• ืื™ืŸ ื“ื™ MonetDB ื“ืึทื˜ืึทื‘ื™ื™ืก

ื“ื™ ื“ืึทื˜ืŸ ืื™ืŸ ื“ืขื ืคืึทืจืžืขืกื˜ ื–ืขื ืขืŸ ื ื™ืฉื˜ ืฆื•ื’ืขืฉื˜ืขืœื˜ ืื™ืŸ ื“ื™ ืคืึธืจืขื ืคื•ืŸ ืคืึทืจื˜ื™ืง ื‘ื™ืœื“ืขืจ, ืึธื‘ืขืจ ืื™ืŸ ื“ื™ ืคืึธืจืขื ืคื•ืŸ 340 ืงืกื•ื• ื˜ืขืงืขืก (ืื™ื™ืŸ ื˜ืขืงืข ืคึฟืึทืจ ื™ืขื“ืขืจ ืงืœืึทืก) ืžื™ื˜ JSONs ืžื™ื˜ ืคื•ื ื˜ ืงืึธื•ืึธืจื“ืึทื ืึทืฅ. ื“ื•ืจืš ืงืึทื ืขืงื˜ื™ื ื’ ื“ื™ ืคื•ื ืงื˜ืŸ ืžื™ื˜ ืฉื•ืจื•ืช, ืžื™ืจ ื‘ืึทืงื•ืžืขืŸ ืึท ืœืขืฆื˜ ื‘ื™ืœื“ ืžืขืกื˜ืŸ 256x256 ื‘ื™ืœื“ืฆืขืœืŸ. ืื•ื™ืš ืคึฟืึทืจ ื™ืขื“ืขืจ ืจืขืงืึธืจื“ ืขืก ืื™ื– ืึท ืคื™ืจืžืข ื•ื•ืึธืก ื™ื ื“ื™ืงื™ื™ืฅ ืฆื™ ื“ื™ ื‘ื™ืœื“ ืื™ื– ืจื™ื›ื˜ื™ืง ื“ืขืจืงืขื ื˜ ื“ื•ืจืš ื“ื™ ืงืœืึทืกืกื™ืคื™ืขืจ ื’ืขื ื™ืฆื˜ ืื™ืŸ ื“ืขืจ ืฆื™ื™ื˜ ื•ื•ืขืŸ ื“ื™ ื“ืึทื˜ืึทืกืขื˜ ืื™ื– ื’ืขื–ืืžืœื˜, ืึท ืฆื•ื•ื™ื™-ืื•ืชื™ื•ืช ืงืึธื“ ืคื•ืŸ ื“ื™ ืžื“ื™ื ื” ืคื•ืŸ ื•ื•ื•ื™ื ืึธืจื˜ ืคื•ืŸ ื“ืขืจ ืžื—ื‘ืจ ืคื•ืŸ ื“ื™ ื‘ื™ืœื“, ืึท ื™ื™ื ืฆื™ืง ืื™ื“ืขื ื˜ื™ื˜ืขื˜, ืึท ืฆื™ื™ื˜ ืกื˜ืึทืžืคึผ. ืื•ืŸ ืึท ืงืœืึทืก ื ืึธืžืขืŸ ื•ื•ืึธืก ื’ืœื™ื™ึทื›ืŸ ื“ื™ ื˜ืขืงืข ื ืึธืžืขืŸ. ื ืกื™ืžืคึผืœืึทืคื™ื™ื“ ื•ื•ืขืจืกื™ืข ืคื•ืŸ โ€‹โ€‹ื“ืขืจ ืึธืจื™ื’ื™ื ืขืœ ื“ืึทื˜ืŸ ื•ื•ื™ื™ื– 7.4 ื’ื™ื’ืื‘ื™ื™ื˜ ืื™ืŸ ื“ื™ ืึทืจืงื™ื™ื•ื• ืื•ืŸ ื‘ืขืขืจืขืš 20 ื’ื™ื’ืื‘ื™ื™ื˜ ื ืึธืš ืึทื ืคึผืึทืงื™ื ื’, ื“ื™ ืคื•ืœ ื“ืึทื˜ืŸ ื ืึธืš ืึทื ืคึผืึทืงื™ื ื’ ื ืขืžื˜ 240 ื’ื™ื’ืื‘ื™ื™ื˜. ื“ื™ ืึธืจื’ืึทื ื™ื™ื–ืขืจื– ื™ื ืฉื•ืจื“ ืึทื– ื‘ื™ื™ื“ืข ื•ื•ืขืจืกื™ืขืก ืจื™ืคึผืจืึทื“ื•ืกื˜ ื“ื™ ื–ืขืœื‘ืข ื“ืจืึทื•ื•ื™ื ื’ืก, ื˜ื™ื™ึทื˜ืฉ ื“ื™ ืคื•ืœ ื•ื•ืขืจืกื™ืข ืื™ื– ื™ื‘ืขืจื™ืง. ืื™ืŸ ืงื™ื™ืŸ ืคืึทืœ, ืกื˜ืึธืจื™ื ื’ 50 ืžื™ืœื™ืึธืŸ ื‘ื™ืœื“ืขืจ ืื™ืŸ ื’ืจืึทืคื™ืง ื˜ืขืงืขืก ืึธื“ืขืจ ืื™ืŸ ื“ื™ ืคืึธืจืขื ืคื•ืŸ ืขืจื™ื™ื– ืื™ื– ื’ืœื™ื™ืš ื’ืขืจืขื›ื ื˜ ื•ื•ื™ ืึทื ืคึผืจืึทืคื™ื˜ืึทื‘ืึทืœ, ืื•ืŸ ืžื™ืจ ื‘ืึทืฉืœืึธืกืŸ ืฆื• ืฆื•ื ื•ื™ืคื’ื™ืกืŸ ืึทืœืข ืงืกื•ื• ื˜ืขืงืขืก ืคื•ืŸ ื“ื™ ืึทืจืงื™ื™ื•ื•. train_simplified.zip ืื™ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ืžื™ื˜ ืกืึทื‘ืกืึทืงื•ื•ืึทื ื˜ ื“ื•ืจ ื‘ื™ืœื“ืขืจ ืคื•ืŸ ื“ื™ ืคืืจืœืื ื’ื˜ ื’ืจื™ื™ืก "ืื•ื™ืฃ ื“ื™ ืคืœื™ืขืŸ" ืคึฟืึทืจ ื™ืขื“ืขืจ ืคึผืขืงืœ.

ื ื’ืขื–ื•ื ื˜-ืคึผืจืึธื•ื•ืขืŸ ืกื™ืกื˜ืขื ืื™ื– ืื•ื™ืกื“ืขืจื•ื•ื™ื™ืœื˜ ื•ื•ื™ ื“ื™ DBMS ืžืึธื ืขื˜ื“ื‘, ื ื™ื™ืžืœื™ ืึท ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืคึฟืึทืจ R ื•ื•ื™ ืึท ืคึผืขืงืœ MonetDBLite. ื“ืขืจ ืคึผืขืงืœ ื›ื•ืœืœ ืึทืŸ ืขืžื‘ืขื“ื™ื“ ื•ื•ืขืจืกื™ืข ืคื•ืŸ โ€‹โ€‹โ€‹โ€‹ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ืกืขืจื•ื•ืขืจ ืื•ืŸ ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืงืœื™ื™ึทื‘ืŸ ื“ื™ ืกืขืจื•ื•ืขืจ ื’ืœื™ื™ืš ืคึฟื•ืŸ ืึท R ืกืขืกื™ืข ืื•ืŸ ืึทืจื‘ืขื˜ ืžื™ื˜ ืื™ื ื“ืึธืจื˜. ืงืจื™ื™ื™ื˜ื™ื ื’ ืึท ื“ืึทื˜ืึทื‘ื™ื™ืก ืื•ืŸ ืงืึทื ืขืงื˜ื™ื ื’ ืฆื• ืขืก ื–ืขื ืขืŸ ื’ืขื˜ืืŸ ืžื™ื˜ ืื™ื™ืŸ ื‘ืึทืคึฟืขืœ:

con <- DBI::dbConnect(drv = MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

ืžื™ืจ ื“ืึทืจืคึฟืŸ ืฆื• ืฉืึทืคึฟืŸ ืฆื•ื•ื™ื™ ื˜ื™ืฉืŸ: ืื™ื™ื ืขืจ ืคึฟืึทืจ ืึทืœืข ื“ืึทื˜ืŸ, ื“ื™ ืื ื“ืขืจืข ืคึฟืึทืจ ืกืขืจื•ื•ื™ืก ืื™ื ืคึฟืึธืจืžืึทืฆื™ืข ื•ื•ืขื’ืŸ ื“ืึทื•ื ืœืึธื•ื“ื™ื“ ื˜ืขืงืขืก (ื ื•ืฆื™ืง ืื•ื™ื‘ ืขืคึผืขืก ื’ื™ื™ื˜ ืคืึทืœืฉ ืื•ืŸ ื“ืขืจ ืคึผืจืึธืฆืขืก ืžื•ื–ืŸ ื–ื™ื™ืŸ ืจื™ื–ื•ืžื“ ื ืึธืš ื“ืึทื•ื ืœืึธื•ื“ื™ื ื’ ืขื˜ืœืขื›ืข ื˜ืขืงืขืก):

ืฉืืคืŸ ื˜ืึทื‘ืœืขืก

if (!DBI::dbExistsTable(con, "doodles")) {
  DBI::dbCreateTable(
    con = con,
    name = "doodles",
    fields = c(
      "countrycode" = "char(2)",
      "drawing" = "text",
      "key_id" = "bigint",
      "recognized" = "bool",
      "timestamp" = "timestamp",
      "word" = "text"
    )
  )
}

if (!DBI::dbExistsTable(con, "upload_log")) {
  DBI::dbCreateTable(
    con = con,
    name = "upload_log",
    fields = c(
      "id" = "serial",
      "file_name" = "text UNIQUE",
      "uploaded" = "bool DEFAULT false"
    )
  )
}

ื“ื™ ืคืึทืกื˜ืึทืกื˜ ื•ื•ืขื’ ืฆื• ืœืึธื“ืŸ ื“ืึทื˜ืŸ ืื™ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ืื™ื– ืฆื• ื’ืœื™ื™ืš ื ืึธื›ืžืึทื›ืŸ ืงืกื•ื• ื˜ืขืงืขืก ื ื™ืฆืŸ SQL - ื‘ืึทืคึฟืขืœ COPY OFFSET 2 INTO tablename FROM path USING DELIMITERS ',','n','"' NULL AS '' BEST EFFORTื•ื•ื• tablename - ื˜ื™ืฉ ื ืึธืžืขืŸ ืื•ืŸ path - ื“ืขืจ ื“ืจืš ืฆื• ื“ืขืจ ื˜ืขืงืข. ื‘ืฉืขืช ืืจื‘ืขื˜ืŸ ืžื™ื˜ ื“ืขื ืึทืจืงื™ื™ื•ื•, ืขืก ืื™ื– ื’ืขื•ื•ืขืŸ ื“ื™ืกืงืึทื•ื•ืขืจื“ ืึทื– ื“ื™ ื’ืขื‘ื•ื™ื˜-ืื™ืŸ ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ unzip ืื™ืŸ R ื˜ื•ื˜ ื ื™ืฉื˜ ืึทืจื‘ืขื˜ืŸ ืจื™ื›ื˜ื™ืง ืžื™ื˜ ืึท ื ื•ืžืขืจ ืคื•ืŸ ื˜ืขืงืขืก ืคึฟื•ืŸ ื“ืขืจ ืึทืจืงื™ื™ื•ื•, ืึทื–ื•ื™ ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ื“ื™ ืกื™ืกื˜ืขื unzip (ื ื™ืฆืŸ ื“ืขื ืคึผืึทืจืึทืžืขื˜ืขืจ getOption("unzip")).

ืคื•ื ืงืฆื™ืข ืคึฟืึทืจ ืฉืจื™ื™ื‘ืŸ ืฆื• ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก

#' @title ะ˜ะทะฒะปะตั‡ะตะฝะธะต ะธ ะทะฐะณั€ัƒะทะบะฐ ั„ะฐะนะปะพะฒ
#'
#' @description
#' ะ˜ะทะฒะปะตั‡ะตะฝะธะต CSV-ั„ะฐะนะปะพะฒ ะธะท ZIP-ะฐั€ั…ะธะฒะฐ ะธ ะทะฐะณั€ัƒะทะบะฐ ะธั… ะฒ ะฑะฐะทัƒ ะดะฐะฝะฝั‹ั…
#'
#' @param con ะžะฑัŠะตะบั‚ ะฟะพะดะบะปัŽั‡ะตะฝะธั ะบ ะฑะฐะทะต ะดะฐะฝะฝั‹ั… (ะบะปะฐัั `MonetDBEmbeddedConnection`).
#' @param tablename ะะฐะทะฒะฐะฝะธะต ั‚ะฐะฑะปะธั†ั‹ ะฒ ะฑะฐะทะต ะดะฐะฝะฝั‹ั….
#' @oaram zipfile ะŸัƒั‚ัŒ ะบ ZIP-ะฐั€ั…ะธะฒัƒ.
#' @oaram filename ะ˜ะผั ั„ะฐะนะปะฐ ะฒะฝัƒั€ะธ ZIP-ะฐั€ั…ะธะฒะฐ.
#' @param preprocess ะคัƒะฝะบั†ะธั ะฟั€ะตะดะพะฑั€ะฐะฑะพั‚ะบะธ, ะบะพั‚ะพั€ะฐั ะฑัƒะดะตั‚ ะฟั€ะธะผะตะฝะตะฝะฐ ะธะทะฒะปะตั‡ั‘ะฝะฝะพะผัƒ ั„ะฐะนะปัƒ.
#'   ะ”ะพะปะถะฝะฐ ะฟั€ะธะฝะธะผะฐั‚ัŒ ะพะดะธะฝ ะฐั€ะณัƒะผะตะฝั‚ `data` (ะพะฑัŠะตะบั‚ `data.table`).
#'
#' @return `TRUE`.
#'
upload_file <- function(con, tablename, zipfile, filename, preprocess = NULL) {
  # ะŸั€ะพะฒะตั€ะบะฐ ะฐั€ะณัƒะผะตะฝั‚ะพะฒ
  checkmate::assert_class(con, "MonetDBEmbeddedConnection")
  checkmate::assert_string(tablename)
  checkmate::assert_string(filename)
  checkmate::assert_true(DBI::dbExistsTable(con, tablename))
  checkmate::assert_file_exists(zipfile, access = "r", extension = "zip")
  checkmate::assert_function(preprocess, args = c("data"), null.ok = TRUE)

  # ะ˜ะทะฒะปะตั‡ะตะฝะธะต ั„ะฐะนะปะฐ
  path <- file.path(tempdir(), filename)
  unzip(zipfile, files = filename, exdir = tempdir(), 
        junkpaths = TRUE, unzip = getOption("unzip"))
  on.exit(unlink(file.path(path)))

  # ะŸั€ะธะผะตะฝัะตะผ ั„ัƒะฝะบั†ะธั ะฟั€ะตะดะพะฑั€ะฐะฑะพั‚ะบะธ
  if (!is.null(preprocess)) {
    .data <- data.table::fread(file = path)
    .data <- preprocess(data = .data)
    data.table::fwrite(x = .data, file = path, append = FALSE)
    rm(.data)
  }

  # ะ—ะฐะฟั€ะพั ะบ ะ‘ะ” ะฝะฐ ะธะผะฟะพั€ั‚ CSV
  sql <- sprintf(
    "COPY OFFSET 2 INTO %s FROM '%s' USING DELIMITERS ',','n','"' NULL AS '' BEST EFFORT",
    tablename, path
  )
  # ะ’ั‹ะฟะพะปะฝะตะฝะธะต ะทะฐะฟั€ะพัะฐ ะบ ะ‘ะ”
  DBI::dbExecute(con, sql)

  # ะ”ะพะฑะฐะฒะปะตะฝะธะต ะทะฐะฟะธัะธ ะพะฑ ัƒัะฟะตัˆะฝะพะน ะทะฐะณั€ัƒะทะบะต ะฒ ัะปัƒะถะตะฑะฝัƒัŽ ั‚ะฐะฑะปะธั†ัƒ
  DBI::dbExecute(con, sprintf("INSERT INTO upload_log(file_name, uploaded) VALUES('%s', true)",
                              filename))

  return(invisible(TRUE))
}

ืื•ื™ื‘ ืื™ืจ ื“ืึทืจืคึฟืŸ ืฆื• ื™ื‘ืขืจืžืึทื›ืŸ ื“ื™ ื˜ื™ืฉ ืื™ื™ื“ืขืจ ืื™ืจ ืฉืจื™ื™ึทื‘ืŸ ืขืก ืฆื• ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก, ืขืก ืื™ื– ื’ืขื ื•ื’ ืฆื• ืคืึธืจืŸ ื“ื™ ืึทืจื’ื•ืžืขื ื˜ preprocess ืคื•ื ืงืฆื™ืข ื•ื•ืึธืก ื•ื•ืขื˜ ื™ื‘ืขืจืžืึทื›ืŸ ื“ื™ ื“ืึทื˜ืŸ.

ืงืึธื“ ืคึฟืึทืจ ืกืึทืงื•ื•ืขื ื˜ืฉืึทืœ ืœืึธื•ื“ื™ื ื’ ื“ืึทื˜ืŸ ืื™ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก:

ืฉืจื™ื™ื‘ืŸ ื“ืึทื˜ืŸ ืฆื• ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก

# ะกะฟะธัะพะบ ั„ะฐะนะปะพะฒ ะดะปั ะทะฐะฟะธัะธ
files <- unzip(zipfile, list = TRUE)$Name

# ะกะฟะธัะพะบ ะธัะบะปัŽั‡ะตะฝะธะน, ะตัะปะธ ั‡ะฐัั‚ัŒ ั„ะฐะนะปะพะฒ ัƒะถะต ะฑั‹ะปะฐ ะทะฐะณั€ัƒะถะตะฝะฐ
to_skip <- DBI::dbGetQuery(con, "SELECT file_name FROM upload_log")[[1L]]
files <- setdiff(files, to_skip)

if (length(files) > 0L) {
  # ะ—ะฐะฟัƒัะบะฐะตะผ ั‚ะฐะนะผะตั€
  tictoc::tic()
  # ะŸั€ะพะณั€ะตัั ะฑะฐั€
  pb <- txtProgressBar(min = 0L, max = length(files), style = 3)
  for (i in seq_along(files)) {
    upload_file(con = con, tablename = "doodles", 
                zipfile = zipfile, filename = files[i])
    setTxtProgressBar(pb, i)
  }
  close(pb)
  # ะžัั‚ะฐะฝะฐะฒะปะธะฒะฐะตะผ ั‚ะฐะนะผะตั€
  tictoc::toc()
}

# 526.141 sec elapsed - ะบะพะฟะธั€ะพะฒะฐะฝะธะต SSD->SSD
# 558.879 sec elapsed - ะบะพะฟะธั€ะพะฒะฐะฝะธะต USB->SSD

ื“ื™ ืœืึธื•ื“ื™ื ื’ ืฆื™ื™ื˜ ืคื•ืŸ ื“ืึทื˜ืŸ ืงืขืŸ ื‘ื™ื™ึทื˜ืŸ ื“ื™ืคึผืขื ื“ื™ื ื’ ืื•ื™ืฃ ื“ื™ ื’ื™ื›ืงื™ื™ึทื˜ ืงืขืจืึทืงื˜ืขืจื™ืกื˜ื™ืงืก ืคื•ืŸ ื“ื™ ืคืึธืจ ื’ืขื ื™ืฆื˜. ืื™ืŸ ืื•ื ื“ื–ืขืจ ืคืึทืœ, ืœื™ื™ืขื ืขืŸ ืื•ืŸ ืฉืจื™ื™ื‘ืŸ ืื™ืŸ ืื™ื™ืŸ SSD ืึธื“ืขืจ ืคึฟื•ืŸ ืึท ื‘ืœื™ืฅ ืคืึธืจ (ืžืงื•ืจ ื˜ืขืงืข) ืฆื• ืึท SSD (DB) ื ืขืžื˜ ื•ื•ื™ื™ื ื™ืงืขืจ ื•ื•ื™ 10 ืžื™ื ื•ื˜.

ืขืก ื ืขืžื˜ ื ืึธืš ืึท ื‘ื™ืกืœ ืกืขืงื•ื ื“ืขืก ืฆื• ืฉืึทืคึฟืŸ ืึท ื–ื™ื™ึทืœ ืžื™ื˜ ืึท ื™ื ื˜ืึทื“ื–ืฉืขืจ ืงืœืึทืก ืคื™ืจืžืข ืื•ืŸ ืึทืŸ ืื™ื ื“ืขืงืก ื–ื™ื™ึทืœ (ORDERED INDEX) ืžื™ื˜ ืฉื•ืจื” ื ื•ืžืขืจืŸ ื“ื•ืจืš ื•ื•ืึธืก ืึทื‘ื–ืขืจื•ื•ื™ื™ืฉืึทื ื– ื•ื•ืขื˜ ื–ื™ื™ืŸ ืกืึทืžืคึผืึทืœื“ ื•ื•ืขืŸ ืงืจื™ื™ื™ื˜ื™ื ื’ ื‘ืึทื˜ืฉืึทื–:

ืงืจื™ื™ื™ื˜ื™ื ื’ ื ืึธืš ืฉืคืืœื˜ืŸ ืื•ืŸ ืื™ื ื“ืขืงืก

message("Generate lables")
invisible(DBI::dbExecute(con, "ALTER TABLE doodles ADD label_int int"))
invisible(DBI::dbExecute(con, "UPDATE doodles SET label_int = dense_rank() OVER (ORDER BY word) - 1"))

message("Generate row numbers")
invisible(DBI::dbExecute(con, "ALTER TABLE doodles ADD id serial"))
invisible(DBI::dbExecute(con, "CREATE ORDERED INDEX doodles_id_ord_idx ON doodles(id)"))

ืฆื• ืกืึธืœื•ื•ืข ื“ื™ ืคึผืจืึธื‘ืœืขื ืคื•ืŸ ืงืจื™ื™ื™ื˜ื™ื ื’ ืึท ืคึผืขืงืœ ืื•ื™ืฃ ื“ื™ ืคืœื™ืขืŸ, ืžื™ืจ ื“ืึทืจืคึฟืŸ ืฆื• ื“ืขืจื’ืจื™ื™ื›ืŸ ื“ื™ ืžืึทืงืกื™ืžื•ื ื’ื™ื›ืงื™ื™ึทื˜ ืคื•ืŸ ืขืงืกื˜ืจืึทืงื˜ ืจืึทื ื“ืึธื ืจืึธื•ื– ืคื•ืŸ ื“ื™ ื˜ื™ืฉ doodles. ืคึฟืึทืจ ื“ืขื ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ 3 ื˜ืจื™ืงืก. ื“ืขืจ ืขืจืฉื˜ืขืจ ืื™ื– ื’ืขื•ื•ืขืŸ ืฆื• ืจืขื“ื•ืฆื™ืจืŸ ื“ื™ ื“ื™ืžืขื ืฉืึทื ืึทืœื™ื˜ื™ ืคื•ืŸ ื“ืขื ื˜ื™ืคึผ ื•ื•ืึธืก ืกื˜ืึธืจื– ื“ื™ ืึธื‘ืกืขืจื•ื•ืึทืฆื™ืข ืฉื™ื™ึทืŸ. ืื™ืŸ ื“ืขืจ ืึธืจื™ื’ื™ื ืขืœ ื“ืึทื˜ืŸ ืฉื˜ืขืœืŸ, ื“ืขืจ ื˜ื™ืคึผ ืคืืจืœืื ื’ื˜ ืฆื• ืงืจืึธื ื“ื™ ืฉื™ื™ึทืŸ ืื™ื– bigint, ืึธื‘ืขืจ ื“ื™ ื ื•ืžืขืจ ืคื•ืŸ ืึทื‘ื–ืขืจื•ื•ื™ื™ืฉืึทื ื– ืžืื›ื˜ ืขืก ืžืขื’ืœืขืš ืฆื• ืคึผืึทืกื™ืง ื–ื™ื™ืขืจ ืื™ื“ืขื ื˜ื™ืคื™ืฆื™ืจืŸ, ื’ืœื™ื™ึทืš ืฆื• ื“ื™ ืึธืจื“ืึทื ืึทืœ ื ื•ืžืขืจ, ืื™ืŸ ื“ืขื ื˜ื™ืคึผ int. ื“ืขืจ ื–ื•ื›ืŸ ืื™ื– ืคื™ืœ ืคืึทืกื˜ืขืจ ืื™ืŸ ื“ืขื ืคืึทืœ. ื“ื™ ืฆื•ื•ื™ื™ื˜ืข ืงื•ื ืฅ ืื™ื– ื’ืขื•ื•ืขืŸ ืฆื• ื ื•ืฆืŸ ORDERED INDEX โ€” ืžื™ ืจ ื–ืฒื ืข ืŸ ื’ืขืงื•ืžืข ืŸ ืฆ ื• ื“ืข ื ื‘ืืฉืœื•ืก , ืขืžืคื™ืจื™ืฉ , ื“ื•ืจื›ื’ืขืžืื› ื˜ ืืœ ืข ืคืืจืื ื˜ืฐืืจื˜ืœืขื› ืข options. ื“ื™ ื“ืจื™ื˜ ืื™ื– ืฆื• ื ื•ืฆืŸ ืคึผืึทืจืึทืžืขื˜ืขืจื™ื™ื–ื“ ืงื•ื•ื™ืจื™ื–. ื“ื™ ืขืกืึทื ืก ืคื•ืŸ ื“ืขื ืื•ืคึฟืŸ ืื™ื– ืฆื• ื•ื™ืกืคื™ืจืŸ ื“ืขื ื‘ืึทืคึฟืขืœ ืึทืžืึธืœ PREPARE ืžื™ื˜ ืกืึทื‘ืกืึทืงื•ื•ืึทื ื˜ ื ื•ืฆืŸ ืคื•ืŸ ืึท ืฆื•ื’ืขื’ืจื™ื™ื˜ ืื•ื™ืกื“ืจื•ืง ื•ื•ืขืŸ ืงืจื™ื™ื™ื˜ื™ื ื’ ืึท ื‘ื™ื ื˜ืœ ืคื•ืŸ ืงื•ื•ื™ืจื™ื– ืคื•ืŸ ื“ืขืจ ื–ืขืœื‘ื™ืงืขืจ ื˜ื™ืคึผ, ืึธื‘ืขืจ ืื™ืŸ ืคืึทืงื˜ ืขืก ืื™ื– ืึท ืžื™ื™ึทืœืข ืื™ืŸ ืคืึทืจื’ืœื™ื™ึทืš ืžื™ื˜ ืึท ืคึผืฉื•ื˜ ืื™ื™ื ืขืจ SELECT ืขืก ืื™ื– ื’ืขื•ื•ืขืŸ ืื™ืŸ ื“ื™ ืงื™ื™ื˜ ืคื•ืŸ ืกื˜ืึทื˜ื™ืกื˜ื™ืฉ ื˜ืขื•ืช.

ื“ืขืจ ืคึผืจืึธืฆืขืก ืคื•ืŸ ื•ืคึผืœืึธืึทื“ื™ื ื’ ื“ืึทื˜ืŸ ืงืึทื ืกื•ืžื– ื ื™ื˜ ืžืขืจ ื•ื•ื™ 450 ืžืขื’ืื‘ื™ื™ื˜ืŸ ืคื•ืŸ ื‘ืึทืจืึทืŸ. ื“ืึธืก ืื™ื–, ื“ืขืจ ื“ื™ืกืงืจื™ื™ื‘ื“ ืฆื•ื’ืึทื ื’ ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืžืึทืš ื“ืึทื˜ืึทืกืขืฅ ื•ื•ื™ื™ื™ื ื’ ื˜ืขื ืก ืคื•ืŸ ื’ื™ื’ืื‘ื™ื™ื˜ ืื•ื™ืฃ ื›ึผืžืขื˜ ืงื™ื™ืŸ ื‘ื•ื“ื–ืฉืขื˜ ื™ื™ึทื–ื ื•ื•ืึทืจื’, ืึทืจื™ื™ึทื ื’ืขืจืขื›ื ื˜ ืขื˜ืœืขื›ืข ืื™ื™ืŸ-ื‘ืจืขื˜ ื“ืขื•ื•ื™ืกืขืก, ื•ื•ืึธืก ืื™ื– ืฉื™ื™ืŸ ืงื™ืœ.

ืึทืœืข ื•ื•ืึธืก ื‘ืœื™ื™ื‘ื˜ ืื™ื– ืฆื• ืžืขืกื˜ืŸ ื“ื™ ื’ื™ื›ืงื™ื™ึทื˜ ืคื•ืŸ ืจื™ื˜ืจื™ื•ื•ื™ื ื’ (ืจืึทื ื“ืึธื) ื“ืึทื˜ืŸ ืื•ืŸ ืึธืคึผืฉืึทืฆืŸ ื“ื™ ืกืงื™ื™ืœื™ื ื’ ื•ื•ืขืŸ ืžื•ืกื˜ืขืจื•ื ื’ ื‘ืึทื˜ืฉืึทื– ืคื•ืŸ ืคืึทืจืฉื™ื“ืขื ืข ืกื™ื–ืขืก:

ื“ืึทื˜ืึทื‘ืึทืกืข ื‘ืขื ื˜ืฉืžืึทืจืง

library(ggplot2)

set.seed(0)
# ะŸะพะดะบะปัŽั‡ะตะฝะธะต ะบ ะฑะฐะทะต ะดะฐะฝะฝั‹ั…
con <- DBI::dbConnect(MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

# ะคัƒะฝะบั†ะธั ะดะปั ะฟะพะดะณะพั‚ะพะฒะบะธ ะทะฐะฟั€ะพัะฐ ะฝะฐ ัั‚ะพั€ะพะฝะต ัะตั€ะฒะตั€ะฐ
prep_sql <- function(batch_size) {
  sql <- sprintf("PREPARE SELECT id FROM doodles WHERE id IN (%s)",
                 paste(rep("?", batch_size), collapse = ","))
  res <- DBI::dbSendQuery(con, sql)
  return(res)
}

# ะคัƒะฝะบั†ะธั ะดะปั ะธะทะฒะปะตั‡ะตะฝะธั ะดะฐะฝะฝั‹ั…
fetch_data <- function(rs, batch_size) {
  ids <- sample(seq_len(n), batch_size)
  res <- DBI::dbFetch(DBI::dbBind(rs, as.list(ids)))
  return(res)
}

# ะŸั€ะพะฒะตะดะตะฝะธะต ะทะฐะผะตั€ะฐ
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    rs <- prep_sql(batch_size)
    bench::mark(
      fetch_data(rs, batch_size),
      min_iterations = 50L
    )
  }
)
# ะŸะฐั€ะฐะผะตั‚ั€ั‹ ะฑะตะฝั‡ะผะฐั€ะบะฐ
cols <- c("batch_size", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   batch_size      min   median      max `itr/sec` total_time n_itr
#        <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
# 1         16   23.6ms  54.02ms  93.43ms     18.8        2.6s    49
# 2         32     38ms  84.83ms 151.55ms     11.4       4.29s    49
# 3         64   63.3ms 175.54ms 248.94ms     5.85       8.54s    50
# 4        128   83.2ms 341.52ms 496.24ms     3.00      16.69s    50
# 5        256  232.8ms 653.21ms 847.44ms     1.58      31.66s    50
# 6        512  784.6ms    1.41s    1.98s     0.740       1.1m    49
# 7       1024  681.7ms    2.72s    4.06s     0.377      2.16m    49

ggplot(res_bench, aes(x = factor(batch_size), y = median, group = 1)) +
  geom_point() +
  geom_line() +
  ylab("median time, s") +
  theme_minimal()

DBI::dbDisconnect(con, shutdown = TRUE)

Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก

2. ืคึผืจื™ืคึผืขืจื™ื ื’ ื‘ืึทื˜ืฉืึทื–

ื“ืขืจ ื’ืื ืฆืขืจ ืคึผืขืงืœ ืฆื•ื’ืจื™ื™ื˜ื•ื ื’ ืคึผืจืึธืฆืขืก ื‘ืืฉื˜ื™ื™ื˜ ืคื•ืŸ ื“ื™ ืคืืœื’ืขื ื“ืข ืกื˜ืขืคึผืก:

  1. ืคึผืึทืจืกื™ื ื’ ืขื˜ืœืขื›ืข JSONs ืžื™ื˜ ื•ื•ืขืงื˜ืึธืจืก ืคื•ืŸ ืกื˜ืจื™ื ื’ืก ืžื™ื˜ ืงืึธื•ืึธืจื“ืึทื ืึทืฅ ืคื•ืŸ ืคื•ื ืงื˜ืŸ.
  2. ืฆื™ื™ื›ืขื ื•ื ื’ ื‘ื•ื ื˜ ืฉื•ืจื•ืช ื‘ืื–ื™ืจื˜ ืื•ื™ืฃ ื“ื™ ืงืึธื•ืึธืจื“ืึทื ืึทืฅ ืคื•ืŸ ื•ื•ื™ื™ื–ื˜ ืื•ื™ืฃ ืึท ื‘ื™ืœื“ ืคื•ืŸ ื“ื™ ืคืืจืœืื ื’ื˜ ื’ืจื™ื™ืก (ืœืžืฉืœ, 256 ร— 256 ืึธื“ืขืจ 128 ร— 128).
  3. ืงืึทื ื•ื•ืขืจื˜ื™ื ื’ ื“ื™ ืจื™ื–ืึทืœื˜ื™ื ื’ ื‘ื™ืœื“ืขืจ ืื™ืŸ ืึท ื˜ืขื ืกืึธืจ.

ื•ื•ื™ ืึท ื˜ื™ื™ืœ ืคื•ืŸ ื“ื™ ืคืึทืจืžืขืกื˜ ืฆื•ื•ื™ืฉืŸ ืคึผื™ื˜ื”ืึธืŸ ืงืขืจื ืึทืœื–, ื“ื™ ืคึผืจืึธื‘ืœืขื ืื™ื– ื’ืขื•ื•ืขืŸ ืกืึทืœื•ื•ื“ ื‘ืคึฟืจื˜ ื ื™ืฆืŸ OpenCV. ืื™ื™ื ืขืจ ืคื•ืŸ ื“ื™ ืกื™ืžืคึผืœืึทืกื˜ ืื•ืŸ ืžืขืจืกื˜ ืงืœืึธืจ ื•ื•ื™ ื“ืขืจ ื˜ืึธื’ ืึทื ืึทืœืึธื’ื•ืขืก ืื™ืŸ ืจ ื•ื•ืึธืœื˜ ืงื•ืงืŸ ื•ื•ื™ ื“ืึธืก:

ื™ืžืคึผืœืึทืžืขื ื™ื ื’ JSON ืฆื• Tensor ืงืึทื ื•ื•ืขืจื–ืฉืึทืŸ ืื™ืŸ R

r_process_json_str <- function(json, line.width = 3, 
                               color = TRUE, scale = 1) {
  # ะŸะฐั€ัะธะฝะณ JSON
  coords <- jsonlite::fromJSON(json, simplifyMatrix = FALSE)
  tmp <- tempfile()
  # ะฃะดะฐะปัะตะผ ะฒั€ะตะผะตะฝะฝั‹ะน ั„ะฐะนะป ะฟะพ ะทะฐะฒะตั€ัˆะตะฝะธัŽ ั„ัƒะฝะบั†ะธะธ
  on.exit(unlink(tmp))
  png(filename = tmp, width = 256 * scale, height = 256 * scale, pointsize = 1)
  # ะŸัƒัั‚ะพะน ะณั€ะฐั„ะธะบ
  plot.new()
  # ะ ะฐะทะผะตั€ ะพะบะฝะฐ ะณั€ะฐั„ะธะบะฐ
  plot.window(xlim = c(256 * scale, 0), ylim = c(256 * scale, 0))
  # ะฆะฒะตั‚ะฐ ะปะธะฝะธะน
  cols <- if (color) rainbow(length(coords)) else "#000000"
  for (i in seq_along(coords)) {
    lines(x = coords[[i]][[1]] * scale, y = coords[[i]][[2]] * scale, 
          col = cols[i], lwd = line.width)
  }
  dev.off()
  # ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต ะธะทะพะฑั€ะฐะถะตะฝะธั ะฒ 3-ั… ะผะตั€ะฝั‹ะน ะผะฐััะธะฒ
  res <- png::readPNG(tmp)
  return(res)
}

r_process_json_vector <- function(x, ...) {
  res <- lapply(x, r_process_json_str, ...)
  # ะžะฑัŠะตะดะธะฝะตะฝะธะต 3-ั… ะผะตั€ะฝั‹ั… ะผะฐััะธะฒะพะฒ ะบะฐั€ั‚ะธะฝะพะบ ะฒ 4-ั… ะผะตั€ะฝั‹ะน ะฒ ั‚ะตะฝะทะพั€
  res <- do.call(abind::abind, c(res, along = 0))
  return(res)
}

ืฆื™ื™ื›ืขื ื•ื ื’ ืื™ื– ื“ื•ืจื›ื’ืขืงืึธื›ื˜ ืžื™ื˜ ื ืึธืจืžืึทืœ R ืžื›ืฉื™ืจื™ื ืื•ืŸ ื’ืขืจืื˜ืขื•ื•ืขื˜ ืื™ืŸ ืึท ืฆื™ื™ื˜ื•ื•ื™ื™ืœื™ื’ืข PNG ืกื˜ืึธืจื“ ืื™ืŸ ื‘ืึทืจืึทืŸ (ืื™ืŸ ืœื™ื ื•ืงืก, ืฆื™ื™ื˜ื•ื•ื™ื™ืœื™ื’ืข R ื“ื™ืจืขืงื˜ืขืจื™ื– ื–ืขื ืขืŸ ืœื™ื’ืŸ ืื™ืŸ ื“ื™ ื•ื•ืขื’ื•ื•ื™ื™ึทื–ืขืจ /tmp, ืžืึธื•ื ื˜ืขื“ ืื™ืŸ ื‘ืึทืจืึทืŸ). ื“ืขืจ ื˜ืขืงืข ืื™ื– ื“ืขืžืึธืœื˜ ืœื™ื™ืขื ืขืŸ ื•ื•ื™ ืึท ื“ืจื™ื™-ื“ื™ืžืขื ืฉืึทื ืึทืœ ืžืขื ื’ืข ืžื™ื˜ ื ื•ืžืขืจืŸ ืจื™ื™ื ื“ื–ืฉื™ื ื’ ืคื•ืŸ 0 ืฆื• 1. ื“ืึธืก ืื™ื– ื•ื•ื™ื›ื˜ื™ืง ื•ื•ื™ื™ึทืœ ืึท ืžืขืจ ืงืึทื ื•ื•ืขื ืฉืึทื ืึทืœ ื‘ืžืคึผ ื•ื•ืึธืœื˜ ื–ื™ื™ืŸ ืœื™ื™ืขื ืขืŸ ืื™ืŸ ืึท ืจื•ื™ ืžืขื ื’ืข ืžื™ื˜ ื”ืขืงืก ืงืึธืœื™ืจ ืงืึธื•ื“ื–.

ื–ืืœ ืก ืคึผืจื•ื‘ื™ืจืŸ ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜:

zip_file <- file.path("data", "train_simplified.zip")
csv_file <- "cat.csv"
unzip(zip_file, files = csv_file, exdir = tempdir(), 
      junkpaths = TRUE, unzip = getOption("unzip"))
tmp_data <- data.table::fread(file.path(tempdir(), csv_file), sep = ",", 
                              select = "drawing", nrows = 10000)
arr <- r_process_json_str(tmp_data[4, drawing])
dim(arr)
# [1] 256 256   3
plot(magick::image_read(arr))

Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก

ื“ืขืจ ืคึผืขืงืœ ื–ื™ืš ื•ื•ืขื˜ ื–ื™ื™ืŸ ื’ืขืฉืืคืŸ ื•ื•ื™ ื’ื™ื™ื˜:

res <- r_process_json_vector(tmp_data[1:4, drawing], scale = 0.5)
str(res)
 # num [1:4, 1:128, 1:128, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
 # - attr(*, "dimnames")=List of 4
 #  ..$ : NULL
 #  ..$ : NULL
 #  ..$ : NULL
 #  ..$ : NULL

ื“ื™ ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืื™ื– ื’ืขื•ื•ืขืŸ ืกื•ื‘ืึธืคึผื˜ื™ืžืึทืœ ืคึฟืึทืจ ืื•ื ื“ื–, ื•ื•ื™ื™ึทืœ ื“ื™ ืคืึธืจืžื™ืจื•ื ื’ ืคื•ืŸ ื’ืจื•ื™ืก ื‘ืึทื˜ืฉืึทื– ื ืขืžื˜ ืึท ืึธืจื ื˜ืœืขืš ืœืึทื ื’ ืฆื™ื™ื˜, ืื•ืŸ ืžื™ืจ ื‘ืึทืฉืœืึธืกืŸ ืฆื• ื ื•ืฆืŸ ื“ื™ ื“ืขืจืคืึทืจื•ื ื’ ืคื•ืŸ ืื•ื ื“ื–ืขืจ ื—ื‘ืจื™ื ืžื™ื˜ ืึท ืฉื˜ืึทืจืง ื‘ื™ื‘ืœื™ืึธื˜ืขืง. OpenCV. ืื™ืŸ ื“ืขืจ ืฆื™ื™ื˜ ืขืก ืื™ื– ื’ืขื•ื•ืขืŸ ืงื™ื™ืŸ ืคืึทืจื˜ื™ืง ืคึผืขืงืœ ืคึฟืึทืจ ืจ (ืขืก ืื™ื– ื’ืึธืจื ื™ื˜ ืื™ืฆื˜), ืึทื–ื•ื™ ืึท ืžื™ื ื™ืžืึทืœ ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืคื•ืŸ ื“ื™ ืคืืจืœืื ื’ื˜ ืคืึทื ื’ืงืฉืึทื ืึทืœื™ื˜ื™ ืื™ื– ื’ืขื•ื•ืขืŸ ื’ืขืฉืจื™ื‘ืŸ ืื™ืŸ C ++ ืžื™ื˜ ื™ื ืึทื’ืจื™ื™ืฉืึทืŸ ืื™ืŸ ืจ ืงืึธื“ ื ื™ืฆืŸ Rcpp.

ืฆื• ืกืึธืœื•ื•ืข ื“ืขื ืคึผืจืึธื‘ืœืขื, ื“ื™ ืคืืœื’ืขื ื“ืข ืคึผืึทืงืึทื“ื–ืฉืึทื– ืื•ืŸ ืœื™ื™ื‘ืจืขืจื™ื– ื–ืขื ืขืŸ ื’ืขื ื™ืฆื˜:

  1. OpenCV ืคึฟืึทืจ ืืจื‘ืขื˜ืŸ ืžื™ื˜ ื‘ื™ืœื“ืขืจ ืื•ืŸ ืฆื™ื™ื›ืขื ื•ื ื’ ืฉื•ืจื•ืช. ื’ืขื•ื•ื™ื™ื ื˜ ืคืึทืจ-ืื™ื ืกื˜ืึทืœื™ืจืŸ ืกื™ืกื˜ืขื ืœื™ื™ื‘ืจืขืจื™ื– ืื•ืŸ ื›ืขื“ืขืจ ื˜ืขืงืขืก, ื•ื•ื™ ื’ืขื–ื•ื ื˜ ื•ื•ื™ ื“ื™ื ืึทืžื™ืฉ ืคึฟืึทืจื‘ื™ื ื“ื•ื ื’.

  2. xtensor ืคึฟืึทืจ ืืจื‘ืขื˜ืŸ ืžื™ื˜ ืžื•ืœื˜ื™ื“ื™ืžืขื ืกื™ืึธื ืึทืœ ืขืจื™ื™ื– ืื•ืŸ ื˜ืขื ืกืขืจ. ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ื›ืขื“ืขืจ ื˜ืขืงืขืก ืึทืจื™ื™ึทื ื’ืขืจืขื›ื ื˜ ืื™ืŸ ื“ื™ R ืคึผืขืงืœ ืžื™ื˜ ื“ื™ ื–ืขืœื‘ืข ื ืึธืžืขืŸ. ื“ื™ ื‘ื™ื‘ืœื™ืึธื˜ืขืง ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืึทืจื‘ืขื˜ืŸ ืžื™ื˜ ืžื•ืœื˜ื™ื“ื™ืžืขื ืกื™ืึธื ืึทืœ ืขืจื™ื™ื–, ื‘ื™ื™ื“ืข ืื™ืŸ ืจื•ื“ืขืจืŸ ื”ื•ื™ืคึผื˜ ืื•ืŸ ื–ื™ื™ึทืœ ื”ื•ื™ืคึผื˜ ืกื“ืจ.

  3. ndjson ืคึฟืึทืจ ืคึผืึทืจืกื™ื ื’ JSON. ื“ื™ ื‘ื™ื‘ืœื™ืึธื˜ืขืง ืื™ื– ื’ืขื ื™ืฆื˜ ืื™ืŸ xtensor ืื•ื™ื˜ืึธืžืึทื˜ื™ืฉ ืื•ื™ื‘ ืขืก ืื™ื– ืคืึธืจืฉื˜ืขืœืŸ ืื™ืŸ ื“ื™ ืคึผืจื•ื™ืขืงื˜.

  4. RcppThread ืคึฟืึทืจ ืึธืจื’ืึทื ื™ื™ื–ื™ื ื’ ืžืึทืœื˜ื™-ื˜ืจืขื“ื™ื“ ืคึผืจืึทืกืขืกื™ื ื’ ืคื•ืŸ ืึท ื•ื•ืขืงื˜ืึธืจ ืคึฟื•ืŸ JSON. ื’ืขื•ื•ื™ื™ื ื˜ ื“ื™ ื›ืขื“ืขืจ ื˜ืขืงืขืก ืฆื•ื’ืขืฉื˜ืขืœื˜ ื“ื•ืจืš ื“ืขื ืคึผืขืงืœ. ืคึฟื•ืŸ ืžืขืจ ืคืึธืœืงืก RcppParallel ื“ืขืจ ืคึผืขืงืœ, ืฆื•ื•ื™ืฉืŸ ืื ื“ืขืจืข ื–ืื›ืŸ, ื”ืื˜ ืึท ื’ืขื‘ื•ื™ื˜-ืื™ืŸ ืฉืœื™ื™ืฃ ื™ื‘ืขืจืจื™ื™ึทืกืŸ ืžืขืงืึทื ื™ื–ืึทื.

ืขืก ื–ืึธืœ ื–ื™ื™ึทืŸ ืื ื’ืขื•ื•ื™ื–ืŸ ืึทื– xtensor ืขืก ืื™ื– ื’ืขื•ื•ืขืŸ ืึท ื’ืึทื“ืกืขื ื“: ืื™ืŸ ืึทื“ื™ืฉืึทืŸ ืฆื• ื“ืขื ืคืึทืงื˜ ืึทื– ืขืก ื”ืื˜ ืึท ื‘ืจื™ื™ื˜ ืคืึทื ื’ืงืฉืึทื ืึทืœื™ื˜ื™ ืื•ืŸ ื”ื•ื™ืš ืคืึธืจืฉื˜ืขืœื•ื ื’, ื“ื™ ื“ืขื•ื•ืขืœืึธืคึผืขืจืก ื–ืขื ืขืŸ ื’ืขื•ื•ืขืŸ ื’ืึทื ืฅ ืึธืคึผืจื•ืคื™ืง ืื•ืŸ ื’ืขืขื ื˜ืคืขืจื˜ ืคึฟืจืื’ืŸ ื’ืœื™ื™ืš ืื•ืŸ ืื™ืŸ ื“ืขื˜ืึทืœ. ืžื™ื˜ ื–ื™ื™ืขืจ ื”ื™ืœืฃ, ืขืก ืื™ื– ื’ืขื•ื•ืขืŸ ืžืขื’ืœืขืš ืฆื• ื™ื ืกื˜ืจื•ืžืขื ื˜ ื˜ืจืึทื ืกืคืขืจืžื™ื™ืฉืึทื ื– ืคื•ืŸ OpenCV ืžืึทื˜ืจื™ืฅ ืื™ืŸ ืงืกื˜ืขื ืกืึธืจ ื˜ืขื ืกืขืจ, ื•ื•ื™ ื’ืขื–ื•ื ื˜ ื•ื•ื™ ืึท ื•ื•ืขื’ ืฆื• ืคืึทืจื‘ื™ื ื“ืŸ 3-ื“ื™ืžืขื ืฉืึทื ืึทืœ ื‘ื™ืœื“ ื˜ืขื ืกืขืจ ืื™ืŸ ืึท 4-ื“ื™ืžืขื ืฉืึทื ืึทืœ ื˜ืขื ืกืขืจ ืคื•ืŸ ื“ื™ ืจื™ื›ื˜ื™ืง ื•ื™ืกืžืขืกื˜ื•ื ื’ (ื“ื™ ืคึผืขืงืœ ื–ื™ืš).

ืžืึทื˜ืขืจื™ืึทืœืก ืคึฟืึทืจ ืœืขืจื ืขืŸ Rcpp, Xtensor ืื•ืŸ RcppThread

https://thecoatlessprofessor.com/programming/unofficial-rcpp-api-documentation

https://docs.opencv.org/4.0.1/d7/dbd/group__imgproc.html

https://xtensor.readthedocs.io/en/latest/

https://xtensor.readthedocs.io/en/latest/file_loading.html#loading-json-data-into-xtensor

https://cran.r-project.org/web/packages/RcppThread/vignettes/RcppThread-vignette.pdf

ืฆื• ืฆื•ื ื•ื™ืคื ืขืžืขืŸ ื˜ืขืงืขืก ื•ื•ืึธืก ื ื•ืฆืŸ ืกื™ืกื˜ืขื ื˜ืขืงืขืก ืื•ืŸ ื“ื™ื ืึทืžื™ืฉ ืคึฟืึทืจื‘ื™ื ื“ื•ื ื’ ืžื™ื˜ ืœื™ื™ื‘ืจืขืจื™ื– ืื™ื ืกื˜ืึทืœื™ืจืŸ ืื•ื™ืฃ ื“ื™ ืกื™ืกื˜ืขื, ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ื“ื™ ืคึผืœื•ื’ื™ืŸ ืžืขืงืึทื ื™ื–ืึทื ื™ืžืคึผืœืึทืžืขื ืึทื“ ืื™ืŸ ื“ืขื ืคึผืขืงืœ Rcpp. ืฆื• ืื•ื™ื˜ืึธืžืึทื˜ื™ืฉ ื’ืขืคึฟื™ื ืขืŸ ืคึผืึทื˜ืก ืื•ืŸ ืคืœืึทื’ืก, ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ืึท ืคืึธืœืงืก ืœื™ื ื•ืงืก ื ื•ืฆืŸ pkg-config.

ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืคื•ืŸ ื“ื™ Rcpp ืคึผืœื•ื’ื™ืŸ ืคึฟืึทืจ ื ื™ืฆืŸ ื“ื™ OpenCV ื‘ื™ื‘ืœื™ืึธื˜ืขืง

Rcpp::registerPlugin("opencv", function() {
  # ะ’ะพะทะผะพะถะฝั‹ะต ะฝะฐะทะฒะฐะฝะธั ะฟะฐะบะตั‚ะฐ
  pkg_config_name <- c("opencv", "opencv4")
  # ะ‘ะธะฝะฐั€ะฝั‹ะน ั„ะฐะนะป ัƒั‚ะธะปะธั‚ั‹ pkg-config
  pkg_config_bin <- Sys.which("pkg-config")
  # ะŸั€ะพะฒั€ะตะบะฐ ะฝะฐะปะธั‡ะธั ัƒั‚ะธะปะธั‚ั‹ ะฒ ัะธัั‚ะตะผะต
  checkmate::assert_file_exists(pkg_config_bin, access = "x")
  # ะŸั€ะพะฒะตั€ะบะฐ ะฝะฐะปะธั‡ะธั ั„ะฐะนะปะฐ ะฝะฐัั‚ั€ะพะตะบ OpenCV ะดะปั pkg-config
  check <- sapply(pkg_config_name, 
                  function(pkg) system(paste(pkg_config_bin, pkg)))
  if (all(check != 0)) {
    stop("OpenCV config for the pkg-config not found", call. = FALSE)
  }

  pkg_config_name <- pkg_config_name[check == 0]
  list(env = list(
    PKG_CXXFLAGS = system(paste(pkg_config_bin, "--cflags", pkg_config_name), 
                          intern = TRUE),
    PKG_LIBS = system(paste(pkg_config_bin, "--libs", pkg_config_name), 
                      intern = TRUE)
  ))
})

ื•ื•ื™ ืึท ืจืขื–ื•ืœื˜ืึทื˜ ืคื•ืŸ ื“ื™ ืคึผืœื•ื’ื™ืŸ ืึธืคึผืขืจืึทืฆื™ืข, ื“ื™ ืคืืœื’ืขื ื“ืข ื•ื•ืึทืœื•ืขืก ื•ื•ืขื˜ ื–ื™ื™ืŸ ืกืึทื‘ืกื˜ืึทื˜ื•ื˜ืึทื“ ื‘ืขืฉืึทืก ื“ื™ ื–ืึทืžืœื•ื ื’ ืคึผืจืึธืฆืขืก:

Rcpp:::.plugins$opencv()$env

# $PKG_CXXFLAGS
# [1] "-I/usr/include/opencv"
#
# $PKG_LIBS
# [1] "-lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core"

ื“ื™ ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืงืึธื“ ืคึฟืึทืจ ืคึผืึทืจืกื™ื ื’ JSON ืื•ืŸ ื“ื–ืฉืขื ืขืจื™ื™ื˜ื™ื ื’ ืึท ืคึผืขืงืœ ืคึฟืึทืจ ื˜ืจืึทื ืกืžื™ืกื™ืข ืฆื• ื“ื™ ืžืึธื“ืขืœ ืื™ื– ื’ืขื’ืขื‘ืŸ ืื•ื ื˜ืขืจ ื“ื™ ืกืคึผื•ื™ืœืขืจ. ืขืจืฉื˜ืขืจ, ืœื™ื™ื’ืŸ ืึท ื”ื™ื’ืข ืคึผืจื•ื™ืขืงื˜ ื•ื•ืขื’ื•ื•ื™ื™ึทื–ืขืจ ืฆื• ื–ื•ื›ืŸ ืคึฟืึทืจ ื›ืขื“ืขืจ ื˜ืขืงืขืก (ื ื™ื“ื– ืคึฟืึทืจ ndjson):

Sys.setenv("PKG_CXXFLAGS" = paste0("-I", normalizePath(file.path("src"))))

ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืคื•ืŸ JSON ืฆื• ื˜ืขื ืกืึธืจ ืงืึทื ื•ื•ืขืจื–ืฉืึทืŸ ืื™ืŸ C ++

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::plugins(opencv)]]
// [[Rcpp::depends(xtensor)]]
// [[Rcpp::depends(RcppThread)]]

#include <xtensor/xjson.hpp>
#include <xtensor/xadapt.hpp>
#include <xtensor/xview.hpp>
#include <xtensor-r/rtensor.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <Rcpp.h>
#include <RcppThread.h>

// ะกะธะฝะพะฝะธะผั‹ ะดะปั ั‚ะธะฟะพะฒ
using RcppThread::parallelFor;
using json = nlohmann::json;
using points = xt::xtensor<double,2>;     // ะ˜ะทะฒะปะตั‡ั‘ะฝะฝั‹ะต ะธะท JSON ะบะพะพั€ะดะธะฝะฐั‚ั‹ ั‚ะพั‡ะตะบ
using strokes = std::vector<points>;      // ะ˜ะทะฒะปะตั‡ั‘ะฝะฝั‹ะต ะธะท JSON ะบะพะพั€ะดะธะฝะฐั‚ั‹ ั‚ะพั‡ะตะบ
using xtensor3d = xt::xtensor<double, 3>; // ะขะตะฝะทะพั€ ะดะปั ั…ั€ะฐะฝะตะฝะธั ะผะฐั‚ั€ะธั†ั‹ ะธะทะพะพะฑั€ะฐะถะตะฝะธั
using xtensor4d = xt::xtensor<double, 4>; // ะขะตะฝะทะพั€ ะดะปั ั…ั€ะฐะฝะตะฝะธั ะผะฝะพะถะตัั‚ะฒะฐ ะธะทะพะฑั€ะฐะถะตะฝะธะน
using rtensor3d = xt::rtensor<double, 3>; // ะžะฑั‘ั€ั‚ะบะฐ ะดะปั ัะบัะฟะพั€ั‚ะฐ ะฒ R
using rtensor4d = xt::rtensor<double, 4>; // ะžะฑั‘ั€ั‚ะบะฐ ะดะปั ัะบัะฟะพั€ั‚ะฐ ะฒ R

// ะกั‚ะฐั‚ะธั‡ะตัะบะธะต ะบะพะฝัั‚ะฐะฝั‚ั‹
// ะ ะฐะทะผะตั€ ะธะทะพะฑั€ะฐะถะตะฝะธั ะฒ ะฟะธะบัะตะปัั…
const static int SIZE = 256;
// ะขะธะฟ ะปะธะฝะธะธ
// ะกะผ. https://en.wikipedia.org/wiki/Pixel_connectivity#2-dimensional
const static int LINE_TYPE = cv::LINE_4;
// ะขะพะปั‰ะธะฝะฐ ะปะธะฝะธะธ ะฒ ะฟะธะบัะตะปัั…
const static int LINE_WIDTH = 3;
// ะะปะณะพั€ะธั‚ะผ ั€ะตัะฐะนะทะฐ
// https://docs.opencv.org/3.1.0/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121
const static int RESIZE_TYPE = cv::INTER_LINEAR;

// ะจะฐะฑะปะพะฝ ะดะปั ะบะพะฝะฒะตั€ั‚ะธั€ะพะฒะฐะฝะธั OpenCV-ะผะฐั‚ั€ะธั†ั‹ ะฒ ั‚ะตะฝะทะพั€
template <typename T, int NCH, typename XT=xt::xtensor<T,3,xt::layout_type::column_major>>
XT to_xt(const cv::Mat_<cv::Vec<T, NCH>>& src) {
  // ะ ะฐะทะผะตั€ะฝะพัั‚ัŒ ั†ะตะปะตะฒะพะณะพ ั‚ะตะฝะทะพั€ะฐ
  std::vector<int> shape = {src.rows, src.cols, NCH};
  // ะžะฑั‰ะตะต ะบะพะปะธั‡ะตัั‚ะฒะพ ัะปะตะผะตะฝั‚ะพะฒ ะฒ ะผะฐััะธะฒะต
  size_t size = src.total() * NCH;
  // ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต cv::Mat ะฒ xt::xtensor
  XT res = xt::adapt((T*) src.data, size, xt::no_ownership(), shape);
  return res;
}

// ะŸั€ะตะพะฑั€ะฐะทะพะฒะฐะฝะธะต JSON ะฒ ัะฟะธัะพะบ ะบะพะพั€ะดะธะฝะฐั‚ ั‚ะพั‡ะตะบ
strokes parse_json(const std::string& x) {
  auto j = json::parse(x);
  // ะ ะตะทัƒะปัŒั‚ะฐั‚ ะฟะฐั€ัะธะฝะณะฐ ะดะพะปะถะตะฝ ะฑั‹ั‚ัŒ ะผะฐััะธะฒะพะผ
  if (!j.is_array()) {
    throw std::runtime_error("'x' must be JSON array.");
  }
  strokes res;
  res.reserve(j.size());
  for (const auto& a: j) {
    // ะšะฐะถะดั‹ะน ัะปะตะผะตะฝั‚ ะผะฐััะธะฒะฐ ะดะพะปะถะตะฝ ะฑั‹ั‚ัŒ 2-ะผะตั€ะฝั‹ะผ ะผะฐััะธะฒะพะผ
    if (!a.is_array() || a.size() != 2) {
      throw std::runtime_error("'x' must include only 2d arrays.");
    }
    // ะ˜ะทะฒะปะตั‡ะตะฝะธะต ะฒะตะบั‚ะพั€ะฐ ั‚ะพั‡ะตะบ
    auto p = a.get<points>();
    res.push_back(p);
  }
  return res;
}

// ะžั‚ั€ะธัะพะฒะบะฐ ะปะธะฝะธะน
// ะฆะฒะตั‚ะฐ HSV
cv::Mat ocv_draw_lines(const strokes& x, bool color = true) {
  // ะ˜ัั…ะพะดะฝั‹ะน ั‚ะธะฟ ะผะฐั‚ั€ะธั†ั‹
  auto stype = color ? CV_8UC3 : CV_8UC1;
  // ะ˜ั‚ะพะณะพะฒั‹ะน ั‚ะธะฟ ะผะฐั‚ั€ะธั†ั‹
  auto dtype = color ? CV_32FC3 : CV_32FC1;
  auto bg = color ? cv::Scalar(0, 0, 255) : cv::Scalar(255);
  auto col = color ? cv::Scalar(0, 255, 220) : cv::Scalar(0);
  cv::Mat img = cv::Mat(SIZE, SIZE, stype, bg);
  // ะšะพะปะธั‡ะตัั‚ะฒะพ ะปะธะฝะธะน
  size_t n = x.size();
  for (const auto& s: x) {
    // ะšะพะปะธั‡ะตัั‚ะฒะพ ั‚ะพั‡ะตะบ ะฒ ะปะธะฝะธะธ
    size_t n_points = s.shape()[1];
    for (size_t i = 0; i < n_points - 1; ++i) {
      // ะขะพั‡ะบะฐ ะฝะฐั‡ะฐะปะฐ ัˆั‚ั€ะธั…ะฐ
      cv::Point from(s(0, i), s(1, i));
      // ะขะพั‡ะบะฐ ะพะบะพะฝั‡ะฐะฝะธั ัˆั‚ั€ะธั…ะฐ
      cv::Point to(s(0, i + 1), s(1, i + 1));
      // ะžั‚ั€ะธัะพะฒะบะฐ ะปะธะฝะธะธ
      cv::line(img, from, to, col, LINE_WIDTH, LINE_TYPE);
    }
    if (color) {
      // ะœะตะฝัะตะผ ั†ะฒะตั‚ ะปะธะฝะธะธ
      col[0] += 180 / n;
    }
  }
  if (color) {
    // ะœะตะฝัะตะผ ั†ะฒะตั‚ะพะฒะพะต ะฟั€ะตะดัั‚ะฐะฒะปะตะฝะธะต ะฝะฐ RGB
    cv::cvtColor(img, img, cv::COLOR_HSV2RGB);
  }
  // ะœะตะฝัะตะผ ั„ะพั€ะผะฐั‚ ะฟั€ะตะดัั‚ะฐะฒะปะตะฝะธั ะฝะฐ float32 ั ะดะธะฐะฟะฐะทะพะฝะพะผ [0, 1]
  img.convertTo(img, dtype, 1 / 255.0);
  return img;
}

// ะžะฑั€ะฐะฑะพั‚ะบะฐ JSON ะธ ะฟะพะปัƒั‡ะตะฝะธะต ั‚ะตะฝะทะพั€ะฐ ั ะดะฐะฝะฝั‹ะผะธ ะธะทะพะฑั€ะฐะถะตะฝะธั
xtensor3d process(const std::string& x, double scale = 1.0, bool color = true) {
  auto p = parse_json(x);
  auto img = ocv_draw_lines(p, color);
  if (scale != 1) {
    cv::Mat out;
    cv::resize(img, out, cv::Size(), scale, scale, RESIZE_TYPE);
    cv::swap(img, out);
    out.release();
  }
  xtensor3d arr = color ? to_xt<double,3>(img) : to_xt<double,1>(img);
  return arr;
}

// [[Rcpp::export]]
rtensor3d cpp_process_json_str(const std::string& x, 
                               double scale = 1.0, 
                               bool color = true) {
  xtensor3d res = process(x, scale, color);
  return res;
}

// [[Rcpp::export]]
rtensor4d cpp_process_json_vector(const std::vector<std::string>& x, 
                                  double scale = 1.0, 
                                  bool color = false) {
  size_t n = x.size();
  size_t dim = floor(SIZE * scale);
  size_t channels = color ? 3 : 1;
  xtensor4d res({n, dim, dim, channels});
  parallelFor(0, n, [&x, &res, scale, color](int i) {
    xtensor3d tmp = process(x[i], scale, color);
    auto view = xt::view(res, i, xt::all(), xt::all(), xt::all());
    view = tmp;
  });
  return res;
}

ื“ืขื ืงืึธื“ ื–ืึธืœ ื–ื™ื™ืŸ ื’ืขืฉื˜ืขืœื˜ ืื™ืŸ ื“ืขืจ ื˜ืขืงืข src/cv_xt.cpp ืื•ืŸ ืฆื•ื ื•ื™ืคื ืขืžืขืŸ ืžื™ื˜ ื“ื™ ื‘ืึทืคึฟืขืœ Rcpp::sourceCpp(file = "src/cv_xt.cpp", env = .GlobalEnv); ืื•ื™ืš ืคืืจืœืื ื’ื˜ ืคึฟืึทืจ ืึทืจื‘ืขื˜ nlohmann/json.hpp ืคื•ืŸ ืจื™ืคึผืึทื–ืึทื˜ืึธืจื™. ื“ืขืจ ืงืึธื“ ืื™ื– ืฆืขื˜ื™ื™ืœื˜ ืื™ืŸ ืขื˜ืœืขื›ืข ืคืึทื ื’ืงืฉืึทื ื–:

  • to_xt - ืึท ื˜ืขืžืคึผืœืึทื˜ืข ืคื•ื ืงืฆื™ืข ืคึฟืึทืจ ื˜ืจืึทื ืกืคืึธืจืžื™ื ื’ ืึท ื‘ื™ืœื“ ืžืึทื˜ืจื™ืฅ (cv::Mat) ืฆื• ื ื˜ืขื ืกืืจ xt::xtensor;

  • parse_json - ื“ื™ ืคึฟื•ื ืงืฆื™ืข ืคึผืึทืจืกืขืก ืึท JSON ืฉื˜ืจื™ืงืœ, ืขืงืกื˜ืจืึทืงืฅ ื“ื™ ืงืึธื•ืึธืจื“ืึทื ืึทืฅ ืคื•ืŸ ืคื•ื ืงื˜ืŸ, ืคึผืึทืงื™ื ื’ ื–ื™ื™ ืื™ืŸ ืึท ื•ื•ืขืงื˜ืึธืจ;

  • ocv_draw_lines - ืคื•ืŸ ื“ื™ ืจื™ื–ืึทืœื˜ื™ื ื’ ื•ื•ืขืงื˜ืึธืจ ืคื•ืŸ ื•ื•ื™ื™ื–ื˜, ื“ืจืึธื– ืžืึทืœื˜ื™-ื‘ื•ื ื˜ ืฉื•ืจื•ืช;

  • process - ืงืึทืžื‘ื™ื™ื ื– ื“ื™ ืื•ื™ื‘ืŸ ืคืึทื ื’ืงืฉืึทื ื– ืื•ืŸ ืื•ื™ืš ืžื•ืกื™ืฃ ื“ื™ ืคื™ื™ื™ืงื™ื™ื˜ ืฆื• ื•ื•ืึธื’ ื“ื™ ืจื™ื–ืึทืœื˜ื™ื ื’ ื‘ื™ืœื“;

  • cpp_process_json_str - ืจืึทืคึผืขืจ ืื™ื‘ืขืจ ื“ื™ ืคึฟื•ื ืงืฆื™ืข process, ื•ื•ืึธืก ืขืงืกืคึผืึธืจืฅ ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜ ืฆื• ืึท ืจ-ืึธื‘ื“ื–ืฉืขืงื˜ (ืžื•ืœื˜ื™ื“ื™ืžืขื ืกื™ืึธื ืึทืœ ืžืขื ื’ืข);

  • cpp_process_json_vector - ืจืึทืคึผืขืจ ืื™ื‘ืขืจ ื“ื™ ืคึฟื•ื ืงืฆื™ืข cpp_process_json_str, ื•ื•ืึธืก ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืคึผืจืึธืฆืขืก ืึท ืฉื˜ืจื™ืงืœ ื•ื•ืขืงื˜ืึธืจ ืื™ืŸ ืžื•ืœื˜ื™-ื˜ืจืขื“ื™ื“ ืžืึธื“ืข.

ืฆื• ืฆื™ืขืŸ ืžืึทืœื˜ื™-ื‘ื•ื ื˜ ืฉื•ืจื•ืช, ื“ื™ HSV ืงืึธืœื™ืจ ืžืึธื“ืขืœ ืื™ื– ื’ืขื ื™ืฆื˜, ื ืื›ื’ืขื’ืื ื’ืขืŸ ื“ื•ืจืš ืงืึทื ื•ื•ืขืจื–ืฉืึทืŸ ืฆื• RGB. ื–ืืœ ืก ืคึผืจื•ื‘ื™ืจืŸ ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜:

arr <- cpp_process_json_str(tmp_data[4, drawing])
dim(arr)
# [1] 256 256   3
plot(magick::image_read(arr))

Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก
ืคืึทืจื’ืœื™ื™ึทืš ืคื•ืŸ ื“ื™ ื’ื™ื›ืงื™ื™ึทื˜ ืคื•ืŸ ื™ืžืคึผืœืึทืžืึทื ืฅ ืื™ืŸ ืจ ืื•ืŸ C ++

res_bench <- bench::mark(
  r_process_json_str(tmp_data[4, drawing], scale = 0.5),
  cpp_process_json_str(tmp_data[4, drawing], scale = 0.5),
  check = FALSE,
  min_iterations = 100
)
# ะŸะฐั€ะฐะผะตั‚ั€ั‹ ะฑะตะฝั‡ะผะฐั€ะบะฐ
cols <- c("expression", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   expression                min     median       max `itr/sec` total_time  n_itr
#   <chr>                <bch:tm>   <bch:tm>  <bch:tm>     <dbl>   <bch:tm>  <int>
# 1 r_process_json_str     3.49ms     3.55ms    4.47ms      273.      490ms    134
# 2 cpp_process_json_str   1.94ms     2.02ms    5.32ms      489.      497ms    243

library(ggplot2)
# ะŸั€ะพะฒะตะดะตะฝะธะต ะทะฐะผะตั€ะฐ
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    .data <- tmp_data[sample(seq_len(.N), batch_size), drawing]
    bench::mark(
      r_process_json_vector(.data, scale = 0.5),
      cpp_process_json_vector(.data,  scale = 0.5),
      min_iterations = 50,
      check = FALSE
    )
  }
)

res_bench[, cols]

#    expression   batch_size      min   median      max `itr/sec` total_time n_itr
#    <chr>             <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
#  1 r                   16   50.61ms  53.34ms  54.82ms    19.1     471.13ms     9
#  2 cpp                 16    4.46ms   5.39ms   7.78ms   192.      474.09ms    91
#  3 r                   32   105.7ms 109.74ms 212.26ms     7.69        6.5s    50
#  4 cpp                 32    7.76ms  10.97ms  15.23ms    95.6     522.78ms    50
#  5 r                   64  211.41ms 226.18ms 332.65ms     3.85      12.99s    50
#  6 cpp                 64   25.09ms  27.34ms  32.04ms    36.0        1.39s    50
#  7 r                  128   534.5ms 627.92ms 659.08ms     1.61      31.03s    50
#  8 cpp                128   56.37ms  58.46ms  66.03ms    16.9        2.95s    50
#  9 r                  256     1.15s    1.18s    1.29s     0.851     58.78s    50
# 10 cpp                256  114.97ms 117.39ms 130.09ms     8.45       5.92s    50
# 11 r                  512     2.09s    2.15s    2.32s     0.463       1.8m    50
# 12 cpp                512  230.81ms  235.6ms 261.99ms     4.18      11.97s    50
# 13 r                 1024        4s    4.22s     4.4s     0.238       3.5m    50
# 14 cpp               1024  410.48ms 431.43ms 462.44ms     2.33      21.45s    50

ggplot(res_bench, aes(x = factor(batch_size), y = median, 
                      group =  expression, color = expression)) +
  geom_point() +
  geom_line() +
  ylab("median time, s") +
  theme_minimal() +
  scale_color_discrete(name = "", labels = c("cpp", "r")) +
  theme(legend.position = "bottom") 

Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก

ื•ื•ื™ ืื™ืจ ืงืขื ืขืŸ ื–ืขืŸ, ื“ื™ ื’ื™ื›ืงื™ื™ึทื˜ ืคืึทืจื’ืจืขืกืขืจืŸ ืื™ื– ื’ืขื•ื•ืขืŸ ื–ื™ื™ืขืจ ื‘ืึทื˜ื™ื™ึทื˜ื™ืง, ืื•ืŸ ืขืก ืื™ื– ื ื™ื˜ ืžืขื’ืœืขืš ืฆื• ื›ืึทืคึผืŸ ื–ื™ืš ืžื™ื˜ C ++ ืงืึธื“ ื“ื•ืจืš ืคึผืึทืจืึทืœืขืœื™ื™ื–ื™ื ื’ ืจ ืงืึธื“.

3. ื™ื˜ืขืจื™ื™ื˜ืขืจื– ืคึฟืึทืจ ืึทื ืœืึธื•ื“ื™ื ื’ ื‘ืึทื˜ืฉืึทื– ืคื•ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก

R ื”ืื˜ ืึท ื’ืขื–ื•ื ื˜-ื“ื™ื–ืขืจื•ื•ื“ ืฉืขื ืคึฟืึทืจ ืคึผืจืึทืกืขืกื™ื ื’ ื“ืึทื˜ืŸ ื•ื•ืึธืก ืคึผืึทืกื™ืง ืื™ืŸ ื‘ืึทืจืึทืŸ, ื‘ืฉืขืช Python ืื™ื– ืžืขืจ ืงืขืจืึทืงื˜ืขืจื™ื™ื–ื“ ื“ื•ืจืš ื™ื˜ืขืจืึทื˜ื™ื•ื• ื“ืึทื˜ืŸ ืคึผืจืึทืกืขืกื™ื ื’, ื•ื•ืึธืก ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืœื™ื™ื›ื˜ ืื•ืŸ ื’ืขื•ื•ื™ื™ื ื˜ืœืขืš ื™ื ืกื˜ืจื•ืžืขื ื˜ ืื•ื™ืก-ืคื•ืŸ-ื”ืึทืจืฅ ื—ืฉื‘ื•ื ื•ืช (ื—ืฉื‘ื•ื ื•ืช ืžื™ื˜ ืคื•ื ื“ืจื•ื™ืกื ื“ื™ืง ื–ื›ึผืจื•ืŸ). ื ืงืœืึทืกื™ืฉ ืื•ืŸ ื‘ืึทื˜ื™ื™ึทื˜ื™ืง ื‘ื™ื™ืฉืคึผื™ืœ ืคึฟืึทืจ ืื•ื ื“ื– ืื™ืŸ ื“ืขื ืงืึธื ื˜ืขืงืกื˜ ืคื•ืŸ ื“ื™ ื“ื™ืกืงืจื™ื™ื‘ื“ ืคึผืจืึธื‘ืœืขื ืื™ื– ื˜ื™ืฃ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก ื˜ืจื™ื™ื ื“ ื“ื•ืจืš ื“ื™ ื’ืจืึทื“ื™ืขื ื˜ ืึทืจืึธืคึผื’ืึทื ื’ ืื•ืคึฟืŸ ืžื™ื˜ ืึทืคึผืจืึทืงืกืึทืžื™ื™ืฉืึทืŸ ืคื•ืŸ ื“ื™ ื’ืจืึทื“ื™ืขื ื˜ ืื™ืŸ ื™ืขื“ืขืจ ืฉืจื™ื˜ ื ื™ืฆืŸ ืึท ืงืœื™ื™ืŸ ื—ืœืง ืคื•ืŸ ืึทื‘ื–ืขืจื•ื•ื™ื™ืฉืึทื ื– ืึธื“ืขืจ ืžื™ื ื™ ืคึผืขืงืœ.

ื˜ื™ืฃ ืœืขืจื ืขืŸ ืคืจืึทืžืขื•ื•ืึธืจืงืก ื’ืขืฉืจื™ื‘ืŸ ืื™ืŸ ืคึผื™ื˜ื”ืึธืŸ ื”ืึธื‘ืŸ ืกืคึผืขืฆื™ืขืœ ืงืœืืกืŸ ื•ื•ืึธืก ื™ื ืกื˜ืจื•ืžืขื ื˜ ื™ื˜ืขืจื™ื™ื˜ืขืจื– ื‘ืื–ื™ืจื˜ ืื•ื™ืฃ ื“ืึทื˜ืŸ: ื˜ื™ืฉืŸ, ื‘ื™ืœื“ืขืจ ืื™ืŸ ืคืึธืœื“ืขืจืก, ื‘ื™ื™ื ืขืจื™ ืคึฟืึธืจืžืึทื˜ื™ืจื•ื ื’ืขืŸ, ืืื–"ื• ื•. ืื™ืจ ืงืขื ืขืŸ ื ื•ืฆืŸ ืคืึทืจื˜ื™ืง ืึธืคึผืฆื™ืขืก ืึธื“ืขืจ ืฉืจื™ื™ึทื‘ืŸ ื“ื™ื™ืŸ ืื™ื™ื’ืขื ืข ืคึฟืึทืจ ืกืคึผืขืฆื™ืคื™ืฉ ื˜ืึทืกืงืก. ืื™ืŸ R, ืžื™ืจ ืงืขื ืขืŸ ื ื•ืฆืŸ ืึทืœืข ื“ื™ ืคึฟืขื™ึดืงื™ื™ื˜ืŸ ืคื•ืŸ ื“ื™ ืคึผื™ื˜ื”ืึธืŸ ื‘ื™ื‘ืœื™ืึธื˜ืขืง ืงืขืจืึทืก ืžื™ื˜ ื–ื™ื™ึทืŸ ืคืึทืจืฉื™ื“ืŸ ื‘ืึทืงืขื ื“ื– ื ื™ืฆืŸ ื“ื™ ืคึผืขืงืœ ืžื™ื˜ ื“ื™ ื–ืขืœื‘ืข ื ืึธืžืขืŸ, ื•ื•ืึธืก ืื™ืŸ ืงืขืจ ืึทืจื‘ืขื˜ ืื•ื™ืฃ ืฉืคึผื™ืฅ ืคื•ืŸ ื“ื™ ืคึผืขืงืœ ืจืขื˜ื™ืงื•ืœืึทื˜ืข. ื“ืขืจ ืœืขืฆื˜ืขืจ ืคึฟืึทืจื“ื™ื ื˜ ืึท ื‘ืึทื–ื•ื ื“ืขืจืŸ ืœืึทื ื’ืŸ ืึทืจื˜ื™ืงืœ; ืขืก ื ื™ื˜ ื‘ืœื•ื™ื– ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืœื•ื™ืคืŸ Python ืงืึธื“ ืคึฟื•ืŸ R, ืึธื‘ืขืจ ืื•ื™ืš ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืึทืจื™ื‘ืขืจืคื™ืจืŸ ืึทื‘ื“ื–ืฉืขืงืฅ ืฆื•ื•ื™ืฉืŸ R ืื•ืŸ Python ืกืขืฉืึทื ื–, ืื•ื™ื˜ืึธืžืึทื˜ื™ืฉ ื“ื•ืจื›ืคื™ืจืŸ ืึทืœืข ื“ื™ ื ื™ื™ื˜ื™ืง ื˜ื™ืคึผ ืงืึทื ื•ื•ืขืจื–ืฉืึทื ื–.

ืžื™ืจ ื‘ืึทืงื•ืžืขืŸ ื‘ืึทืคืจื™ื™ึทืขืŸ ืคื•ืŸ ื“ื™ ื ื•ื™ื˜ ืฆื• ืงืจืึธื ืึทืœืข ื“ื™ ื“ืึทื˜ืŸ ืื™ืŸ ื‘ืึทืจืึทืŸ ื“ื•ืจืš ื ื™ืฆืŸ MonetDBLite, ืึทืœืข ื“ื™ "ื ืขื•ืจืึทืœ ื ืขืฅ" ืึทืจื‘ืขื˜ ื•ื•ืขื˜ ื–ื™ื™ืŸ ื“ื•ืจื›ื’ืขืงืึธื›ื˜ ื“ื•ืจืš ื“ืขืจ ืึธืจื™ื’ื™ื ืขืœ ืงืึธื“ ืื™ืŸ ืคึผื™ื˜ื”ืึธืŸ, ืžื™ืจ ื ืึธืจ ื”ืึธื‘ืŸ ืฆื• ืฉืจื™ื™ึทื‘ืŸ ืึทืŸ ื™ื˜ืขืจืึทื˜ืึธืจ ืื™ื‘ืขืจ ื“ื™ ื“ืึทื˜ืŸ, ื•ื•ื™ื™ึทืœ ืขืก ืื™ื– ื’ืึธืจื ื™ืฉื˜ ื’ืจื™ื™ื˜. ืคึฟืึทืจ ืึทื–ืึท ืึท ืกื™ื˜ื•ืึทืฆื™ืข ืื™ืŸ ืจ ืึธื“ืขืจ ืคึผื™ื˜ื”ืึธืŸ. ืขืก ื–ืขื ืขืŸ ื‘ื™ื™ืกื™ืงืœื™ ื‘ืœื•ื™ื– ืฆื•ื•ื™ื™ ืจืขืงื•ื•ื™ืจืขืžืขื ืฅ ืคึฟืึทืจ ืขืก: ืขืก ืžื•ื–ืŸ ืฆื•ืจื™ืงืงื•ืžืขืŸ ื‘ืึทื˜ืฉืึทื– ืื™ืŸ ืึท ืกืึธืฃ ืฉืœื™ื™ืฃ ืื•ืŸ ืจืึทื˜ืขื•ื•ืขืŸ ื–ื™ื™ึทืŸ ืฉื˜ืึทื˜ ืฆื•ื•ื™ืฉืŸ ื™ื˜ืขืจื™ื™ืฉืึทื ื– (ื“ื™ ืœืขืฆื˜ืข ืื™ืŸ ืจ ืื™ื– ื™ืžืคึผืœืึทืžืขื ืึทื“ ืื™ืŸ ื“ื™ ืกื™ืžืคึผืœืึทืกื˜ ื•ื•ืขื’ ื ื™ืฆืŸ ืงืœืึธื•ื–ืฉืขืจื–). ื‘ื™ื– ืึทื”ืขืจ, ืขืก ืื™ื– ื’ืขื•ื•ืขืŸ ืคืืจืœืื ื’ื˜ ืฆื• ื‘ืคื™ืจื•ืฉ ื’ืขืจ R ืขืจื™ื™ื– ืื™ืŸ ื ืึทืžืคึผื™ ืขืจื™ื™ื– ืื™ืŸ ื“ื™ ื™ื˜ืขืจืึทื˜ืึธืจ, ืึธื‘ืขืจ ื“ื™ ืงืจืึทื ื˜ ื•ื•ืขืจืกื™ืข ืคื•ืŸ โ€‹โ€‹โ€‹โ€‹ื“ืขื ืคึผืขืงืœ ืงืขืจืึทืก ื˜ื•ื˜ ืขืก ื–ื™ืš.

ื“ืขืจ ื™ื˜ืขืจืึทื˜ืึธืจ ืคึฟืึทืจ ื˜ืจื™ื™ื ื™ื ื’ ืื•ืŸ ื•ื•ืึทืœืึทื“ื™ื™ืฉืึทืŸ ื“ืึทื˜ืŸ ืื™ื– ื’ืขื•ื•ืขืŸ ื•ื•ื™ ื’ื™ื™ื˜:

ื™ื˜ืขืจืึทื˜ืึธืจ ืคึฟืึทืจ ื˜ืจื™ื™ื ื™ื ื’ ืื•ืŸ ื•ื•ืึทืœืึทื“ื™ื™ืฉืึทืŸ ื“ืึทื˜ืŸ

train_generator <- function(db_connection = con,
                            samples_index,
                            num_classes = 340,
                            batch_size = 32,
                            scale = 1,
                            color = FALSE,
                            imagenet_preproc = FALSE) {
  # ะŸั€ะพะฒะตั€ะบะฐ ะฐั€ะณัƒะผะตะฝั‚ะพะฒ
  checkmate::assert_class(con, "DBIConnection")
  checkmate::assert_integerish(samples_index)
  checkmate::assert_count(num_classes)
  checkmate::assert_count(batch_size)
  checkmate::assert_number(scale, lower = 0.001, upper = 5)
  checkmate::assert_flag(color)
  checkmate::assert_flag(imagenet_preproc)

  # ะŸะตั€ะตะผะตัˆะธะฒะฐะตะผ, ั‡ั‚ะพะฑั‹ ะฑั€ะฐั‚ัŒ ะธ ัƒะดะฐะปัั‚ัŒ ะธัะฟะพะปัŒะทะพะฒะฐะฝะฝั‹ะต ะธะฝะดะตะบัั‹ ะฑะฐั‚ั‡ะตะน ะฟะพ ะฟะพั€ัะดะบัƒ
  dt <- data.table::data.table(id = sample(samples_index))
  # ะŸั€ะพัั‚ะฐะฒะปัะตะผ ะฝะพะผะตั€ะฐ ะฑะฐั‚ั‡ะตะน
  dt[, batch := (.I - 1L) %/% batch_size + 1L]
  # ะžัั‚ะฐะฒะปัะตะผ ั‚ะพะปัŒะบะพ ะฟะพะปะฝั‹ะต ะฑะฐั‚ั‡ะธ ะธ ะธะฝะดะตะบัะธั€ัƒะตะผ
  dt <- dt[, if (.N == batch_size) .SD, keyby = batch]
  # ะฃัั‚ะฐะฝะฐะฒะปะธะฒะฐะตะผ ัั‡ั‘ั‚ั‡ะธะบ
  i <- 1
  # ะšะพะปะธั‡ะตัั‚ะฒะพ ะฑะฐั‚ั‡ะตะน
  max_i <- dt[, max(batch)]

  # ะŸะพะดะณะพั‚ะพะฒะบะฐ ะฒั‹ั€ะฐะถะตะฝะธั ะดะปั ะฒั‹ะณั€ัƒะทะบะธ
  sql <- sprintf(
    "PREPARE SELECT drawing, label_int FROM doodles WHERE id IN (%s)",
    paste(rep("?", batch_size), collapse = ",")
  )
  res <- DBI::dbSendQuery(con, sql)

  # ะะฝะฐะปะพะณ keras::to_categorical
  to_categorical <- function(x, num) {
    n <- length(x)
    m <- numeric(n * num)
    m[x * n + seq_len(n)] <- 1
    dim(m) <- c(n, num)
    return(m)
  }

  # ะ—ะฐะผั‹ะบะฐะฝะธะต
  function() {
    # ะะฐั‡ะธะฝะฐะตะผ ะฝะพะฒัƒัŽ ัะฟะพั…ัƒ
    if (i > max_i) {
      dt[, id := sample(id)]
      data.table::setkey(dt, batch)
      # ะกะฑั€ะฐัั‹ะฒะฐะตะผ ัั‡ั‘ั‚ั‡ะธะบ
      i <<- 1
      max_i <<- dt[, max(batch)]
    }

    # ID ะดะปั ะฒั‹ะณั€ัƒะทะบะธ ะดะฐะฝะฝั‹ั…
    batch_ind <- dt[batch == i, id]
    # ะ’ั‹ะณั€ัƒะทะบะฐ ะดะฐะฝะฝั‹ั…
    batch <- DBI::dbFetch(DBI::dbBind(res, as.list(batch_ind)), n = -1)

    # ะฃะฒะตะปะธั‡ะธะฒะฐะตะผ ัั‡ั‘ั‚ั‡ะธะบ
    i <<- i + 1

    # ะŸะฐั€ัะธะฝะณ JSON ะธ ะฟะพะดะณะพั‚ะพะฒะบะฐ ะผะฐััะธะฒะฐ
    batch_x <- cpp_process_json_vector(batch$drawing, scale = scale, color = color)
    if (imagenet_preproc) {
      # ะจะบะฐะปะธั€ะพะฒะฐะฝะธะต c ะธะฝั‚ะตั€ะฒะฐะปะฐ [0, 1] ะฝะฐ ะธะฝั‚ะตั€ะฒะฐะป [-1, 1]
      batch_x <- (batch_x - 0.5) * 2
    }

    batch_y <- to_categorical(batch$label_int, num_classes)
    result <- list(batch_x, batch_y)
    return(result)
  }
}

ื“ื™ ืคึฟื•ื ืงืฆื™ืข ื ืขืžื˜ ื•ื•ื™ ืึทืจื™ื™ึทื ืฉืจื™ื™ึทื‘ ืึท ื‘ื™ื™ึทื˜ืขื•ื•ื“ื™ืง ืžื™ื˜ ืึท ืคึฟืึทืจื‘ื™ื ื“ื•ื ื’ ืฆื• ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก, ื“ื™ ื ื•ืžืขืจ ืคื•ืŸ ืฉื•ืจื•ืช ื’ืขื ื™ืฆื˜, ื“ื™ ื ื•ืžืขืจ ืคื•ืŸ ืงืœืืกืŸ, ืคึผืขืงืœ ื’ืจื™ื™ืก, ื•ื•ืึธื’ (scale = 1 ืงืึธืจืึทืกืคึผืึทื ื“ื– ืฆื• ืจืขื ื“ืขืจื™ื ื’ ื‘ื™ืœื“ืขืจ ืคื•ืŸ 256x256 ื‘ื™ืœื“ืฆืขืœืŸ, scale = 0.5 - 128x128 ื‘ื™ืœื“ืฆืขืœืŸ), ืงืึธืœื™ืจ ื’ืจืื“ืŸ (color = FALSE ืกืคึผืขืฆื™ืคื™ืฆื™ืจื˜ ืจืขื ื“ืขืจื™ื ื’ ืื™ืŸ ื’ืจื™ื™ืกืงืึทืœืข ื•ื•ืขืŸ ื’ืขื•ื•ื™ื™ื ื˜ color = TRUE ื™ืขื“ืขืจ ืžืึทืš ืื™ื– ืฆื™ืขืŸ ืื™ืŸ ืึท ื ื™ื™ึทืข ืงืึธืœื™ืจ) ืื•ืŸ ืึท ืคึผืจืขืคึผืจืึธืกืขืกืกื™ื ื’ ื’ืจืื“ืŸ ืคึฟืึทืจ ื ืขื˜ื•ื•ืึธืจืงืก ืคืึทืจ-ื˜ืจื™ื™ื ื“ ืื•ื™ืฃ ื™ืžืึทื“ื–ืฉื ืขื˜. ื“ื™ ืœืขืฆื˜ืข ืื™ื– ื“ืืจืฃ ืฆื• ื•ื•ืึธื’ ืคึผื™ืงืกืขืœ ื•ื•ืึทืœื•ืขืก ืคื•ืŸ ื“ื™ ืžืขื”ืึทืœืขืš [0, 1] ืฆื• ื“ื™ ืžืขื”ืึทืœืขืš [-1, 1], ื•ื•ืึธืก ืื™ื– ื’ืขื ื™ืฆื˜ ื•ื•ืขืŸ ื˜ืจื™ื™ื ื™ื ื’ ื“ื™ ืกืึทืคึผืœื™ื™ื“. ืงืขืจืึทืก ืžืึธื“ืขืœืก.

ื“ื™ ืคื•ื ื“ืจื•ื™ืกื ื“ื™ืง ืคึฟื•ื ืงืฆื™ืข ื›ึผื•ืœืœ ืึทืจื’ื•ืžืขื ื˜ ื˜ื™ืคึผ ืงืึธื ื˜ืจืึธืœื™ืจื•ื ื’, ืึท ื˜ื™ืฉ data.table ืžื™ื˜ ืจืึทื ื“ืึทืžืœื™ ื’ืขืžื™ืฉื˜ ืฉื•ืจื” ื ื•ืžืขืจืŸ ืคื•ืŸ samples_index ืื•ืŸ ืคึผืขืงืœ ื ื•ืžืขืจืŸ, ื˜ืึธืžื‘ืึทื ืง ืื•ืŸ ืžืึทืงืกื™ืžื•ื ื ื•ืžืขืจ ืคื•ืŸ ื‘ืึทื˜ืฉืึทื–, ื•ื•ื™ ื’ืขื–ื•ื ื˜ ื•ื•ื™ ืึท ืกืงืœ ืื•ื™ืกื“ืจื•ืง ืคึฟืึทืจ ืึทื ืœืึธื•ื“ื™ื ื’ ื“ืึทื˜ืŸ ืคื•ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก. ืื™ืŸ ื“ืขืจืฆื•, ืžื™ืจ ื“ื™ืคื™ื™ื ื“ ืึท ืฉื ืขืœ ืึทื ืึทืœืึธื’ ืคื•ืŸ ื“ื™ ืคื•ื ืงืฆื™ืข ื™ืŸ keras::to_categorical(). ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ื›ึผืžืขื˜ ืึทืœืข ื“ื™ ื“ืึทื˜ืŸ ืคึฟืึทืจ ื˜ืจื™ื™ื ื™ื ื’, ื’ืขืœืื–ืŸ ืึท ื”ืึทืœื‘ ืคึผืจืึธืฆืขื ื˜ ืคึฟืึทืจ ื•ื•ืึทืœืึทื“ื™ื™ืฉืึทืŸ, ืึทื–ื•ื™ ื“ื™ ืขืคึผืึธืก ื’ืจื™ื™ืก ืื™ื– ืœื™ืžื™ื˜ืขื“ ื“ื•ืจืš ื“ื™ ืคึผืึทืจืึทืžืขื˜ืขืจ steps_per_epoch ื•ื•ืขืŸ ื’ืขืจื•ืคืŸ keras::fit_generator(), ืื•ืŸ ื“ื™ ืฆื•ืฉื˜ืึทื ื“ if (i > max_i) ื ืึธืจ ื’ืขืืจื‘ืขื˜ ืคึฟืึทืจ ื“ื™ ื•ื•ืึทืœืึทื“ื™ื™ืฉืึทืŸ ื™ื˜ืขืจืึทื˜ืึธืจ.

ืื™ืŸ ื“ื™ ื™ื ืขืจืœืขืš ืคึฟื•ื ืงืฆื™ืข, ืจื•ื“ืขืจืŸ ื™ื ื“ืขืงืกื™ื– ื–ืขื ืขืŸ ืจื™ื˜ืจื™ื•ื•ื“ ืคึฟืึทืจ ื“ื™ ื•ื•ื™ื™ึทื˜ืขืจ ืคึผืขืงืœ, ืจืขืงืึธืจื“ืก ื–ืขื ืขืŸ ืึทื ืœืึธื•ื“ืึทื“ ืคื•ืŸ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ืžื™ื˜ ื“ื™ ืคึผืขืงืœ ื˜ืึธืžื‘ืึทื ืง ื™ื ืงืจื™ืกื™ื ื’, JSON ืคึผืึทืจืกื™ื ื’ (ืคื•ื ืงืฆื™ืข). cpp_process_json_vector(), ื’ืขืฉืจื™ื‘ืŸ ืื™ืŸ C ++) ืื•ืŸ ืงืจื™ื™ื™ื˜ื™ื ื’ ืขืจื™ื™ื– ืงืึธืจืึทืกืคึผืึทื ื“ื™ื ื’ ืฆื• ื‘ื™ืœื“ืขืจ. ื“ืขืจื ืึธืš ืื™ื™ื ืขืจ-ื”ื™ื™ืก ื•ื•ืขืงื˜ืึธืจืก ืžื™ื˜ ืงืœืึทืก ืœืึทื‘ืขืœืก ื–ืขื ืขืŸ ื‘ืืฉืืคืŸ, ืขืจื™ื™ื– ืžื™ื˜ ืคึผื™ืงืกืขืœ ื•ื•ืึทืœื•ืขืก ืื•ืŸ ืœืึทื‘ืขืœืก ื–ืขื ืขืŸ ืงืึทืžื‘ื™ื™ื ื“ ืื™ืŸ ืึท ืจืฉื™ืžื”, ื•ื•ืึธืก ืื™ื– ื“ื™ ืฆื•ืจื™ืงืงื•ืžืขืŸ ื•ื•ืขืจื˜. ืฆื• ืคืึทืจื’ื™ื›ืขืจืŸ ื“ื™ ืึทืจื‘ืขื˜, ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ื“ื™ ืฉืึทืคื•ื ื’ ืคื•ืŸ ื™ื ื“ืขืงืกื™ื– ืื™ืŸ ื˜ื™ืฉืŸ data.table ืื•ืŸ ืžืึธื“ื™ืคื™ืงืึทื˜ื™ืึธืŸ ื“ื•ืจืš ื“ื™ ืœื™ื ืง - ืึธืŸ ื“ื™ ืคึผืขืงืœ "ื˜ืฉื™ืคึผืก" ื“ืึทื˜ืึท.ื˜ืึทื‘ืœืข ืขืก ืื™ื– ื’ืึทื ืฅ ืฉื•ื•ืขืจ ืฆื• ื™ืžืึทื“ื–ืฉืึทืŸ ืืจื‘ืขื˜ืŸ ื™ืคืขืงื˜ื™ื•ื•ืœื™ ืžื™ื˜ ืงื™ื™ืŸ ื‘ืึทื˜ื™ื™ื˜ื™ืง ืกื•ืžืข ืคื•ืŸ โ€‹โ€‹ื“ืึทื˜ืŸ ืื™ืŸ ืจ.

ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜ืŸ ืคื•ืŸ ื’ื™ื›ืงื™ื™ึทื˜ ืžืขื–ืฉืขืจืžืึทื ืฅ ืื•ื™ืฃ ืึท Core i5 ืœืึทืคึผื˜ืึทืคึผ ื–ืขื ืขืŸ ื•ื•ื™ ื’ื™ื™ื˜:

ื™ื˜ืขืจืึทื˜ืึธืจ ื‘ืขื ื˜ืฉืžืึทืจืง

library(Rcpp)
library(keras)
library(ggplot2)

source("utils/rcpp.R")
source("utils/keras_iterator.R")

con <- DBI::dbConnect(drv = MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

ind <- seq_len(DBI::dbGetQuery(con, "SELECT count(*) FROM doodles")[[1L]])
num_classes <- DBI::dbGetQuery(con, "SELECT max(label_int) + 1 FROM doodles")[[1L]]

# ะ˜ะฝะดะตะบัั‹ ะดะปั ะพะฑัƒั‡ะฐัŽั‰ะตะน ะฒั‹ะฑะพั€ะบะธ
train_ind <- sample(ind, floor(length(ind) * 0.995))
# ะ˜ะฝะดะตะบัั‹ ะดะปั ะฟั€ะพะฒะตั€ะพั‡ะฝะพะน ะฒั‹ะฑะพั€ะบะธ
val_ind <- ind[-train_ind]
rm(ind)
# ะšะพัั„ั„ะธั†ะธะตะฝั‚ ะผะฐััˆั‚ะฐะฑะฐ
scale <- 0.5

# ะŸั€ะพะฒะตะดะตะฝะธะต ะทะฐะผะตั€ะฐ
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    it1 <- train_generator(
      db_connection = con,
      samples_index = train_ind,
      num_classes = num_classes,
      batch_size = batch_size,
      scale = scale
    )
    bench::mark(
      it1(),
      min_iterations = 50L
    )
  }
)
# ะŸะฐั€ะฐะผะตั‚ั€ั‹ ะฑะตะฝั‡ะผะฐั€ะบะฐ
cols <- c("batch_size", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   batch_size      min   median      max `itr/sec` total_time n_itr
#        <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
# 1         16     25ms  64.36ms   92.2ms     15.9       3.09s    49
# 2         32   48.4ms 118.13ms 197.24ms     8.17       5.88s    48
# 3         64   69.3ms 117.93ms 181.14ms     8.57       5.83s    50
# 4        128  157.2ms 240.74ms 503.87ms     3.85      12.71s    49
# 5        256  359.3ms 613.52ms 988.73ms     1.54       30.5s    47
# 6        512  884.7ms    1.53s    2.07s     0.674      1.11m    45
# 7       1024     2.7s    3.83s    5.47s     0.261      2.81m    44

ggplot(res_bench, aes(x = factor(batch_size), y = median, group = 1)) +
    geom_point() +
    geom_line() +
    ylab("median time, s") +
    theme_minimal()

DBI::dbDisconnect(con, shutdown = TRUE)

Quick Draw Doodle Recognition: ื•ื•ื™ ืฆื• ืžืึทื›ืŸ ืคืจืขื ื“ื– ืžื™ื˜ R, C ++ ืื•ืŸ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก

ืื•ื™ื‘ ืื™ืจ ื”ืึธื‘ืŸ ืึท ื’ืขื ื•ื’ ืกื•ืžืข ืคื•ืŸ โ€‹โ€‹ื‘ืึทืจืึทืŸ, ืื™ืจ ืงืขื ืขืŸ ืขืžืขืก ืคืึทืจื’ื™ื›ืขืจืŸ ื“ื™ ืึธืคึผืขืจืึทืฆื™ืข ืคื•ืŸ โ€‹โ€‹ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ื“ื•ืจืš ื˜ืจืึทื ืกืคืขืจื™ื ื’ ืขืก ืฆื• ื“ืขืจ ื–ืขืœื‘ื™ืงืขืจ ื‘ืึทืจืึทืŸ (32 ื’ื™ื’ืื‘ื™ื™ื˜ ืื™ื– ื’ืขื ื•ื’ ืคึฟืึทืจ ืื•ื ื“ื–ืขืจ ืึทืจื‘ืขื˜). ืื™ืŸ ืœื™ื ื•ืงืก, ื“ื™ ืฆืขื˜ื™ื™ืœื•ื ื’ ืื™ื– ืžืึธื•ื ื˜ืขื“ ื“ื•ืจืš ืคืขืœื™ืงื™ื™ึทื˜ /dev/shm, ืึทืงื™ืึทืคึผื™ื™ื™ื ื’ ืึทืจื•ื™ืฃ ืฆื• ื”ืึทืœื‘ ืคื•ืŸ ื“ื™ ื‘ืึทืจืึทืŸ ืงืึทืคึผืึทืฆื™ื˜ืขื˜. ืื™ืจ ืงืขื ืขืŸ ื”ื•ื™ื›ืคึผื•ื ืงื˜ ืžืขืจ ื“ื•ืจืš ืขื“ื™ื˜ื™ื ื’ /etc/fstabืฆื• ื‘ืึทืงื•ืžืขืŸ ืึท ืจืขืงืึธืจื“ ื•ื•ื™ tmpfs /dev/shm tmpfs defaults,size=25g 0 0. ื–ื™ื™ื˜ ื–ื™ื›ืขืจ ืฆื• ืจืขื‘ืึธืึธื˜ ืื•ืŸ ืงืึธื ื˜ืจืึธืœื™ืจืŸ ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜ ื“ื•ืจืš ืœื•ื™ืคืŸ ื“ื™ ื‘ืึทืคึฟืขืœ df -h.

ื“ืขืจ ื™ื˜ืขืจืึทื˜ืึธืจ ืคึฟืึทืจ ืคึผืจืึธื‘ืข ื“ืึทื˜ืŸ ืงื•ืงื˜ ืคื™ืœ ืกื™ืžืคึผืœืขืจ, ื–ื™ื ื˜ ื“ื™ ืคึผืจืึธื‘ืข ื“ืึทื˜ืึทืกืขื˜ ืคื™ืฅ ืœืขื’ืึทืžืจืข ืื™ืŸ ื‘ืึทืจืึทืŸ:

ื™ื˜ืขืจืึทื˜ืึธืจ ืคึฟืึทืจ ืคึผืจื•ื‘ื™ืจืŸ ื“ืึทื˜ืŸ

test_generator <- function(dt,
                           batch_size = 32,
                           scale = 1,
                           color = FALSE,
                           imagenet_preproc = FALSE) {

  # ะŸั€ะพะฒะตั€ะบะฐ ะฐั€ะณัƒะผะตะฝั‚ะพะฒ
  checkmate::assert_data_table(dt)
  checkmate::assert_count(batch_size)
  checkmate::assert_number(scale, lower = 0.001, upper = 5)
  checkmate::assert_flag(color)
  checkmate::assert_flag(imagenet_preproc)

  # ะŸั€ะพัั‚ะฐะฒะปัะตะผ ะฝะพะผะตั€ะฐ ะฑะฐั‚ั‡ะตะน
  dt[, batch := (.I - 1L) %/% batch_size + 1L]
  data.table::setkey(dt, batch)
  i <- 1
  max_i <- dt[, max(batch)]

  # ะ—ะฐะผั‹ะบะฐะฝะธะต
  function() {
    batch_x <- cpp_process_json_vector(dt[batch == i, drawing], 
                                       scale = scale, color = color)
    if (imagenet_preproc) {
      # ะจะบะฐะปะธั€ะพะฒะฐะฝะธะต c ะธะฝั‚ะตั€ะฒะฐะปะฐ [0, 1] ะฝะฐ ะธะฝั‚ะตั€ะฒะฐะป [-1, 1]
      batch_x <- (batch_x - 0.5) * 2
    }
    result <- list(batch_x)
    i <<- i + 1
    return(result)
  }
}

4. ืกืขืœืขืงืฆื™ืข ืคื•ืŸ โ€‹โ€‹ืžืึธื“ืขืœ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ

ื“ืขืจ ืขืจืฉื˜ืขืจ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ ื’ืขื ื™ืฆื˜ ืื™ื– ื’ืขื•ื•ืขืŸ mobilenet v1, ื“ื™ ืคึฟืขื™ึดืงื™ื™ื˜ืŸ ืคื•ืŸ ื•ื•ืึธืก ื–ืขื ืขืŸ ื“ื™ืกืงืึทืกื˜ ืื™ืŸ this ืึธื ื–ืึธื’. ืขืก ืื™ื– ืึทืจื™ื™ึทื ื’ืขืจืขื›ื ื˜ ื•ื•ื™ ื ืึธืจืžืึทืœ ืงืขืจืึทืก ืื•ืŸ, ื“ืขืจืคึฟืึทืจ, ืื™ื– ื‘ื ื™ืžืฆื ืื™ืŸ ื“ืขื ืคึผืขืงืœ ืžื™ื˜ ื“ื™ ื–ืขืœื‘ืข ื ืึธืžืขืŸ ืคึฟืึทืจ ืจ. ืึธื‘ืขืจ ื•ื•ืขืŸ ืื™ืจ ืคึผืจื•ึผื•ื•ื˜ ืฆื• ื ื•ืฆืŸ ืขืก ืžื™ื˜ ืื™ื™ืŸ-ืงืึทื ืึทืœ ื‘ื™ืœื“ืขืจ, ืึท ืžืึธื“ื ืข ื–ืึทืš ืื™ื– ืืจื•ื™ืก: ื“ืขืจ ืึทืจื™ื™ึทื ืฉืจื™ื™ึทื‘ ื˜ืขื ืกืขืจ ืžื•ื–ืŸ ืฉื˜ืขื ื“ื™ืง ื”ืึธื‘ืŸ ื“ื™ ื•ื™ืกืžืขืกื˜ื•ื ื’ (batch, height, width, 3), ื•ื•ืึธืก ืื™ื–, ื“ื™ ื ื•ืžืขืจ ืคื•ืŸ ื˜ืฉืึทื ืึทืœื– ืงืขื ืขืŸ ื ื™ื˜ ื–ื™ื™ืŸ ื’ืขื‘ื™ื˜ืŸ. ืขืก ืื™ื– ื ื™ื˜ ืึทื–ืึท ื‘ืึทื’ืจืขื ืขืฆื•ื ื’ ืื™ืŸ Python, ืึทื–ื•ื™ ืžื™ืจ ืจืึทืฉื˜ ืื•ืŸ ื’ืขืฉืจื™ื‘ืŸ ืื•ื ื“ื–ืขืจ ืื™ื™ื’ืขื ืข ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืคื•ืŸ ื“ืขื ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ, ื ืึธืš ื“ืขืจ ืึธืจื™ื’ื™ื ืขืœ ืึทืจื˜ื™ืงืœ (ืึธืŸ ื“ื™ ื“ืจืึธืคึผืึทื•ื˜ ืื™ืŸ ื“ื™ ืงืขืจืึทืก ื•ื•ืขืจืกื™ืข):

Mobilenet v1 ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ

library(keras)

top_3_categorical_accuracy <- custom_metric(
    name = "top_3_categorical_accuracy",
    metric_fn = function(y_true, y_pred) {
         metric_top_k_categorical_accuracy(y_true, y_pred, k = 3)
    }
)

layer_sep_conv_bn <- function(object, 
                              filters,
                              alpha = 1,
                              depth_multiplier = 1,
                              strides = c(2, 2)) {

  # NB! depth_multiplier !=  resolution multiplier
  # https://github.com/keras-team/keras/issues/10349

  layer_depthwise_conv_2d(
    object = object,
    kernel_size = c(3, 3), 
    strides = strides,
    padding = "same",
    depth_multiplier = depth_multiplier
  ) %>%
  layer_batch_normalization() %>% 
  layer_activation_relu() %>%
  layer_conv_2d(
    filters = filters * alpha,
    kernel_size = c(1, 1), 
    strides = c(1, 1)
  ) %>%
  layer_batch_normalization() %>% 
  layer_activation_relu() 
}

get_mobilenet_v1 <- function(input_shape = c(224, 224, 1),
                             num_classes = 340,
                             alpha = 1,
                             depth_multiplier = 1,
                             optimizer = optimizer_adam(lr = 0.002),
                             loss = "categorical_crossentropy",
                             metrics = c("categorical_crossentropy",
                                         top_3_categorical_accuracy)) {

  inputs <- layer_input(shape = input_shape)

  outputs <- inputs %>%
    layer_conv_2d(filters = 32, kernel_size = c(3, 3), strides = c(2, 2), padding = "same") %>%
    layer_batch_normalization() %>% 
    layer_activation_relu() %>%
    layer_sep_conv_bn(filters = 64, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 128, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 128, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 256, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 256, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 1024, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 1024, strides = c(1, 1)) %>%
    layer_global_average_pooling_2d() %>%
    layer_dense(units = num_classes) %>%
    layer_activation_softmax()

    model <- keras_model(
      inputs = inputs,
      outputs = outputs
    )

    model %>% compile(
      optimizer = optimizer,
      loss = loss,
      metrics = metrics
    )

    return(model)
}

ื“ื™ ื“ื™ืกืึทื“ื•ื•ืึทื ื˜ื™ื“ื–ืฉื™ื– ืคื•ืŸ ื“ืขื ืฆื•ื’ืึทื ื’ ื–ืขื ืขืŸ ืงืœืึธืจ ื•ื•ื™ ื“ืขืจ ื˜ืึธื’. ืื™ืš ื•ื•ื™ืœืŸ ืฆื• ืคึผืจื•ื‘ื™ืจืŸ ืึท ืคึผืœืึทืฅ ืคื•ืŸ ืžืึธื“ืขืœืก, ืึธื‘ืขืจ ืื•ื™ืฃ ื“ื™ ืคืึทืจืงืขืจื˜, ืื™ืš ื˜ืึธืŸ ื ื™ืฉื˜ ื•ื•ืขืœืŸ ืฆื• ืจื™ืจื™ื™ื˜ ื™ืขื“ืขืจ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ ืžืึทื ื™ื•ืึทืœื™. ืžื™ืจ ื–ืขื ืขืŸ ืื•ื™ืš ื“ื™ืคึผืจื™ื™ื•ื•ื“ ืคื•ืŸ ื“ื™ ื’ืขืœืขื’ื ื”ื™ื™ื˜ ืฆื• ื ื•ืฆืŸ ื“ื™ ื•ื•ื™ื™ืฅ ืคื•ืŸ ืžืึธื“ืขืœืก ืคึผืจื™-ื˜ืจื™ื™ื ื“ ืื•ื™ืฃ ื™ืžืึทื“ื–ืฉื ืขื˜. ื•ื•ื™ ื’ืขื•ื•ื™ื™ื ื˜ืœืขืš, ื’ืขืœืขืจื ื˜ ื“ื™ ื“ืึทืงื™ื•ืžืขื ื˜ื™ื™ืฉืึทืŸ ื’ืขื”ืึธืœืคึฟืŸ. ืคึฟื•ื ืงืฆื™ืข get_config() ืึทืœืึทื•ื– ืื™ืจ ืฆื• ื‘ืึทืงื•ืžืขืŸ ืึท ื‘ืึทืฉืจื™ื™ึทื‘ื•ื ื’ ืคื•ืŸ ื“ื™ ืžืึธื“ืขืœ ืื™ืŸ ืึท ืคืึธืจืขื ืคึผืึทืกื™ืง ืคึฟืึทืจ ืขื“ื™ื˜ื™ื ื’ (base_model_conf$layers - ืึท ืจืขื’ื•ืœืขืจ ืจ ืจืฉื™ืžื”), ืื•ืŸ ื“ื™ ืคึฟื•ื ืงืฆื™ืข from_config() ืคึผืขืจืคืึธืจืžื– ื“ื™ ืคืึทืจืงืขืจื˜ ืงืึทื ื•ื•ืขืจื–ืฉืึทืŸ ืฆื• ืึท ืžืึธื“ืขืœ ื›ื™ื™ืคืขืฅ:

base_model_conf <- get_config(base_model)
base_model_conf$layers[[1]]$config$batch_input_shape[[4]] <- 1L
base_model <- from_config(base_model_conf)

ืื™ืฆื˜ ืขืก ืื™ื– ื ื™ืฉื˜ ืฉื•ื•ืขืจ ืฆื• ืฉืจื™ื™ึทื‘ืŸ ืึท ื•ื ื™ื•ื•ืขืจืกืึทืœ ืคึฟื•ื ืงืฆื™ืข ืฆื• ื‘ืึทืงื•ืžืขืŸ ืงื™ื™ืŸ ืคื•ืŸ ื“ื™ ืกืึทืคึผืœื™ื™ื“ ืงืขืจืึทืก ืžืึธื“ืขืœืก ืžื™ื˜ ืึธื“ืขืจ ืึธืŸ ื•ื•ื™ื™ืฅ ื˜ืจื™ื™ื ื“ ืื•ื™ืฃ ื‘ื™ืœื“ื ืขื˜:

ืคื•ื ืงืฆื™ืข ืคึฟืึทืจ ืœืึธื•ื“ื™ื ื’ ืคืึทืจื˜ื™ืง ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจื–

get_model <- function(name = "mobilenet_v2",
                      input_shape = NULL,
                      weights = "imagenet",
                      pooling = "avg",
                      num_classes = NULL,
                      optimizer = keras::optimizer_adam(lr = 0.002),
                      loss = "categorical_crossentropy",
                      metrics = NULL,
                      color = TRUE,
                      compile = FALSE) {
  # ะŸั€ะพะฒะตั€ะบะฐ ะฐั€ะณัƒะผะตะฝั‚ะพะฒ
  checkmate::assert_string(name)
  checkmate::assert_integerish(input_shape, lower = 1, upper = 256, len = 3)
  checkmate::assert_count(num_classes)
  checkmate::assert_flag(color)
  checkmate::assert_flag(compile)

  # ะŸะพะปัƒั‡ะฐะตะผ ะพะฑัŠะตะบั‚ ะธะท ะฟะฐะบะตั‚ะฐ keras
  model_fun <- get0(paste0("application_", name), envir = asNamespace("keras"))
  # ะŸั€ะพะฒะตั€ะบะฐ ะฝะฐะปะธั‡ะธั ะพะฑัŠะตะบั‚ะฐ ะฒ ะฟะฐะบะตั‚ะต
  if (is.null(model_fun)) {
    stop("Model ", shQuote(name), " not found.", call. = FALSE)
  }

  base_model <- model_fun(
    input_shape = input_shape,
    include_top = FALSE,
    weights = weights,
    pooling = pooling
  )

  # ะ•ัะปะธ ะธะทะพะฑั€ะฐะถะตะฝะธะต ะฝะต ั†ะฒะตั‚ะฝะพะต, ะผะตะฝัะตะผ ั€ะฐะทะผะตั€ะฝะพัั‚ัŒ ะฒั…ะพะดะฐ
  if (!color) {
    base_model_conf <- keras::get_config(base_model)
    base_model_conf$layers[[1]]$config$batch_input_shape[[4]] <- 1L
    base_model <- keras::from_config(base_model_conf)
  }

  predictions <- keras::get_layer(base_model, "global_average_pooling2d_1")$output
  predictions <- keras::layer_dense(predictions, units = num_classes, activation = "softmax")
  model <- keras::keras_model(
    inputs = base_model$input,
    outputs = predictions
  )

  if (compile) {
    keras::compile(
      object = model,
      optimizer = optimizer,
      loss = loss,
      metrics = metrics
    )
  }

  return(model)
}

ื•ื•ืขืŸ ื ื™ืฆืŸ ืื™ื™ืŸ-ืงืึทื ืึทืœ ื‘ื™ืœื“ืขืจ, ืงื™ื™ืŸ ืคึผืจื™ื˜ืจื™ื™ื ื“ ื•ื•ื™ื™ืฅ ื–ืขื ืขืŸ ื’ืขื ื™ืฆื˜. ื“ืึธืก ืงืขืŸ ื–ื™ื™ืŸ ืคืึทืจืคืขืกื˜ื™ืงื˜: ื ื™ืฆืŸ ื“ื™ ืคึฟื•ื ืงืฆื™ืข get_weights() ื‘ืึทืงื•ืžืขืŸ ื“ื™ ืžืึธื“ืขืœ ื•ื•ื™ื™ืฅ ืื™ืŸ ื“ื™ ืคืึธืจืขื ืคื•ืŸ ืึท ืจืฉื™ืžื” ืคื•ืŸ R ืขืจื™ื™ื–, ื˜ื•ื™ืฉืŸ ื“ื™ ื•ื™ืกืžืขืกื˜ื•ื ื’ ืคื•ืŸ ื“ืขืจ ืขืจืฉื˜ืขืจ ืขืœืขืžืขื ื˜ ืคื•ืŸ ื“ืขืจ ืจืฉื™ืžื” (ื“ื•ืจืš ื ืขืžืขืŸ ืื™ื™ืŸ ืงืึธืœื™ืจ ืงืึทื ืึทืœ ืึธื“ืขืจ ืึทื•ื•ืจื™ื“ื–ืฉื™ื ื’ ืึทืœืข ื“ืจื™ื™), ืื•ืŸ ื“ืึทืŸ ืœืึธื“ืŸ ื“ื™ ื•ื•ื™ื™ืฅ ืฆื•ืจื™ืง ืื™ืŸ ื“ื™ ืžืึธื“ืขืœ ืžื™ื˜ ื“ื™ ืคึฟื•ื ืงืฆื™ืข. set_weights(). ืžื™ืจ ืงื™ื™ื ืžืึธืœ ืฆื•ื’ืขื’ืขื‘ืŸ ื“ืขื ืคืึทื ื’ืงืฉืึทื ืึทืœื™ื˜ื™, ื•ื•ื™ื™ึทืœ ืื™ืŸ ื“ืขื ื‘ื™ื ืข ืขืก ืื™ื– ืฉื•ื™ืŸ ืงืœืึธืจ ืึทื– ืขืก ืื™ื– ืžืขืจ ืคึผืจืึธื“ื•ืงื˜ื™ื•ื• ืฆื• ืึทืจื‘ืขื˜ืŸ ืžื™ื˜ ืงืึธืœื™ืจ ื‘ื™ืœื“ืขืจ.

ืžื™ืจ ื”ืึธื‘ืŸ ื“ื•ืจื›ื’ืขืงืึธื›ื˜ ืจื•ื‘ึฟ ืคื•ืŸ ื“ื™ ื™ืงืกืคึผืขืจืึทืžืึทื ืฅ ืžื™ื˜ ืžืึธื‘ื™ืœื ืขื˜ ื•ื•ืขืจืกื™ืขืก 1 ืื•ืŸ 2, ื•ื•ื™ ื’ืขื–ื•ื ื˜ ื•ื•ื™ ืจืขืกื ืขื˜34. ืžืขืจ ืžืึธื“ืขืจืŸ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจื– ืึทื–ืึท ื•ื•ื™ SE-ResNeXt ื”ืึธื‘ืŸ ื“ื•ืจื›ื’ืขืงืึธื›ื˜ ื’ื•ื˜ ืื™ืŸ ื“ืขื ืคืึทืจืžืขืกื˜. ืฆื•ื ื‘ืึทื“ื•ื™ืขืจืŸ, ืžื™ืจ ื”ืึธื‘ืŸ ื ื™ืฉื˜ ื”ืึธื‘ืŸ ืคืึทืจื˜ื™ืง ื™ืžืคึผืœืึทืžืึทื ืฅ ืฆื• ืื•ื ื“ื–ืขืจ ื‘ืึทื–ื™ื™ึทื˜ื™ืงื•ื ื’, ืื•ืŸ ืžื™ืจ ื”ืึธื‘ืŸ ื ื™ืฉื˜ ื’ืขืฉืจื™ื‘ืŸ ืื•ื ื“ื–ืขืจ ืื™ื™ื’ืขื ืข (ืึธื‘ืขืจ ืžื™ืจ ื•ื•ืขืœืŸ ื‘ืืฉื˜ื™ืžื˜ ืฉืจื™ื™ึทื‘ืŸ).

5. ืคึผืึทืจืึทืžืขื˜ืขืจื™ื–ืึทื˜ื™ืึธืŸ ืคื•ืŸ ืกืงืจื™ืคึผืก

ืคึฟืึทืจ ืงืึทื ื•ื•ื™ื ื™ืึทื ืก, ืึทืœืข ืงืึธื“ ืคึฟืึทืจ ืกื˜ืึทืจื˜ื™ื ื’ ื˜ืจื™ื™ื ื™ื ื’ ืื™ื– ื“ื™ื–ื™ื™ื ื“ ื•ื•ื™ ืึท ืื™ื™ืŸ ืฉืจื™ืคื˜, ืคึผืึทืจืึทืžืขื˜ืขืจื™ื™ื–ื“ ื ื™ืฆืŸ ื“ืึธืงืึธืคึผื˜ ื•ื•ื™ ื’ื™ื™ื˜:

doc <- '
Usage:
  train_nn.R --help
  train_nn.R --list-models
  train_nn.R [options]

Options:
  -h --help                   Show this message.
  -l --list-models            List available models.
  -m --model=<model>          Neural network model name [default: mobilenet_v2].
  -b --batch-size=<size>      Batch size [default: 32].
  -s --scale-factor=<ratio>   Scale factor [default: 0.5].
  -c --color                  Use color lines [default: FALSE].
  -d --db-dir=<path>          Path to database directory [default: Sys.getenv("db_dir")].
  -r --validate-ratio=<ratio> Validate sample ratio [default: 0.995].
  -n --n-gpu=<number>         Number of GPUs [default: 1].
'
args <- docopt::docopt(doc)

ืคึผืขืงืœ ื“ืึธืงืึธืคึผื˜ ืจืขืคึผืจืึทื–ืขื ืฅ ื“ื™ ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ http://docopt.org/ ืคึฟืึทืจ ืจ. ืžื™ื˜ ื–ื™ื™ึทืŸ ื”ื™ืœืฃ, ืกืงืจื™ืคึผืก ื–ืขื ืขืŸ ืœืึธื ื˜ืฉื˜ ืžื™ื˜ ืคึผืฉื•ื˜ ืงืึทืžืึทื ื“ื– ื•ื•ื™ Rscript bin/train_nn.R -m resnet50 -c -d /home/andrey/doodle_db ืึธื“ืขืจ ./bin/train_nn.R -m resnet50 -c -d /home/andrey/doodle_db, ืื•ื™ื‘ ื˜ืขืงืข train_nn.R ืื™ื– ืขืงืกืขืงื•ื˜ืึทื‘ืœืข (ื“ืขื ื‘ืึทืคึฟืขืœ ื•ื•ืขื˜ ืึธื ื”ื™ื™ื‘ืŸ ื˜ืจื™ื™ื ื™ื ื’ ื“ื™ ืžืึธื“ืขืœ resnet50 ืื•ื™ืฃ ื“ืจื™ื™-ืงืึธืœื™ืจ ื‘ื™ืœื“ืขืจ ืžื™ื˜ 128x128 ื‘ื™ืœื“ืฆืขืœืŸ, ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ืžื•ื–ืŸ ื–ื™ื™ืŸ ืœื™ื’ืŸ ืื™ืŸ ื“ืขืจ ื˜ืขืงืข /home/andrey/doodle_db). ืื™ืจ ืงืขื ืขืŸ ืœื™ื™ื’ืŸ ืœืขืจื ืขืŸ ื’ื™ื›ืงื™ื™ึทื˜, ืึธืคึผื˜ื™ืžื™ื–ืขืจ ื˜ื™ืคึผ ืื•ืŸ ืงื™ื™ืŸ ืื ื“ืขืจืข ืงื•ืกื˜ืึธืžื™ื–ืึทื‘ืœืข ืคึผืึทืจืึทืžืขื˜ืขืจืก ืฆื• ื“ืขืจ ืจืฉื™ืžื”. ืื™ืŸ ื“ืขื ืคึผืจืึธืฆืขืก ืคื•ืŸ ืฆื•ื’ืจื™ื™ื˜ื•ื ื’ ืคื•ืŸ ื“ื™ ื•ื™ืกื’ืึทื‘ืข, ืขืก ืคืืจืงืขืจื˜ ืื•ื™ืก ืึทื– ื“ื™ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจ mobilenet_v2 ืคื•ืŸ ื“ื™ ืื™ืฆื˜ื™ืงืข ื•ื•ืขืจืกื™ืข ืงืขืจืึทืก ืื™ืŸ ืจ ื ื•ืฆืŸ ืžื•ื–ืŸ ื ื™ื˜ ืจืขื›ื˜ ืฆื• ืขื ื“ืขืจื•ื ื’ืขืŸ ื•ื•ืึธืก ื–ืขื ืขืŸ ื ื™ืฉื˜ ื’ืขื ื•ืžืขืŸ ืื™ืŸ ื—ืฉื‘ื•ืŸ ื“ื™ R ืคึผืขืงืœ, ืžื™ืจ ื•ื•ืืจื˜ืŸ ืคึฟืึทืจ ื–ื™ื™ ืฆื• ืคืึทืจืจื™ื›ื˜ืŸ ืขืก.

ื“ืขืจ ืฆื•ื’ืึทื ื’ ื”ืึธื˜ ื’ืขืžืื›ื˜ ืขืก ืžืขื’ืœืขืš ืฆื• ื‘ืื˜ื™ื™ื˜ื™ืง ืคืึทืจื’ื™ื›ืขืจืŸ ื™ืงืกืคึผืขืจืึทืžืึทื ืฅ ืžื™ื˜ ืคืึทืจืฉื™ื“ืขื ืข ืžืึธื“ืขืœืก ืงืึทืžืคึผืขืจื“ ืžื™ื˜ ื“ื™ ืžืขืจ ื˜ืจืื“ื™ืฆื™ืื ืขืœืŸ ืงืึทื˜ืขืจ ืคื•ืŸ ืกืงืจื™ืคึผืก ืื™ืŸ RStudio (ืžื™ืจ ื˜ืึธืŸ ื“ืขื ืคึผืขืงืœ ื•ื•ื™ ืึท ืžืขื’ืœืขืš ืื ื“ืขืจ ื‘ืจื™ืจื” ืชืคืจื•ื ืก). ืึธื‘ืขืจ ื“ืขืจ ื”ื•ื™ืคึผื˜ ืžื™ื™ึทืœืข ืื™ื– ื“ื™ ืคื™ื™ื™ืงื™ื™ื˜ ืฆื• ืœื™ื™ื›ื˜ ืคื™ืจืŸ ื“ื™ ืงืึทื˜ืขืจ ืคื•ืŸ ืกืงืจื™ืคึผืก ืื™ืŸ ื“ืึธืงืงืขืจ ืึธื“ืขืจ ืคืฉื•ื˜ ืื•ื™ืฃ ื“ื™ ืกืขืจื•ื•ืขืจ, ืึธืŸ ื™ื ืกื˜ืึธืœื™ื ื’ RStudio ืคึฟืึทืจ ื“ืขื.

6. ื“ืึธืงืขืจื™ื–ืึทื˜ื™ืึธืŸ ืคื•ืŸ ืกืงืจื™ืคึผืก

ืžื™ืจ ื’ืขื•ื•ื™ื™ื ื˜ ื“ืึธืงืงืขืจ ืฆื• ืขื ืฉื•ืจ ืคึผืึธืจื˜ืึทื‘ื™ืœื™ื˜ื™ ืคื•ืŸ ื“ื™ ืกื•ื•ื™ื•ื•ืข ืคึฟืึทืจ ื˜ืจื™ื™ื ื™ื ื’ ืžืึธื“ืขืœืก ืฆื•ื•ื™ืฉืŸ ืžืึทื ืฉืึทืคึฟื˜ ืžื™ื˜ื’ืœื™ื“ืขืจ ืื•ืŸ ืคึฟืึทืจ ื’ื™ืš ื“ื™ืคึผืœื•ื™ืžืึทื ื˜ ืื™ืŸ ื“ื™ ื•ื•ืึธืœืงืŸ. ืื™ืจ ืงืขื ืขืŸ ืึธื ื”ื™ื™ื‘ืŸ ื‘ืึทืงืึทื ื˜ ืžื™ื˜ ื“ืขื ื’ืขืฆื™ื™ึทื’, ื•ื•ืึธืก ืื™ื– ืœืขืคื™ืขืจืขืš ื•ืžื’ืขื•ื•ื™ื™ื ื˜ืœืขืš ืคึฟืึทืจ ืึท ืจ ืคึผืจืึธื’ืจืึทืžื™ืกื˜ ื“ืขื ืกืขืจื™ืข ืคื•ืŸ โ€‹โ€‹ืื•ื™ืกื’ืื‘ืขืก ืึธื“ืขืจ ื•ื•ื™ื“ืขื ืงื•ืจืก.

ื“ืึธืงืขืจ ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืฉืึทืคึฟืŸ ื“ื™ื™ืŸ ืื™ื™ื’ืขื ืข ื‘ื™ืœื“ืขืจ ืคึฟื•ืŸ ืงืจืึทืฆืŸ ืื•ืŸ ื ื•ืฆืŸ ืื ื“ืขืจืข ื‘ื™ืœื“ืขืจ ื•ื•ื™ ืึท ื™ืงืขืจ ืคึฟืึทืจ ืงืจื™ื™ื™ื˜ื™ื ื’ ื“ื™ื™ืŸ ืื™ื™ื’ืขื ืข. ื•ื•ืขืŸ ืึทื ืึทืœื™ื™ื–ื™ื ื’ ื“ื™ ื‘ื ื™ืžืฆื ืึธืคึผืฆื™ืขืก, ืžื™ืจ ื’ืขืงื•ืžืขืŸ ืฆื• ื“ื™ ืžืกืงื ื ืึทื– ื™ื ืกื˜ืึธืœื™ื ื’ NVIDIA, CUDA + cuDNN ื“ืจื™ื•ื•ืขืจืก ืื•ืŸ ืคึผื™ื˜ื”ืึธืŸ ืœื™ื™ื‘ืจืขืจื™ื– ืื™ื– ืึท ืคืขืจืœื™ ื•ื•ืึทืœื•ืžืึทื ืึทืก ื˜ื™ื™ืœ ืคื•ืŸ ื“ื™ ื‘ื™ืœื“, ืื•ืŸ ืžื™ืจ ื‘ืึทืฉืœืึธืกืŸ ืฆื• ื ืขืžืขืŸ ื“ื™ ื‘ืึทืึทืžื˜ืขืจ ื‘ื™ืœื“ ื•ื•ื™ ืึท ื™ืงืขืจ. tensorflow/tensorflow:1.12.0-gpu, ืึทื“ื™ื ื’ ื“ื™ ื ื™ื™ื˜ื™ืง R ืคึผืึทืงืึทื“ื–ืฉืึทื– ื“ืึธืจื˜.

ื“ื™ ืœืขืฆื˜ ื“ืึธืงืงืขืจ ื˜ืขืงืข ื’ืขืงื•ืงื˜ ื•ื•ื™ ื“ืึธืก:

dockerfile

FROM tensorflow/tensorflow:1.12.0-gpu

MAINTAINER Artem Klevtsov <[email protected]>

SHELL ["/bin/bash", "-c"]

ARG LOCALE="en_US.UTF-8"
ARG APT_PKG="libopencv-dev r-base r-base-dev littler"
ARG R_BIN_PKG="futile.logger checkmate data.table rcpp rapidjsonr dbi keras jsonlite curl digest remotes"
ARG R_SRC_PKG="xtensor RcppThread docopt MonetDBLite"
ARG PY_PIP_PKG="keras"
ARG DIRS="/db /app /app/data /app/models /app/logs"

RUN source /etc/os-release && 
    echo "deb https://cloud.r-project.org/bin/linux/ubuntu ${UBUNTU_CODENAME}-cran35/" > /etc/apt/sources.list.d/cran35.list && 
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9 && 
    add-apt-repository -y ppa:marutter/c2d4u3.5 && 
    add-apt-repository -y ppa:timsc/opencv-3.4 && 
    apt-get update && 
    apt-get install -y locales && 
    locale-gen ${LOCALE} && 
    apt-get install -y --no-install-recommends ${APT_PKG} && 
    ln -s /usr/lib/R/site-library/littler/examples/install.r /usr/local/bin/install.r && 
    ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r && 
    ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r && 
    echo 'options(Ncpus = parallel::detectCores())' >> /etc/R/Rprofile.site && 
    echo 'options(repos = c(CRAN = "https://cloud.r-project.org"))' >> /etc/R/Rprofile.site && 
    apt-get install -y $(printf "r-cran-%s " ${R_BIN_PKG}) && 
    install.r ${R_SRC_PKG} && 
    pip install ${PY_PIP_PKG} && 
    mkdir -p ${DIRS} && 
    chmod 777 ${DIRS} && 
    rm -rf /tmp/downloaded_packages/ /tmp/*.rds && 
    rm -rf /var/lib/apt/lists/*

COPY utils /app/utils
COPY src /app/src
COPY tests /app/tests
COPY bin/*.R /app/

ENV DBDIR="/db"
ENV CUDA_HOME="/usr/local/cuda"
ENV PATH="/app:${PATH}"

WORKDIR /app

VOLUME /db
VOLUME /app

CMD bash

ืคึฟืึทืจ ืงืึทื ื•ื•ื™ื ื™ืึทื ืก, ื“ื™ ืคึผืึทืงืึทื“ื–ืฉืึทื– ื’ืขื ื™ืฆื˜ ื–ืขื ืขืŸ ืฉื˜ืขืœืŸ ืื™ืŸ ื•ื•ืขืจื™ืึทื‘ืึทืœื–; ื“ืขืจ ืคืึทืจื ืขื ืคื•ืŸ ื“ื™ ื’ืขืฉืจื™ื‘ืŸ ืกืงืจื™ืคึผืก ื–ืขื ืขืŸ ืงืึทืคึผื™ื“ ื™ืŸ ื“ื™ ืงืึทื ื˜ื™ื™ื ืขืจื– ื‘ืขืฉืึทืก ืคึฟืึทืจื–ืึทืžืœื•ื ื’. ืžื™ืจ ืื•ื™ืš ืคืืจืขื ื“ืขืจื˜ ื“ื™ ื‘ืึทืคึฟืขืœ ืฉืึธืœ ืฆื• /bin/bash ืคึฟืึทืจ ื™ื– ืคื•ืŸ ื ื•ืฆืŸ ืคื•ืŸ ืื™ื ื”ืึทืœื˜ /etc/os-release. ื“ืขื ืึทื•ื•ื•ื™ื“ื™ื“ ื“ื™ ื ื•ื™ื˜ ืฆื• ืกืคึผืขืฆื™ืคื™ืฆื™ืจืŸ ื“ื™ ืึทืก ื•ื•ืขืจืกื™ืข ืื™ืŸ ื“ื™ ืงืึธื“.

ืื™ืŸ ื“ืขืจืฆื•, ืึท ืงืœื™ื™ืŸ ื‘ืึทืฉ ืฉืจื™ืคื˜ ืื™ื– ื’ืขืฉืจื™ื‘ืŸ ืึทื– ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืงืึทื˜ืขืจ ืึท ืงืึทื ื˜ื™ื™ื ืขืจ ืžื™ื˜ ืคืึทืจืฉื™ื“ืŸ ืงืึทืžืึทื ื“ื–. ืคึฟืึทืจ ื‘ื™ื™ึทืฉืคึผื™ืœ, ื“ืึธืก ืงืขืŸ ื–ื™ื™ืŸ ืกืงืจื™ืคึผืก ืคึฟืึทืจ ื˜ืจื™ื™ื ื™ื ื’ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก ื•ื•ืึธืก ื–ืขื ืขืŸ ืคืจื™ืขืจ ื’ืขืฉื˜ืขืœื˜ ืื™ืŸ ื“ืขื ืงืึทื ื˜ื™ื™ื ืขืจ, ืึธื“ืขืจ ืึท ื‘ืึทืคึฟืขืœ ืฉืึธืœ ืคึฟืึทืจ ื“ื™ื‘ืึทื’ื™ื ื’ ืื•ืŸ ืžืึธื ื™ื˜ืึธืจื™ื ื’ ื“ื™ ืึธืคึผืขืจืึทืฆื™ืข ืคื•ืŸ โ€‹โ€‹โ€‹โ€‹ื“ืขื ืงืึทื ื˜ื™ื™ื ืขืจ:

ืฉืจื™ืคื˜ ืฆื• ืงืึทื˜ืขืจ ื“ืขื ืงืึทื ื˜ื™ื™ื ืขืจ

#!/bin/sh

DBDIR=${PWD}/db
LOGSDIR=${PWD}/logs
MODELDIR=${PWD}/models
DATADIR=${PWD}/data
ARGS="--runtime=nvidia --rm -v ${DBDIR}:/db -v ${LOGSDIR}:/app/logs -v ${MODELDIR}:/app/models -v ${DATADIR}:/app/data"

if [ -z "$1" ]; then
    CMD="Rscript /app/train_nn.R"
elif [ "$1" = "bash" ]; then
    ARGS="${ARGS} -ti"
else
    CMD="Rscript /app/train_nn.R $@"
fi

docker run ${ARGS} doodles-tf ${CMD}

ืื•ื™ื‘ ื“ืขื ื‘ืึทืฉ ืฉืจื™ืคื˜ ืื™ื– ืœื•ื™ืคืŸ ืึธืŸ ืคึผืึทืจืึทืžืขื˜ืขืจืก, ื“ื™ ืฉืจื™ืคื˜ ื•ื•ืขื˜ ื–ื™ื™ืŸ ื’ืขืจื•ืคึฟืŸ ืื™ืŸ ื“ืขื ืงืึทื ื˜ื™ื™ื ืขืจ train_nn.R ืžื™ื˜ ืคืขืœื™ืงื™ื™ึทื˜ ื•ื•ืึทืœื•ืขืก; ืื•ื™ื‘ ื“ืขืจ ืขืจืฉื˜ืขืจ ืคึผืึธืกื™ื˜ื™ืึธื ืึทืœ ืึทืจื’ื•ืžืขื ื˜ ืื™ื– "ื‘ืึทืฉ", ื“ืขืจ ืงืึทื ื˜ื™ื™ื ืขืจ ื•ื•ืขื˜ ืึธื ื”ื™ื™ื‘ืŸ ื™ื ื˜ืขืจืึทืงื˜ื™ื•ื•ืœื™ ืžื™ื˜ ืึท ื‘ืึทืคึฟืขืœ ืฉืึธืœ. ืื™ืŸ ืึทืœืข ืื ื“ืขืจืข ืงืึทืกืขืก, ื“ื™ ื•ื•ืึทืœื•ืขืก ืคื•ืŸ ืคึผืึธืกื™ื˜ื™ืึธื ืึทืœ ืึทืจื’ื•ืžืขื ื˜ืŸ ื–ืขื ืขืŸ ืกืึทื‘ืกื˜ืึทื˜ื•ื˜ืึทื“: CMD="Rscript /app/train_nn.R $@".

ืขืก ืื™ื– ื›ื“ืื™ ืฆื• ื‘ืืžืขืจืงืŸ ืึทื– ื“ื™ ื“ื™ื™ืจืขืงื˜ืขืจื™ื– ืžื™ื˜ ืžืงื•ืจ ื“ืึทื˜ืŸ ืื•ืŸ ื“ืึทื˜ืึทื‘ื™ื™ืก, ื•ื•ื™ ื’ืขื–ื•ื ื˜ ื•ื•ื™ ื“ื™ ื•ื•ืขื’ื•ื•ื™ื™ึทื–ืขืจ ืคึฟืึทืจ ืฉืคึผืึธืจืŸ ื˜ืจื™ื™ื ื“ ืžืึธื“ืขืœืก, ื–ืขื ืขืŸ ืžืึธื•ื ื˜ืขื“ ืื™ืŸ ื“ืขื ืงืึทื ื˜ื™ื™ื ืขืจ ืคึฟื•ืŸ ื“ืขืจ ื‘ืึทืœืขื‘ืึธืก ืกื™ืกื˜ืขื, ื•ื•ืึธืก ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืึทืงืกืขืก ื“ื™ ืจืขื–ื•ืœื˜ืึทื˜ืŸ ืคื•ืŸ ื“ื™ ืกืงืจื™ืคึผืก ืึธืŸ ื•ืžื ื™ื™ื˜ื™ืง ืžืึทื ื™ืคึผื™ืึทืœื™ื™ืฉืึทื ื–.

7. ื ื™ืฆืŸ ืงื™ื™ืคืœ ื’ืคึผื•ืก ืื•ื™ืฃ Google ืงืœืึธื•ื“

ืื™ื™ื ืขืจ ืคื•ืŸ ื“ื™ ืคึฟืขื™ึดืงื™ื™ื˜ืŸ ืคื•ืŸ ื“ื™ ืคืึทืจืžืขืกื˜ ืื™ื– ื’ืขื•ื•ืขืŸ ื–ื™ื™ืขืจ ื˜ื•ืžืœื“ื™ืง ื“ืึทื˜ืŸ (ื–ืขืŸ ื“ืขื ื˜ื™ื˜ืœ ื‘ื™ืœื“, ื‘ืึทืจืึธื•ื“ ืคึฟื•ืŸ @Leigh.plt ืคึฟื•ืŸ ODS ืกืœืึทืง). ื’ืจื•ื™ืก ื‘ืึทื˜ืฉืึทื– ื”ืขืœืคืŸ ืงืึทืžื‘ืึทื˜ ื“ืขื, ืื•ืŸ ื ืึธืš ื™ืงืกืคึผืขืจืึทืžืึทื ืฅ ืื•ื™ืฃ ืึท ืคึผื™ืกื™ ืžื™ื˜ 1 ื’ืคึผื•, ืžื™ืจ ื‘ืึทืฉืœืึธืกืŸ ืฆื• ื‘ืขืœ ื˜ืจื™ื™ื ื™ื ื’ ืžืึธื“ืขืœืก ืื•ื™ืฃ ืขื˜ืœืขื›ืข ื’ืคึผื•ืก ืื™ืŸ ื“ื™ ื•ื•ืึธืœืงืŸ. ื’ืขื•ื•ื™ื™ื ื˜ GoogleCloud (ื’ื•ื˜ ืคื™ืจืŸ ืฆื• ื“ื™ ื‘ืึทืกื™ืงืก) ืจืขื›ื˜ ืฆื• ื“ืขืจ ื’ืจื•ื™ืก ืกืขืœืขืงืฆื™ืข ืคื•ืŸ โ€‹โ€‹ื‘ื ื™ืžืฆื ืงืึทื ืคื™ื’ื™ืขืจื™ื™ืฉืึทื ื–, ื’ืœื™ื™ึทืš ืคึผืจื™ื™ื– ืื•ืŸ $ 300 ื‘ืึธื ื•ืก. ืคึฟื•ืŸ ื’ืจื™ื“, ืื™ืš ืึธืจื“ืขืจื“ ืึท 4xV100 ื‘ื™ื™ึทืฉืคึผื™ืœ ืžื™ื˜ ืึท SSD ืื•ืŸ ืึท ื˜ืึธืŸ ืคื•ืŸ ื‘ืึทืจืึทืŸ, ืื•ืŸ ื“ืึธืก ืื™ื– ื’ืขื•ื•ืขืŸ ืึท ื’ืจื•ื™ืก ื’ืจื™ื™ึทื–. ืึทื–ืึท ืึท ืžืึทืฉื™ืŸ ืขืกื˜ ื’ืขืœื˜ ื’ืขืฉื•ื•ื™ื ื“; ืื™ืจ ืงืขื ืขืŸ ื’ื™ื™ืŸ ืฆืขื‘ืจืื›ืŸ ืขืงืกืคึผืขืจื™ืžืขื ื˜ื™ื ื’ ืึธืŸ ืึท ืคึผืจืึธื•ื•ืขืŸ ืจืขืจื  - ืœื™ื ื™ืข. ืคึฟืึทืจ ื‘ื™ืœื“ื•ื ื’ืงืจื™ื™ื– ืฆื•ื•ืขืงืŸ, ืขืก ืื™ื– ื‘ืขืกืขืจ ืฆื• ื ืขืžืขืŸ ื“ื™ ืง80. ืึธื‘ืขืจ ื“ื™ ื’ืจื•ื™ืก ืกื•ืžืข ืคื•ืŸ โ€‹โ€‹ื‘ืึทืจืึทืŸ ืื™ื– ื’ืขื•ื•ืขืŸ ื ื•ืฆื™ืง - ื“ื™ ื•ื•ืึธืœืงืŸ ืกืกื“ ื”ืื˜ ื ื™ืฉื˜ ื™ืžืคึผืึธื ื™ืจืŸ ืžื™ื˜ ื–ื™ื™ึทืŸ ืคืึธืจืฉื˜ืขืœื•ื ื’, ืึทื–ื•ื™ ื“ื™ ื“ืึทื˜ืึทื‘ื™ื™ืก ืื™ื– ื˜ืจืึทื ืกืคืขืจื“ ืฆื• dev/shm.

ืคื•ืŸ ื’ืจืขืกื˜ืข ืื™ื ื˜ืขืจืขืก ืื™ื– ื“ื™ ืงืึธื“ ืคืจืึทื’ืžืขื ื˜ ืคืึทืจืึทื ื˜ื•ื•ืึธืจื˜ืœืขืš ืคึฟืึทืจ ื ื™ืฆืŸ ืงื™ื™ืคืœ ื’ืคึผื•ืก. ืขืจืฉื˜ืขืจ, ื“ืขืจ ืžืึธื“ืขืœ ืื™ื– ื‘ืืฉืืคืŸ ืื•ื™ืฃ ื“ื™ ืงืคึผื• ื ื™ืฆืŸ ืึท ืงืึธื ื˜ืขืงืกื˜ ืคืึทืจื•ื•ืึทืœื˜ืขืจ, ืคึผื•ื ืงื˜ ื•ื•ื™ ืื™ืŸ ืคึผื™ื˜ื”ืึธืŸ:

with(tensorflow::tf$device("/cpu:0"), {
  model_cpu <- get_model(
    name = model_name,
    input_shape = input_shape,
    weights = weights,
    metrics =(top_3_categorical_accuracy,
    compile = FALSE
  )
})

ื“ืขืจื ืึธืš ื“ื™ ืึทื ืงืึธืžืคึผื™ืœืขื“ (ื“ืึธืก ืื™ื– ื•ื•ื™ื›ื˜ื™ืง) ืžืึธื“ืขืœ ืื™ื– ืงืึทืคึผื™ื“ ืฆื• ืึท ื’ืขื’ืขื‘ืŸ ื ื•ืžืขืจ ืคื•ืŸ ื‘ื ื™ืžืฆื ื’ืคึผื•ืก, ืื•ืŸ ื‘ืœื•ื™ื– ื ืึธืš ื“ืขื ืขืก ืื™ื– ืงืึทืžืคึผื™ื™ืœื“:

model <- keras::multi_gpu_model(model_cpu, gpus = n_gpu)
keras::compile(
  object = model,
  optimizer = keras::optimizer_adam(lr = 0.0004),
  loss = "categorical_crossentropy",
  metrics = c(top_3_categorical_accuracy)
)

ื“ืขืจ ืงืœืึทืกื™ืฉ ื˜ืขื›ื ื™ืง ืคื•ืŸ ื™ื™ึทื– ืงืึทืœื˜ ืึทืœืข ืœื™ื™ึทืขืจืก ืึทื—ื•ืฅ ื“ื™ ืœืขืฆื˜ืข, ื˜ืจื™ื™ื ื™ื ื’ ื“ื™ ืœืขืฆื˜ืข ืฉื™ื›ื˜ืข, ื•ื ืคืจืขื–ื™ื ื’ ืื•ืŸ ืจื™ื˜ืจื™ื™ื ื™ื ื’ ื“ื™ ื’ืื ืฆืข ืžืึธื“ืขืœ ืคึฟืึทืจ ืขื˜ืœืขื›ืข ื’ืคึผื•ืก ืงืขืŸ ื ื™ืฉื˜ ื–ื™ื™ืŸ ื™ืžืคึผืœืึทืžืขื ืึทื“.

ื˜ืจืึทื™ื ื™ื ื’ ืื™ื– ืžืึธื ื™ื˜ืึธืจืขื“ ืึธืŸ ื ื•ืฆืŸ. tensorboard, ืœื™ืžื™ื˜ืขื“ ื–ื™ืš ืฆื• ืจืขืงืึธืจื“ื™ืจืŸ ืœืึธื’ืก ืื•ืŸ ืฉืคึผืึธืจืŸ ืžืึธื“ืขืœืก ืžื™ื˜ ื™ื ืคืึธืจืžืึทื˜ื™ื•ื• ื ืขืžืขืŸ ื ืึธืš ื™ืขื“ืขืจ ืขืคึผืึธืก:

ืงืึทืœืœื‘ืึทืงืงืก

# ะจะฐะฑะปะพะฝ ะธะผะตะฝะธ ั„ะฐะนะปะฐ ะปะพะณะฐ
log_file_tmpl <- file.path("logs", sprintf(
  "%s_%d_%dch_%s.csv",
  model_name,
  dim_size,
  channels,
  format(Sys.time(), "%Y%m%d%H%M%OS")
))
# ะจะฐะฑะปะพะฝ ะธะผะตะฝะธ ั„ะฐะนะปะฐ ะผะพะดะตะปะธ
model_file_tmpl <- file.path("models", sprintf(
  "%s_%d_%dch_{epoch:02d}_{val_loss:.2f}.h5",
  model_name,
  dim_size,
  channels
))

callbacks_list <- list(
  keras::callback_csv_logger(
    filename = log_file_tmpl
  ),
  keras::callback_early_stopping(
    monitor = "val_loss",
    min_delta = 1e-4,
    patience = 8,
    verbose = 1,
    mode = "min"
  ),
  keras::callback_reduce_lr_on_plateau(
    monitor = "val_loss",
    factor = 0.5, # ัƒะผะตะฝัŒัˆะฐะตะผ lr ะฒ 2 ั€ะฐะทะฐ
    patience = 4,
    verbose = 1,
    min_delta = 1e-4,
    mode = "min"
  ),
  keras::callback_model_checkpoint(
    filepath = model_file_tmpl,
    monitor = "val_loss",
    save_best_only = FALSE,
    save_weights_only = FALSE,
    mode = "min"
  )
)

8. ืึทื ืฉื˜ืึธื˜ ืึท ืžืกืงื ื

ื ื ื•ืžืขืจ ืคื•ืŸ ืคืจืื‘ืœืขืžืขืŸ ื•ื•ืึธืก ืžื™ืจ ื”ืึธื‘ืŸ ื’ืขืคึผืœืึธื ื˜ืขืจื˜ ื–ืขื ืขืŸ ื ืึธืš ื ื™ืฉื˜ ื‘ืึทืงื•ืžืขืŸ:

  • ะฒ ืงืขืจืึทืก ืขืก ืื™ื– ืงื™ื™ืŸ ืคืึทืจื˜ื™ืง ืคื•ื ืงืฆื™ืข ืคึฟืึทืจ ืื•ื™ื˜ืึธืžืึทื˜ื™ืฉ ื–ื•ื›ืŸ ืคึฟืึทืจ ื“ื™ ืึธืคึผื˜ื™ืžืึทืœ ืœืขืจื ืขืŸ ืงื•ืจืก (ืึทื ืึทืœืึธื’ lr_finder ืื™ืŸ ื‘ื™ื‘ืœื™ืื˜ืขืง fast.ai); ืžื™ื˜ ืขื˜ืœืขื›ืข ืžื™, ืขืก ืื™ื– ืžืขื’ืœืขืš ืฆื• ืึทืจื™ื‘ืขืจืคื™ืจืŸ ื“ืจื™ื˜-ืคึผืึทืจื˜ื™ื™ ื™ืžืคึผืœืึทืžืึทื ืฅ ืฆื• R, ืœืžืฉืœ, this;
  • ื•ื•ื™ ืึท ืงืึทื ืกืึทืงื•ื•ืึทื ืก ืคื•ืŸ ื“ื™ ืคืจื™ืขืจื“ื™ืงืข ืคื•ื ื˜, ืขืก ืื™ื– ื ื™ื˜ ืžืขื’ืœืขืš ืฆื• ืื•ื™ืกืงืœื™ื™ึทื‘ืŸ ื“ื™ ืจื™ื›ื˜ื™ืง ื˜ืจื™ื™ื ื™ื ื’ ื’ื™ื›ืงื™ื™ึทื˜ ื•ื•ืขืŸ ื ื™ืฆืŸ ืขื˜ืœืขื›ืข ื’ืคึผื•ืก;
  • ืขืก ืื™ื– ืึท ืคืขืœืŸ ืคื•ืŸ ืžืึธื“ืขืจืŸ ื ืขื•ืจืึทืœ ื ืขืฅ ืึทืจืงืึทื˜ืขืงื˜ืฉืขืจื–, ืกืคึผืขืฆื™ืขืœ ื“ื™ ืคืึทืจ-ื˜ืจื™ื™ื ื“ ืื•ื™ืฃ ื™ืžืึทื“ื–ืฉื ืขื˜;
  • ืงื™ื™ืŸ ืื™ื™ืŸ ืฆื™ืงืœ ืคึผืึธืœื™ื˜ื™ืง ืื•ืŸ ื“ื™ืกืงืจื™ืžื™ื ืึทื˜ื™ื•ื•ืข ืœืขืจื ืขืŸ ืจื™ื™ืฅ (ืงืึธืกื™ื ืข ืึทื ื™ืœื™ื ื’ ืื™ื– ื’ืขื•ื•ืขืŸ ืื™ืŸ ืื•ื ื“ื–ืขืจ ื‘ืงืฉื” ื™ืžืคึผืœืึทืžืขื ืึทื“, ื“ืึทื ืงืขืŸ ืกืงื™ื™ื“ืŸ).

ื•ื•ืึธืก ื ื•ืฆื™ืง ื˜ื™ื ื’ื– ื–ืขื ืขืŸ ื’ืขืœืขืจื ื˜ ืคื•ืŸ ื“ืขื ืคืึทืจืžืขืกื˜:

  • ืื•ื™ืฃ ืœืขืคื™ืขืจืขืš ื ื™ื“ืขืจื™ืง-ืžืึทื›ื˜ ื™ื™ึทื–ื ื•ื•ืึทืจื’, ืื™ืจ ืงืขื ืขืŸ ืึทืจื‘ืขื˜ืŸ ืžื™ื˜ ืœื™ื™ึทื˜ื™ืฉ (ืงื™ื™ืคืœ ื“ื™ ื’ืจื™ื™ืก ืคื•ืŸ ื‘ืึทืจืึทืŸ) ื•ื•ืึทืœื™ื•ืžื– ืคื•ืŸ ื“ืึทื˜ืŸ ืึธืŸ ื•ื•ื™ื™ื˜ื™ืง. ืคึผืœืึทืกื˜ื™ืง ื–ืขืงืœ ื“ืึทื˜ืึท.ื˜ืึทื‘ืœืข ืกืึทื•ื•ืขืก ื–ื™ืงืึธืจืŸ ืจืขื›ื˜ ืฆื• ื“ืขืจ ืžืึธื“ื™ืคื™ืงืึทื˜ื™ืึธืŸ ืคื•ืŸ ื˜ื™ืฉืŸ ืื™ืŸ ืคึผืœืึทืฅ, ื•ื•ืึธืก ื•ื™ืกืžื™ื™ื“ืŸ ืงืึทืคึผื™ื™ื ื’ ื–ื™ื™, ืื•ืŸ ื•ื•ืขืŸ ื’ืขื•ื•ื™ื™ื ื˜ ืจื™ื›ื˜ื™ืง, ื“ื™ ืงื™ื™ืคึผืึทื‘ื™ืœืึทื˜ื™ื– ื›ึผืžืขื˜ ืฉื˜ืขื ื“ื™ืง ื‘ืึทื•ื•ื™ื™ึทื–ืŸ ื“ื™ ื”ืขื›ืกื˜ืŸ ื’ื™ื›ืงื™ื™ึทื˜ ืฆื•ื•ื™ืฉืŸ ืึทืœืข ื“ื™ ืžื›ืฉื™ืจื™ื ื•ื•ืึธืก ื–ืขื ืขืŸ ื‘ืืงืื ื˜ ืฆื• ืื•ื ื“ื– ืคึฟืึทืจ ืกืงืจื™ืคึผื˜ื™ื ื’ ืฉืคึผืจืึทื›ืŸ. ืฉืคึผืึธืจืŸ ื“ืึทื˜ืŸ ืื™ืŸ ืึท ื“ืึทื˜ืึทื‘ื™ื™ืก ืึทืœืึทื•ื– ืื™ืจ, ืื™ืŸ ืคื™ืœืข ืงืึทืกืขืก, ื ื™ืฉื˜ ืฆื• ื˜ืจืึทื›ื˜ืŸ ื•ื•ืขื’ืŸ ื“ื™ ื ื•ื™ื˜ื™ืง ืฆื• ืงื•ื•ืขื˜ืฉืŸ ื“ื™ ื’ืื ืฆืข ื“ืึทื˜ืึทื‘ื™ื™ืก ืื™ืŸ ื‘ืึทืจืึทืŸ.
  • ืคึผืึทืžืขืœืขืš ืคืึทื ื’ืงืฉืึทื ื– ืื™ืŸ ืจ ืงืขื ืขืŸ ื–ื™ื™ืŸ ืจื™ืคึผืœื™ื™ืกื˜ ืžื™ื˜ ืฉื ืขืœ ืึธื ืขืก ืื™ืŸ C ++ ื ื™ืฆืŸ ื“ืขื ืคึผืขืงืœ Rcpp. ืื•ื™ื‘ ืื™ืŸ ื“ืขืจืฆื• ืฆื• ื ื•ืฆืŸ RcppThread ืึธื“ืขืจ RcppParallel, ืžื™ืจ ื‘ืึทืงื•ืžืขืŸ ืงืจื™ื™ึทื–-ืคึผืœืึทื˜ืคืึธืจืžืข ืžืึทืœื˜ื™-ื˜ืจืขื“ื™ื“ ื™ืžืคึผืœืึทืžืึทื ืฅ, ืึทื–ื•ื™ ืขืก ืื™ื– ื ื™ื˜ ื“ืึทืจืคึฟืŸ ืฆื• ืคึผืึทืจืึทืœืขืœื™ื–ื™ืจืŸ ื“ื™ ืงืึธื“ ืื•ื™ืฃ ื“ื™ ืจ ืžื“ืจื’ื”.
  • ืคึผืขืงืœ Rcpp ืงืขื ืขืŸ ื–ื™ื™ืŸ ื’ืขื•ื•ื™ื™ื ื˜ ืึธืŸ ืขืจื ืกื˜ ื•ื•ื™ืกืŸ ืคื•ืŸ C ++, ื“ื™ ืคืืจืœืื ื’ื˜ ืžื™ื ื™ืžื•ื ืื™ื– ืึทื•ื˜ืœื™ื™ื ื“ ื“ืึธ. ื›ืขื“ืขืจ ื˜ืขืงืขืก ืคึฟืึทืจ ืึท ื ื•ืžืขืจ ืคื•ืŸ ืงื™ืœ C-ืœื™ื‘ืจืขืจื™ื– ื•ื•ื™ xtensor ื‘ื ื™ืžืฆื ืื•ื™ืฃ CRAN, ื“ืึธืก ื”ื™ื™ืกื˜, ืึทืŸ ื™ื ืคืจืึทืกื˜ืจืึทืงื˜ืฉืขืจ ืื™ื– ื’ืขืฉืืคืŸ ืคึฟืึทืจ ื“ื™ ื™ืžืคึผืœืึทืžืขื ื˜ื™ื™ืฉืึทืŸ ืคื•ืŸ ืคึผืจืึทื“ื–ืฉืขืงืก ื•ื•ืึธืก ื•ื™ืกืฉื˜ื™ืžืขืŸ ืคืึทืจื˜ื™ืง ื”ื•ื™ืš-ืคืึธืจืฉื˜ืขืœื•ื ื’ C ++ ืงืึธื“ ืื™ืŸ R. ื ืึธืš ืงืึทื ื•ื•ื™ื ื™ืึทื ืก ืื™ื– ืกื™ื ื˜ืึทืงืก ื›ื™ื™ืœื™ื™ื˜ื™ื ื’ ืื•ืŸ ืึท ืกื˜ืึทื˜ื™ืง C ++ ืงืึธื“ ืึทื ืึทืœื™ื–ืขืจ ืื™ืŸ RStudio.
  • ื“ืึธืงืึธืคึผื˜ ืึทืœืึทื•ื– ืื™ืจ ืฆื• ืœื•ื™ืคืŸ ื–ื™ืš-ืงืึทื ื˜ื™ื™ื ื“ ืกืงืจื™ืคึผืก ืžื™ื˜ ืคึผืึทืจืึทืžืขื˜ืขืจืก. ื“ืึธืก ืื™ื– ื‘ืึทืงื•ื•ืขื ืคึฟืึทืจ ื ื•ืฆืŸ ืื•ื™ืฃ ืึท ื•ื•ื™ื™ึทื˜ ืกืขืจื•ื•ืขืจ, ื™ื ืงืœ. ืื•ื ื˜ืขืจ ื“ืึธืงืงืขืจ. ืื™ืŸ RStudio, ืขืก ืื™ื– ื•ืžื‘ืึทืงื•ื•ืขื ืฆื• ื“ื•ืจื›ืคื™ืจืŸ ืคื™ืœืข ืฉืขื” ืคื•ืŸ ื™ืงืกืคึผืขืจืึทืžืึทื ืฅ ืžื™ื˜ ื˜ืจื™ื™ื ื™ื ื’ ื ืขื•ืจืึทืœ ื ืขื˜ื•ื•ืึธืจืงืก, ืื•ืŸ ื™ื ืกื˜ืึธืœื™ื ื’ ื“ื™ IDE ืื•ื™ืฃ ื“ื™ ืกืขืจื•ื•ืขืจ ื–ื™ืš ืื™ื– ื ื™ื˜ ืฉื˜ืขื ื“ื™ืง ื’ืขืจืขื›ื˜ืคืืจื˜ื™ืงื˜.
  • ื“ืึธืงืขืจ ื™ื ืฉื•ืจื– ืงืึธื“ ืคึผืึธืจื˜ืึทื‘ื™ืœื™ื˜ื™ ืื•ืŸ ืจืขืคึผืจืึธื“ื•ืกื™ื‘ื™ืœื™ื˜ื™ ืคื•ืŸ ืจืขื–ื•ืœื˜ืึทื˜ืŸ ืฆื•ื•ื™ืฉืŸ ื“ืขื•ื•ืขืœืึธืคึผืขืจืก ืžื™ื˜ ืคืึทืจืฉื™ื“ืขื ืข ื•ื•ืขืจืกื™ืขืก ืคื•ืŸ ื“ื™ ืึทืก ืื•ืŸ ืœื™ื™ื‘ืจืขืจื™ื–, ื•ื•ื™ ื’ืขื–ื•ื ื˜ ื•ื•ื™ ื™ื– ืคื•ืŸ ื“ื•ืจื›ืคื™ืจื•ื ื’ ืื•ื™ืฃ ืกืขืจื•ื•ืขืจืก. ืื™ืจ ืงืขื ืขืŸ ืงืึทื˜ืขืจ ื“ื™ ื’ืื ืฆืข ื˜ืจื™ื™ื ื™ื ื’ ืจืขืจื  - ืœื™ื ื™ืข ืžื™ื˜ ื‘ืœื•ื™ื– ืื™ื™ืŸ ื‘ืึทืคึฟืขืœ.
  • Google ืงืœืึธื•ื“ ืื™ื– ืึท ื‘ื•ื“ื–ืฉืขื˜-ืคืจื™ื™ึทื ื“ืœืขืš ื•ื•ืขื’ ืฆื• ืขืงืกืคึผืขืจื™ืžืขื ื˜ ืื•ื™ืฃ ื˜ื™ื™ึทืขืจ ื™ื™ึทื–ื ื•ื•ืึทืจื’, ืึธื‘ืขืจ ืื™ืจ ื“ืึทืจืคึฟืŸ ืฆื• ืงืœื™ื™ึทื‘ืŸ ืงืึทื ืคื™ื’ื™ืขืจื™ื™ืฉืึทื ื– ืงืขืจืคืึทืœื™.
  • ืžืขืกื˜ืŸ ื“ื™ ื’ื™ื›ืงื™ื™ึทื˜ ืคื•ืŸ ื™ื—ื™ื“ ืงืึธื“ ืคืจืึทื’ืžืึทื ืฅ ืื™ื– ื–ื™ื™ืขืจ ื ื•ืฆื™ืง, ืกืคึผืขืฆื™ืขืœ ื•ื•ืขืŸ ืงืึทืžื‘ื™ื™ื ื™ื ื’ ืจ ืื•ืŸ C ++, ืื•ืŸ ืžื™ื˜ ื“ืขื ืคึผืขืงืœ ื‘ืึทื ืง - ืื•ื™ืš ื–ื™ื™ืขืจ ื’ืจื™ื ื’.

ืงื•ื™ืœืขืœื“ื™ืง, ื“ื™ ื“ืขืจืคืึทืจื•ื ื’ ืื™ื– ื’ืขื•ื•ืขืŸ ื–ื™ื™ืขืจ ืจื™ื•ื•ืึธืจื“ื™ื ื’ ืื•ืŸ ืžื™ืจ ืคืึธืจื–ืขืฆืŸ ืฆื• ืึทืจื‘ืขื˜ืŸ ืฆื• ืกืึธืœื•ื•ืข ืขื˜ืœืขื›ืข ืคื•ืŸ โ€‹โ€‹โ€‹โ€‹ื“ื™ ื™ืฉื•ื–.

ืžืงื•ืจ: www.habr.com

ืœื™ื™ื’ืŸ ืึท ื‘ืึทืžืขืจืงื•ื ื’