Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha

Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha

Haye Habr!

Deyrtii hore, Kaggle waxa ay martigelisay tartan lagu kala saarayo sawirada gacanta lagu sawiray, Aqoonsiga Draw Degdega ah ee Doodle, kaas oo ay ka mid yihiin, koox saynisyahano R ah ayaa ka qayb qaatay: Artem Klevtsova, Maareeyaha Philippa ΠΈ Andrey Ogurtsov. Si faahfaahsan kuma sifayn doonno tartanka; taas oo hore loo qabtay daabacaad dhawaan.

Markan kuma aysan shaqeynin bilad-falashada, laakiin khibrado badan oo qiimo leh ayaa la helay, sidaa darteed waxaan jeclaan lahaa inaan bulshada u sheego waxyaabo badan oo xiiso leh oo faa'iido leh oo ku saabsan Kagle iyo shaqada maalinlaha ah. Mawduucyada laga wada hadlay waxaa ka mid ah: nolol adag la'aanteed OpenCV, JSON parsing (tusaalooyinkani waxay eegayaan is dhexgalka koodka C++ ee qoraallada ama baakadaha R iyadoo la isticmaalayo Rcpp), cabbirida qoraallada iyo dockerization ee xalka ugu dambeeya. Dhammaan koodka fariinta ee foomka ku habboon fulinta waa la helayaa kayd.

Tusmo:

  1. Si hufan uga soo rar xogta CSV MonetDB
  2. Diyaarinta dufcadaha
  3. Iterators ee soo dejinta dufcadaha kaydka
  4. Doorashada Qaab-dhismeedka Model
  5. Baraamijka qoraalka
  6. Dockerization of scripts
  7. Isticmaalka GPU-yo badan oo Google Cloud ah
  8. Halkii gabagabo

1. Si hufan uga soo rar xogta CSV galka xogta MonetDB

Xogta ku jirta tartankan looma bixin qaab muuqaalo diyaar ah, laakiin qaab 340 CSV ah oo fayl ah (hal fayl fasal kasta) oo ka kooban JSONs leh dhibco isku-duwayaal ah. Marka lagu xidho dhibcahan xariiqyada, waxaan helnaa sawirka kama dambaysta ah ee cabbiraya 256x256 pixels. Sidoo kale diiwaan kasta waxaa ku qoran calaamad muujinaysa in sawirka si sax ah loo aqoonsaday kala saarta la isticmaalay wakhtiga xogta la ururinayey, laba xaraf ka kooban oo code dalka uu degan yahay qoraaga sawirka, aqoonsi gaar ah, waqti-stambabada iyo magac fasal oo u dhigma magaca faylka. Nooca la fududeeyay ee xogta asalka ah ayaa miisaankeedu yahay 7.4 GB ee kaydka iyo ku dhawaad ​​20 GB ka dib marka la furo, xogta buuxda ka dib marka la furayo waxay qaadataa 240 GB. Qabanqaabiyayaashu waxay hubiyeen in labada noocba ay soo saareen sawiro isku mid ah, taasoo la micno ah in nuqulka buuxa uu ahaa mid aan loo baahnayn. Si kastaba ha ahaatee, kaydinta 50 milyan oo sawir faylal garaaf ah ama qaab qaabaysan ayaa isla markiiba loo arkaa mid aan faa'iido lahayn, waxaanan go'aansanay inaan ku dhex darno dhammaan faylasha CSV ee kaydka tareenka_fududeeyey.zip geli kaydka xogta jiilka xiga ee sawirada cabbirka loo baahan yahay "duuqsi" dufcad kasta.

Nidaam si wanaagsan loo xaqiijiyay ayaa loo doortay DBMS MonetDB, oo ah hirgelinta R sida xirmo ahaan MonetDLite. Xirmada waxaa ku jira nooc ku lifaaqan kaydka xogta waxayna kuu oggolaaneysaa inaad si toos ah uga soo qaadato server-ka fadhiga R oo aad halkaas kula shaqeyso. Abuuritaanka xog-ururin iyo ku xidhiddeeda waxa lagu sameeyaa hal amar:

con <- DBI::dbConnect(drv = MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

Waxaan u baahan doonaa inaan abuurno laba shax: mid xogta oo dhan ah, kan kalena macluumaadka adeegga ee ku saabsan faylalka la soo dejiyey (waxtar leh haddii ay wax qaldamaan oo geeddi-socodku waa in dib loo bilaabo ka dib markii la soo dejiyo dhowr fayl):

Abuuritaanka miisaska

if (!DBI::dbExistsTable(con, "doodles")) {
  DBI::dbCreateTable(
    con = con,
    name = "doodles",
    fields = c(
      "countrycode" = "char(2)",
      "drawing" = "text",
      "key_id" = "bigint",
      "recognized" = "bool",
      "timestamp" = "timestamp",
      "word" = "text"
    )
  )
}

if (!DBI::dbExistsTable(con, "upload_log")) {
  DBI::dbCreateTable(
    con = con,
    name = "upload_log",
    fields = c(
      "id" = "serial",
      "file_name" = "text UNIQUE",
      "uploaded" = "bool DEFAULT false"
    )
  )
}

Sida ugu dhaqsaha badan ee xogta loogu shubi karo kaydka xogta waxay ahayd in si toos ah loo koobiyeeyo faylasha CSV iyadoo la isticmaalayo SQL - Command COPY OFFSET 2 INTO tablename FROM path USING DELIMITERS ',','n','"' NULL AS '' BEST EFFORThalkaas oo tablename - magaca miiska iyo path - jidka loo maro faylka. Iyadoo la shaqeyneysa kaydka, waxaa la ogaaday in hirgelinta la dhisay unzip in R si sax ah uma shaqeeyo tiro ka mid ah faylalka kaydka, sidaas darteed waxaan isticmaalnay nidaamka unzip (adoo isticmaalaya parameterka getOption("unzip")).

Hawsha qorista xogta xogta

#' @title Π˜Π·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ ΠΈ Π·Π°Π³Ρ€ΡƒΠ·ΠΊΠ° Ρ„Π°ΠΉΠ»ΠΎΠ²
#'
#' @description
#' Π˜Π·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ CSV-Ρ„Π°ΠΉΠ»ΠΎΠ² ΠΈΠ· ZIP-Π°Ρ€Ρ…ΠΈΠ²Π° ΠΈ Π·Π°Π³Ρ€ΡƒΠ·ΠΊΠ° ΠΈΡ… Π² Π±Π°Π·Ρƒ Π΄Π°Π½Π½Ρ‹Ρ…
#'
#' @param con ΠžΠ±ΡŠΠ΅ΠΊΡ‚ ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΡ ΠΊ Π±Π°Π·Π΅ Π΄Π°Π½Π½Ρ‹Ρ… (класс `MonetDBEmbeddedConnection`).
#' @param tablename НазваниС Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ Π² Π±Π°Π·Π΅ Π΄Π°Π½Π½Ρ‹Ρ….
#' @oaram zipfile ΠŸΡƒΡ‚ΡŒ ΠΊ ZIP-Π°Ρ€Ρ…ΠΈΠ²Ρƒ.
#' @oaram filename Имя Ρ„Π°ΠΉΠ»Π° Π²Π½ΡƒΡ€ΠΈ ZIP-Π°Ρ€Ρ…ΠΈΠ²Π°.
#' @param preprocess Ѐункция ΠΏΡ€Π΅Π΄ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ, которая Π±ΡƒΠ΄Π΅Ρ‚ ΠΏΡ€ΠΈΠΌΠ΅Π½Π΅Π½Π° ΠΈΠ·Π²Π»Π΅Ρ‡Ρ‘Π½Π½ΠΎΠΌΡƒ Ρ„Π°ΠΉΠ»Ρƒ.
#'   Π”ΠΎΠ»ΠΆΠ½Π° ΠΏΡ€ΠΈΠ½ΠΈΠΌΠ°Ρ‚ΡŒ ΠΎΠ΄ΠΈΠ½ Π°Ρ€Π³ΡƒΠΌΠ΅Π½Ρ‚ `data` (ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ `data.table`).
#'
#' @return `TRUE`.
#'
upload_file <- function(con, tablename, zipfile, filename, preprocess = NULL) {
  # ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ° Π°Ρ€Π³ΡƒΠΌΠ΅Π½Ρ‚ΠΎΠ²
  checkmate::assert_class(con, "MonetDBEmbeddedConnection")
  checkmate::assert_string(tablename)
  checkmate::assert_string(filename)
  checkmate::assert_true(DBI::dbExistsTable(con, tablename))
  checkmate::assert_file_exists(zipfile, access = "r", extension = "zip")
  checkmate::assert_function(preprocess, args = c("data"), null.ok = TRUE)

  # Π˜Π·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ Ρ„Π°ΠΉΠ»Π°
  path <- file.path(tempdir(), filename)
  unzip(zipfile, files = filename, exdir = tempdir(), 
        junkpaths = TRUE, unzip = getOption("unzip"))
  on.exit(unlink(file.path(path)))

  # ΠŸΡ€ΠΈΠΌΠ΅Π½ΡΠ΅ΠΌ функция ΠΏΡ€Π΅Π΄ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ
  if (!is.null(preprocess)) {
    .data <- data.table::fread(file = path)
    .data <- preprocess(data = .data)
    data.table::fwrite(x = .data, file = path, append = FALSE)
    rm(.data)
  }

  # Запрос ΠΊ Π‘Π” Π½Π° ΠΈΠΌΠΏΠΎΡ€Ρ‚ CSV
  sql <- sprintf(
    "COPY OFFSET 2 INTO %s FROM '%s' USING DELIMITERS ',','n','"' NULL AS '' BEST EFFORT",
    tablename, path
  )
  # Π’Ρ‹ΠΏΠΎΠ»Π½Π΅Π½ΠΈΠ΅ запроса ΠΊ Π‘Π”
  DBI::dbExecute(con, sql)

  # Π”ΠΎΠ±Π°Π²Π»Π΅Π½ΠΈΠ΅ записи ΠΎΠ± ΡƒΡΠΏΠ΅ΡˆΠ½ΠΎΠΉ Π·Π°Π³Ρ€ΡƒΠ·ΠΊΠ΅ Π² ΡΠ»ΡƒΠΆΠ΅Π±Π½ΡƒΡŽ Ρ‚Π°Π±Π»ΠΈΡ†Ρƒ
  DBI::dbExecute(con, sprintf("INSERT INTO upload_log(file_name, uploaded) VALUES('%s', true)",
                              filename))

  return(invisible(TRUE))
}

Haddii aad u baahan tahay inaad beddesho miiska ka hor intaadan u qorin kaydka xogta, waa ku filan inaad u gudubto doodda preprocess function bedeli doona xogta.

Koodhka loogu talagalay si isdaba joog ah xogta loogu shubayo xogta:

Ku qor xogta database-ka

# Бписок Ρ„Π°ΠΉΠ»ΠΎΠ² для записи
files <- unzip(zipfile, list = TRUE)$Name

# Бписок ΠΈΡΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠΉ, Ссли Ρ‡Π°ΡΡ‚ΡŒ Ρ„Π°ΠΉΠ»ΠΎΠ² ΡƒΠΆΠ΅ Π±Ρ‹Π»Π° Π·Π°Π³Ρ€ΡƒΠΆΠ΅Π½Π°
to_skip <- DBI::dbGetQuery(con, "SELECT file_name FROM upload_log")[[1L]]
files <- setdiff(files, to_skip)

if (length(files) > 0L) {
  # ЗапускаСм Ρ‚Π°ΠΉΠΌΠ΅Ρ€
  tictoc::tic()
  # ΠŸΡ€ΠΎΠ³Ρ€Π΅ΡΡ Π±Π°Ρ€
  pb <- txtProgressBar(min = 0L, max = length(files), style = 3)
  for (i in seq_along(files)) {
    upload_file(con = con, tablename = "doodles", 
                zipfile = zipfile, filename = files[i])
    setTxtProgressBar(pb, i)
  }
  close(pb)
  # ΠžΡΡ‚Π°Π½Π°Π²Π»ΠΈΠ²Π°Π΅ΠΌ Ρ‚Π°ΠΉΠΌΠ΅Ρ€
  tictoc::toc()
}

# 526.141 sec elapsed - ΠΊΠΎΠΏΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ SSD->SSD
# 558.879 sec elapsed - ΠΊΠΎΠΏΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ USB->SSD

Wakhtiga rarida xogta way kala duwanaan kartaa iyadoo ku xidhan sifooyinka xawaaraha wadista la isticmaalo. Xaaladeena, akhrinta iyo qorista hal SSD gudaheed ama flash drive (faylka isha) ilaa SSD (DB) waxay qaadataa wax ka yar 10 daqiiqo.

Waxay qaadanaysaa dhowr ilbiriqsi oo dheeraad ah si loo abuuro tiir leh sumadda fasalka shaandheynta iyo tiirka tusaha (ORDERED INDEX) oo leh nambaro xariiq ah oo indha-indheynta lagu soo qaadan doono marka la samaynayo dufcooyin:

Abuuritaanka Tiirar Dheeraad ah iyo Tusmo

message("Generate lables")
invisible(DBI::dbExecute(con, "ALTER TABLE doodles ADD label_int int"))
invisible(DBI::dbExecute(con, "UPDATE doodles SET label_int = dense_rank() OVER (ORDER BY word) - 1"))

message("Generate row numbers")
invisible(DBI::dbExecute(con, "ALTER TABLE doodles ADD id serial"))
invisible(DBI::dbExecute(con, "CREATE ORDERED INDEX doodles_id_ord_idx ON doodles(id)"))

Si loo xalliyo dhibaatada abuurista dufcaddii duulista, waxaan u baahnnahay inaan gaarno xawaaraha ugu badan ee safafka aan tooska ahayn ee miiska laga soo saaro. doodles. Tan waxaan u isticmaalnay 3 farsamo. Midda kowaad waxay ahayd in la dhimo cabbirka nooca kaydiya aqoonsiga indha-indheynta. Qaybta xogta asalka ah, nooca loo baahan yahay si loo kaydiyo aqoonsiga waa bigint, laakiin tirada indha-indhayntu waxay suurtogal ka dhigaysaa in lagu dhejiyo tilmaameyaashooda, oo la mid ah lambarka caadiga ah, nooca int. Baadhitaanku aad buu u dheereeyaa kiiskan. Habka labaad waa in la isticmaalo ORDERED INDEX - Waxaan u nimid go'aankan si macquul ah, anagoo soo marnay wax walba oo la heli karo doorashooyinka. Midda saddexaad waxay ahayd in la isticmaalo weydiimo la xaddiday. Nuxurka habka waa in la fuliyo amarka hal mar PREPARE iyada oo la adeegsanayo odhaah la diyaariyay marka la abuurayo tiro su'aalo ah oo isku mid ah, laakiin dhab ahaantii waxaa jira faa'iido marka la barbardhigo mid fudud. SELECT Waxay u soo baxday inay ku dhex jirto inta u dhaxaysa khaladaadka tirakoobka.

Habka raritaanka xogta ma cunayso wax ka badan 450 MB ee RAM. Taasi waa, habka lagu sharraxay wuxuu kuu ogolaanayaa inaad dhaqaajiso xog-ururin miisaankeedu yahay tobanaan gigabytes ku dhawaad ​​qalab kasta oo miisaaniyadeed, oo ay ku jiraan qaar ka mid ah qalabka hal-board-ka ah, taas oo aad u fiican.

Waxa hadhay oo dhan waa in la cabbiro xawaaraha soo celinta xogta ( random) iyo in la qiimeeyo cabbirka marka la samaynayo qaybo cabbirro kala duwan ah:

Halbeegga database-ka

library(ggplot2)

set.seed(0)
# ΠŸΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠ΅ ΠΊ Π±Π°Π·Π΅ Π΄Π°Π½Π½Ρ‹Ρ…
con <- DBI::dbConnect(MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

# Ѐункция для ΠΏΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠΈ запроса Π½Π° сторонС сСрвСра
prep_sql <- function(batch_size) {
  sql <- sprintf("PREPARE SELECT id FROM doodles WHERE id IN (%s)",
                 paste(rep("?", batch_size), collapse = ","))
  res <- DBI::dbSendQuery(con, sql)
  return(res)
}

# Ѐункция для извлСчСния Π΄Π°Π½Π½Ρ‹Ρ…
fetch_data <- function(rs, batch_size) {
  ids <- sample(seq_len(n), batch_size)
  res <- DBI::dbFetch(DBI::dbBind(rs, as.list(ids)))
  return(res)
}

# ΠŸΡ€ΠΎΠ²Π΅Π΄Π΅Π½ΠΈΠ΅ Π·Π°ΠΌΠ΅Ρ€Π°
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    rs <- prep_sql(batch_size)
    bench::mark(
      fetch_data(rs, batch_size),
      min_iterations = 50L
    )
  }
)
# ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Ρ‹ Π±Π΅Π½Ρ‡ΠΌΠ°Ρ€ΠΊΠ°
cols <- c("batch_size", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   batch_size      min   median      max `itr/sec` total_time n_itr
#        <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
# 1         16   23.6ms  54.02ms  93.43ms     18.8        2.6s    49
# 2         32     38ms  84.83ms 151.55ms     11.4       4.29s    49
# 3         64   63.3ms 175.54ms 248.94ms     5.85       8.54s    50
# 4        128   83.2ms 341.52ms 496.24ms     3.00      16.69s    50
# 5        256  232.8ms 653.21ms 847.44ms     1.58      31.66s    50
# 6        512  784.6ms    1.41s    1.98s     0.740       1.1m    49
# 7       1024  681.7ms    2.72s    4.06s     0.377      2.16m    49

ggplot(res_bench, aes(x = factor(batch_size), y = median, group = 1)) +
  geom_point() +
  geom_line() +
  ylab("median time, s") +
  theme_minimal()

DBI::dbDisconnect(con, shutdown = TRUE)

Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha

2. Diyaarinta dufcadaha

Habka diyaarinta dufcaddii oo dhan wuxuu ka kooban yahay tallaabooyinka soo socda:

  1. Falanqaynta dhowr JSON oo ka kooban xargaha xargaha oo leh iskuduwayaal dhibco ah.
  2. Sawirida khadadka midabka leh ee ku salaysan isu-duwayaasha dhibcaha sawirka cabbirka loo baahan yahay (tusaale, 256Γ—256 ama 128Γ—128).
  3. U beddelashada sawirada ka soo baxday tensor.

Iyada oo qayb ka ah tartanka u dhexeeya kernels Python, dhibaatada waxaa la xalliyey ugu horreyn iyadoo la adeegsanayo OpenCV. Mid ka mid ah analoogyada ugu fudud uguna cad ee R ayaa u ekaan doona sidan:

Hirgelinta JSON u Beddelka Tensor gudaha R

r_process_json_str <- function(json, line.width = 3, 
                               color = TRUE, scale = 1) {
  # ΠŸΠ°Ρ€ΡΠΈΠ½Π³ JSON
  coords <- jsonlite::fromJSON(json, simplifyMatrix = FALSE)
  tmp <- tempfile()
  # УдаляСм Π²Ρ€Π΅ΠΌΠ΅Π½Π½Ρ‹ΠΉ Ρ„Π°ΠΉΠ» ΠΏΠΎ Π·Π°Π²Π΅Ρ€ΡˆΠ΅Π½ΠΈΡŽ Ρ„ΡƒΠ½ΠΊΡ†ΠΈΠΈ
  on.exit(unlink(tmp))
  png(filename = tmp, width = 256 * scale, height = 256 * scale, pointsize = 1)
  # ΠŸΡƒΡΡ‚ΠΎΠΉ Π³Ρ€Π°Ρ„ΠΈΠΊ
  plot.new()
  # Π Π°Π·ΠΌΠ΅Ρ€ ΠΎΠΊΠ½Π° Π³Ρ€Π°Ρ„ΠΈΠΊΠ°
  plot.window(xlim = c(256 * scale, 0), ylim = c(256 * scale, 0))
  # Π¦Π²Π΅Ρ‚Π° Π»ΠΈΠ½ΠΈΠΉ
  cols <- if (color) rainbow(length(coords)) else "#000000"
  for (i in seq_along(coords)) {
    lines(x = coords[[i]][[1]] * scale, y = coords[[i]][[2]] * scale, 
          col = cols[i], lwd = line.width)
  }
  dev.off()
  # ΠŸΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ изобраТСния Π² 3-Ρ… ΠΌΠ΅Ρ€Π½Ρ‹ΠΉ массив
  res <- png::readPNG(tmp)
  return(res)
}

r_process_json_vector <- function(x, ...) {
  res <- lapply(x, r_process_json_str, ...)
  # ОбъСдинСниС 3-Ρ… ΠΌΠ΅Ρ€Π½Ρ‹Ρ… массивов ΠΊΠ°Ρ€Ρ‚ΠΈΠ½ΠΎΠΊ Π² 4-Ρ… ΠΌΠ΅Ρ€Π½Ρ‹ΠΉ Π² Ρ‚Π΅Π½Π·ΠΎΡ€
  res <- do.call(abind::abind, c(res, along = 0))
  return(res)
}

Sawirka waxaa lagu sameeyaa iyadoo la adeegsanayo qalabka caadiga ah ee R waxaana lagu keydiyaa PNG ku meel gaar ah oo lagu kaydiyay RAM (Linux, hagayaasha R ku meel gaarka ah waxay ku yaalliin hagaha /tmp, ku rakiban RAM). Faylkan waxaa markaa loo akhriyaa qaab saddex-cabbir ah oo leh tirooyinka u dhexeeya 0 ilaa 1. Tani waa muhiim sababtoo ah BMP-ga caadiga ah ayaa loo akhriyi doonaa qaab cayriin ah oo leh codes midab leh.

Aynu tijaabino natiijada:

zip_file <- file.path("data", "train_simplified.zip")
csv_file <- "cat.csv"
unzip(zip_file, files = csv_file, exdir = tempdir(), 
      junkpaths = TRUE, unzip = getOption("unzip"))
tmp_data <- data.table::fread(file.path(tempdir(), csv_file), sep = ",", 
                              select = "drawing", nrows = 10000)
arr <- r_process_json_str(tmp_data[4, drawing])
dim(arr)
# [1] 256 256   3
plot(magick::image_read(arr))

Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha

Dufcadda lafteeda ayaa loo samayn doonaa sida soo socota:

res <- r_process_json_vector(tmp_data[1:4, drawing], scale = 0.5)
str(res)
 # num [1:4, 1:128, 1:128, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
 # - attr(*, "dimnames")=List of 4
 #  ..$ : NULL
 #  ..$ : NULL
 #  ..$ : NULL
 #  ..$ : NULL

Hirgelintani waxay nooga muuqatay mid aan fiicneyn, maadaama samaynta dufcadaha waaweyni ay qaadanayso waqti dheer oo aan munaasib ahayn, waxaana go'aansanay inaan ka faa'iidaysanno khibradda asxaabta annaga oo adeegsanayna maktabad awood leh. OpenCV. Waqtigaas ma jirin xirmo diyaarsan oo loogu talagalay R (midna hadda ma jiro), markaa hirgelinta ugu yar ee shaqeynta loo baahan yahay ayaa lagu qoray C ++ oo leh isdhexgalka R code iyadoo la adeegsanayo Rcpp.

Si loo xalliyo dhibaatada, xirmooyinka soo socda iyo maktabadaha ayaa la isticmaalay:

  1. OpenCV si loogu shaqeeyo sawirada iyo khadadka sawirida. La isticmaalay maktabadaha nidaamka horay loo rakibay iyo faylalka madaxa, iyo sidoo kale isku xidhka firfircoon.

  2. xtensor si aad ula shaqeyso arrays iyo tenorsooyin badan. Waxaan isticmaalnay faylalka madaxa ee ku jira xirmada R ee isku magaca ah. Maktabadu waxay kuu ogolaanaysaa inaad la shaqeyso qalabyo kala duwan, labadaba safka waaweyn iyo siday u kala horreeyaan.

  3. ndjson si loo qiimeeyo JSON. Maktabaddan waxaa lagu isticmaalaa gudaha xtensor si toos ah haddii ay ku jirto mashruuca.

  4. RcppThread si loo abaabulo habaynta silsilado badan oo vector ah oo ka socda JSON. Isticmaalay faylalka madaxa ee xirmadan ay bixisay. Ka caan ah RcppParallel Xirmada, iyo waxyaabo kale, waxay leedahay hab-dhex-dhexaad ah oo dhex-dhexaad ah.

Waa muhiim in la ogaado taas xtensor Waxay isu rogtay ilaahnimo: marka lagu daro xaqiiqda ah in ay leedahay hawlkarnimo balaadhan iyo wax qabad sare, horumariyayaasheeda waxa ay noqdeen kuwo si fiican uga jawaaba su'aalaha isla markiiba oo faahfaahsan. Caawinaadkooda, waxaa suurtagal ah in la hirgeliyo isbeddelada matrices OpenCV ee xtensor tenors, iyo sidoo kale habka la isugu daro sawir-qaadayaasha 3-cabbirka ah ee 4-cabbir ah ee cabbirka saxda ah (qaybta lafteeda).

Qalabka lagu barto Rcpp, xtensor iyo RcppThread

https://thecoatlessprofessor.com/programming/unofficial-rcpp-api-documentation

https://docs.opencv.org/4.0.1/d7/dbd/group__imgproc.html

https://xtensor.readthedocs.io/en/latest/

https://xtensor.readthedocs.io/en/latest/file_loading.html#loading-json-data-into-xtensor

https://cran.r-project.org/web/packages/RcppThread/vignettes/RcppThread-vignette.pdf

Si loo ururiyo faylalka adeegsada faylalka nidaamka iyo isku xidhka firfircoon ee maktabadaha lagu rakibay nidaamka, waxaanu isticmaalnay habka plugin ee lagu hirgeliyay xirmada Rcpp. Si aad si toos ah u hesho waddooyinka iyo calanka, waxaan isticmaalnay utility Linux ah oo caan ah pkg-isku xidhka.

Hirgelinta plugin Rcpp ee isticmaalka maktabadda OpenCV

Rcpp::registerPlugin("opencv", function() {
  # Π’ΠΎΠ·ΠΌΠΎΠΆΠ½Ρ‹Π΅ названия ΠΏΠ°ΠΊΠ΅Ρ‚Π°
  pkg_config_name <- c("opencv", "opencv4")
  # Π‘ΠΈΠ½Π°Ρ€Π½Ρ‹ΠΉ Ρ„Π°ΠΉΠ» ΡƒΡ‚ΠΈΠ»ΠΈΡ‚Ρ‹ pkg-config
  pkg_config_bin <- Sys.which("pkg-config")
  # ΠŸΡ€ΠΎΠ²Ρ€Π΅ΠΊΠ° наличия ΡƒΡ‚ΠΈΠ»ΠΈΡ‚Ρ‹ Π² систСмС
  checkmate::assert_file_exists(pkg_config_bin, access = "x")
  # ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ° наличия Ρ„Π°ΠΉΠ»Π° настроСк OpenCV для pkg-config
  check <- sapply(pkg_config_name, 
                  function(pkg) system(paste(pkg_config_bin, pkg)))
  if (all(check != 0)) {
    stop("OpenCV config for the pkg-config not found", call. = FALSE)
  }

  pkg_config_name <- pkg_config_name[check == 0]
  list(env = list(
    PKG_CXXFLAGS = system(paste(pkg_config_bin, "--cflags", pkg_config_name), 
                          intern = TRUE),
    PKG_LIBS = system(paste(pkg_config_bin, "--libs", pkg_config_name), 
                      intern = TRUE)
  ))
})

Natiijadu waxay tahay shaqada plugin, qiyamka soo socda ayaa la beddeli doonaa inta lagu jiro habka isku-darka:

Rcpp:::.plugins$opencv()$env

# $PKG_CXXFLAGS
# [1] "-I/usr/include/opencv"
#
# $PKG_LIBS
# [1] "-lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core"

Xeerka dhaqangelinta ee kala-soocidda JSON iyo soo saarista dufcaddii gudbinta moodeelka waxa lagu bixiyaa qaswade hoosteeda. Marka hore, ku dar hagaha mashruuca deegaanka si aad u raadiso faylalka madaxa (loo baahan yahay ndjson):

Sys.setenv("PKG_CXXFLAGS" = paste0("-I", normalizePath(file.path("src"))))

Hirgelinta JSON u beddelashada tensor gudaha C++

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::plugins(opencv)]]
// [[Rcpp::depends(xtensor)]]
// [[Rcpp::depends(RcppThread)]]

#include <xtensor/xjson.hpp>
#include <xtensor/xadapt.hpp>
#include <xtensor/xview.hpp>
#include <xtensor-r/rtensor.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <Rcpp.h>
#include <RcppThread.h>

// Π‘ΠΈΠ½ΠΎΠ½ΠΈΠΌΡ‹ для Ρ‚ΠΈΠΏΠΎΠ²
using RcppThread::parallelFor;
using json = nlohmann::json;
using points = xt::xtensor<double,2>;     // Π˜Π·Π²Π»Π΅Ρ‡Ρ‘Π½Π½Ρ‹Π΅ ΠΈΠ· JSON ΠΊΠΎΠΎΡ€Π΄ΠΈΠ½Π°Ρ‚Ρ‹ Ρ‚ΠΎΡ‡Π΅ΠΊ
using strokes = std::vector<points>;      // Π˜Π·Π²Π»Π΅Ρ‡Ρ‘Π½Π½Ρ‹Π΅ ΠΈΠ· JSON ΠΊΠΎΠΎΡ€Π΄ΠΈΠ½Π°Ρ‚Ρ‹ Ρ‚ΠΎΡ‡Π΅ΠΊ
using xtensor3d = xt::xtensor<double, 3>; // Π’Π΅Π½Π·ΠΎΡ€ для хранСния ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Ρ‹ изообраТСния
using xtensor4d = xt::xtensor<double, 4>; // Π’Π΅Π½Π·ΠΎΡ€ для хранСния мноТСства ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ
using rtensor3d = xt::rtensor<double, 3>; // ΠžΠ±Ρ‘Ρ€Ρ‚ΠΊΠ° для экспорта Π² R
using rtensor4d = xt::rtensor<double, 4>; // ΠžΠ±Ρ‘Ρ€Ρ‚ΠΊΠ° для экспорта Π² R

// БтатичСскиС константы
// Π Π°Π·ΠΌΠ΅Ρ€ изобраТСния Π² пиксСлях
const static int SIZE = 256;
// Π’ΠΈΠΏ Π»ΠΈΠ½ΠΈΠΈ
// Π‘ΠΌ. https://en.wikipedia.org/wiki/Pixel_connectivity#2-dimensional
const static int LINE_TYPE = cv::LINE_4;
// Π’ΠΎΠ»Ρ‰ΠΈΠ½Π° Π»ΠΈΠ½ΠΈΠΈ Π² пиксСлях
const static int LINE_WIDTH = 3;
// Алгоритм рСсайза
// https://docs.opencv.org/3.1.0/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121
const static int RESIZE_TYPE = cv::INTER_LINEAR;

// Π¨Π°Π±Π»ΠΎΠ½ для конвСртирования OpenCV-ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Ρ‹ Π² Ρ‚Π΅Π½Π·ΠΎΡ€
template <typename T, int NCH, typename XT=xt::xtensor<T,3,xt::layout_type::column_major>>
XT to_xt(const cv::Mat_<cv::Vec<T, NCH>>& src) {
  // Π Π°Π·ΠΌΠ΅Ρ€Π½ΠΎΡΡ‚ΡŒ Ρ†Π΅Π»Π΅Π²ΠΎΠ³ΠΎ Ρ‚Π΅Π½Π·ΠΎΡ€Π°
  std::vector<int> shape = {src.rows, src.cols, NCH};
  // ΠžΠ±Ρ‰Π΅Π΅ количСство элСмСнтов Π² массивС
  size_t size = src.total() * NCH;
  // ΠŸΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ cv::Mat Π² xt::xtensor
  XT res = xt::adapt((T*) src.data, size, xt::no_ownership(), shape);
  return res;
}

// ΠŸΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ JSON Π² список ΠΊΠΎΠΎΡ€Π΄ΠΈΠ½Π°Ρ‚ Ρ‚ΠΎΡ‡Π΅ΠΊ
strokes parse_json(const std::string& x) {
  auto j = json::parse(x);
  // Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚ парсинга Π΄ΠΎΠ»ΠΆΠ΅Π½ Π±Ρ‹Ρ‚ΡŒ массивом
  if (!j.is_array()) {
    throw std::runtime_error("'x' must be JSON array.");
  }
  strokes res;
  res.reserve(j.size());
  for (const auto& a: j) {
    // ΠšΠ°ΠΆΠ΄Ρ‹ΠΉ элСмСнт массива Π΄ΠΎΠ»ΠΆΠ΅Π½ Π±Ρ‹Ρ‚ΡŒ 2-ΠΌΠ΅Ρ€Π½Ρ‹ΠΌ массивом
    if (!a.is_array() || a.size() != 2) {
      throw std::runtime_error("'x' must include only 2d arrays.");
    }
    // Π˜Π·Π²Π»Π΅Ρ‡Π΅Π½ΠΈΠ΅ Π²Π΅ΠΊΡ‚ΠΎΡ€Π° Ρ‚ΠΎΡ‡Π΅ΠΊ
    auto p = a.get<points>();
    res.push_back(p);
  }
  return res;
}

// ΠžΡ‚Ρ€ΠΈΡΠΎΠ²ΠΊΠ° Π»ΠΈΠ½ΠΈΠΉ
// Π¦Π²Π΅Ρ‚Π° HSV
cv::Mat ocv_draw_lines(const strokes& x, bool color = true) {
  // Π˜ΡΡ…ΠΎΠ΄Π½Ρ‹ΠΉ Ρ‚ΠΈΠΏ ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Ρ‹
  auto stype = color ? CV_8UC3 : CV_8UC1;
  // Π˜Ρ‚ΠΎΠ³ΠΎΠ²Ρ‹ΠΉ Ρ‚ΠΈΠΏ ΠΌΠ°Ρ‚Ρ€ΠΈΡ†Ρ‹
  auto dtype = color ? CV_32FC3 : CV_32FC1;
  auto bg = color ? cv::Scalar(0, 0, 255) : cv::Scalar(255);
  auto col = color ? cv::Scalar(0, 255, 220) : cv::Scalar(0);
  cv::Mat img = cv::Mat(SIZE, SIZE, stype, bg);
  // ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ Π»ΠΈΠ½ΠΈΠΉ
  size_t n = x.size();
  for (const auto& s: x) {
    // ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ Ρ‚ΠΎΡ‡Π΅ΠΊ Π² Π»ΠΈΠ½ΠΈΠΈ
    size_t n_points = s.shape()[1];
    for (size_t i = 0; i < n_points - 1; ++i) {
      // Π’ΠΎΡ‡ΠΊΠ° Π½Π°Ρ‡Π°Π»Π° ΡˆΡ‚Ρ€ΠΈΡ…Π°
      cv::Point from(s(0, i), s(1, i));
      // Π’ΠΎΡ‡ΠΊΠ° окончания ΡˆΡ‚Ρ€ΠΈΡ…Π°
      cv::Point to(s(0, i + 1), s(1, i + 1));
      // ΠžΡ‚Ρ€ΠΈΡΠΎΠ²ΠΊΠ° Π»ΠΈΠ½ΠΈΠΈ
      cv::line(img, from, to, col, LINE_WIDTH, LINE_TYPE);
    }
    if (color) {
      // МСняСм Ρ†Π²Π΅Ρ‚ Π»ΠΈΠ½ΠΈΠΈ
      col[0] += 180 / n;
    }
  }
  if (color) {
    // МСняСм Ρ†Π²Π΅Ρ‚ΠΎΠ²ΠΎΠ΅ прСдставлСниС Π½Π° RGB
    cv::cvtColor(img, img, cv::COLOR_HSV2RGB);
  }
  // МСняСм Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ прСдставлСния Π½Π° float32 с Π΄ΠΈΠ°ΠΏΠ°Π·ΠΎΠ½ΠΎΠΌ [0, 1]
  img.convertTo(img, dtype, 1 / 255.0);
  return img;
}

// ΠžΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠ° JSON ΠΈ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½ΠΈΠ΅ Ρ‚Π΅Π½Π·ΠΎΡ€Π° с Π΄Π°Π½Π½Ρ‹ΠΌΠΈ изобраТСния
xtensor3d process(const std::string& x, double scale = 1.0, bool color = true) {
  auto p = parse_json(x);
  auto img = ocv_draw_lines(p, color);
  if (scale != 1) {
    cv::Mat out;
    cv::resize(img, out, cv::Size(), scale, scale, RESIZE_TYPE);
    cv::swap(img, out);
    out.release();
  }
  xtensor3d arr = color ? to_xt<double,3>(img) : to_xt<double,1>(img);
  return arr;
}

// [[Rcpp::export]]
rtensor3d cpp_process_json_str(const std::string& x, 
                               double scale = 1.0, 
                               bool color = true) {
  xtensor3d res = process(x, scale, color);
  return res;
}

// [[Rcpp::export]]
rtensor4d cpp_process_json_vector(const std::vector<std::string>& x, 
                                  double scale = 1.0, 
                                  bool color = false) {
  size_t n = x.size();
  size_t dim = floor(SIZE * scale);
  size_t channels = color ? 3 : 1;
  xtensor4d res({n, dim, dim, channels});
  parallelFor(0, n, [&x, &res, scale, color](int i) {
    xtensor3d tmp = process(x[i], scale, color);
    auto view = xt::view(res, i, xt::all(), xt::all(), xt::all());
    view = tmp;
  });
  return res;
}

Koodhkan waa in lagu dhejiyaa faylka src/cv_xt.cpp oo la soo ururiyo amarka Rcpp::sourceCpp(file = "src/cv_xt.cpp", env = .GlobalEnv); sidoo kale looga baahan yahay shaqada nlohmann/json.hpp ka kayd. Xeerku waxa uu u qaybsan yahay dhawr hawlood:

  • to_xt - shaqo qaabaysan oo loogu talagalay beddelka sawirka sawirka (cv::Mat) ilaa tensor xt::xtensor;

  • parse_json - Shaqadu waxay kala saartaa xargaha JSON, waxay soo saartaa isku-duwayaasha dhibcaha, iyaga oo ku xiraya vector;

  • ocv_draw_lines - laga soo bilaabo natiijada dhibcaha dhibcaha, waxay soo jiidataa khadadka midabyo badan;

  • process - wuxuu isku daraa hawlaha kor ku xusan wuxuuna sidoo kale ku darayaa awoodda lagu cabbiro sawirka ka dhashay;

  • cpp_process_json_str - duuduubka ka sarreeya shaqada process, kaas oo natiijada u dhoofiya shay R (multidimensional array);

  • cpp_process_json_vector - duuduubka ka sarreeya shaqada cpp_process_json_str, kaas oo kuu ogolaanaya inaad ku farsamayso vector string ah qaab multi-threaded.

Si loo sawiro khadad midabyo badan leh, qaabka midabka HSV ayaa la isticmaalay, oo ay ku xigtay u beddelashada RGB. Aynu tijaabino natiijada:

arr <- cpp_process_json_str(tmp_data[4, drawing])
dim(arr)
# [1] 256 256   3
plot(magick::image_read(arr))

Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha
Isbarbardhigga xawaaraha fulinta ee R iyo C++

res_bench <- bench::mark(
  r_process_json_str(tmp_data[4, drawing], scale = 0.5),
  cpp_process_json_str(tmp_data[4, drawing], scale = 0.5),
  check = FALSE,
  min_iterations = 100
)
# ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Ρ‹ Π±Π΅Π½Ρ‡ΠΌΠ°Ρ€ΠΊΠ°
cols <- c("expression", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   expression                min     median       max `itr/sec` total_time  n_itr
#   <chr>                <bch:tm>   <bch:tm>  <bch:tm>     <dbl>   <bch:tm>  <int>
# 1 r_process_json_str     3.49ms     3.55ms    4.47ms      273.      490ms    134
# 2 cpp_process_json_str   1.94ms     2.02ms    5.32ms      489.      497ms    243

library(ggplot2)
# ΠŸΡ€ΠΎΠ²Π΅Π΄Π΅Π½ΠΈΠ΅ Π·Π°ΠΌΠ΅Ρ€Π°
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    .data <- tmp_data[sample(seq_len(.N), batch_size), drawing]
    bench::mark(
      r_process_json_vector(.data, scale = 0.5),
      cpp_process_json_vector(.data,  scale = 0.5),
      min_iterations = 50,
      check = FALSE
    )
  }
)

res_bench[, cols]

#    expression   batch_size      min   median      max `itr/sec` total_time n_itr
#    <chr>             <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
#  1 r                   16   50.61ms  53.34ms  54.82ms    19.1     471.13ms     9
#  2 cpp                 16    4.46ms   5.39ms   7.78ms   192.      474.09ms    91
#  3 r                   32   105.7ms 109.74ms 212.26ms     7.69        6.5s    50
#  4 cpp                 32    7.76ms  10.97ms  15.23ms    95.6     522.78ms    50
#  5 r                   64  211.41ms 226.18ms 332.65ms     3.85      12.99s    50
#  6 cpp                 64   25.09ms  27.34ms  32.04ms    36.0        1.39s    50
#  7 r                  128   534.5ms 627.92ms 659.08ms     1.61      31.03s    50
#  8 cpp                128   56.37ms  58.46ms  66.03ms    16.9        2.95s    50
#  9 r                  256     1.15s    1.18s    1.29s     0.851     58.78s    50
# 10 cpp                256  114.97ms 117.39ms 130.09ms     8.45       5.92s    50
# 11 r                  512     2.09s    2.15s    2.32s     0.463       1.8m    50
# 12 cpp                512  230.81ms  235.6ms 261.99ms     4.18      11.97s    50
# 13 r                 1024        4s    4.22s     4.4s     0.238       3.5m    50
# 14 cpp               1024  410.48ms 431.43ms 462.44ms     2.33      21.45s    50

ggplot(res_bench, aes(x = factor(batch_size), y = median, 
                      group =  expression, color = expression)) +
  geom_point() +
  geom_line() +
  ylab("median time, s") +
  theme_minimal() +
  scale_color_discrete(name = "", labels = c("cpp", "r")) +
  theme(legend.position = "bottom") 

Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha

Sida aad arki karto, kororka xawaaraha wuxuu noqday mid aad u muhiim ah, mana dhici karto in la qabsado koodhka C++ iyadoo la barbardhigayo R code.

3. Iterators ee soo dejinta dufcadaha kaydka xogta

R waxay sumcad u qalantaa habaynta xogta ku habboon RAM-ka, halka Python ay aad ugu astaysan tahay habaynta xogta, taasoo kuu oggolaanaysa inaad si fudud oo dabiici ah u hirgeliso xisaabaadka ka baxsan ( xisaabinta iyadoo la adeegsanayo xusuusta dibadda). Tusaalaha caadiga ah ee khuseeya annaga marka la eego macnaha guud ee dhibaatada la tilmaamay waa shabakadaha neerfaha qoto dheer ee lagu tababaray habka farcanka jilbaha leh ee qiyaasida jaangooyooyinka tallaabo kasta iyadoo la adeegsanayo qayb yar oo indho-indhayn ah, ama dufcad-yar.

Qaab dhismeedka waxbarashada qoto dheer ee ku qoran Python waxay leeyihiin fasalo gaar ah oo hirgeliya dib-u-eegis ku salaysan xogta: miisaska, sawirada faylalka, qaababka binary, iwm. Waxaad isticmaali kartaa doorashooyin diyaarsan ama waxaad qori kartaa naftaada hawlo gaar ah. Gudaha R waxaan uga faa'iidaysan karnaa dhammaan sifooyinka maktabadda Python keras oo leh dhabarka dambe ee kala duwan iyadoo la adeegsanayo xirmada magaca isku midka ah, taas oo iyaduna ka shaqeysa korka xirmada dib u dhigid. Midda dambe waxay mudan tahay maqaal dheer oo gooni ah; kaliya kuma ogola inaad ku socodsiiso code Python ka R, laakiin sidoo kale waxay kuu ogolaaneysaa inaad ku wareejiso walxaha u dhexeeya fadhiyada R iyo Python, si toos ah u fulinaya dhammaan beddelka nooca lagama maarmaanka ah.

Waxaan ka takhalusnay baahida loo qabo in lagu keydiyo dhammaan xogta RAM iyadoo la adeegsanayo MonetDLite, dhammaan shaqada "shabakadda neerfaha" waxaa lagu fulin doonaa code asalka ah ee Python, kaliya waa inaan ku qornaa xogta xogta, maadaama aysan jirin wax diyaar ah. xaaladdan oo kale ee R ama Python midkood. Asal ahaan waxa jira laba shuruudood oo keliya: waa inay ku soo celisaa dufcad aan dhammaad lahayn oo ay beddesho xaaladdeeda inta u dhaxaysa ku celcelinta (kan dambe ee R waxa loo fuliyaa habka ugu fudud iyadoo la isticmaalayo xidhidhyo). Markii hore, waxa loo baahnaa in si toos ah loogu beddelo arrays R oo loo beddelo habab tiro badan oo ku dhex jira terator, laakiin nooca hadda ee xidhmada keras lafteedu way samaysaa.

Dib-u-habaynta xogta tababarka iyo xaqiijinta waxay noqotay sidatan:

Iterator ee tababarka iyo xogta xaqiijinta

train_generator <- function(db_connection = con,
                            samples_index,
                            num_classes = 340,
                            batch_size = 32,
                            scale = 1,
                            color = FALSE,
                            imagenet_preproc = FALSE) {
  # ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ° Π°Ρ€Π³ΡƒΠΌΠ΅Π½Ρ‚ΠΎΠ²
  checkmate::assert_class(con, "DBIConnection")
  checkmate::assert_integerish(samples_index)
  checkmate::assert_count(num_classes)
  checkmate::assert_count(batch_size)
  checkmate::assert_number(scale, lower = 0.001, upper = 5)
  checkmate::assert_flag(color)
  checkmate::assert_flag(imagenet_preproc)

  # ΠŸΠ΅Ρ€Π΅ΠΌΠ΅ΡˆΠΈΠ²Π°Π΅ΠΌ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ Π±Ρ€Π°Ρ‚ΡŒ ΠΈ ΡƒΠ΄Π°Π»ΡΡ‚ΡŒ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Π½Π½Ρ‹Π΅ индСксы Π±Π°Ρ‚Ρ‡Π΅ΠΉ ΠΏΠΎ порядку
  dt <- data.table::data.table(id = sample(samples_index))
  # ΠŸΡ€ΠΎΡΡ‚Π°Π²Π»ΡΠ΅ΠΌ Π½ΠΎΠΌΠ΅Ρ€Π° Π±Π°Ρ‚Ρ‡Π΅ΠΉ
  dt[, batch := (.I - 1L) %/% batch_size + 1L]
  # ΠžΡΡ‚Π°Π²Π»ΡΠ΅ΠΌ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΏΠΎΠ»Π½Ρ‹Π΅ Π±Π°Ρ‚Ρ‡ΠΈ ΠΈ индСксируСм
  dt <- dt[, if (.N == batch_size) .SD, keyby = batch]
  # УстанавливаСм счётчик
  i <- 1
  # ΠšΠΎΠ»ΠΈΡ‡Π΅ΡΡ‚Π²ΠΎ Π±Π°Ρ‚Ρ‡Π΅ΠΉ
  max_i <- dt[, max(batch)]

  # ΠŸΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠ° выраТСния для Π²Ρ‹Π³Ρ€ΡƒΠ·ΠΊΠΈ
  sql <- sprintf(
    "PREPARE SELECT drawing, label_int FROM doodles WHERE id IN (%s)",
    paste(rep("?", batch_size), collapse = ",")
  )
  res <- DBI::dbSendQuery(con, sql)

  # Аналог keras::to_categorical
  to_categorical <- function(x, num) {
    n <- length(x)
    m <- numeric(n * num)
    m[x * n + seq_len(n)] <- 1
    dim(m) <- c(n, num)
    return(m)
  }

  # Π—Π°ΠΌΡ‹ΠΊΠ°Π½ΠΈΠ΅
  function() {
    # НачинаСм Π½ΠΎΠ²ΡƒΡŽ эпоху
    if (i > max_i) {
      dt[, id := sample(id)]
      data.table::setkey(dt, batch)
      # БбрасываСм счётчик
      i <<- 1
      max_i <<- dt[, max(batch)]
    }

    # ID для Π²Ρ‹Π³Ρ€ΡƒΠ·ΠΊΠΈ Π΄Π°Π½Π½Ρ‹Ρ…
    batch_ind <- dt[batch == i, id]
    # Π’Ρ‹Π³Ρ€ΡƒΠ·ΠΊΠ° Π΄Π°Π½Π½Ρ‹Ρ…
    batch <- DBI::dbFetch(DBI::dbBind(res, as.list(batch_ind)), n = -1)

    # Π£Π²Π΅Π»ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ счётчик
    i <<- i + 1

    # ΠŸΠ°Ρ€ΡΠΈΠ½Π³ JSON ΠΈ ΠΏΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠ° массива
    batch_x <- cpp_process_json_vector(batch$drawing, scale = scale, color = color)
    if (imagenet_preproc) {
      # Π¨ΠΊΠ°Π»ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ c ΠΈΠ½Ρ‚Π΅Ρ€Π²Π°Π»Π° [0, 1] Π½Π° ΠΈΠ½Ρ‚Π΅Ρ€Π²Π°Π» [-1, 1]
      batch_x <- (batch_x - 0.5) * 2
    }

    batch_y <- to_categorical(batch$label_int, num_classes)
    result <- list(batch_x, batch_y)
    return(result)
  }
}

Shaqadu waxay u qaadanaysaa doorsoome ahaan xidhiidh la leh kaydka xogta, tirooyinka laynka la isticmaalay, tirada fasalada, cabbirka dufcada, miisaanka (scale = 1 waxay u dhigantaa samaynta sawirada 256x256 pixels, scale = 0.5 - 128x128 pixels), tilmaame midab (color = FALSE qeexaa ku-bandhigidda cabbirka cawl marka la isticmaalo color = TRUE istaroog kasta waxaa lagu sawiraa midab cusub) iyo tilmaame horudhac u ah shabakadaha horay loogu tababaray imagenet. Midda dambe ayaa loo baahan yahay si loo cabbiro qiyamka pixel laga bilaabo muddada u dhaxaysa [0, 1] ilaa inta u dhaxaysa [-1, 1], kaas oo la isticmaalay markii la tababbaray waxa la keenay keras moodooyinka.

Shaqada dibadda waxay ka kooban tahay dooda nooca hubinta, miis data.table oo leh lambarro khad isku qasan oo aan kala sooc lahayn samples_index iyo tirada dufcadaha, miiska iyo tirada ugu badan ee dufcadaha, iyo sidoo kale tibaaxaha SQL ee soo dejinta xogta kaydka. Intaa waxaa dheer, waxaan qeexnay analoogga degdega ah ee shaqada gudaha keras::to_categorical(). Waxaan u isticmaalnay ku dhawaad ​​dhammaan xogta tababarka, anagoo ka tagnay boqolkiiba nus si loo ansixiyo, markaa cabbirka waagii wuxuu ku xaddidnaa cabbirka steps_per_epoch marka loo yeero keras::fit_generator(), iyo shuruudda if (i > max_i) kaliya u shaqeeyay ku celcelinta ansaxinta.

Shaqada gudaha, tusmooyinka safafka ayaa loo soo saaray dufcada soo socota, diiwaanada waxaa laga soo dajiyay xogta iyada oo miiska dufcada uu kordho, JSON parsing (shaqo cpp_process_json_vector(), ku qoran C++) oo abuuraya habab u dhiganta sawirro. Kadibna hal-kulul oo leh calaamado fasalka ayaa la abuuray, qalabyo leh qiyamka pixels iyo calaamadaha ayaa lagu daraa liis, taas oo ah qiimaha soo noqoshada. Si loo dardargeliyo shaqada, waxaan isticmaalnay abuurista tusmooyinka miisaska data.table iyo wax ka beddelka iyada oo la marayo isku xirka - iyada oo aan la helin xirmadan "chips" miiska Aad bay u adagtahay in la qiyaaso in si wax ku ool ah loogu shaqeeyo iyada oo la adeegsanayo xog kasta oo muhiim ah oo ku jirta R.

Natiijooyinka cabbiraadda xawaaraha ee laptop-ka Core i5 waa sida soo socota:

Iterator benchmark

library(Rcpp)
library(keras)
library(ggplot2)

source("utils/rcpp.R")
source("utils/keras_iterator.R")

con <- DBI::dbConnect(drv = MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

ind <- seq_len(DBI::dbGetQuery(con, "SELECT count(*) FROM doodles")[[1L]])
num_classes <- DBI::dbGetQuery(con, "SELECT max(label_int) + 1 FROM doodles")[[1L]]

# Π˜Π½Π΄Π΅ΠΊΡΡ‹ для ΠΎΠ±ΡƒΡ‡Π°ΡŽΡ‰Π΅ΠΉ Π²Ρ‹Π±ΠΎΡ€ΠΊΠΈ
train_ind <- sample(ind, floor(length(ind) * 0.995))
# Π˜Π½Π΄Π΅ΠΊΡΡ‹ для ΠΏΡ€ΠΎΠ²Π΅Ρ€ΠΎΡ‡Π½ΠΎΠΉ Π²Ρ‹Π±ΠΎΡ€ΠΊΠΈ
val_ind <- ind[-train_ind]
rm(ind)
# ΠšΠΎΡΡ„Ρ„ΠΈΡ†ΠΈΠ΅Π½Ρ‚ ΠΌΠ°ΡΡˆΡ‚Π°Π±Π°
scale <- 0.5

# ΠŸΡ€ΠΎΠ²Π΅Π΄Π΅Π½ΠΈΠ΅ Π·Π°ΠΌΠ΅Ρ€Π°
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    it1 <- train_generator(
      db_connection = con,
      samples_index = train_ind,
      num_classes = num_classes,
      batch_size = batch_size,
      scale = scale
    )
    bench::mark(
      it1(),
      min_iterations = 50L
    )
  }
)
# ΠŸΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Ρ‹ Π±Π΅Π½Ρ‡ΠΌΠ°Ρ€ΠΊΠ°
cols <- c("batch_size", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   batch_size      min   median      max `itr/sec` total_time n_itr
#        <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
# 1         16     25ms  64.36ms   92.2ms     15.9       3.09s    49
# 2         32   48.4ms 118.13ms 197.24ms     8.17       5.88s    48
# 3         64   69.3ms 117.93ms 181.14ms     8.57       5.83s    50
# 4        128  157.2ms 240.74ms 503.87ms     3.85      12.71s    49
# 5        256  359.3ms 613.52ms 988.73ms     1.54       30.5s    47
# 6        512  884.7ms    1.53s    2.07s     0.674      1.11m    45
# 7       1024     2.7s    3.83s    5.47s     0.261      2.81m    44

ggplot(res_bench, aes(x = factor(batch_size), y = median, group = 1)) +
    geom_point() +
    geom_line() +
    ylab("median time, s") +
    theme_minimal()

DBI::dbDisconnect(con, shutdown = TRUE)

Aqoonsiga Doodle Draw Degdega ah: sida saaxiibbo looga samaysto R, C++ iyo shabakadaha neerfaha

Haddii aad haysato qadar ku filan oo RAM ah, waxaad si dhab ah u dedejin kartaa hawlgalka kaydka xogta adoo u wareejinaya isla RAM (32 GB ayaa ku filan hawshayada). Linux, qaybta waxaa lagu rakibaa si caadi ah /dev/shm, oo qabsanaya ilaa kala badh awoodda RAM. Wax badan ayaad ku iftiimin kartaa adigoo tafatiraya /etc/fstabsi aad u hesho rikoor sida tmpfs /dev/shm tmpfs defaults,size=25g 0 0. Hubi inaad dib u kiciso oo aad hubiso natiijada adoo socodsiinaya amarka df -h.

Dib-u-habaynta xogta tijaabada ayaa u muuqata mid aad uga fudud, maadaama xogta tijaabadu ay gabi ahaanba ku habboon tahay RAM:

Iterator ee xogta tijaabada

test_generator <- function(dt,
                           batch_size = 32,
                           scale = 1,
                           color = FALSE,
                           imagenet_preproc = FALSE) {

  # ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ° Π°Ρ€Π³ΡƒΠΌΠ΅Π½Ρ‚ΠΎΠ²
  checkmate::assert_data_table(dt)
  checkmate::assert_count(batch_size)
  checkmate::assert_number(scale, lower = 0.001, upper = 5)
  checkmate::assert_flag(color)
  checkmate::assert_flag(imagenet_preproc)

  # ΠŸΡ€ΠΎΡΡ‚Π°Π²Π»ΡΠ΅ΠΌ Π½ΠΎΠΌΠ΅Ρ€Π° Π±Π°Ρ‚Ρ‡Π΅ΠΉ
  dt[, batch := (.I - 1L) %/% batch_size + 1L]
  data.table::setkey(dt, batch)
  i <- 1
  max_i <- dt[, max(batch)]

  # Π—Π°ΠΌΡ‹ΠΊΠ°Π½ΠΈΠ΅
  function() {
    batch_x <- cpp_process_json_vector(dt[batch == i, drawing], 
                                       scale = scale, color = color)
    if (imagenet_preproc) {
      # Π¨ΠΊΠ°Π»ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠ΅ c ΠΈΠ½Ρ‚Π΅Ρ€Π²Π°Π»Π° [0, 1] Π½Π° ΠΈΠ½Ρ‚Π΅Ρ€Π²Π°Π» [-1, 1]
      batch_x <- (batch_x - 0.5) * 2
    }
    result <- list(batch_x)
    i <<- i + 1
    return(result)
  }
}

4. Xulashada qaab-dhismeedka moodeelka

Nashqadii ugu horeysay ee la isticmaalo waxay ahayd mobilenet v1, kuwaas oo sifooyinkooda lagu falanqeeyay tan fariinta. Waxaa lagu daraa sida caadiga ah keras iyo, si waafaqsan, waxaa laga heli karaa xirmada magaca isku midka ah ee R. Laakiin marka la isku dayayo in lagu isticmaalo sawirada hal kanaalka ah, wax la yaab leh ayaa soo baxay: Tensor-ku-gelinta waa in uu had iyo jeer lahaadaa cabbirka (batch, height, width, 3), taas oo ah, tirada kanaalada lama beddeli karo. Ma jiro xaddidaad noocaas ah Python, markaa waanu ku degdegnay oo qornay hirgelinta noo gaar ah ee qaab dhismeedkan, annagoo raacayna maqaalkii asalka ahaa (iyada oo aan laga tegin nooca keras):

Mobilenet v1 qaab dhismeedka

library(keras)

top_3_categorical_accuracy <- custom_metric(
    name = "top_3_categorical_accuracy",
    metric_fn = function(y_true, y_pred) {
         metric_top_k_categorical_accuracy(y_true, y_pred, k = 3)
    }
)

layer_sep_conv_bn <- function(object, 
                              filters,
                              alpha = 1,
                              depth_multiplier = 1,
                              strides = c(2, 2)) {

  # NB! depth_multiplier !=  resolution multiplier
  # https://github.com/keras-team/keras/issues/10349

  layer_depthwise_conv_2d(
    object = object,
    kernel_size = c(3, 3), 
    strides = strides,
    padding = "same",
    depth_multiplier = depth_multiplier
  ) %>%
  layer_batch_normalization() %>% 
  layer_activation_relu() %>%
  layer_conv_2d(
    filters = filters * alpha,
    kernel_size = c(1, 1), 
    strides = c(1, 1)
  ) %>%
  layer_batch_normalization() %>% 
  layer_activation_relu() 
}

get_mobilenet_v1 <- function(input_shape = c(224, 224, 1),
                             num_classes = 340,
                             alpha = 1,
                             depth_multiplier = 1,
                             optimizer = optimizer_adam(lr = 0.002),
                             loss = "categorical_crossentropy",
                             metrics = c("categorical_crossentropy",
                                         top_3_categorical_accuracy)) {

  inputs <- layer_input(shape = input_shape)

  outputs <- inputs %>%
    layer_conv_2d(filters = 32, kernel_size = c(3, 3), strides = c(2, 2), padding = "same") %>%
    layer_batch_normalization() %>% 
    layer_activation_relu() %>%
    layer_sep_conv_bn(filters = 64, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 128, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 128, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 256, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 256, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 1024, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 1024, strides = c(1, 1)) %>%
    layer_global_average_pooling_2d() %>%
    layer_dense(units = num_classes) %>%
    layer_activation_softmax()

    model <- keras_model(
      inputs = inputs,
      outputs = outputs
    )

    model %>% compile(
      optimizer = optimizer,
      loss = loss,
      metrics = metrics
    )

    return(model)
}

Faa'iido darrada habkan waa caddahay. Waxaan rabaa in aan tijaabiyo noocyo badan, laakiin liddi ku ah, ma rabo in aan dib u qoro naqshad kasta oo gacanta ah. Waxa kale oo naloo diiday fursad aan ku isticmaalno miisaanka moodooyinka horay loogu tababaray imagenet. Sida caadiga ah, barashada dukumentiyada ayaa caawisay. Shaqada get_config() wuxuu kuu ogolaanayaa inaad hesho sharaxaad ku saabsan qaabka qaab ku habboon tafatirka (base_model_conf$layers - liiska R caadiga ah), iyo shaqada from_config() Wuxuu sameeyaa beddelka u beddelashada shayga moodelka:

base_model_conf <- get_config(base_model)
base_model_conf$layers[[1]]$config$batch_input_shape[[4]] <- 1L
base_model <- from_config(base_model_conf)

Hadda ma adka in la qoro hawl caalami ah si loo helo mid ka mid ah kuwa la keenay keras moodooyinka leh ama aan lahayn miisaan lagu tababaray imagenet:

Shaqada rarida naqshadaha diyaarsan

get_model <- function(name = "mobilenet_v2",
                      input_shape = NULL,
                      weights = "imagenet",
                      pooling = "avg",
                      num_classes = NULL,
                      optimizer = keras::optimizer_adam(lr = 0.002),
                      loss = "categorical_crossentropy",
                      metrics = NULL,
                      color = TRUE,
                      compile = FALSE) {
  # ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ° Π°Ρ€Π³ΡƒΠΌΠ΅Π½Ρ‚ΠΎΠ²
  checkmate::assert_string(name)
  checkmate::assert_integerish(input_shape, lower = 1, upper = 256, len = 3)
  checkmate::assert_count(num_classes)
  checkmate::assert_flag(color)
  checkmate::assert_flag(compile)

  # ΠŸΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ ΠΈΠ· ΠΏΠ°ΠΊΠ΅Ρ‚Π° keras
  model_fun <- get0(paste0("application_", name), envir = asNamespace("keras"))
  # ΠŸΡ€ΠΎΠ²Π΅Ρ€ΠΊΠ° наличия ΠΎΠ±ΡŠΠ΅ΠΊΡ‚Π° Π² ΠΏΠ°ΠΊΠ΅Ρ‚Π΅
  if (is.null(model_fun)) {
    stop("Model ", shQuote(name), " not found.", call. = FALSE)
  }

  base_model <- model_fun(
    input_shape = input_shape,
    include_top = FALSE,
    weights = weights,
    pooling = pooling
  )

  # Если ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅ Π½Π΅ Ρ†Π²Π΅Ρ‚Π½ΠΎΠ΅, мСняСм Ρ€Π°Π·ΠΌΠ΅Ρ€Π½ΠΎΡΡ‚ΡŒ Π²Ρ…ΠΎΠ΄Π°
  if (!color) {
    base_model_conf <- keras::get_config(base_model)
    base_model_conf$layers[[1]]$config$batch_input_shape[[4]] <- 1L
    base_model <- keras::from_config(base_model_conf)
  }

  predictions <- keras::get_layer(base_model, "global_average_pooling2d_1")$output
  predictions <- keras::layer_dense(predictions, units = num_classes, activation = "softmax")
  model <- keras::keras_model(
    inputs = base_model$input,
    outputs = predictions
  )

  if (compile) {
    keras::compile(
      object = model,
      optimizer = optimizer,
      loss = loss,
      metrics = metrics
    )
  }

  return(model)
}

Markaad isticmaalayso sawirro hal kanaal ah, miisaan hore loo tababaray lama isticmaalo. Tan waxaa lagu hagaajin karaa: iyadoo la isticmaalayo shaqada get_weights() hel miisaanka moodeelka qaab liis ah R arrays, beddel cabbirka qaybta ugu horreysa ee liiskan (adiga oo qaadanaya hal kanaal midab ama celcelis ahaan saddexda), ka dibna miisaannada dib ugu celi moodeelka shaqada set_weights(). Marna kuma darin shaqeyntan, sababtoo ah marxaladan waxay horeyba u caddaatay in ay waxtar badan tahay in lagu shaqeeyo sawirada midabka.

Waxaan samaynay inta badan tijaabooyinka anagoo adeegsanayna nooca mobilnetka 1 iyo 2, iyo sidoo kale resnet34. Nashqado casri ah oo badan sida SE-ResNeXt ayaa si fiican u soo bandhigay tartankan. Nasiib darro, ma aanan haysanin wax-qabadyo diyaarsan oo annagu gacanta ku hayno, mana aanaan qorin anaga (laakin hubaal waan qori doonnaa).

5. Barbaarinta qoraallada

Si loo fududeeyo, dhammaan koodka bilowga tababarka waxaa loo qaabeeyey sidii hal qoraal, oo la cabbiray iyadoo la isticmaalayo docopt sida soo socota:

doc <- '
Usage:
  train_nn.R --help
  train_nn.R --list-models
  train_nn.R [options]

Options:
  -h --help                   Show this message.
  -l --list-models            List available models.
  -m --model=<model>          Neural network model name [default: mobilenet_v2].
  -b --batch-size=<size>      Batch size [default: 32].
  -s --scale-factor=<ratio>   Scale factor [default: 0.5].
  -c --color                  Use color lines [default: FALSE].
  -d --db-dir=<path>          Path to database directory [default: Sys.getenv("db_dir")].
  -r --validate-ratio=<ratio> Validate sample ratio [default: 0.995].
  -n --n-gpu=<number>         Number of GPUs [default: 1].
'
args <- docopt::docopt(doc)

Xirmo docopt waxay ka dhigan tahay fulinta http://docopt.org/ loogu talagalay R. Iyadoo la kaashanayo, qoraallada waxaa la bilaabay amarro fudud sida Rscript bin/train_nn.R -m resnet50 -c -d /home/andrey/doodle_db ama ./bin/train_nn.R -m resnet50 -c -d /home/andrey/doodle_db, haddii faylka train_nn.R waa la fulin karaa (amarkani wuxuu bilaabi doonaa tababarka moodeelka resnet50 Sawirada saddexda midab leh ee cabbiraya 128x128 pixels, xog-ururinta waa in ay ku jirtaa galka /home/andrey/doodle_db). Waxaad ku dari kartaa xawaaraha waxbarashada, nooca wax hagaajiya, iyo wax kasta oo kale oo la beddeli karo liiska. Habka diyaarinta daabacaadda, waxaa soo baxday in naqshadeynta mobilenet_v2 laga bilaabo nooca hadda jira keras in R la isticmaalo ma awoodo sababtoo ah isbeddelada aan lagu xisaabtamin xirmada R, waxaan sugeynaa inay hagaajiyaan.

Habkani wuxuu suurtogal ka dhigay in si weyn loo dedejiyo tijaabooyinka moodooyinka kala duwan marka la barbar dhigo soo saarista dhaqameed ee qoraallada ee RStudio (waxaan u aragnaa xirmada sida beddelka suurtagalka ah). tfruns). Laakiin faa'iidada ugu weyn waa awoodda si fudud loo maareeyo bilaabista qoraallada Docker ama si fudud server-ka, iyada oo aan lagu rakibin RStudio tan.

6. Dockerization of scripts

Waxaan u isticmaalnay Docker si aan u hubinno qaadista deegaanka ee moodooyinka tababarka ee u dhexeeya xubnaha kooxda iyo si degdeg ah loo geeyo daruuraha. Waxaad bilaabi kartaa inaad barato qalabkan, kaas oo aan caadi u ahayn barnaamijka R, oo leh tan daabacaado taxane ah ama koorsada video.

Docker wuxuu kuu ogolaanayaa inaad labadiinaba ka abuurtaan sawiradaada xoqan oo aad isticmaasho sawirro kale oo saldhig u ah abuuristaada. Markii aan falanqeynay xulashooyinka jira, waxaan gaadhnay in rakibidda NVIDIA, CUDA+cuDNN darawallada iyo maktabadaha Python ay tahay qayb si caddaalad ah qayb uga ah sawirka, waxaanan go'aansannay inaan sawirka rasmiga ah u qaadanno saldhig ahaan. tensorflow/tensorflow:1.12.0-gpu, ku darida xirmooyinka R ee lagama maarmaanka ah halkaas.

Faylka docker-ka u dambeeya wuxuu u ekaa sidan:

Dockerfile

FROM tensorflow/tensorflow:1.12.0-gpu

MAINTAINER Artem Klevtsov <[email protected]>

SHELL ["/bin/bash", "-c"]

ARG LOCALE="en_US.UTF-8"
ARG APT_PKG="libopencv-dev r-base r-base-dev littler"
ARG R_BIN_PKG="futile.logger checkmate data.table rcpp rapidjsonr dbi keras jsonlite curl digest remotes"
ARG R_SRC_PKG="xtensor RcppThread docopt MonetDBLite"
ARG PY_PIP_PKG="keras"
ARG DIRS="/db /app /app/data /app/models /app/logs"

RUN source /etc/os-release && 
    echo "deb https://cloud.r-project.org/bin/linux/ubuntu ${UBUNTU_CODENAME}-cran35/" > /etc/apt/sources.list.d/cran35.list && 
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9 && 
    add-apt-repository -y ppa:marutter/c2d4u3.5 && 
    add-apt-repository -y ppa:timsc/opencv-3.4 && 
    apt-get update && 
    apt-get install -y locales && 
    locale-gen ${LOCALE} && 
    apt-get install -y --no-install-recommends ${APT_PKG} && 
    ln -s /usr/lib/R/site-library/littler/examples/install.r /usr/local/bin/install.r && 
    ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r && 
    ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r && 
    echo 'options(Ncpus = parallel::detectCores())' >> /etc/R/Rprofile.site && 
    echo 'options(repos = c(CRAN = "https://cloud.r-project.org"))' >> /etc/R/Rprofile.site && 
    apt-get install -y $(printf "r-cran-%s " ${R_BIN_PKG}) && 
    install.r ${R_SRC_PKG} && 
    pip install ${PY_PIP_PKG} && 
    mkdir -p ${DIRS} && 
    chmod 777 ${DIRS} && 
    rm -rf /tmp/downloaded_packages/ /tmp/*.rds && 
    rm -rf /var/lib/apt/lists/*

COPY utils /app/utils
COPY src /app/src
COPY tests /app/tests
COPY bin/*.R /app/

ENV DBDIR="/db"
ENV CUDA_HOME="/usr/local/cuda"
ENV PATH="/app:${PATH}"

WORKDIR /app

VOLUME /db
VOLUME /app

CMD bash

Si ay ugu habboonaato, baakadaha la isticmaalay waxaa lagu riday doorsoomayaal; Inta badan qoraallada qoran waxaa lagu koobiyeeyaa gudaha weelasha inta lagu jiro kulanka. Waxaan sidoo kale u bedelnay qolofka taliska /bin/bash si ay u fududaato isticmaalka nuxurka /etc/os-release. Tani waxay ka fogaatay baahida loo qabo in lagu qeexo nooca OS ee koodka.

Intaa waxaa dheer, qoraal yar oo bash ah ayaa la qoray kaas oo kuu ogolaanaya inaad bilowdo weel leh amarro kala duwan. Tusaale ahaan, kuwani waxay noqon karaan qoraallo loogu talagalay tababbarka shabakadaha neerfaha ee hore loogu dhejiyay gudaha weelka, ama qolofka taliska ee ciladaha iyo la socodka shaqada weelka:

Qoraal si aad u furto weelka

#!/bin/sh

DBDIR=${PWD}/db
LOGSDIR=${PWD}/logs
MODELDIR=${PWD}/models
DATADIR=${PWD}/data
ARGS="--runtime=nvidia --rm -v ${DBDIR}:/db -v ${LOGSDIR}:/app/logs -v ${MODELDIR}:/app/models -v ${DATADIR}:/app/data"

if [ -z "$1" ]; then
    CMD="Rscript /app/train_nn.R"
elif [ "$1" = "bash" ]; then
    ARGS="${ARGS} -ti"
else
    CMD="Rscript /app/train_nn.R $@"
fi

docker run ${ARGS} doodles-tf ${CMD}

Haddii qoraalka bashku uu socdo iyada oo aan la xaddidin, qoraalka waxaa loogu yeeri doonaa gudaha weelka train_nn.R oo leh qiyamka caadiga ah; haddii dooda ugu horeysa ee mawqifku tahay "bash", markaa weelku wuxuu ku bilaabi doonaa si dhexgal ah qolof amar ah. Dhammaan kiisaska kale, qiyamka doodaha booska ayaa la beddelay: CMD="Rscript /app/train_nn.R $@".

Waxaa xusid mudan in tusaha leh xogta ilaha iyo xogta, iyo sidoo kale tusaha badbaadinta moodooyinka tababaran, lagu rakibay gudaha weelka laga soo bilaabo nidaamka martida loo yahay, kaas oo kuu ogolaanaya inaad hesho natiijooyinka qoraallada iyada oo aan loo baahnayn khalkhalgelinta.

7. Isticmaalka GPU-yo badan oo Google Cloud ah

Mid ka mid ah sifooyinka tartanka ayaa ahaa xogta buuqa badan (fiiri sawirka cinwaanka, oo laga soo amaahday @Leigh.plt oo ka yimid ODS slack). Dufcooyin waaweyn ayaa ka caawiya la dagaalanka tan, ka dib tijaabooyin lagu sameeyay kombuyuutar leh 1 GPU, waxaan go'aansanay in aan xirfad u samayno moodooyinka tababarka ee GPU-yada daruuraha ah. GoogleCloud la isticmaalay (hagaha wanaagsan ee aasaasiga ah) iyadoo ay ugu wacan tahay xulashada badan ee habaynta la heli karo, qiimo macquul ah iyo $300 oo gunno ah. Hungurinimo awgeed, waxaan dalbaday tusaale 4xV100 ah oo wata SSD iyo tan RAM ah, taasina waxay ahayd qalad weyn. Mashiinka noocan oo kale ah si dhakhso ah ayuu u cunaa lacagta; waxaad tagi kartaa tijaabada jaban iyada oo aan la helin dhuumo la xaqiijiyay. Ujeeddooyin waxbarasho, waxa fiican inaad qaadato K80. Laakiin qadarka badan ee RAM ayaa ku anfacay - Cloud SSD ma cajabin waxqabadkeeda, sidaas darteed kaydinta xogta ayaa loo wareejiyay dev/shm.

Xiisaha ugu weyni waa jajabka koodka ee mas'uul ka ah isticmaalka GPU-yo badan. Marka hore, qaabka waxaa lagu abuuray CPU iyadoo la adeegsanayo maamulaha macnaha guud, sida Python:

with(tensorflow::tf$device("/cpu:0"), {
  model_cpu <- get_model(
    name = model_name,
    input_shape = input_shape,
    weights = weights,
    metrics =(top_3_categorical_accuracy,
    compile = FALSE
  )
})

Kadibna qaabka aan la soo uruurin (tani waa muhiim) waxaa lagu koobiyeeyay tiro la siiyay oo GPU-yada la heli karo, ka dibna kaliya ayaa la ururiyay:

model <- keras::multi_gpu_model(model_cpu, gpus = n_gpu)
keras::compile(
  object = model,
  optimizer = keras::optimizer_adam(lr = 0.0004),
  loss = "categorical_crossentropy",
  metrics = c(top_3_categorical_accuracy)
)

Farsamada caadiga ah ee qaboojinta dhammaan lakabyada marka laga reebo kan u dambeeya, tababbarka lakabka u dambeeya, barafaynta iyo dib-u-tabaridda dhammaan moodeelka dhowr GPUs lama hirgelin.

Tababarka ayaa lala socday iyadoo aan la isticmaalin. tensorboard, annaga oo ku xaddidnayna duubista diiwaannada iyo kaydinta moodooyinka leh magacyo wargelin ah xilli kasta:

Dib u soo celinta

# Π¨Π°Π±Π»ΠΎΠ½ ΠΈΠΌΠ΅Π½ΠΈ Ρ„Π°ΠΉΠ»Π° Π»ΠΎΠ³Π°
log_file_tmpl <- file.path("logs", sprintf(
  "%s_%d_%dch_%s.csv",
  model_name,
  dim_size,
  channels,
  format(Sys.time(), "%Y%m%d%H%M%OS")
))
# Π¨Π°Π±Π»ΠΎΠ½ ΠΈΠΌΠ΅Π½ΠΈ Ρ„Π°ΠΉΠ»Π° ΠΌΠΎΠ΄Π΅Π»ΠΈ
model_file_tmpl <- file.path("models", sprintf(
  "%s_%d_%dch_{epoch:02d}_{val_loss:.2f}.h5",
  model_name,
  dim_size,
  channels
))

callbacks_list <- list(
  keras::callback_csv_logger(
    filename = log_file_tmpl
  ),
  keras::callback_early_stopping(
    monitor = "val_loss",
    min_delta = 1e-4,
    patience = 8,
    verbose = 1,
    mode = "min"
  ),
  keras::callback_reduce_lr_on_plateau(
    monitor = "val_loss",
    factor = 0.5, # ΡƒΠΌΠ΅Π½ΡŒΡˆΠ°Π΅ΠΌ lr Π² 2 Ρ€Π°Π·Π°
    patience = 4,
    verbose = 1,
    min_delta = 1e-4,
    mode = "min"
  ),
  keras::callback_model_checkpoint(
    filepath = model_file_tmpl,
    monitor = "val_loss",
    save_best_only = FALSE,
    save_weights_only = FALSE,
    mode = "min"
  )
)

8. Halkii gunaanad

Dhibaatooyin dhowr ah oo aan la kulannay ayaa weli aan laga gudbi karin:

  • Π² keras ma jirto hawl diyaarsan oo si toos ah loogu raadinayo heerka waxbarashada ugu wanaagsan (analogue lr_finder maktabadda soomi.ai; Dedaal xoogaa ah, waxaa suurtagal ah in la geeyo fulinta dhinac saddexaad R, tusaale ahaan, tan;
  • Natiijo ahaan qodobkii hore, ma suurtagal ahayn in la doorto xawaaraha saxda ah ee tababarka marka la isticmaalayo dhowr GPUs;
  • waxaa jira la'aanta naqshadaha shabakadaha neerfaha ee casriga ah, gaar ahaan kuwa horay loogu tababaray imagenet;
  • qofna ma wareego siyaasadda iyo heerarka waxbarashada takoorid (cosine annealing waxay ahayd codsigeena la fuliyay, Mahadsanid skydan).

Maxaa faa'iido leh oo laga bartay tartankan:

  • Qalab awood yar leh, waxaad ku shaqayn kartaa mugga xogta saxda ah (mar badan cabbirka RAM) xanuun la'aan. Bac caag ah miiska kaydiya xusuusta sababtoo ah wax ka beddelka goobta ee miisaska, taas oo ka fogaanaysa in la koobiyeeyo, iyo marka si sax ah loo isticmaalo, awoodeeda had iyo jeer waxay muujisaa xawaaraha ugu sarreeya ee dhammaan qalabka naloo yaqaanno qorista luqadaha. Ku kaydinta xogta database-ku waxay kuu ogolaanaysaa, marar badan, inaadan ka fikirin dhammaan baahida loo qabo in la tuujiyo dhammaan xogta kaydinta RAM.
  • Hawlaha qunyar socodka ah ee R waxaa lagu bedeli karaa kuwa degdega ah ee C++ iyadoo la isticmaalayo xirmada Rcpp. Haddii lagu daro isticmaalka RcppThread ama RcppParallel, Waxaan helnaa hirgelinta iskudubarid badan oo isku dhafan, markaa looma baahna in la barbar dhigo koodhka heerka R.
  • Xidhmada Rcpp waxaa loo isticmaali karaa iyada oo aan aqoon dhab ah loo lahayn C++, ugu yaraan loo baahan yahay ayaa la qeexay halkan. Faylasha madaxa ee tiro ka mid ah maktabadaha C-ga fiican sida xtensor laga heli karo CRAN, taas oo ah, kaabayaal ayaa loo sameeyay hirgelinta mashaariicda isku-dhafka C++ ee waxqabadka sare ee diyaarsan ee R. Ku habboonaanta dheeriga ah waa muujinta syntax iyo falanqaynta koodhka C++ ee RStudio.
  • docopt wuxuu kuu ogolaanayaa inaad ku socodsiiso qoraallada is-ku jira oo leh cabbirro. Tani waxay ku habboon tahay isticmaalka server-ka fog, oo ay ku jiraan. docker hoostiisa. Gudaha RStudio, way dhib badan tahay in la sameeyo saacado badan oo tijaabo ah oo lagu sameeyo shabakadaha neerfaha, iyo ku rakibida IDE ee server-ka laftiisa mar walba ma aha mid xaq ah.
  • Docker wuxuu xaqiijiyaa qaadista koodhka iyo dib u soo saarista natiijooyinka u dhexeeya horumariyayaal leh noocyo kala duwan oo OS ah iyo maktabadaha, iyo sidoo kale fududaynta fulinta ee adeegayaasha. Waxaad bilaabi kartaa dhammaan dhuumaha tababarka hal amar oo keliya.
  • Google Cloud waa hab ku habboon miisaaniyada oo lagu tijaabiyo qalab qaali ah, laakiin waxaad u baahan tahay inaad si taxadar leh u doorato qaabaynta.
  • Cabbiraadda xawaaraha jajabyada koodhka shakhsi ahaaneed aad bay faa'iido u leedahay, gaar ahaan marka la isku daro R iyo C++, iyo xirmada kursiga keydka - sidoo kale aad u fudud.

Guud ahaan, waayo-aragnimadani waxay ahayd mid aad u faa'iido badan waxaanan sii wadeynaa inaan ka shaqeyno xallinta qaar ka mid ah arrimaha la soo qaaday.

Source: www.habr.com

Add a comment