Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!

Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!

Heus Habr!

Novissimo lapsu, Kaggle certatum dedit ad imagines manus ductas referendas, Velox Draw Doodle Recognitionem, in qua, inter alios, turmae R-physicorum interfuit; Artem Klevtsova, Philippus Manager и Andreas Ogurtsov. Competition in speciali non describere nos: quod iam factum est in recentis editionis.

Hoc tempore non operabatur numisma rustico, sed experientia multa pretiosa consecuta est, ut communitati de pluribus rebus iucundissimis et utilibus in Kagle et in opere communi narrare vellem. Inter argumenta disputata: vita difficilis sine Epicriticae, JSON parsing (examinant haec exempla integrationem C++ codicis in scriptoribus vel fasciculis in R utendo Rcpp) scriptorum parameterizatio et dockerization finalis solutionis. Totum codicem e nuntio in forma ad executionem in promptu est repositoria.

Contents:

  1. Efficaciter onus notitia ex CSV in MonetDB
  2. Praeparans batches
  3. Iteratores exonerare batches a database
  4. Discriptis exemplar Architecture
  5. Scriptor parameterization
  6. Dockerization scriptorum
  7. Multa GPUs utens in Google Cloud
  8. Sed in finem

1. Efficaciter load notitia ex CSV in MonetDB database

Notitia in hoc certamine non sub forma imaginum promptarum factarum est, sed in forma 340 CSV imaginum (uni fasciculi pro singulis classibus) continens JSONs cum punctum coordinatarum. Haec puncta cum lineis connectens, imaginem finalem 256x256 imaginis mensuræ accipimus. Etiam in unoquoque monumento titulus est indicans num imago recte agnita sit a classifier adhibito tempore dataset collecta, duplicem litteram regionis commorationis auctoris picturae, identificantis singularis, indicationem temporis. et genus nominis, quod nomen tabella aequet. Simplex versio notitiarum originalium ponderat 7.4 GB in archivo et circiter 20 GB post fasciculum, plena notitia post infrequentiam 240 GB. Auctores curaverunt ut ambae versiones easdem delineatas expressissent, significationem plenam versionis redundantem esse. Utcumque, recondere 50 miliones imagines in imaginum graphicarum aut vestimentorum specie inutiles statim visum est et decrevimus omnes fasciculos CSV ex archivo confundi. train_simplified.zip in database cum generatione imaginum subsequentis magnitudinis "in musca" pro unaquaque batch requisitae.

Ratio bene probata electa est ut DBMS MonetDB, nempe exsecutionem pro R in sarcina MonetDBLite. Involucrum inclusum involvit versionem servo datorum datorum et permittit ut servo immediate ab R sessione colligere et cum eo laboret. Creare datorum et connexionem ei aguntur uno praecepto;

con <- DBI::dbConnect(drv = MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

Duas tabulas creare debebimus: unam pro omnibus datam, alteram ad informationem muneris de lima receptaculis (utilis si quid erratum est et processus resumendus est post aliquot tabulas depositio);

Tabulas creando

if (!DBI::dbExistsTable(con, "doodles")) {
  DBI::dbCreateTable(
    con = con,
    name = "doodles",
    fields = c(
      "countrycode" = "char(2)",
      "drawing" = "text",
      "key_id" = "bigint",
      "recognized" = "bool",
      "timestamp" = "timestamp",
      "word" = "text"
    )
  )
}

if (!DBI::dbExistsTable(con, "upload_log")) {
  DBI::dbCreateTable(
    con = con,
    name = "upload_log",
    fields = c(
      "id" = "serial",
      "file_name" = "text UNIQUE",
      "uploaded" = "bool DEFAULT false"
    )
  )
}

Via quam celerrime ad onera data in database erat ut protinus effingo lima CSV utens SQL-imperium COPY OFFSET 2 INTO tablename FROM path USING DELIMITERS ',','n','"' NULL AS '' BEST EFFORTquibus tablename - mensa nomen et path — iter ad limam. Dum opus cum archivo deprehensum est constructum-in exsequenda unzip in R non recte pluribus documentis ex archivo laborat, sic systema usi sumus unzip (Per parametri getOption("unzip")).

Munus scribendi ad database

#' @title Извлечение и загрузка файлов
#'
#' @description
#' Извлечение CSV-файлов из ZIP-архива и загрузка их в базу данных
#'
#' @param con Объект подключения к базе данных (класс `MonetDBEmbeddedConnection`).
#' @param tablename Название таблицы в базе данных.
#' @oaram zipfile Путь к ZIP-архиву.
#' @oaram filename Имя файла внури ZIP-архива.
#' @param preprocess Функция предобработки, которая будет применена извлечённому файлу.
#'   Должна принимать один аргумент `data` (объект `data.table`).
#'
#' @return `TRUE`.
#'
upload_file <- function(con, tablename, zipfile, filename, preprocess = NULL) {
  # Проверка аргументов
  checkmate::assert_class(con, "MonetDBEmbeddedConnection")
  checkmate::assert_string(tablename)
  checkmate::assert_string(filename)
  checkmate::assert_true(DBI::dbExistsTable(con, tablename))
  checkmate::assert_file_exists(zipfile, access = "r", extension = "zip")
  checkmate::assert_function(preprocess, args = c("data"), null.ok = TRUE)

  # Извлечение файла
  path <- file.path(tempdir(), filename)
  unzip(zipfile, files = filename, exdir = tempdir(), 
        junkpaths = TRUE, unzip = getOption("unzip"))
  on.exit(unlink(file.path(path)))

  # Применяем функция предобработки
  if (!is.null(preprocess)) {
    .data <- data.table::fread(file = path)
    .data <- preprocess(data = .data)
    data.table::fwrite(x = .data, file = path, append = FALSE)
    rm(.data)
  }

  # Запрос к БД на импорт CSV
  sql <- sprintf(
    "COPY OFFSET 2 INTO %s FROM '%s' USING DELIMITERS ',','n','"' NULL AS '' BEST EFFORT",
    tablename, path
  )
  # Выполнение запроса к БД
  DBI::dbExecute(con, sql)

  # Добавление записи об успешной загрузке в служебную таблицу
  DBI::dbExecute(con, sprintf("INSERT INTO upload_log(file_name, uploaded) VALUES('%s', true)",
                              filename))

  return(invisible(TRUE))
}

Si mensam quam scribendo ad datorum transfigurare debes, satis est in argumento transire preprocess munus quod data transformabit.

Codicis pro continue notitias onerantium in database:

Data database ad scribo

# Список файлов для записи
files <- unzip(zipfile, list = TRUE)$Name

# Список исключений, если часть файлов уже была загружена
to_skip <- DBI::dbGetQuery(con, "SELECT file_name FROM upload_log")[[1L]]
files <- setdiff(files, to_skip)

if (length(files) > 0L) {
  # Запускаем таймер
  tictoc::tic()
  # Прогресс бар
  pb <- txtProgressBar(min = 0L, max = length(files), style = 3)
  for (i in seq_along(files)) {
    upload_file(con = con, tablename = "doodles", 
                zipfile = zipfile, filename = files[i])
    setTxtProgressBar(pb, i)
  }
  close(pb)
  # Останавливаем таймер
  tictoc::toc()
}

# 526.141 sec elapsed - копирование SSD->SSD
# 558.879 sec elapsed - копирование USB->SSD

Data loading tempus variari possunt secundum celeritatem notarum coegi adhibitorum. In casu nostro, legendi et scribendi intra unum SSD vel e ictu mico ad SSD (DB) minus quam X minuta accipit.

Pauca plura secunda capit columnam facere cum pittacio integro classe et columna indicem (ORDERED INDEX) cum numeris lineis quibus observationes in batches creando gustentur;

Additional creando columnas et Index

message("Generate lables")
invisible(DBI::dbExecute(con, "ALTER TABLE doodles ADD label_int int"))
invisible(DBI::dbExecute(con, "UPDATE doodles SET label_int = dense_rank() OVER (ORDER BY word) - 1"))

message("Generate row numbers")
invisible(DBI::dbExecute(con, "ALTER TABLE doodles ADD id serial"))
invisible(DBI::dbExecute(con, "CREATE ORDERED INDEX doodles_id_ord_idx ON doodles(id)"))

Ad problema solvendum batch in musca creandi, necesse est nos consequi celeritatem maximam temere extrahendi ordines e mensa. doodles. Hacc 3 strophis utebamur. Prima erat reducere dimensionalitatem illius generis quae observationem ID reponit. In the original data set, the type required to store the ID is bigintnumerus autem observationum idoneos suos, aequalem numero ordinali, in rationem aptare potest. int. Multo velocius quaesitum est in hoc casu. Secunda fraus uti ORDERED INDEX - Ad hanc decisionem empirice pervenimus, omnibus in promptu peractis options. Tertius erat quaestionibus utendi parametris. Essentia modus est mandatum exequi PREPARE cum subsequenti usu paratae expressionis cum fasciculum quaestionis eiusdem generis creando, re vera commodum est respectu simplicis. SELECT evasit intra teli actuariorum errore.

Processus notitiarum fasciculorum non plus quam 450 MB of RAM consumit. Hoc est, descriptus accessus permittit te movere datasets pondo decem gigabytarum in ferramentis technicis fere quibusvis, inclusis quibusdam machinis unius tabulae, quod est pulchellus frigus.

Reliquum est ut celeritatem notitiarum recuperandorum metiri et scalas ponderare cum batches diversarum magnitudinum sampling:

Probatio Database

library(ggplot2)

set.seed(0)
# Подключение к базе данных
con <- DBI::dbConnect(MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

# Функция для подготовки запроса на стороне сервера
prep_sql <- function(batch_size) {
  sql <- sprintf("PREPARE SELECT id FROM doodles WHERE id IN (%s)",
                 paste(rep("?", batch_size), collapse = ","))
  res <- DBI::dbSendQuery(con, sql)
  return(res)
}

# Функция для извлечения данных
fetch_data <- function(rs, batch_size) {
  ids <- sample(seq_len(n), batch_size)
  res <- DBI::dbFetch(DBI::dbBind(rs, as.list(ids)))
  return(res)
}

# Проведение замера
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    rs <- prep_sql(batch_size)
    bench::mark(
      fetch_data(rs, batch_size),
      min_iterations = 50L
    )
  }
)
# Параметры бенчмарка
cols <- c("batch_size", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   batch_size      min   median      max `itr/sec` total_time n_itr
#        <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
# 1         16   23.6ms  54.02ms  93.43ms     18.8        2.6s    49
# 2         32     38ms  84.83ms 151.55ms     11.4       4.29s    49
# 3         64   63.3ms 175.54ms 248.94ms     5.85       8.54s    50
# 4        128   83.2ms 341.52ms 496.24ms     3.00      16.69s    50
# 5        256  232.8ms 653.21ms 847.44ms     1.58      31.66s    50
# 6        512  784.6ms    1.41s    1.98s     0.740       1.1m    49
# 7       1024  681.7ms    2.72s    4.06s     0.377      2.16m    49

ggplot(res_bench, aes(x = factor(batch_size), y = median, group = 1)) +
  geom_point() +
  geom_line() +
  ylab("median time, s") +
  theme_minimal()

DBI::dbDisconnect(con, shutdown = TRUE)

Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!

2. Praeparans batches

Tota massa praeparationis processus in sequentibus gradibus consistit:

  1. Parsing aliquot JSONs continentes vectores chordarum cum punctorum coordinatis.
  2. Lineas coloratas ducentes secundum coordinatas punctorum in imagine magnitudinis debitae (exempli gratia 256×256 vel 128×128).
  3. Imagines inde in tensorem convertens.

Cum pars certaminis inter nucleos Pythonis, quaestio imprimis utens solvitur Epicriticae. Una simplicissima et notissima analoga in R hoc spectare videtur;

Exsequens JSON ad Tensorem Conversio in R *

r_process_json_str <- function(json, line.width = 3, 
                               color = TRUE, scale = 1) {
  # Парсинг JSON
  coords <- jsonlite::fromJSON(json, simplifyMatrix = FALSE)
  tmp <- tempfile()
  # Удаляем временный файл по завершению функции
  on.exit(unlink(tmp))
  png(filename = tmp, width = 256 * scale, height = 256 * scale, pointsize = 1)
  # Пустой график
  plot.new()
  # Размер окна графика
  plot.window(xlim = c(256 * scale, 0), ylim = c(256 * scale, 0))
  # Цвета линий
  cols <- if (color) rainbow(length(coords)) else "#000000"
  for (i in seq_along(coords)) {
    lines(x = coords[[i]][[1]] * scale, y = coords[[i]][[2]] * scale, 
          col = cols[i], lwd = line.width)
  }
  dev.off()
  # Преобразование изображения в 3-х мерный массив
  res <- png::readPNG(tmp)
  return(res)
}

r_process_json_vector <- function(x, ...) {
  res <- lapply(x, r_process_json_str, ...)
  # Объединение 3-х мерных массивов картинок в 4-х мерный в тензор
  res <- do.call(abind::abind, c(res, along = 0))
  return(res)
}

Tractus per instrumenta R vexillum conficitur et ad tempus PNG conditum in RAM servatur (in Linux, directoria temporaria R in indicem inscripta sunt. /tmpinclusi ram). Hic fasciculus tunc legitur ut tria dimensiva ordinata cum numeris ab 0 ad 1. Hoc magni momenti est quod BMP magis conventionale legetur in rudi apparatu cum codicibus hexametris coloratis.

Experiamur eventum:

zip_file <- file.path("data", "train_simplified.zip")
csv_file <- "cat.csv"
unzip(zip_file, files = csv_file, exdir = tempdir(), 
      junkpaths = TRUE, unzip = getOption("unzip"))
tmp_data <- data.table::fread(file.path(tempdir(), csv_file), sep = ",", 
                              select = "drawing", nrows = 10000)
arr <- r_process_json_str(tmp_data[4, drawing])
dim(arr)
# [1] 256 256   3
plot(magick::image_read(arr))

Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!

Ipsa massa formabitur sic:

res <- r_process_json_vector(tmp_data[1:4, drawing], scale = 0.5)
str(res)
 # num [1:4, 1:128, 1:128, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
 # - attr(*, "dimnames")=List of 4
 #  ..$ : NULL
 #  ..$ : NULL
 #  ..$ : NULL
 #  ..$ : NULL

Haec exsecutio suboptima nobis visa est, cum formatio magnarum batches diu indecenter occupat, et experientia collegarum uti decrevimus utendo bibliotheca valida. Epicriticae. Eo tempore nulla sarcina parata facta est pro R (nullum nunc est), itaque minima exsecutio functionis inquisitae in C ++ cum integratione in R codice utens scripta est. Rcpp.

Ad problema solvendum, fasciculi et bibliothecae sequentes adhibiti sunt;

  1. Epicriticae ad operandum cum imaginibus et lineis. Usus est systema bibliothecarum pre-installed et lima caput capitis, necnon nexus dynamicus.

  2. xtensor ad operandum cum multidimensionalibus vestimentis et tensoriis. Fasciculi capitis usi sumus in R sarcina eiusdem nominis comprehensis. Bibliotheca te permittit operari cum multidimensionalibus vestimentis, tam in ordine majoris quam in majori columna ordinis.

  3. ndjson ad parsing JSON. Haec bibliotheca adhibetur in xtensor automatice si praesens in re est.

  4. RcppThread pro ordinandis multi- thiatis processus vectoris ab JSON. Usus est fasciculi capitalis ab hac sarcina provisum. Ex more popular! RcppParallel Sarcina, praeter alia, fabricatum in ansa mechanismum interrumpunt.

Est memorabile est quod, xtensor deos esse, praeterquam quod amplam functionem et magnificam agendi rationem habet, eius tincturae admodum responsales evaserunt et quaestiones prompte ac singillatim responderunt. Eorum ope fieri potuit ad efficiendum transformationes OpenCV matrices in tensores tensores, ac via ad tensores 3 dimensivas imaginis componendi in tensorem 4 dimensiva rectae dimensionis (ipsae massae).

Materia discendi Rcpp, xtensor et RcppThread

https://thecoatlessprofessor.com/programming/unofficial-rcpp-api-documentation

https://docs.opencv.org/4.0.1/d7/dbd/group__imgproc.html

https://xtensor.readthedocs.io/en/latest/

https://xtensor.readthedocs.io/en/latest/file_loading.html#loading-json-data-into-xtensor

https://cran.r-project.org/web/packages/RcppThread/vignettes/RcppThread-vignette.pdf

Ad fasciculos componendos qui usus systematis fasciculi et dynamicae ligaturae cum bibliothecis in systemate institutis, mechanismum plugin in involucro impletum usi sumus Rcpp. Ad semitas et vexilla automatice invenire, utilitate populari Linux usi sumus pkg aboutconfig,.

Exsequendam de Rcpp plugin ad bibliothecam OpenCV utendam

Rcpp::registerPlugin("opencv", function() {
  # Возможные названия пакета
  pkg_config_name <- c("opencv", "opencv4")
  # Бинарный файл утилиты pkg-config
  pkg_config_bin <- Sys.which("pkg-config")
  # Проврека наличия утилиты в системе
  checkmate::assert_file_exists(pkg_config_bin, access = "x")
  # Проверка наличия файла настроек OpenCV для pkg-config
  check <- sapply(pkg_config_name, 
                  function(pkg) system(paste(pkg_config_bin, pkg)))
  if (all(check != 0)) {
    stop("OpenCV config for the pkg-config not found", call. = FALSE)
  }

  pkg_config_name <- pkg_config_name[check == 0]
  list(env = list(
    PKG_CXXFLAGS = system(paste(pkg_config_bin, "--cflags", pkg_config_name), 
                          intern = TRUE),
    PKG_LIBS = system(paste(pkg_config_bin, "--libs", pkg_config_name), 
                      intern = TRUE)
  ))
})

Ex plugin operatione, valores sequentes in processu compilatione substituentur;

Rcpp:::.plugins$opencv()$env

# $PKG_CXXFLAGS
# [1] "-I/usr/include/opencv"
#
# $PKG_LIBS
# [1] "-lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core"

Exsecutio code pro parsing JSON et batch generandi ad exemplar tradendae datur sub praedo. Primum, adde directorium locale ad quaerendum fasciculi capitis (opus ndjson);

Sys.setenv("PKG_CXXFLAGS" = paste0("-I", normalizePath(file.path("src"))))

Exsecutio JSON ad tensoris conversionem in C ++

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::plugins(opencv)]]
// [[Rcpp::depends(xtensor)]]
// [[Rcpp::depends(RcppThread)]]

#include <xtensor/xjson.hpp>
#include <xtensor/xadapt.hpp>
#include <xtensor/xview.hpp>
#include <xtensor-r/rtensor.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <Rcpp.h>
#include <RcppThread.h>

// Синонимы для типов
using RcppThread::parallelFor;
using json = nlohmann::json;
using points = xt::xtensor<double,2>;     // Извлечённые из JSON координаты точек
using strokes = std::vector<points>;      // Извлечённые из JSON координаты точек
using xtensor3d = xt::xtensor<double, 3>; // Тензор для хранения матрицы изоображения
using xtensor4d = xt::xtensor<double, 4>; // Тензор для хранения множества изображений
using rtensor3d = xt::rtensor<double, 3>; // Обёртка для экспорта в R
using rtensor4d = xt::rtensor<double, 4>; // Обёртка для экспорта в R

// Статические константы
// Размер изображения в пикселях
const static int SIZE = 256;
// Тип линии
// См. https://en.wikipedia.org/wiki/Pixel_connectivity#2-dimensional
const static int LINE_TYPE = cv::LINE_4;
// Толщина линии в пикселях
const static int LINE_WIDTH = 3;
// Алгоритм ресайза
// https://docs.opencv.org/3.1.0/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121
const static int RESIZE_TYPE = cv::INTER_LINEAR;

// Шаблон для конвертирования OpenCV-матрицы в тензор
template <typename T, int NCH, typename XT=xt::xtensor<T,3,xt::layout_type::column_major>>
XT to_xt(const cv::Mat_<cv::Vec<T, NCH>>& src) {
  // Размерность целевого тензора
  std::vector<int> shape = {src.rows, src.cols, NCH};
  // Общее количество элементов в массиве
  size_t size = src.total() * NCH;
  // Преобразование cv::Mat в xt::xtensor
  XT res = xt::adapt((T*) src.data, size, xt::no_ownership(), shape);
  return res;
}

// Преобразование JSON в список координат точек
strokes parse_json(const std::string& x) {
  auto j = json::parse(x);
  // Результат парсинга должен быть массивом
  if (!j.is_array()) {
    throw std::runtime_error("'x' must be JSON array.");
  }
  strokes res;
  res.reserve(j.size());
  for (const auto& a: j) {
    // Каждый элемент массива должен быть 2-мерным массивом
    if (!a.is_array() || a.size() != 2) {
      throw std::runtime_error("'x' must include only 2d arrays.");
    }
    // Извлечение вектора точек
    auto p = a.get<points>();
    res.push_back(p);
  }
  return res;
}

// Отрисовка линий
// Цвета HSV
cv::Mat ocv_draw_lines(const strokes& x, bool color = true) {
  // Исходный тип матрицы
  auto stype = color ? CV_8UC3 : CV_8UC1;
  // Итоговый тип матрицы
  auto dtype = color ? CV_32FC3 : CV_32FC1;
  auto bg = color ? cv::Scalar(0, 0, 255) : cv::Scalar(255);
  auto col = color ? cv::Scalar(0, 255, 220) : cv::Scalar(0);
  cv::Mat img = cv::Mat(SIZE, SIZE, stype, bg);
  // Количество линий
  size_t n = x.size();
  for (const auto& s: x) {
    // Количество точек в линии
    size_t n_points = s.shape()[1];
    for (size_t i = 0; i < n_points - 1; ++i) {
      // Точка начала штриха
      cv::Point from(s(0, i), s(1, i));
      // Точка окончания штриха
      cv::Point to(s(0, i + 1), s(1, i + 1));
      // Отрисовка линии
      cv::line(img, from, to, col, LINE_WIDTH, LINE_TYPE);
    }
    if (color) {
      // Меняем цвет линии
      col[0] += 180 / n;
    }
  }
  if (color) {
    // Меняем цветовое представление на RGB
    cv::cvtColor(img, img, cv::COLOR_HSV2RGB);
  }
  // Меняем формат представления на float32 с диапазоном [0, 1]
  img.convertTo(img, dtype, 1 / 255.0);
  return img;
}

// Обработка JSON и получение тензора с данными изображения
xtensor3d process(const std::string& x, double scale = 1.0, bool color = true) {
  auto p = parse_json(x);
  auto img = ocv_draw_lines(p, color);
  if (scale != 1) {
    cv::Mat out;
    cv::resize(img, out, cv::Size(), scale, scale, RESIZE_TYPE);
    cv::swap(img, out);
    out.release();
  }
  xtensor3d arr = color ? to_xt<double,3>(img) : to_xt<double,1>(img);
  return arr;
}

// [[Rcpp::export]]
rtensor3d cpp_process_json_str(const std::string& x, 
                               double scale = 1.0, 
                               bool color = true) {
  xtensor3d res = process(x, scale, color);
  return res;
}

// [[Rcpp::export]]
rtensor4d cpp_process_json_vector(const std::vector<std::string>& x, 
                                  double scale = 1.0, 
                                  bool color = false) {
  size_t n = x.size();
  size_t dim = floor(SIZE * scale);
  size_t channels = color ? 3 : 1;
  xtensor4d res({n, dim, dim, channels});
  parallelFor(0, n, [&x, &res, scale, color](int i) {
    xtensor3d tmp = process(x[i], scale, color);
    auto view = xt::view(res, i, xt::all(), xt::all(), xt::all());
    view = tmp;
  });
  return res;
}

Hoc signum in tabella src/cv_xt.cpp et ordinare cum imperio Rcpp::sourceCpp(file = "src/cv_xt.cpp", env = .GlobalEnv); etiam requiritur ad opus nlohmann/json.hpp ex repositio. Codex in varia munera divisus est;

  • to_xt - a templated munus transformandi imaginem matricis (cv::Mat) Ad tensorem xt::xtensor;

  • parse_json — munus parses a JSON chorda, coordinatas punctorum extrahit, eas in vectorem collocans;

  • ocv_draw_lines — a vectore inde proveniens lineas versicolores trahit;

  • process — praedicta munera componit et etiam facultatem imaginis inde ascensionis addit;

  • cpp_process_json_str - wrapper super functionem processquae eventum rei R-obiectum exportat (multidimensionale ordinata);

  • cpp_process_json_vector - wrapper super functionem cpp_process_json_strquae permittit ut vector in multi- liciato modo chordarum procedamus.

Ad lineas multicolores ducendas, exemplar HSV coloris adhibitum est, deinde conversio ad RGB. Experiamur eventum:

arr <- cpp_process_json_str(tmp_data[4, drawing])
dim(arr)
# [1] 256 256   3
plot(magick::image_read(arr))

Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!
Comparatio celeritatis implementationum in R et C ++

res_bench <- bench::mark(
  r_process_json_str(tmp_data[4, drawing], scale = 0.5),
  cpp_process_json_str(tmp_data[4, drawing], scale = 0.5),
  check = FALSE,
  min_iterations = 100
)
# Параметры бенчмарка
cols <- c("expression", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   expression                min     median       max `itr/sec` total_time  n_itr
#   <chr>                <bch:tm>   <bch:tm>  <bch:tm>     <dbl>   <bch:tm>  <int>
# 1 r_process_json_str     3.49ms     3.55ms    4.47ms      273.      490ms    134
# 2 cpp_process_json_str   1.94ms     2.02ms    5.32ms      489.      497ms    243

library(ggplot2)
# Проведение замера
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    .data <- tmp_data[sample(seq_len(.N), batch_size), drawing]
    bench::mark(
      r_process_json_vector(.data, scale = 0.5),
      cpp_process_json_vector(.data,  scale = 0.5),
      min_iterations = 50,
      check = FALSE
    )
  }
)

res_bench[, cols]

#    expression   batch_size      min   median      max `itr/sec` total_time n_itr
#    <chr>             <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
#  1 r                   16   50.61ms  53.34ms  54.82ms    19.1     471.13ms     9
#  2 cpp                 16    4.46ms   5.39ms   7.78ms   192.      474.09ms    91
#  3 r                   32   105.7ms 109.74ms 212.26ms     7.69        6.5s    50
#  4 cpp                 32    7.76ms  10.97ms  15.23ms    95.6     522.78ms    50
#  5 r                   64  211.41ms 226.18ms 332.65ms     3.85      12.99s    50
#  6 cpp                 64   25.09ms  27.34ms  32.04ms    36.0        1.39s    50
#  7 r                  128   534.5ms 627.92ms 659.08ms     1.61      31.03s    50
#  8 cpp                128   56.37ms  58.46ms  66.03ms    16.9        2.95s    50
#  9 r                  256     1.15s    1.18s    1.29s     0.851     58.78s    50
# 10 cpp                256  114.97ms 117.39ms 130.09ms     8.45       5.92s    50
# 11 r                  512     2.09s    2.15s    2.32s     0.463       1.8m    50
# 12 cpp                512  230.81ms  235.6ms 261.99ms     4.18      11.97s    50
# 13 r                 1024        4s    4.22s     4.4s     0.238       3.5m    50
# 14 cpp               1024  410.48ms 431.43ms 462.44ms     2.33      21.45s    50

ggplot(res_bench, aes(x = factor(batch_size), y = median, 
                      group =  expression, color = expression)) +
  geom_point() +
  geom_line() +
  ylab("median time, s") +
  theme_minimal() +
  scale_color_discrete(name = "", labels = c("cpp", "r")) +
  theme(legend.position = "bottom") 

Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!

Ut vides, celeritas incrementa admodum significantia evasit, et cum C++ codice parallelisando R codicem assequi non potest.

3. Iterators ad exonerandam batches a database

R bene meritae famae in RAM processus notitiae aptatae habet, dum Python per iterativas notas magis insignitur, te facile et naturaliter calculis (calculis externae memoriae utentibus) permittens. Classicum et propositum nobis exemplum in contextu descriptorum problematum est altum retiacula neuralis quae methodo gradiente descensu instructae cum approximatione gradientis in unoquoque gressu parvam portionem observationum vel mini- batch utentes.

Alta litterarum compages in Pythone scriptas speciales classes habent, quae iteratores instrumenti in notitia fundantur: tabulas, picturas in folder, formas binarias, etc. Optiones paratas uti potes vel ad propria negotia propria scribere. In R uti possumus omnes bibliothecae Pythonis lineamenta Keras cum variis suis backends utens fasciculo eiusdem nominis, qui rursus operatur in summitate sarcina reticulare. Posterior articulum longum separatum meretur; id non solum tibi permittit ut Pythonis codicem ex R curreret, sed etiam per R et Pythonem sessiones obiecta transferres, omnes conversiones typus necessarias sponte faciendo.

Nos necessitatem omnem in RAM condere utendo MonetDBLite, omnia "retis neuralis" opera ab originali codice in Pythone perficienda, mox iteratorem in notitia scribenda habemus, cum nihil sit paratum. pro tali re aut R aut Python. Essentialiter duo tantum requiruntur pro eo: batches in ansa infinita reddere debet et statum suum servare inter iterationes (haec in R simplicissimo modo utens clausuras impletur). Antea opus erat ut R explicite convertat in numpy vestit iteratorem intus, sed versio involucrum currens. Keras ipsa facit.

Iterator ad formandam et sanationem datam ita evasit:

Iterator pro disciplina et sanatione data

train_generator <- function(db_connection = con,
                            samples_index,
                            num_classes = 340,
                            batch_size = 32,
                            scale = 1,
                            color = FALSE,
                            imagenet_preproc = FALSE) {
  # Проверка аргументов
  checkmate::assert_class(con, "DBIConnection")
  checkmate::assert_integerish(samples_index)
  checkmate::assert_count(num_classes)
  checkmate::assert_count(batch_size)
  checkmate::assert_number(scale, lower = 0.001, upper = 5)
  checkmate::assert_flag(color)
  checkmate::assert_flag(imagenet_preproc)

  # Перемешиваем, чтобы брать и удалять использованные индексы батчей по порядку
  dt <- data.table::data.table(id = sample(samples_index))
  # Проставляем номера батчей
  dt[, batch := (.I - 1L) %/% batch_size + 1L]
  # Оставляем только полные батчи и индексируем
  dt <- dt[, if (.N == batch_size) .SD, keyby = batch]
  # Устанавливаем счётчик
  i <- 1
  # Количество батчей
  max_i <- dt[, max(batch)]

  # Подготовка выражения для выгрузки
  sql <- sprintf(
    "PREPARE SELECT drawing, label_int FROM doodles WHERE id IN (%s)",
    paste(rep("?", batch_size), collapse = ",")
  )
  res <- DBI::dbSendQuery(con, sql)

  # Аналог keras::to_categorical
  to_categorical <- function(x, num) {
    n <- length(x)
    m <- numeric(n * num)
    m[x * n + seq_len(n)] <- 1
    dim(m) <- c(n, num)
    return(m)
  }

  # Замыкание
  function() {
    # Начинаем новую эпоху
    if (i > max_i) {
      dt[, id := sample(id)]
      data.table::setkey(dt, batch)
      # Сбрасываем счётчик
      i <<- 1
      max_i <<- dt[, max(batch)]
    }

    # ID для выгрузки данных
    batch_ind <- dt[batch == i, id]
    # Выгрузка данных
    batch <- DBI::dbFetch(DBI::dbBind(res, as.list(batch_ind)), n = -1)

    # Увеличиваем счётчик
    i <<- i + 1

    # Парсинг JSON и подготовка массива
    batch_x <- cpp_process_json_vector(batch$drawing, scale = scale, color = color)
    if (imagenet_preproc) {
      # Шкалирование c интервала [0, 1] на интервал [-1, 1]
      batch_x <- (batch_x - 0.5) * 2
    }

    batch_y <- to_categorical(batch$label_int, num_classes)
    result <- list(batch_x, batch_y)
    return(result)
  }
}

Munus variabile accipit cum connexione ad datorum, numeros linearum, numerum generum, massam magnitudinem, scalam.scale = 1 respondet imagines reddendae imaginum 256x256; scale = 0.5 — 128x128 elementa), color indicator (color = FALSE cificat reddere cineraceorum cum usus color = TRUE utrumque ictum in novo colore trahitur) et index praeprocessionis ad retiacula in imagenet praestructa. Hoc opus est ut ab intervallo valores pixel ascendat [0, 1] ad intervallum [-1, 1], quod adhibitum est cum formando supplevit. Keras exempla.

Munus externum continet rationem generis reprimendi, mensam data.table cum passim mixta linea numeris a * samples_index et batch numeros, calculos et batches maximus numerus, tum SQL expressio pro exonerandi notitia e datorum datorum. Praeterea tenemus analogum muneris intus definivimus keras::to_categorical(). Omnes fere notitias exercitationis adhibebamus, dimidium centesimas ad sanandum relinquentes, itaque magnitudo epocha a modulo circumscripta erat. steps_per_epoch cum dicitur keras::fit_generator()ac condition if (i > max_i) tantum pro sanatione iterator laboraverunt.

In functione interna, indices ordinis in proximam massam redduntur, monumenta expositae sunt e datorum cum batch contra crescentem, JSON parsing (munus. cpp_process_json_vector(), scripta in C ++) et creant vestes picturis correspondentes. Tunc vector calidus unus cum pittaciis classium creatur, vestit cum pixel valores et pittacia in album, quod est reditus pretii. Ad opus accelerandum, indices creationis in tabulis usi sumus data.table et modificatio per nexum - sine his sarcina "eu" data.table Difficillimum est fingere efficaciter operari cum aliqua notabili copia notitiarum in R.

Eventus celeritatis mensurarum in Cori i5 laptop hi sunt:

Probatio Iterator

library(Rcpp)
library(keras)
library(ggplot2)

source("utils/rcpp.R")
source("utils/keras_iterator.R")

con <- DBI::dbConnect(drv = MonetDBLite::MonetDBLite(), Sys.getenv("DBDIR"))

ind <- seq_len(DBI::dbGetQuery(con, "SELECT count(*) FROM doodles")[[1L]])
num_classes <- DBI::dbGetQuery(con, "SELECT max(label_int) + 1 FROM doodles")[[1L]]

# Индексы для обучающей выборки
train_ind <- sample(ind, floor(length(ind) * 0.995))
# Индексы для проверочной выборки
val_ind <- ind[-train_ind]
rm(ind)
# Коэффициент масштаба
scale <- 0.5

# Проведение замера
res_bench <- bench::press(
  batch_size = 2^(4:10),
  {
    it1 <- train_generator(
      db_connection = con,
      samples_index = train_ind,
      num_classes = num_classes,
      batch_size = batch_size,
      scale = scale
    )
    bench::mark(
      it1(),
      min_iterations = 50L
    )
  }
)
# Параметры бенчмарка
cols <- c("batch_size", "min", "median", "max", "itr/sec", "total_time", "n_itr")
res_bench[, cols]

#   batch_size      min   median      max `itr/sec` total_time n_itr
#        <dbl> <bch:tm> <bch:tm> <bch:tm>     <dbl>   <bch:tm> <int>
# 1         16     25ms  64.36ms   92.2ms     15.9       3.09s    49
# 2         32   48.4ms 118.13ms 197.24ms     8.17       5.88s    48
# 3         64   69.3ms 117.93ms 181.14ms     8.57       5.83s    50
# 4        128  157.2ms 240.74ms 503.87ms     3.85      12.71s    49
# 5        256  359.3ms 613.52ms 988.73ms     1.54       30.5s    47
# 6        512  884.7ms    1.53s    2.07s     0.674      1.11m    45
# 7       1024     2.7s    3.83s    5.47s     0.261      2.81m    44

ggplot(res_bench, aes(x = factor(batch_size), y = median, group = 1)) +
    geom_point() +
    geom_line() +
    ylab("median time, s") +
    theme_minimal()

DBI::dbDisconnect(con, shutdown = TRUE)

Velox Draw Doodle Recognitio: quomodo amicos R, C ++ ac retiacula neural!

Si quantum satis RAM habes, graviter accelerare potes operationem datorum, eam transferendo ad eundem RAM (32 GB satis est ad negotium nostrum). In Linux, partitio annectitur per defaltam /dev/shmoccupans usque ad medietatem ram facultatis. Potes exaggerandam plus emendando /etc/fstabut recordum quasi tmpfs /dev/shm tmpfs defaults,size=25g 0 0. Vide ut reboot deprime et effectus per currendo mandatum df -h.

Iterator notitiarum testium multo simpliciorem spectat, cum probatio dataset totum in RAM convenit;

Iterator pro test notitia

test_generator <- function(dt,
                           batch_size = 32,
                           scale = 1,
                           color = FALSE,
                           imagenet_preproc = FALSE) {

  # Проверка аргументов
  checkmate::assert_data_table(dt)
  checkmate::assert_count(batch_size)
  checkmate::assert_number(scale, lower = 0.001, upper = 5)
  checkmate::assert_flag(color)
  checkmate::assert_flag(imagenet_preproc)

  # Проставляем номера батчей
  dt[, batch := (.I - 1L) %/% batch_size + 1L]
  data.table::setkey(dt, batch)
  i <- 1
  max_i <- dt[, max(batch)]

  # Замыкание
  function() {
    batch_x <- cpp_process_json_vector(dt[batch == i, drawing], 
                                       scale = scale, color = color)
    if (imagenet_preproc) {
      # Шкалирование c интервала [0, 1] на интервал [-1, 1]
      batch_x <- (batch_x - 0.5) * 2
    }
    result <- list(batch_x)
    i <<- i + 1
    return(result)
  }
}

4. Electio exemplar architecturae

Primum architectura usus est mobilenet v1, de quibus lineamentis agitur haec perferre. Includitur ut vexillum Keras et ideo in sarcina eiusdem nominis praesto est pro R. Sed cum eo uti conatur cum imaginibus simplicibus, res mira evasit: tensor initus semper dimensionem habere debet. (batch, height, width, 3)id est, numerus canalium mutari non potest. Talis limitatio in Pythone non est, ita nos ruit et scripsimus nostram exsecutionem huius architecturae, secundum articulum originalem (sine dropout quod est in keras version);

Mobilenet v1 architectura

library(keras)

top_3_categorical_accuracy <- custom_metric(
    name = "top_3_categorical_accuracy",
    metric_fn = function(y_true, y_pred) {
         metric_top_k_categorical_accuracy(y_true, y_pred, k = 3)
    }
)

layer_sep_conv_bn <- function(object, 
                              filters,
                              alpha = 1,
                              depth_multiplier = 1,
                              strides = c(2, 2)) {

  # NB! depth_multiplier !=  resolution multiplier
  # https://github.com/keras-team/keras/issues/10349

  layer_depthwise_conv_2d(
    object = object,
    kernel_size = c(3, 3), 
    strides = strides,
    padding = "same",
    depth_multiplier = depth_multiplier
  ) %>%
  layer_batch_normalization() %>% 
  layer_activation_relu() %>%
  layer_conv_2d(
    filters = filters * alpha,
    kernel_size = c(1, 1), 
    strides = c(1, 1)
  ) %>%
  layer_batch_normalization() %>% 
  layer_activation_relu() 
}

get_mobilenet_v1 <- function(input_shape = c(224, 224, 1),
                             num_classes = 340,
                             alpha = 1,
                             depth_multiplier = 1,
                             optimizer = optimizer_adam(lr = 0.002),
                             loss = "categorical_crossentropy",
                             metrics = c("categorical_crossentropy",
                                         top_3_categorical_accuracy)) {

  inputs <- layer_input(shape = input_shape)

  outputs <- inputs %>%
    layer_conv_2d(filters = 32, kernel_size = c(3, 3), strides = c(2, 2), padding = "same") %>%
    layer_batch_normalization() %>% 
    layer_activation_relu() %>%
    layer_sep_conv_bn(filters = 64, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 128, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 128, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 256, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 256, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 512, strides = c(1, 1)) %>%
    layer_sep_conv_bn(filters = 1024, strides = c(2, 2)) %>%
    layer_sep_conv_bn(filters = 1024, strides = c(1, 1)) %>%
    layer_global_average_pooling_2d() %>%
    layer_dense(units = num_classes) %>%
    layer_activation_softmax()

    model <- keras_model(
      inputs = inputs,
      outputs = outputs
    )

    model %>% compile(
      optimizer = optimizer,
      loss = loss,
      metrics = metrics
    )

    return(model)
}

Incommoda aditus patet. Multa exempla volo probare, sed contra singula manually architectura scribere nolo. Destituti etiam sumus facultate utendi ponderum exemplorum in imagenet praestructi. Ut fit, studeo documentis adiuvari. Officium get_config() permittit tibi ut descriptionem exemplaris in forma edendi idoneam habeas.base_model_conf$layers - regularis R list), et munus from_config() facit contra conversionem ad exemplar objecti;

base_model_conf <- get_config(base_model)
base_model_conf$layers[[1]]$config$batch_input_shape[[4]] <- 1L
base_model <- from_config(base_model_conf)

Non est autem difficile scribere munus universale ad quodlibet suppeditatum Keras exempla cum aut sine ponderibus ad imagenet exercitata;

Munus onerationis parata facta architecturae

get_model <- function(name = "mobilenet_v2",
                      input_shape = NULL,
                      weights = "imagenet",
                      pooling = "avg",
                      num_classes = NULL,
                      optimizer = keras::optimizer_adam(lr = 0.002),
                      loss = "categorical_crossentropy",
                      metrics = NULL,
                      color = TRUE,
                      compile = FALSE) {
  # Проверка аргументов
  checkmate::assert_string(name)
  checkmate::assert_integerish(input_shape, lower = 1, upper = 256, len = 3)
  checkmate::assert_count(num_classes)
  checkmate::assert_flag(color)
  checkmate::assert_flag(compile)

  # Получаем объект из пакета keras
  model_fun <- get0(paste0("application_", name), envir = asNamespace("keras"))
  # Проверка наличия объекта в пакете
  if (is.null(model_fun)) {
    stop("Model ", shQuote(name), " not found.", call. = FALSE)
  }

  base_model <- model_fun(
    input_shape = input_shape,
    include_top = FALSE,
    weights = weights,
    pooling = pooling
  )

  # Если изображение не цветное, меняем размерность входа
  if (!color) {
    base_model_conf <- keras::get_config(base_model)
    base_model_conf$layers[[1]]$config$batch_input_shape[[4]] <- 1L
    base_model <- keras::from_config(base_model_conf)
  }

  predictions <- keras::get_layer(base_model, "global_average_pooling2d_1")$output
  predictions <- keras::layer_dense(predictions, units = num_classes, activation = "softmax")
  model <- keras::keras_model(
    inputs = base_model$input,
    outputs = predictions
  )

  if (compile) {
    keras::compile(
      object = model,
      optimizer = optimizer,
      loss = loss,
      metrics = metrics
    )
  }

  return(model)
}

Cum imagines uno alveo utuntur, nulla prescripta pondera adhibentur. Hoc fixum potest esse: usus ad munus get_weights() exemplar pondera in forma elenchi R vestit, dimensio primi elementi huius indicem mutat (sumendo unum canalem colorem vel omnia tria fere), et postea pondera in exemplar cum functione reducit. set_weights(). Hanc functionem numquam addimus, quia hac in re iam perspicuum erat uberiorem picturis operam navare.

Plurima experimenta per versiones mobiles 1 et 2, necnon resnet34 percepimus. More modernae architecturae sicut SE-ResNeXt in hoc certamine bene praestiterunt. Infeliciter, paratas factas exsecutiones in promptu nostro non habuimus, nec nostra scripsimus (sed definite scribemus).

5. Parameterization scriptorum

Ad commoditatem, omnia codice ad institutionem incipiendam destinata sunt ut unum scriptum, utens parametrisis docopt ut sequitur:

doc <- '
Usage:
  train_nn.R --help
  train_nn.R --list-models
  train_nn.R [options]

Options:
  -h --help                   Show this message.
  -l --list-models            List available models.
  -m --model=<model>          Neural network model name [default: mobilenet_v2].
  -b --batch-size=<size>      Batch size [default: 32].
  -s --scale-factor=<ratio>   Scale factor [default: 0.5].
  -c --color                  Use color lines [default: FALSE].
  -d --db-dir=<path>          Path to database directory [default: Sys.getenv("db_dir")].
  -r --validate-ratio=<ratio> Validate sample ratio [default: 0.995].
  -n --n-gpu=<number>         Number of GPUs [default: 1].
'
args <- docopt::docopt(doc)

sarcina docopt significat exsecutionem http://docopt.org/ pro R. Eius auxilio scripta cum simplicibus praeceptis similia Rscript bin/train_nn.R -m resnet50 -c -d /home/andrey/doodle_db aut ./bin/train_nn.R -m resnet50 -c -d /home/andrey/doodle_dbSi file train_nn.R est exsecutabile (hoc exemplum mos satus exercens mandatum resnet50 in three-color images mensuræ 128x128 elementa, datorum collocari debet in folder /home/andrey/doodle_db). Potes addiscere celeritatem, optimizer generis, et quosvis alios parametros customizable ad indicem. In processu publicationis apparandi, evenit ut architectura mobilenet_v2 ex emendatione Keras in R usu oportet, ob mutationes in R sarcina non habitae, exspectamus eas reficere.

Hic accessus perfecit ut experimenta signanter acceleraret cum diversis exemplaribus comparatis ad morem scriptorum in RStudio traditum (in sarcina quam jocus fieri potest notamus. tfruns). Praecipua autem utilitas est facultas scriptorum in Docker vel simpliciter in calculo facile administrare, sine RStudio hoc inaugurari.

6. Dockerization scriptorum

Docker ad invigilandum portabilitatem ambitus ad exempla formandi inter iunctos sodales et in nube celeri instruere usi sumus. Incipere potes hoc instrumentum cognoscere, quod relative insolitum est pro R programmate, cum haec series publications or * sane video.

Docker permittit te utrasque imagines tuas ex integro creas et aliis imaginibus utere ut fundamentum ad tuum creandum. Cum optiones available analysing, ad conclusionem venimus NVIDIA, CUDA+cuDNN rectoribus et bibliothecis Pythonis satis numerosam imaginis partem esse, et imaginem officialem pro fundamento accipere decrevimus. tensorflow/tensorflow:1.12.0-gpuadditis ibi necessariis R fasciculis.

Finalis docker file vidi sic:

Dockerfile

FROM tensorflow/tensorflow:1.12.0-gpu

MAINTAINER Artem Klevtsov <[email protected]>

SHELL ["/bin/bash", "-c"]

ARG LOCALE="en_US.UTF-8"
ARG APT_PKG="libopencv-dev r-base r-base-dev littler"
ARG R_BIN_PKG="futile.logger checkmate data.table rcpp rapidjsonr dbi keras jsonlite curl digest remotes"
ARG R_SRC_PKG="xtensor RcppThread docopt MonetDBLite"
ARG PY_PIP_PKG="keras"
ARG DIRS="/db /app /app/data /app/models /app/logs"

RUN source /etc/os-release && 
    echo "deb https://cloud.r-project.org/bin/linux/ubuntu ${UBUNTU_CODENAME}-cran35/" > /etc/apt/sources.list.d/cran35.list && 
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9 && 
    add-apt-repository -y ppa:marutter/c2d4u3.5 && 
    add-apt-repository -y ppa:timsc/opencv-3.4 && 
    apt-get update && 
    apt-get install -y locales && 
    locale-gen ${LOCALE} && 
    apt-get install -y --no-install-recommends ${APT_PKG} && 
    ln -s /usr/lib/R/site-library/littler/examples/install.r /usr/local/bin/install.r && 
    ln -s /usr/lib/R/site-library/littler/examples/install2.r /usr/local/bin/install2.r && 
    ln -s /usr/lib/R/site-library/littler/examples/installGithub.r /usr/local/bin/installGithub.r && 
    echo 'options(Ncpus = parallel::detectCores())' >> /etc/R/Rprofile.site && 
    echo 'options(repos = c(CRAN = "https://cloud.r-project.org"))' >> /etc/R/Rprofile.site && 
    apt-get install -y $(printf "r-cran-%s " ${R_BIN_PKG}) && 
    install.r ${R_SRC_PKG} && 
    pip install ${PY_PIP_PKG} && 
    mkdir -p ${DIRS} && 
    chmod 777 ${DIRS} && 
    rm -rf /tmp/downloaded_packages/ /tmp/*.rds && 
    rm -rf /var/lib/apt/lists/*

COPY utils /app/utils
COPY src /app/src
COPY tests /app/tests
COPY bin/*.R /app/

ENV DBDIR="/db"
ENV CUDA_HOME="/usr/local/cuda"
ENV PATH="/app:${PATH}"

WORKDIR /app

VOLUME /db
VOLUME /app

CMD bash

pro commodo, fasciculi usus in variabiles ponebantur; mole scriptorum intra vascula in ecclesia exscripta sunt. Nos quoque mandatum testa to /bin/bash ad usum contentus otium /etc/os-release. Haec vitavit necessitatem versionis OS denotandi in codice.

Accedit parva scriptura scripta quae vasculum cum variis mandatis immittendi permittit. Exempli gratia, haec scripta esse possunt ad retiacula neuralis educanda, quae antea intus in continente posita erant, vel putamen mandatum ad debugging et vigilantiam operationis continentis;

Scriptor mittere continens

#!/bin/sh

DBDIR=${PWD}/db
LOGSDIR=${PWD}/logs
MODELDIR=${PWD}/models
DATADIR=${PWD}/data
ARGS="--runtime=nvidia --rm -v ${DBDIR}:/db -v ${LOGSDIR}:/app/logs -v ${MODELDIR}:/app/models -v ${DATADIR}:/app/data"

if [ -z "$1" ]; then
    CMD="Rscript /app/train_nn.R"
elif [ "$1" = "bash" ]; then
    ARGS="${ARGS} -ti"
else
    CMD="Rscript /app/train_nn.R $@"
fi

docker run ${ARGS} doodles-tf ${CMD}

Si vercundus scriptura sine parametris curritur, scriptura intra continens vocabitur train_nn.R with default values; si prima argumentatio positionalis est "bash", continens incipiet interactive cum testa imperato. In ceteris casibus, valores argumentorum positionalium substituuntur; CMD="Rscript /app/train_nn.R $@".

Notatu dignum est directoria fontium notitiarum et datorum, necnon directorium ad exemplaria salutarium exercitata, intus in vase ex systemate militiae insidere, quod permittit tibi accedere scriptorum eventus sine manipulationibus superfluis.

7. Using multiple GPUs on Google Cloud

Una e lineamentis certaminis erat notitia admodum tumultuosa (vide titulum picturae ab @Leigh.plt ab ODS remissa mutuatum). Magnae batches hanc pugnam adiuvant, et post experimenta in PC cum 1 GPU constituimus exempla in pluribus GPUs in nube magistros exercere. Used GoogleCloud (bonus dux in basics) propter magna delectu configurationum promptorum, rationabilium pretia et $ 300 bonus. Ex avaritia, exempli gratia 4xV100 imperavi cum SSD ton et RAM, et magnus error fuit. Talis machina celeriter pecuniam exedens, experimentum ruptum ire potes sine fistula probata. Pro institutis, melius est K80 accipere. Sed magna copia RAM in promptu venit - nubes SSD non imprimit cum suo effectu, sic translatum est database ad dev/shm.

Maximi momenti est codice fragmentum quod multiplex GPUs utens est. Primum, exemplar creatum est in CPU procuratori contextu, sicut in Pythone:

with(tensorflow::tf$device("/cpu:0"), {
  model_cpu <- get_model(
    name = model_name,
    input_shape = input_shape,
    weights = weights,
    metrics =(top_3_categorical_accuracy,
    compile = FALSE
  )
})

Deinde exemplar sine compilatum (hoc est magni momenti) ad aliquem numerum GPUs promptorum expressum est, et solum postea compilatum est:

model <- keras::multi_gpu_model(model_cpu, gpus = n_gpu)
keras::compile(
  object = model,
  optimizer = keras::optimizer_adam(lr = 0.0004),
  loss = "categorical_crossentropy",
  metrics = c(top_3_categorical_accuracy)
)

Ars classica congelatio omnium stratorum praeter unum ultimum, rudimenta postrema, indefessa et conservata, exemplar integrum aliquot GPUs impleri non potuit.

Sine usu disciplina viverra erat. tensorboardnosmetipsos limitantes ad acta describenda et ad exempla salutaria cum nominibus informativum post singulas epochas:

Callbacks

# Шаблон имени файла лога
log_file_tmpl <- file.path("logs", sprintf(
  "%s_%d_%dch_%s.csv",
  model_name,
  dim_size,
  channels,
  format(Sys.time(), "%Y%m%d%H%M%OS")
))
# Шаблон имени файла модели
model_file_tmpl <- file.path("models", sprintf(
  "%s_%d_%dch_{epoch:02d}_{val_loss:.2f}.h5",
  model_name,
  dim_size,
  channels
))

callbacks_list <- list(
  keras::callback_csv_logger(
    filename = log_file_tmpl
  ),
  keras::callback_early_stopping(
    monitor = "val_loss",
    min_delta = 1e-4,
    patience = 8,
    verbose = 1,
    mode = "min"
  ),
  keras::callback_reduce_lr_on_plateau(
    monitor = "val_loss",
    factor = 0.5, # уменьшаем lr в 2 раза
    patience = 4,
    verbose = 1,
    min_delta = 1e-4,
    mode = "min"
  ),
  keras::callback_model_checkpoint(
    filepath = model_file_tmpl,
    monitor = "val_loss",
    save_best_only = FALSE,
    save_weights_only = FALSE,
    mode = "min"
  )
)

8. Pro conclusione

Plures difficultates quas invenimus nondum superatae sunt;

  • в Keras non est paratum factum munus pro automatice quaerendo optimal doctrina rate (analog lr_finder sunt in bibliothecam fast.ai); Cum aliquo conatu fieri potest ut exsecutiones tertiae partium ad R portum, e.g. haec;
  • consequenter praecedens punctum, non potuit eligere rectam institutionem celeritatem cum pluribus GPUs utentibus;
  • defectus est architecturae retis neuralis recentioris, praesertim eae quae ad imaginem retis praestructi sunt;
  • nemo cyclus consilium ac discriminis rates discriminat (fuerat cosinus rogationem nostram implemented, Gratias tibi skeydan).

Quae utilia sunt ab hoc certamine discuntur:

  • In hardware relative humilis potentia, cum honestis (multis partibus RAM) volumina notitiarum sine dolore laborare potes. Plastic sacculum data.table memoriam servat ob in- locum modificationem tabularum, quae eas describendi vitat, et cum recte usus est, eius facultates fere semper summa celeritate demonstrant inter omnia instrumenta nobis nota ad linguas scribendas. Salvo dato in database permittit tibi, multis in casibus, ne omnino cogitare de necessitate integras notitias in RAM exprimendi.
  • Tardus functiones in R reponi potest cum celeriter in C ++ utens sarcina Rcpp. Si praeter usum RcppThread aut RcppParalleleiectamenta multi- plicatorum exsecutionum transversim accipimus, quare non opus est codicem in R gradu parallelismum.
  • sarcina Rcpp sine gravi scientia C ++ adhiberi potest, minimum requiritur adumbrata hic. Header files for a number of cool C-libraries like xtensor praesto in CRAN, id est, infrastructura formatur ad exsequendam incepta quae in C++ codice R. perficiendi parata sunt. Commodum praeterea est syntaxin elucidandi et statice C++ analyser in RStudio codice.
  • docopt sino vos currere scriptor parametri sui contenta. Hoc commodum est usui servo remoto, incl. sub dock. In RStudio, incommodum est multas experimentorum horas cum reticulis instituendis gerere, et IDE in ipso servo instituere non semper iustificatur.
  • Docker codicem efficit portabilitatem et reproducibilitatem proventuum inter tincidunt cum variis versionibus OS et bibliothecarum, ac facilitatem exsecutionis in ministris. Totam paedagogiam pipelines cum uno tantum mandato deicere potes.
  • Google Cloud est via budget-amica ad experiendum in ferramentis pretiosis, sed debes diligenter figurationes eligere.
  • Singula fragmenta codicis celeritas metiendi utilissima est, praesertim cum R et C ++ coniungendo, et cum fasciculo scamnum — etiam facillime.

Super hac experientia valde remuneratur et pergimus ad componendas quasdam quaestiones excitandas.

Source: www.habr.com

Add a comment