Magento 2: importing products directly into the database

В previous article I described the process of importing products into Magento 2 in the usual way - through models and repositories. The conventional method is characterized by a very low data processing speed. On my laptop, it was about one product per second. In this continuation, I consider an alternative way to import a product - by direct writing to the database, bypassing the standard Magento 2 mechanisms (models, factories, repositories). The sequence of steps for importing products can be adapted to any programming language that can work with MySQL.

Disclaimer: Magento has ready-made functionality for data import and, most likely, you will have enough of it. However, if you need more control over the import process, not limited to preparing a CSV file for what is, welcome under cat.

Magento 2: importing products directly into the database

The code resulting from writing both articles can be viewed in the Magento module "flancer32/mage2_ext_demo_import". Here are some restrictions that I followed in order to simplify the demo module code:

  • Products are only created, not updated.
  • One warehouse
  • Only category names are imported, without their structure
  • Data structures conform to version 2.3

JSON for single product import:

{
  "sku": "MVA20D-UBV-3",
  "name": "Заглушка для пломбировки ВА47-29 IEK",
  "desc": "Обеспечение доступа к устройствам ...",
  "desc_short": "Заглушка для пломбировки ВА47-29 IEK предназначена для ...",
  "price": 5.00,
  "qty": 25,
  "categories": ["Категория 1", "Категория 2"],
  "image_path": "mva20d_ubv_3.png"
}

Overview of the main import steps

  • registration of the product itself
  • product and website link
  • basic product attributes (EAV)
  • inventory data (amount of product in stock)
  • media (pictures)
  • association with catalog categories

Product Registration

Basic product information is in catalog_product_entity:

CREATE TABLE `catalog_product_entity` (
  `entity_id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Entity Id',
  `attribute_set_id` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'Attribute Set ID',
  `type_id` varchar(32) NOT NULL DEFAULT 'simple' COMMENT 'Type ID',
  `sku` varchar(64) DEFAULT NULL COMMENT 'SKU',
  `has_options` smallint(6) NOT NULL DEFAULT '0' COMMENT 'Has Options',
  `required_options` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'Required Options',
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation Time',
  `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update Time',
  PRIMARY KEY (`entity_id`),
  KEY `CATALOG_PRODUCT_ENTITY_ATTRIBUTE_SET_ID` (`attribute_set_id`),
  KEY `CATALOG_PRODUCT_ENTITY_SKU` (`sku`)
)

The minimum required information to create an entry in the product registry:

  • attribute_set_id
  • sku

additional:

  • type_id — if not specified, 'simple' will be used

For direct writing to the database, I use the DB adapter of Magento itself:

function create($sku, $typeId, $attrSetId)
{
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('catalog_product_entity');
    $bind = [
        'sku' => $sku,
        'type_id' => $typeId,
        'attribute_set_id' => $attrSetId
    ];
    $conn->insert($table, $bind);
    $result = $conn->lastInsertId($table);
    return $result;
}

After registering the product in catalog_product_entity it becomes visible in the admin panel, in the product grid (Catalog/Products).

Magento 2: importing products directly into the database

Linking product and website

The connection of the product to the site determines in which stores and in which showcases the product will be available on the front.

function linkToWebsite($prodId, $websiteId)
{
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('catalog_product_website');
    $bind = [
        'product_id' => $prodId,
        'website_id' => $websiteId
    ];
    $conn->insert($table, $bind);
}

Magento 2: importing products directly into the database

Basic Product Attributes

The newly registered product does not yet have a name or description. All this is done through EAV attributes. Here is a list of basic product attributes that are needed in order for the product to be displayed correctly on the front:

  • name
  • price
  • description
  • short_description
  • status
  • tax_class_id
  • url_key
  • visibility

A separate attribute is added to the product like this (the details of obtaining the identifier and type of the attribute by its code are omitted):

public function create($prodId, $attrCode, $attrValue)
{
    $attrId = /* get attribute ID by attribute code */
    $attrType = /* get attribute type [datetime|decimal|int|text|varchar]) by attribute code */
    if ($attrId) {
        /** @var MagentoFrameworkAppResourceConnection $this->resource */
        /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
        $conn = $this->resource->getConnection();
        $tblName = 'catalog_product_entity_' . $attrType;
        $table = $this->resource->getTableName($tblName);
        $bind = [
            'attribute_id' => $attrId,
            'entity_id' => $prodId,
            /* put all attributes to default store view with id=0 (admin) */
            'store_id' => 0,
            'value' => $attrValue
        ];
        $conn->insert($table, $bind);
    }
}

By the attribute code, we determine its id and data type (datetime, decimal, int, text, varchar), then write the data for the administrative storefront to the appropriate table (store_id = 0).

After adding the above attributes to the product, we get the following picture in the admin panel:

Magento 2: importing products directly into the database

inventory data

Starting from version 2.3 in Magento, there are two sets of tables in parallel that store inventory information (product quantity):

  • cataloginventory_*: old structure;
  • inventory_*: new structure (MSI - Multi Source Inventory);

You need to add inventory data to both structures, because the new structure is not yet completely independent of the old one (it is very likely that for default warehouse in the new structure uses a table cataloginventory_stock_status as inventory_stock_1).

catalog inventory_

When deploying Magneto 2.3 we initially have 2 entries in store_website, which corresponds to two sites - administrative and main client:

website_id|code |name        |sort_order|default_group_id|is_default|
----------|-----|------------|----------|----------------|----------|
         0|admin|Admin       |         0|               0|         0|
         1|base |Main Website|         0|               1|         1|

Table cataloginventory_stock we have only one entry:

stock_id|website_id|stock_name|
--------|----------|----------|
       1|         0|Default   |

That is, we have only one “warehouse” in the old structure (stock) and it is linked to the administrative website. Adding new ones via admin sources/stocks in MSI (new structure) does not result in new entries in cataloginventory_stock.

Inventory data about products in the old structure is initially written in tables:

  • cataloginventory_stock_item
  • cataloginventory_stock_status

catalog inventory_stock_item

function createOldItem($prodId, $qty)
{
    $isQtyDecimal = (((int)$qty) != $qty);
    $isInStock = ($qty > 0);
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('cataloginventory_stock_item');
    $bind = [
        'product_id' => $prodId,
        /* we use one only stock in 'cataloginventory' structure by default */
        'stock_id' => 1,
        'qty' => $qty,
        'is_qty_decimal' => $isQtyDecimal,
        'is_in_stock' => $isInStock,
        /* default stock is bound to admin website (see `cataloginventory_stock`) */
        'website_id' => 0
    ];
    $conn->insert($table, $bind);
}

catalog inventory_stock_status

function createOldStatus($prodId, $qty)
{
    $isInStock = ($qty > 0);
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('cataloginventory_stock_status');
    $bind = [
        'product_id' => $prodId,
        /* we use one only stock in 'cataloginventory' structure by default */
        'stock_id' => 1, 
        'qty' => $qty,
        'stock_status' => MagentoCatalogInventoryApiDataStockStatusInterface::STATUS_IN_STOCK,
        /* default stock is bound to admin website (see `cataloginventory_stock`) */
        'website_id' => 0 
    ];
    $conn->insert($table, $bind);
}

inventory_

Initially, the new structure for storing inventory data contains 1 "source"(inventory_source):

source_code|name          |enabled|description   |latitude|longitude|country_id|...|
-----------|--------------|-------|--------------|--------|---------|----------|...|
default    |Default Source|      1|Default Source|0.000000| 0.000000|US        |...|

and one "warehouse"(inventory_stock):

stock_id|name         |
--------|-------------|
       1|Default Stock|

«Source' represents the physical storage for the products (the entry contains the physical coordinates and mailing address). "Squad" is a logical union of several "sources" (inventory_source_stock_link)

link_id|stock_id|source_code|priority|
-------|--------|-----------|--------|
      1|       1|default    |       1|

at the level of which there is a link to the sales channel (inventory_stock_sales_channel)

type   |code|stock_id|
-------|----|--------|
website|base|       1|

Judging by the data structure, various types of sales channels are assumed, but by default only the relationship "stock"-"website» (the link to the website follows the code of the website — base).

One "warehouse» can be linked to several «sources"and one"source" - to several "warehouses” (many-to-many relationship). The exceptions are the defaultsource" and "warehouse". They do not rebind to other entities (restriction at the code level - the error “Cannot save link related to Default Source or Default Stock"). You can read more about the MSI structure in Magento 2 in the article "Warehouse management system using CQRS and Event Sourcing. Design«.

I will use the default configuration and add all inventory information to the source default, which is involved in the sales channel associated with the website with the code base (corresponds to the client part of the store - see store_website):

function createNewItem($sku, $qty)
{
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('inventory_source_item');
    $bind = [
        'source_code' => 'default',
        'sku' => $sku,
        'quantity' => $qty,
        'status' => MagentoInventoryApiApiDataSourceItemInterface::STATUS_IN_STOCK
    ];
    $conn->insert($table, $bind);
}

After adding the inventory data to the product in the admin panel, the following picture is obtained:

Magento 2: importing products directly into the database

Media

When “manually” adding an image to a product through the admin panel, the relevant information is written in the following tables:

  • catalog_product_entity_media_gallery: media registry (images and video files);
  • catalog_product_entity_media_gallery_value: linking media to products and showcases (localization);
  • catalog_product_entity_media_gallery_value_to_entity: bind media only to products (presumably default media content for a product);
  • catalog_product_entity_varchar: the roles in which the image is used are saved here;

and the images themselves are saved in the directory ./pub/media/catalog/product/x/y/Where x и y - the first and second letters of the file name with the image. For example file image.png should be saved as ./pub/media/catalog/product/i/m/image.pngso that the platform can use it as an image when describing products from the catalog.

Register hosted in ./pub/media/catalog/product/ media file (the process of placing the file itself is not considered in this article):

function createMediaGallery($imgPathPrefixed)
{
    $attrId = /* get attribute ID by attribute code 'media_gallery' */
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('catalog_product_entity_media_gallery');
    $bind = [
        'attribute_id' => $attrId,
        'value' => $imgPathPrefixed,
        /* 'image' or 'video' */
        'media_type' => 'image',
        'disabled' => false
    ];
    $conn->insert($table, $bind);
    $result = $conn->lastInsertId($table);
    return $result;
}

When registering a new media file, an identifier is assigned.

Associate the registered media file with the corresponding product for the default showcase:

function createGalleryValue($mediaId, $prodId)
{
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('catalog_product_entity_media_gallery_value');
    $bind = [
        'value_id' => $mediaId,
        /* use admin store view by default */
        'store_id' => 0,
        'entity_id' => $prodId,
        'label' => null,
        /* we have one only image */
        'position' => 1,
        'disabled' => false
    ];
    $conn->insert($table, $bind);
}

We associate the registered media file with the corresponding product without being tied to any showcase. It is not clear where exactly this data is used and why it is impossible to access the data of the previous table, but this table exists and the data is written to it when an image is added to the product. Therefore, like this.

function createGalleryValueToEntity($mediaId, $prodId)
{
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('catalog_product_entity_media_gallery_value_to_entity');
    $bind = [
        'value_id' => $mediaId,
        'entity_id' => $prodId
    ];
    $conn->insert($table, $bind);
}

catalog_product_entity_varchar

The media file can be used with different roles (in parentheses is the code of the corresponding attribute):

  • Base(image)
  • Small Image (small_image)
  • Thumbnail(thumbnail)
  • Swatch Image(swatch_image)

Binding roles to a media file just happens in catalog_product_entity_varchar. The binding code is the same as the code in the "Basic Product Attributes«.

After adding an image to the product in the admin panel, it looks like this:

Magento 2: importing products directly into the database

Categories

The main tables that contain data by category:

  • catalog_category_entity: register of categories;
  • catalog_category_product: linking products and categories;
  • catalog_category_entity_*: EAV attribute values;

Initially, in an empty Magento application, the category registry contains 2 categories (I shortened the column names: crtcreated_at, updupdated_at):

entity_id|attribute_set_id|parent_id|crt|upd|path|position|level|children_count|
---------|----------------|---------|---|---|----|--------|-----|--------------|
        1|               3|        0|...|...|1   |       0|    0|             1|
        2|               3|        1|...|...|1/2 |       1|    1|             0|

The category with id=1 is the root of the entire Magento directory and is not available either in the admin or on the front. Category with id=2 (default category) is the root category for the main store of the main site (Main Website Store) created when the application is deployed (see Admin / Stores / All Stores). Moreover, the root category of the store itself on the front is also inaccessible, only its subcategories.

Since the topic of this article is still importing product data, I will not use direct entry into the database when creating categories, but will use the classes provided by Magento itself (models and repositories). Direct entry into the database is used only to link the imported product with the category (the category is matched by its name, the category id is retrieved when matching):

function create($prodId, $catId)
{
    /** @var MagentoFrameworkAppResourceConnection $this->resource */
    /** @var MagentoFrameworkDBAdapterPdoMysql $conn */
    $conn = $this->resource->getConnection();
    $table = $this->resource->getTableName('catalog_category_product');
    $bind = [
        'category_id' => $catId,
        'product_id' => $prodId,
    ];
    $conn->insert($table, $bind);
}

After adding a product association with the categories "Category 1" and "Category 2", the product details in the admin panel look something like this:

Magento 2: importing products directly into the database

Additional Actions

After the data import is complete, you need to perform the following additional steps:

  • data indexing: call in console ./bin/magento indexer:reindex;
  • URL regeneration for products/categories: you can use the extension "elgentos/regenerate-catalog-urls«

Products in the admin panel after performing additional steps:

Magento 2: importing products directly into the database

and at the front:

Magento 2: importing products directly into the database

Summary

The same set of products (10 pieces) as in the previous article is imported at least an order of magnitude faster (1 second versus 10). For a more accurate assessment of the speed, you need a larger number of products - several hundred, and preferably thousands. However, even with such a small input size, it can be concluded that the use of the toolkit provided by Magento (models and repositories) is significant (I emphasize - much!) accelerate the development of the required functionality, but at the same time significantly (I emphasize - much!) reduce the speed at which data enters the database.

As a result, the water turned out to be wet and this is no revelation. However, now I have some code to play with and perhaps draw more interesting conclusions.

Source: habr.com