BLAKE3 1.0 Cryptographic Hash Reference Implementation Released

The reference implementation of the BLAKE3 1.0 cryptographic hash function has been released, notable for its very high hash computation performance while ensuring reliability at the SHA-3 level. In the hash generation test for a 16 KB file, BLAKE3 with a 256-bit key outperforms SHA3-256 by 17 times, SHA-256 by 14 times, SHA-512 by 9 times, SHA-1 by 6 times, and BLAKE2b - 5 times. A significant gap remains when processing very large amounts of data, for example, BLAKE3 turned out to be 256 times faster than SHA-8 when calculating a hash for 1GB of random data. The BLAKE3 reference implementation code is available in C and Rust versions under a dual public domain license (CC0) and Apache 2.0.

BLAKE3 1.0 Cryptographic Hash Reference Implementation Released

The hash function is designed for applications such as checking the integrity of files, authenticating messages, and generating data for cryptographic digital signatures. BLAKE3 is not intended for password hashing, as it is aimed at calculating hashes as quickly as possible (for passwords, it is recommended to use the slow yescrypt, bcrypt, scrypt or Argon2 hash functions). The considered hash function is insensitive to the size of the hashed data and is protected from collision detection and preimage attacks.

The algorithm was developed by well-known cryptographers (Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O'Hearn) and continues the development of the BLAKE2 algorithm and uses the Bao mechanism to encode the block chain tree. Unlike BLAKE2 (BLAKE2b, BLAKE2s), BLAKE3 offers a single algorithm for all platforms that is not tied to bit depth and hash size.

The performance improvement was achieved by reducing the number of rounds from 10 to 7 and separate hashing of blocks in 1 KB chunks. According to the creators, they found a convincing mathematical proof that it is possible to get by with 7 rounds instead of 10 while maintaining the same level of reliability (for clarity, we can give an example with mixing fruits in a mixer - after 7 seconds the fruits are already completely mixed, and an additional 3 seconds will not affect mixture consistency). At the same time, some researchers express doubts, believing that even if currently 7 rounds are enough to resist all known attacks on hashes, then an additional 3 rounds may be useful if new attacks are detected in the future.

As for the division into blocks, in BLAKE3 the stream is divided into pieces of 1 KB each and each piece is hashed independently. Based on the hashes of the pieces on the basis of the binary Merkle tree, one big hash is formed. This separation allows you to solve the problem of parallelizing data processing when calculating a hash - for example, you can use 4-thread SIMD instructions to simultaneously calculate the hashes of 4 blocks. Traditional SHA-* hash functions process data sequentially.

BLAKE3 Features:

  • High performance, BLAKE3 is significantly faster than MD5, SHA-1, SHA-2, SHA-3 and BLAKE2.
  • Security, including resistance to the message lengthening attack that SHA-2 is susceptible to;
  • Rust variants are available that are optimized for SSE2, SSE4.1, AVX2, AVX-512, and NEON instructions.
  • Ensuring parallelization of calculations for any number of threads and SIMD channels.
  • Possibility of incremental updating and verified processing of streams;
  • Application in PRF, MAC, KDF, XOF modes and as a regular hash;
  • Single algorithm for all architectures, fast on both x86-64 and 32-bit ARM processors.

The main differences between BLAKE3 and BLAKE2:

  • Using a binary tree structure to achieve unlimited parallelism in hash calculation.
  • Reducing the number of rounds from 10 to 7.
  • Three modes of operation: hashing, hashing with a key (HMAC) and key generation (KDF).
  • No additional overhead when hashing with a key due to the use of the area previously occupied by the key parameter block.
  • The built-in mechanism of work in the form of a function with an extended result (XOF, Extendable Output Function), which allows parallelization and positioning (seek).

Source: opennet.ru

Add a comment