OpenCL 3.1

On May 5, the Khronos consortium released the specification OpenCL 3.1 — the latest update to the open standard for cross-platform computing on CPUs, GPUs, DSPs, NPUs, and other accelerators. The release is timed to coincide with the IWOCL 2026 conference and builds on the OpenCL 3.x model, with some features initially being tested as extensions and then migrated to the standard's mandatory core.

The main change in OpenCL 3.1 is mandatory support for loading computing kernels in the format SPIR-V in all compatible implementations. SPIR-V is used as a portable intermediate representation (IR), which can be generated, in particular, via LLVM/Clang and the SPIR-V LLVM Translator. This should simplify the use of OpenCL as a backend for SYCL, chipStar, and specialized compilers, and also allow kernels to be distributed not as source code, but in precompiled IR form.

The OpenCL 3.1 core also brings features important for AI and HPC workloads: subgroups with shuffle/rotate operations and an extended set of types, integer dot products with saturation and accumulation options, new bitwise operations, a recommended local workgroup size query, and a standard device UUID query consistent with Vulkan behavior.

Other changes include new language features without the need to include extensions, improved printf in OpenCL C with support for the z and t modifiers, clarification of the semantics of CL_DEVICE_HOST_UNIFIED_MEMORY, the ability to pass a zero size for local memory arguments, and simplified synchronization when checking for an event in the CL_COMPLETE state.

Work on OpenCL 3.1 implementations is already underway at Arm, Imagination, Intel, and Qualcomm. Among the open source implementations, Khronos specifically mentions rusticl as part of Mesa, PoCL и CLVKCompatibility layers running OpenCL on top of Vulkan and DirectX 12 are also continuing to develop, which should expand OpenCL availability on systems without native drivers.

Khronos' next areas of development include command buffers for low-level command retry, improvements to unified shared memory, matrix operations in shared mode, new AI types such as low-precision formats, as well as improvements to external memory and compatibility with Vulkan, DirectX 12, and media pipelines.

Source: linux.org.ru