Release of the GCC 11 compiler suite

After a year of development, the release of the free GCC 11.1 compiler suite has been released, the first significant release in the new GCC 11.x branch. Under the new release numbering scheme, version 11.0 was used during development, and shortly before the release of GCC 11.1, the GCC 12.0 branch was already forked, from which the next significant release of GCC 12.1 will be formed.

GCC 11.1 is notable for its transition to using the DWARF 5 debug file format by default, the default inclusion of the C++17 standard (“-std=gnu++17”), significant improvements in support for the C++20 standard, experimental support for C++23, improvements related to the future C language standard (C2x), new performance optimizations.

Major changes:

  • The default mode for the C++ language has been switched to use the C++17 standard (-std=gnu++17) instead of the previously offered C++14. It is possible to selectively disable the new C++17 behavior when processing templates that use other templates as a parameter (-fno-new-ttp-matching).
  • Added support for hardware acceleration of the AddressSanitizer tool, which allows you to determine the facts of accessing freed memory areas, going beyond the boundaries of the allocated buffer, and some other types of errors when working with memory. Hardware acceleration is currently only available for the AArch64 architecture and is focused on use when compiling the Linux kernel. To enable AddressSanitizer hardware acceleration when building user space components, the flag "-fsanitize=hwaddress" has been added, and the kernel flag "-fsanitize=kernel-hwaddress".
  • When generating debugging information, the DWARF 5 format is used by default, which, compared to previous versions, allows generating 25% more compact debugging data. Full support for DWARF 5 requires at least binutils version 2.35.2. DWARF 5 format is supported in debugging tools since GDB 8.0, valgrind 3.17.0, elfutils 0.172 and dwz 0.14. To generate debug files using other versions of DWARF, you can use the options "-gdwarf-2", "-gdwarf-3" and "-gdwarf-4".
  • The requirements for compilers that can be used to build GCC have been increased. The compiler must now support the C++11 standard (previously C++98 was required), i.e. If GCC 10 was enough to build GCC 3.4, then at least GCC 11 is now required to build GCC 4.8.
  • The name and location of files for saving dumps, temporary files and additional information necessary for LTO optimization have been changed. Such files are now always saved in the current directory unless the path is explicitly changed via the "-dumpbase", "-dumpdir" and "-save-temps=*" options.
  • Support for the binary format BRIG for use with the HSAIL (Heterogeneous System Architecture Intermediate Language) language has been deprecated and will soon be removed.
  • The capabilities of the ThreadSanitizer mode (-fsanitize=thread) have been expanded, designed to detect race conditions when sharing the same data from different threads of a multi-threaded application. The new release adds support for alternative runtimes and environments, as well as support for the KCSAN (Kernel Concurrency Sanitizer) debugging tool, designed to dynamically detect race conditions within the Linux kernel. Added new options "-param tsan-distinguish-volatile" and "-param tsan-instrument-func-entry-exit".
  • Column numbers in diagnostic messages now reflect not the byte count from the beginning of the line, but actually the column numbers that take into account multi-byte characters and characters occupying several positions in the line (for example, the character 🙂 occupies two positions and is encoded in 4 bytes). Likewise, tab characters are now treated as a certain number of spaces (configurable via the -ftabstop option, default 8). To restore the old behavior, the “-fdiagnostics-column-unit=byte” option is proposed, and to determine the initial value (numbering from 0 or 1) - the “-fdiagnostics-column-origin=” option.
  • The vectorizer takes into account the entire contents of the function and adds processing capabilities associated with intersections and references to previous blocks in the control-flow graph (CFG, control-flow graph).
  • The optimizer implements the ability to convert a series of conditional operations that compare the same variable into a switch expression. The switch expression can later be encoded using bit testing instructions (the “-fbit-tests” option has been added to control such conversion).
  • Improved interprocedural optimizations. Added a new IPA-modref pass (-fipa-modref) to track side effects when calling functions and improve the accuracy of the analysis. Improved implementation of the IPA-ICF pass (-fipa-icf), which reduces memory consumption during compilation and increases the number of unified functions for which identical blocks of code are combined. In the IPA-CP (Interprocedural constant propagation) pass, the prediction heuristics have been improved, taking into account known boundaries and features of the loops.
  • In the Link-Time Optimization (LTO) implementation, the bytecode format is optimized to reduce size and improve processing speed. Reduced peak memory consumption during the binding phase.
  • In the optimization mechanism based on the results of code profiling (PGO - Profile-guided optimization), which allows generating more optimal code based on analysis of execution features, the size of files with GCOV data is reduced due to more compact packaging of zero counters. Improved "-fprofile-values" mode by keeping track of more parameters on indirect calls.
  • The implementation of the OpenMP 5.0 (Open Multi-Processing) standard, which defines the API and methods for applying parallel programming methods on multi-core and hybrid (CPU+GPU/DSP) systems with shared memory and vectorization units (SIMD), has continued. Added initial support for the allocate directive and the ability to use heterogeneous loops in OpenMP constructs. Implemented support for the OMP_TARGET_OFFLOAD environment variable.
  • The implementation of the OpenACC 2.6 parallel programming specification provided for C, C++ and Fortran languages ​​has been improved, which defines tools for offloading operations on GPUs and specialized processors, such as NVIDIA PTX.
  • For C languages, a new attribute “no_stack_protector” has been implemented, designed to mark functions for which stack protection should not be enabled (“-fstack-protector”). The “malloc” attribute has been expanded to support the identification of pairs of calls for allocating and freeing memory (allocator/deallocator), which is used in the static analyzer to identify typical errors in working with memory (memory leaks, use after freeing, double calls to the free function, etc.) and in compiler warnings “-Wmismatched-dealloc”, “-Wmismatched-new-delete” and “-Wfree-nonheap-object”, informing about inconsistency between memory deallocation and memory allocation operations.
  • New warnings have been added for the C language:
    • "-Wmismatched-dealloc" (enabled by default) - warns about memory deallocation operations that use a pointer that is not compatible with memory allocation functions.
    • "-Wsizeof-array-div" (enabled when "-Wall" is specified) - Warns about dividing two sizeof operators if the divisor does not match the size of the array element.
    • "-Wstringop-overread" (enabled by default) - warns about calling a string function that reads data from an area outside the array boundary.
    • "-Wtsan" (enabled by default) - Warns about using features (such as std::atomic_thread_fence) that are not supported in ThreadSanitizer.
    • “-Warray-parameter” and “-Wvla-parameter” (enabled when specifying “-Wall”) - warns about overriding functions with incompatible declarations of arguments associated with fixed- and variable-length arrays.
    • The "-Wuninitialized" warning now detects attempts to read from uninitialized dynamically allocated memory.
    • The "-Wfree-nonheap-object" warning expands the definition of cases where memory deallocation functions are called with a pointer not obtained through dynamic memory allocation functions.
    • The "-Wmaybe-uninitialized" warning has expanded detection of passing pointers to functions that refer to uninitialized memory locations.
  • For the C language, a portion of new features developed within the framework of the C2X standard has been implemented (enabled by specifying -std=c2x and -std=gnu2x): macros BOOL_MAX and BOOL_WIDTH, optional indication of names of unused parameters in function definitions (as in C++), attribute “[ [nodiscard]]", preprocessor operator "__has_c_attribute", macros FLT_IS_IEC_60559, DBL_IS_IEC_60559, LDBL_IS_IEC_60559, __STDC_WANT_IEC_60559_EXT__, INFINITY, NAN, FLT_SNAN, DBL_SNAN, LDBL_SNAN, DEC_INFINITY and DEC_NAN, NaN=macros for FloatN, _FloatNx and _DecimalN, ability to specify jump marks before declarations and at the end of compound statements.
  • For C++, a portion of the changes and innovations proposed in the C++20 standard has been implemented, including virtual functions “consteval virtual”, pseudo-destructors for the end of the life cycle of objects, the use of the enum class and calculating the size of an array in the “new” expression.
  • For C++, experimental support has been added for some improvements being developed for the future C++23 standard (-std=c++23, -std=gnu++23, -std=c++2b, -std=gnu++2b). For example, there is now support for the literal suffix “zu” for signed size_t values.
  • libstdc++ has improved support for the C++17 standard, including the introduction of std::from_chars and std::to_chars implementations for floating point types. New elements of the C++20 standard have been implemented, including std::bit_cast, std::source_location, atomic operations wait and notify, , , , , as well as elements of the future C++23 standard (std::to_underlying, std::is_scoped_enum). Added experimental support for types for parallel data processing (SIMD, Data-Parallel Types). The implementation of std::uniform_int_distribution has been accelerated.
  • Removed the alpha quality flag from libgccjit, a shared library for embedding a code generator into other processes and using it to organize JIT compilation of bytecode into machine code. Added the ability to build libgccjit for MinGW.
  • Added support for the AArch64 Armv8-R architecture (-march=armv8-r). For AArch64 and ARM architectures, support for processors has been added (parameters -mcpu and -mtune): Arm Cortex-A78 (cortex-a78), Arm Cortex-A78AE (cortex-a78ae), Arm Cortex-A78C (cortex-a78c), Arm Cortex- X1 (cortex-x1), Arm Neoverse V1 (neoverse-v1) and Arm Neoverse N2 (neoverse-n2). Fujitsu A64FX (a64fx) and Arm Cortex-R82 (cortex-r82) CPUs have also been added, supporting only the AArch64 architecture.
  • Added support for using Armv8.3-a (AArch64/AArch32), SVE (AArch64), SVE2 (AArch64) and MVE (AArch32 M-profile) SIMD instructions to autovectorize operations performing addition, subtraction, multiplication and variants of addition/subtraction over complex numbers. Added initial support for autovectorization for ARM using the MVE instruction set.
  • For ARM platforms, a full set of compiler-integrated C functions (Intrinsics) is provided, replaced by extended vector instructions (SIMD), covering all NEON instructions documented in the ACLE Q3 2020 specification.
  • Support for gfx908 GPU has been added to the backend for generating code for AMD GPUs based on the GCN microarchitecture.
  • Added support for new processors and new instruction set extensions implemented in them:
    • Intel Sapphire Rapids (-march=sapphirerapids, enables support for the MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD, CLDEMOTE, SERIALIZE, PTWRITE, WAITPKG, TSXLDTRK, AMT-TILE, AMX-INT8, AMX-BF16 and AVX-VNNI instructions.
    • Intel Alderlake (-march=alderlake, enables support for CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, KEYLOCKER, AVX-VNNI and HRESET instructions).
    • Intel Rocketlake (-march=rocketlake, similar to Rocket Lake without SGX support).
    • AMD Zen 3 (-march=znver3).
  • For IA-32/x86-64 systems based on Intel processors, support for new processor instructions TSXLDTRK, SERIALIZE, HRESET, UINTRKEYLOCKER, AMX-TILE, AMX-INT8, AMX-BF16, AVX-VNNI has been added.
  • Added support for "-march=x86-64-v[234]" flags to select x86-64 architecture levels (v2 - covers SSE4.2, SSSE3, POPCNT and CMPXCHG16B extensions; v3 - AVX2 and MOVBE; v4 - AVX-512) .
  • Added support for RISC-V systems with big-endian byte order. Added "-misa-spec=*" option to select the version of the RISC-V instruction set architecture specification. Added support for AddressSanitizer and stack protection using canary tags.
  • Continued improvement of the “-fanalyzer” static analysis mode, which performs resource-intensive interprocedural analysis of code execution paths and data flows in the program. The mode is capable of detecting problems at the compilation stage, such as double calls to the free() function for one memory area, file descriptor leaks, dereferencing and passing null pointers, accessing freed memory blocks, using uninitialized values, etc. In the new version:
    • The code for tracking the program state has been completely rewritten. Problems with scanning very large C files have been resolved.
    • Added initial C++ support.
    • Memory allocation and deallocation analysis has been abstracted from the specific malloc and free functions, and now supports new/delete and new[]/delete[].
    • Added new warnings: -Wanalyzer-shift-count-negative, -Wanalyzer-shift-count-overflow, -Wanalyzer-write-to-const and -Wanalyzer-write-to-string-literal.
    • Added new debugging options -fdump-analyzer-json and -fno-analyzer-feasibility.
    • The ability to extend the analyzer through plugins for GCC has been implemented (for example, a plugin has been prepared to check the incorrect use of global locking (GIL) in CPython).

Source: opennet.ru

Add a comment