Differences and similarities Between AVX-512 vs AVX10

Intel has recently announced its new Advanced Performance Extensions (APX) and AVX10, which will bring unified support for AVX-512 capabilities to both P-Cores and E-Cores for the first time. This development is set to help Intel overcome the significant issues it faced with its new x86 hybrid architecture found in the Alder and Raptor Lake processors.

Note: If you buy something from our links, we might earn a commission. See our disclosure statement.

Source: Image

Here’s a comparison table that highlights the key differences and similarities:

Table of Contents

AVX-512 vs AVX10

Feature	AVX-512	AVX10
Support for 512-bit instructions	Yes	Yes, but only on P-cores
Support for 256-bit instructions	Yes	Yes, on both P-cores and E-cores
ISA Superset	N/A	AVX10 is a superset of AVX-512
Vector Register Sizes	512-bit and 256-bit	512-bit for P-cores and 256-bit for E-cores
Support on Current-gen CPUs	Yes	No, only on future chips
Enumeration Methods	Original AVX-512 methods	Simplified methods compared to AVX-512
Future Support	Will be frozen when AVX10 debuts	Will be Intel’s vector ISA of choice moving forward
Additional Features	N/A	New AI data types and conversions, data movement optimizations, and standards support (in AVX10.2)

Note: that this table simplifies some of the complexities and nuances of these instruction set architectures.

AVX10: The Future of Intel’s Vector ISA

The AVX10 (Advanced Instruction Extensions 10) will not be supported with Intel’s current-gen CPUs but is expected to arrive in future chips. Intel has stated that AVX10 will be its vector ISA of choice moving forward for both consumer and server processors.

At its most basic level, AVX10 will allow Intel’s chips that have both E-cores and P-cores to still support AVX-512. However, 512-bit instructions can only run on P-cores, while converged 256-bit AVX10 instructions can run on either the p-cores or e-cores. This allows the full chip to maintain support for AVX-512 capabilities.

AVX10: A Superset of AVX-512

The AVX10 ISA is a superset of AVX-512 and comes with all of the features of the AVX-512 ISA for processors with both 256-bit and 512-bit vector register sizes. The converged AVX10 ISA will include “AVX-512 vector instructions with an AVX512VL feature flag, a maximum vector register length of 256 bits, as well as eight 32-bit mask registers and new versions of 256-bit instructions supporting embedded rounding.” This version will run on both p-cores and e-cores.

However, the e-cores will be limited to the converged AVX10’s maximum 256-bit vector length, while P-cores can use 512-bit vectors. This is similar to Arm’s support for variable vector widths with SVE.

AVX10 and Performance

Intel claims that existing applications will provide the same level of performance with AVX10 as they did with AVX-512, at least at the same vector lengths. AVX2-compiled applications, when re-compiled to Intel AVX10, should realize performance gains without the need for additional software tuning. AVX2 applications sensitive to vector register pressure will gain the most performance due to the 16 additional vector registers and new instructions.

Transition from AVX-512 to AVX10

Intel will support AVX10 version 1 (AVX10.1) beginning with its sixth-gen Xeon “Granite Rapids” chips, but that generation will only support 512-bit vector instructions, and not the new converged 256-bit vector instructions. Instead, this first gen will serve as the transition chip from AVX-512 to AVX10.

Chips arriving after Granite Rapids will support AVX10.2, which adds support for the converged 256-bit vector lengths and other new features, like new AI data types and conversions, data movement optimizations, and standards support. All future Xeon processors will continue fully supporting all AVX-512 instructions to ensure that legacy apps function normally.

Simplifying AVX10 Enumeration Methods

To address developer feedback, Intel also plans to significantly simplify its AVX10 enumeration methods compared to AVX-512. Intel also plans to ensure that each move to a new AVX10 revision has enough new instructions and capabilities to merit a change, thus reducing version and enumeration bloat. Intel will freeze the AVX-512 ISA when AVX10 debuts, and all future use of AVX-512 instructions will occur through the AVX10 ISA.

Introduction of APX

In addition to AVX10, Intel also announced the new APX (Advanced Performance Extensions). Intel claims APX-compiled code contains 10% fewer loads and 20% fewer stores than the same code compiled for an Intel 64 baseline. Register accesses are both faster and consume significantly less dynamic power than complex load and store operations. The new APX finds a new use for the 128B area that was left unused when Intel abandoned MPX back in 2019, and repurposes it for XSAVE.

Conclusion

The introduction of AVX10 and APX marks a significant step in Intel’s evolution, with the company aiming to overcome the challenges it faced with its new x86 hybrid architecture. The new ISAs promise improved performance and efficiency, and their arrival in future chips is eagerly anticipated.

Intel’s new APX (Advanced Performance Extensions)
AVX10 [PDF]

Affiliate Disclosure: Faceofit.com is a participant in the Amazon Services LLC Associates Program. As an Amazon Associate we earn from qualifying purchases.