SPO-Project concepts and tools.(Part 2)

In this blog I will discuss the concept of Auto vectorization, SIMD, SVE and SVE2. This are the core concept of software optimization.

Auto-vectorization:

In early days, computers used to have one logic unit that was capable of executing one instruction on one pair of operands at a given time. For this reason computer programs and languages were built to execute sequentially. However modern computers have the capability to perform many task at a time. There are many optimizing compilers who perform automatic vectorization which enables to do some parallel operations where possible instead of only sequential operations. This concept is called vector implementation which can process one operation in multiple pair of operands at a given time.

For AArch64 system there are three extensions for the Auto vectorization. These are SIMD, SVE, SVE2.

For Auto vectorization to be in effect the flags need to be used. This includes –O3, -ftree-vectorize etc.

For more detailed explanation with examples of codes you can follow this link

SIMD:

SIMD is one type of extension for auto vectorization. It stands for Single instruction multiple data. As the name suggest it enables processing of multiple data with a single instruction instead of the conventional sequential approach where one data is process at a time. SIMD operations cannot be used while processing multiple data in different ways.

We can build the program on armv8 system using the following command:

gcc –g –O3 –c march=armv8-a

For more detailed explanation with examples of operation you can follow this link

SVE and SVE2:

SVE stands for Scalable vector extension. Whereas SVE2 is just the armv9 extension of it which is practically not available to any system as of now. SVE is a new SIMD instruction set which is used as an extension to AArch64 in order to allow flexible vector length implementation. SVE2 is the combination of SVE and Neon. SVE2 has more functional domain in terms of data level parallelism.

The main difference between SVE and SVE2 is in functional coverage of the instruction set. While SVE was designed for HPC and ML applications, SVE2 has the capability of data processing beyond this applications.

To use the SVE capability the following command should be used:

gcc –g –O3 –c march=armv8-a+sve

To use the SVE2 capability the following command should be used:

gcc –g –O3 –c march=armv8-a+sve2

One interesting thing to note is that, for the older system which does not support SVE or SVE2 capability we can use emulator for the part only which requires that capability. To use the emulator we can use:

qemu-aarch64

For further detailed learning you can follow this link

Comments

Popular posts from this blog

Naziur Rahman khan Fall 2022 SPO600 Project (Algorithm-part1)

SPO600-Lab 2 (Continued)

SPO-600 64-bit Assembly Language Lab4-part1