Andy Thomason is a code performance specialist with many years of experience in graphics, data and the game industry.
Autovectorisation enables us to write portable loops that use SIMD instructions without using intrinsics or inline assembly. GPUs do this when they compile kernals but to get the right preconditions for autovectorisation in C++, we have to jump through a few hoops.
In particular we will talk about alias analysis, loop pragmas, intrinsics idioms, AVX512, vectorisable maths functions and data access patterns.