2.0 release candidate (C++03 version)
The library supports the following architectures and instruction sets:
- x86, x86-64: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA3, FMA4, AVX-512F,
XOP - ARM, ARM64: NEON
- PowerPC: Altivec
Supported compilers:
- C++11 version:
- GCC: 4.8-5.3
- Clang: 3.3-3.8
- MSVC: 2013
- ICC: 2013, 2015
- C++98 version
- GCC: 4.4-5.3
- Clang: 3.3-3.8
- MSVC: 2013
- ICC: 2013, 2015
Clang 3.3 is not supported on ARM. MSVC and ICC are only supported on x86 and
x86-64.
Newer versions of the aforementioned compilers will generally work with either
C++11 or C++98 version of the library. Older versions of these compilers will
generally work with the C++98 version of the library.
Changes since 1.0:
- Expression template-based backend. It is used only for functions that may
benefit from micro-optimizations (e.g. when several instructions can be merged
into one). - Support for vectors much longer than the native vector type. The only
limitation is that the length must be a power of 2. The widest available
instructions are used for the particular vector type. - Visual Studio and Intel Compiler support
- AVX-512F, Altivec and NEONv2 support
- Vector initialization is simplified, for example:
int32<8> v = make_uint(2);
or
int* p = ...; v = load(p);
. - Curriously recurring template pattern is used to categorize vector
types. Function templates no longer need to be written for each vector
type or their combination, instead, an appropriate vector category may
be used. - Each vector type can be explicitly constructed from any other vector
with the same size. - Most functions accept much wider range of vector type combinations. For
example, bitwise functions accept any two vectors of the same size. - If different vector types are used as arguments to such functions, the
return type is computed as if one or both of the arguments were "promoted"
according to certain rules. For example,int32 + int32 --> int32
, whereas
uint32 + int32 --> uint32
, anduint32 + float32 --> float32
. See
simdpp/types/tag.h for more information. - API break:
int128
andint256
types have been removed. On some architectures
such as AVX512 it's more efficient to have different physical representations
for vectors with different element widths. E.g. 8-bit integer elements would
use 256-bit vectors and 32-bit integer elements would use 512-bit vectors. - API break:
basic_int##
types have been removed. The CRTP-based type
categorization and promotion rules make second inheritance-based vector
categorization system impossible. In majority of casesbasic_int##
can be
straightforwardly replaced withuint##
. - API break:
{vector type}::make_const
,{vector type}::zero
and
{vector type}::ones
have been removed to simplify the library. Use the new
make_int
,make_uint
,make_float
,make_zero
andmake_ones
free
functions that produce a construct expression. - API break:
broadcast
family of functions have been renamed tosplat
- API break:
permute
family of functions has been renamed topermute2
and
permute4
depending on the number of template arguments taken. - API break: value conversion functions such as
to_float32x4
have been renamed
and now returns a vector with the same number of elements as the source
vector. - API break: SIMDPP_USER_ARCH_INFO now accepts any expression, not only a
function - API break: unsigned conversions have been renamed to
to_uintXX
to reduce
confusion. - API break: saturated add and sub are now called
add_sat
andsub_sat
No further significant API changes are planned.