XSSE is a C header-only library consisting of macros and inline functions that allow selected SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, and AVX2 SIMD instructions to be used in ARM NEON environments. It supports both x86_64 and ARM NEON, enabling developers to write portable SIMD code without worrying about platform differences. Note: The original SSE (pre-SSE2) is not supported due to its age and limited adoption in modern environments.
-
Support for floating-point APIs that rely on
__m128
,__m128d
,__m256
or__m256d
is not provided. -
Support for legacy APIs that rely on
__m64
is not provided. -
Functions that are difficult to replicate in NEON, such as
_mm_stream_si128
, are substituted with regular store instructions. -
APIs like
_mm_clflush
, which cannot be reproduced on NEON, are replaced with no-op macros. -
Some cryptographic APIs are not supported (e.g.
_mm256_sha512msg1_epi64
).
Recommended: C99 or later
- SSE2
- SSE3
- SSSE3
- SSE4.1
- SSE4.2
- AVX
- AVX2
- SSE
- AVX512
No special installation is required.
Simply add xsse.h
from the repository to your project.
#include "xsse.h"
To use the AVX and AVX2 APIs, make sure to include the following header file.
#include "xsse_avx.h"
Since xsse_avx.h
includes xsse.h
, you don't need to include both header files.
You can write NEON code just like SSE2, using familiar instructions.
#include "xsse.h"
#ifdef XSSE2
__m128i a = _mm_set1_epi32(42);
__m128i b = _mm_set1_epi32(10);
__m128i c = _mm_add_epi32(a, b);
#endif
When SSE2 or NEON is available, the XSSE2
macro is automatically defined, enabling platform-aware conditional builds.
XSSE also supports selected instructions from SSE3 and SSSE3. For example:
#ifdef XSSE3
#endif
#ifdef XSSSE3
#endif
#ifdef XSSE4_1
#endif
#ifdef XSSE4_2
#endif
-
ARM: If a macro like
XSSE4_2
is defined, it guarantees thatXSSE4_1
,XSSSE3
,XSSE3
, andXSSE2
are also defined. -
x86-64: The corresponding
XSSE*
macros are automatically defined based on the availability of compiler intrinsics such as__SSE2__
,__SSE4_1__
, or__SSE4_2__
.
Similarly, AVX and AVX2 can be used through the following macros.
#ifdef XSSE_AVX
#endif
#ifdef XSSE_AVX2
#endif
-
ARM: If
XSSE_AVX2
is defined,XSSE_AVX
and every otherXSSE_*
will be defined. -
x86-64: In the same way as the other
XSSE_*
macros, the appropriateXSSE_AVX*
macro is defined whenever the compiler defines__AVX__
or__AVX2__
.