NEW API: Fixed shift, Rotation #1164

DiamonDinoia · 2025-08-29T14:24:39Z

adds the API bitwise_[l|r]shift(...) and rot[l|r](...)
updates the test to use the API
fixes documentation

I implemente xoshiro (vecotrized) the current API looks like this:

    const auto result = xsimd::rotl(m_state[0] + m_state[3], 23) + m_state[0];
    const auto t = m_state[1] << 17;

    m_state[2] ^= m_state[0];
    m_state[3] ^= m_state[1];
    m_state[2] ^= t;

    m_state[3] = xsimd::rotl(m_state[3], 45);

    return result;

and with AVX2 achieves 393.06 M samples/s
The new API is:

    const auto result = xsimd::rotl<23>(m_state[0] + m_state[3]) + m_state[0];
    const auto t = xsimd::bitwise_lshift<17>(m_state[1]);

    m_state[2] ^= m_state[0];
    m_state[3] ^= m_state[1];
    m_state[1] ^= m_state[2];
    m_state[0] ^= m_state[3];

    m_state[2] ^= t;

    m_state[3] = xsimd::rotl<45>(m_state[3]);

    return result;

and with AVX2 achieves 423.76 M samples/s

With this change causing a 7% speed increase, xoshiro is the fastest RNG among the ones I tested. Othwewise, PCG64 beats it.

serge-sans-paille · 2025-08-29T20:41:57Z

We already have a similar approach for rotate_left and rotate_right, slide_left and slide_right, so that looks good.
We also have insert that has a different API:

    template <class T, class A, size_t I>
    XSIMD_INLINE batch<T, A> insert(batch<T, A> const& x, T val, index<I> pos) noexcept

which makes me wonder if it's a good idea.

Anyway I like your idea, further comments inline.

serge-sans-paille

Could you also add sse2's _mm_slli_epi32 and _mm_srai_epi16?
You provided a generic implementation so ,other architectures shouldn't be impacted, but I'll still have a look and check if some specific instructions are available.

serge-sans-paille · 2025-08-29T20:42:45Z

include/xsimd/arch/common/xsimd_common_arithmetic.hpp

@@ -34,6 +34,11 @@ namespace xsimd
                                 { return x << y; },
                                 self, other);
        }
+        template <int shift, class A, class T, class /*=typename std::enable_if<std::is_integral<T>::value, void>::type*/>


can the shift be negative? If not we should use an unsigned type for I.

serge-sans-paille · 2025-08-29T20:43:51Z

include/xsimd/arch/common/xsimd_common_arithmetic.hpp

@@ -183,6 +193,12 @@ namespace xsimd
            constexpr auto N = std::numeric_limits<T>::digits;
            return (self << other) | (self >> (N - other));
        }
+        template <int count, class A, class T>


same here, we tend to use size_t for this kind of parameter.

serge-sans-paille · 2025-08-29T20:47:19Z

include/xsimd/arch/xsimd_avx2.hpp

        template <class A, class T, class = typename std::enable_if<std::is_integral<T>::value, void>::type>
        XSIMD_INLINE batch<T, A> bitwise_lshift(batch<T, A> const& self, batch<T, A> const& other, requires_arch<avx2>) noexcept
        {
            XSIMD_IF_CONSTEXPR(sizeof(T) == 4)
            {
                return _mm256_sllv_epi32(self, other);
            }
+#if XSIMD_WITH_AVX512VL


I'm not totally fine with this kind of macro guard. the instruction used should depend on the batch architecture (A) and not macros. We could have someone wanting an avx2 kernel within an avx512 context, and I have not yet thought about how to handle instruction sets that extend previous generation instructions. Could you postpone that particular change?

Not saying it's incorrect, It's just that I need to think a bit about the pattern.

I was thinking of using SSEVL, AVXVL as new archs & register types. Then from AVX512 we forward to those. We also change make_sized_batch_t to return those instead of plain sse,avx. This is useful to me as I use custom sized batches even with AVX512 to avoid padding. It should make things faster expecially in sse as it remove vzeroupper if I am not mistaken.

serge-sans-paille · 2025-08-29T20:48:42Z

include/xsimd/arch/xsimd_avx512bw.hpp

@@ -223,7 +223,6 @@ namespace xsimd
                }
            }
        }
-


My mistake, spurious change. I added an API there and then deleted.

serge-sans-paille · 2025-08-29T20:50:20Z

include/xsimd/config/xsimd_config.hpp

@@ -244,6 +244,17 @@
 #define XSIMD_WITH_AVX512DQ 0
 #endif

+/**


should be in an independent commit, preferably with a test case that catches the scenario we missed.

1. adds the API bitwise_[l|r]shift<N>(...) and rot[l|r]<N>(...) 2. updates the test to use the API 3. Updates documentation

DiamonDinoia changed the title ~~NEW:API Fixed shift, Rotation~~ NEW API: Fixed shift, Rotation Aug 29, 2025

serge-sans-paille requested changes Aug 29, 2025

View reviewed changes

NEW API: Fixed shift, Rotation

ee37f7b

1. adds the API bitwise_[l|r]shift<N>(...) and rot[l|r]<N>(...) 2. updates the test to use the API 3. Updates documentation

DiamonDinoia force-pushed the feat/faster-shift-rotation branch from 4ab1d66 to ee37f7b Compare September 2, 2025 18:13

DiamonDinoia requested a review from serge-sans-paille September 2, 2025 18:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NEW API: Fixed shift, Rotation #1164

NEW API: Fixed shift, Rotation #1164

Uh oh!

DiamonDinoia commented Aug 29, 2025 •

edited

Loading

Uh oh!

serge-sans-paille commented Aug 29, 2025

Uh oh!

serge-sans-paille left a comment

Uh oh!

serge-sans-paille Aug 29, 2025

Uh oh!

serge-sans-paille Aug 29, 2025

Uh oh!

serge-sans-paille Aug 29, 2025

Uh oh!

DiamonDinoia Aug 30, 2025 •

edited

Loading

Uh oh!

serge-sans-paille Aug 29, 2025

Uh oh!

DiamonDinoia Aug 30, 2025

Uh oh!

serge-sans-paille Aug 29, 2025

Uh oh!

Uh oh!

@@ @@ -223,7 +223,6 @@ namespace xsimd @@
                               }
                           }
                       }

NEW API: Fixed shift, Rotation #1164

Are you sure you want to change the base?

NEW API: Fixed shift, Rotation #1164

Uh oh!

Conversation

DiamonDinoia commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

serge-sans-paille commented Aug 29, 2025

Uh oh!

serge-sans-paille left a comment

Choose a reason for hiding this comment

Uh oh!

serge-sans-paille Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

serge-sans-paille Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

serge-sans-paille Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

DiamonDinoia Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serge-sans-paille Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

DiamonDinoia Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

serge-sans-paille Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DiamonDinoia commented Aug 29, 2025 •

edited

Loading

DiamonDinoia Aug 30, 2025 •

edited

Loading