在 ulong (C#) 中获得最后一个有效位的最快方法?

Fastest way to get last significant bit position in a ulong (C#)?

在 ulong (C#) 中获取从最低有效位 (LSB) 到最高有效位 (MSB) 的第一个 set(1) 位位置的最快(或至少非常快)方法是什么? 对于 ulong i = 18; (10010) 那将是 2(如果我们从 0 开始计算位置则为 1)。

MS C++ 编译器有 _BitScanForward64 内部函数来完成这个任务,但是 C# 编译器没有类似的函数。

public static UInt64 CountLeadingZeros(UInt64 input)
{
    if (input == 0) return 64;

    UInt64 n = 1;

    if ((input >> 32) == 0) { n = n + 32; input = input << 32; }
    if ((input >> 48) == 0) { n = n + 16; input = input << 16; }
    if ((input >> 56) == 0) { n = n + 8; input = input << 8; }
    if ((input >> 60) == 0) { n = n + 4; input = input << 4; }
    if ((input >> 62) == 0) { n = n + 2; input = input << 2; }
    n = n - (input >> 63);

    return n;
}

我敢打赌这会更快。来自 .

具有非常快的位操作的解决方案。只有不安全的代码才能更快。

ulong n = 18; // 10010
ulong b = 1;
int p = 0;

for (int i = 0; i < 64; i++)
{
    if ((n & b) == b)
    {
        p = i;
        break;
    }
    b = b << 1;
}

Console.WriteLine(p);
public static UInt64 CountTrailingZeros(UInt64 input)
{
    if (input == 0) return 64;

    UInt64 n = 0;

    if ((input & 0xFFFFFFFF) == 0) { n = 32; input = input >> 32; }
    if ((input & 0xFFFF) == 0) { n = n + 16; input = input >> 16; }
    if ((input & 0xFF) == 0) { n = n + 8; input = input >> 8; }
    if ((input & 0xF) == 0) { n = n + 4; input = input >> 4; }
    if ((input & 3) == 0) { n = n + 2; input = input >> 2; }
    if ((input & 1) == 0) { ++n; }

    return n;
}

我更改了 Michael D. O'Connor 的答案以匹配您的问题。

static Int32 GetLSBPosition(UInt64 v) {
    UInt64 x = 1;
    for (var y = 0; y < 64; y++) {
        if ((x & v) == x) {
            return y;
        }
        x = x << 1;
    }
    return 0;
}

虽然与 Alexander 的回答类似,但此表单的执行速度始终更快,在我的机器上每秒大约执行 4600 万次操作。

还有,我写的是0基的,不过个人觉得应该是1基的,eg:

Assert.Equal(0, GetLSBPosition(0));
Assert.Equal(1, GetLSBPosition(1));
Assert.Equal(1, GetLSBPosition(3));

按位运算,最低设置位为:

ulong bit = x & ~(x-1);

最低on-bit设置为off的原始值为:

x & (x-1)

因此要获取所有打开的位:

public static void Main()
{
    ulong x = 13;
    while(x > 0)
    {
        ulong bit = x & ~(x-1);
        x = x & (x-1);

        Console.WriteLine("bit-value {0} is set", bit);
    }
}

输出

bit-value 1 is set
bit-value 4 is set
bit-value 8 is set

我测量了所有答案的性能。

获胜者不在此处,经典的 De Bruijn 序列方法。

    private const ulong DeBruijnSequence = 0x37E84A99DAE458F;

    private static readonly int[] MultiplyDeBruijnBitPosition =
    {
        0, 1, 17, 2, 18, 50, 3, 57,
        47, 19, 22, 51, 29, 4, 33, 58,
        15, 48, 20, 27, 25, 23, 52, 41,
        54, 30, 38, 5, 43, 34, 59, 8,
        63, 16, 49, 56, 46, 21, 28, 32,
        14, 26, 24, 40, 53, 37, 42, 7,
        62, 55, 45, 31, 13, 39, 36, 6,
        61, 44, 12, 35, 60, 11, 10, 9,
    };

    /// <summary>
    /// Search the mask data from least significant bit (LSB) to the most significant bit (MSB) for a set bit (1)
    /// using De Bruijn sequence approach. Warning: Will return zero for b = 0.
    /// </summary>
    /// <param name="b">Target number.</param>
    /// <returns>Zero-based position of LSB (from right to left).</returns>
    private static int BitScanForward(ulong b)
    {
        Debug.Assert(b > 0, "Target number should not be zero");
        return MultiplyDeBruijnBitPosition[((ulong)((long)b & -(long)b) * DeBruijnSequence) >> 58];
    }

最快的方法是在 JIT 编译器之后而不是 BitScanForward 主体中将位扫描 (bsf) 位指令注入程序集,但这需要更多的努力。

随着 .NET Core 3.0 引入硬件内在函数,最快的解决方案应该是

ulong value = 18;
ulong result = System.Runtime.Intrinsics.X86.Bmi1.X64.TrailingZeroCount(value);

或者,新的 System.Numerics.Bitoperations 方法也使用硬件内部函数:

int result2 = System.Numerics.BitOperations.TrailingZeroCount(value);