有更好的方法来实现阶乘吗？

Question

我一直在开发一个库来处理超过 8 字节正常范围的数字 long long（通常）。

我说的是几百到几千位数字。

现在我实现了一个如下所示的阶乘函数：

largeNum factorial(largeNum& input) {
    if (input > one) return (input * factorial(input-one));
    else return one;
}

现在这给了我很好的结果。 100！花了大约 5 秒来计算，这已经超过 150 位数字。结果正确。

虽然 5 秒是很长的时间 200 已经需要分钟来计算。

例如WolframAlpha，可以计算100000！不到 10 秒。

所以必须有更好的方法来做到这一点。我一直在看 https://en.wikipedia.org/wiki/Factorial 用于所谓的 Gamma 函数，想知道这是否有任何帮助。

Answer 1

虽然很难在没有看到实现的情况下优化代码，但您当然可以通过将递归函数转换为迭代函数来获得一些循环，或者通过 optimizing tail call 帮助编译器为您完成。

largeNum factorial(largeNum& input) {
    largeNum res = one;    
    while (input > one) {
        res *= input;
        input -= one;
    }
    return res;
}

当然，这只是计算阶乘的相同 "middle school" 方法的不同实现。如果您正在寻找高级算法，here is a page dedicated to comparing various "hard" implementations.

Answer 2

您可以使用线程来加速计算，使用 unix pthread 或 C++ std::thread 也是跨平台的。只有当数字很大时，这才会有性能提升，否则它不足以抵消线程创建的成本。

编辑：该程序使用四个线程来计算阶乘。

运行程序运行8次后，平均线程阶乘时间为14秒，平均非线程阶乘时间为18秒。
示例程序：

#include <iostream>
#include <thread>
#include <chrono>
#include "BigInt.h"

void fact(int upper, int lower, BigInt& val)
{
    for (auto i = upper; i >= lower; i--)
    {
        val = val*i;
    }
}

int main()
{
    std::chrono::high_resolution_clock::time_point t1 = std::chrono::high_resolution_clock::now();

    int n = 1000;
    BigInt val1("1"), val2("1"), val3("1"), val4("1");

    std::thread thr1(&fact, n, (3*n)/4, std::ref(val1));
    std::thread thr2(&fact, ((3 * n) / 4) - 1, n/2, std::ref(val2));
    std::thread thr3(&fact, (n / 2)-1, n/4,  std::ref(val3));
    std::thread thr4(&fact, (n/4)-1, 1, std::ref(val4));

    thr1.join();
    thr2.join();
    thr3.join();
    thr4.join();
    auto ans = val1*val2*val3*val4;

    std::chrono::high_resolution_clock::time_point t2 = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::seconds>(t2 - t1).count();
    std::cout << "threaded factorial time: " << duration << "\n";

    t1 = std::chrono::high_resolution_clock::now();
    BigInt ans2("1");
    fact(n, 1, std::ref(ans2));
    t2 = std::chrono::high_resolution_clock::now();
    duration = std::chrono::duration_cast<std::chrono::seconds>(t2 - t1).count();
    std::cout << "non threaded factorial time: " << duration;

    return 0;
}

Answer 3

我不同意所有这些答案，并说典型的迭代和递归阶乘实现对于大输入值来说是幼稚且昂贵的。

更好的方法是使用 gamma function（或者，更好的是，伽玛函数的自然对数）。

这是有效的，因为 gamma(n) = (n-1)! 或 n! = gamma(n+1)

如果将其与记忆化相结合，您将获得一个适用于大参数的高效解决方案。

gamma 的自然对数特别适合评估 combinations and permutations。

有更好的方法来实现阶乘吗？

Any better way to implement factorials?

c++

math

optimization

factorial