在 GeForce 或 Quadro GPU 之间进行选择以通过 TensorFlow 进行机器学习

Choosing between GeForce or Quadro GPUs to do machine learning via TensorFlow

如果使用 Quadro GPU 与 GeForce GPU，TensorFlow 性能是否有明显差异？

例如它是否使用双精度运算或其他会导致 GeForce 卡下降的东西？

我正准备为 TensorFlow 购买 GPU，想知道 GeForce 是否可以。感谢并感谢您的帮助

我认为 GeForce TITAN 很棒并且广泛用于机器学习 (ML)。在 ML 中，大多数情况下单精度就足够了。

有关 GTX 系列（当前为 GeForce 10）性能的更多详细信息，请参见维基百科，here。

网络上的其他来源支持这一说法。这是引用 from doc-ok in 2013 (permalink).

For comparison, an “entry-level” 0 Quadro 4000 is significantly slower than a 0 high-end GeForce GTX 680, at least according to my measurements using several Vrui applications, and the closest performance-equivalent to a GeForce GTX 680 I could find was a Quadro 6000 for a whopping 60.

具体到ML，包括深度学习，有一个Kaggle forum discussion dedicated to this subject (Dec 2014, permalink)，里面有Quadro、GeForce、Tesla系列的对比：

Quadro GPUs aren't for scientific computation, Tesla GPUs are. Quadro cards are designed for accelerating CAD, so they won't help you to train neural nets. They can probably be used for that purpose just fine, but it's a waste of money.

Tesla cards are for scientific computation, but they tend to be pretty expensive. The good news is that many of the features offered by Tesla cards over GeForce cards are not necessary to train neural networks.

For example, Tesla cards usually have ECC memory, which is nice to have but not a requirement. They also have much better support for double precision computations, but single precision is plenty for neural network training, and they perform about the same as GeForce cards for that.

One useful feature of Tesla cards is that they tend to have is a lot more RAM than comparable GeForce cards. More RAM is always welcome if you're planning to train bigger models (or use RAM-intensive computations like FFT-based convolutions).

If you're choosing between Quadro and GeForce, definitely pick GeForce. If you're choosing between Tesla and GeForce, pick GeForce, unless you have a lot of money and could really use the extra RAM.

注意： 注意您正在使用的平台以及其中的默认精度。例如，here in the CUDA forums（2016 年 8 月），一名开发人员拥有两台 Titan X（GeForce 系列），并且没有看到任何 R 或 Python 脚本的性能提升。这被诊断为 R 默认为双精度的结果，并且在新 GPU 上的性能比他们的 CPU（Xeon 处理器）差。 Tesla GPU 被认为是双精度的最佳性能。在这种情况下，将所有数字转换为 float32 可以提高性能，在一个 TITAN X 上，nvBLAS 的性能从 12.437s 提高到 0.324s，gmatrix+float32s（参见第一个基准）。引用这个论坛的讨论：

Double precision performance of Titan X is pretty low.

在 GeForce 或 Quadro GPU 之间进行选择以通过 TensorFlow 进行机器学习

Choosing between GeForce or Quadro GPUs to do machine learning via TensorFlow

gpu

gpgpu

machine-learning

tensorflow