为什么同一个笔记本在两个不同的环境中分配大不同的 vram？

Question

你可以看到这个笔记本可以在 kaggle 中使用 kaggle 的 16gb vram 限制进行训练：https://www.kaggle.com/firefliesqn/g2net-gpu-newbie

我刚刚尝试运行在我安装了 Torch 1.8 的 rtx3090 gpu 上本地使用同一个笔记本，同一个笔记本分配了大约 23.3 gb vram，为什么会发生这种情况以及我如何优化我的本地环境，如 kaggle ？即使与 kaggle 中使用的相比我减少了批量大小，我的笔记本仍然在本地分配了大约 23gb vram

在 kggle 中，我看到安装了 torch 1.7、tensorflow 2.4，并且在本地安装，因为我使用 rtx3090，因此推荐使用新版本的 torch 和 tf，因此我使用了 torch 1.8.1 和 tensorflow 2.6

Answer 1

默认情况下，TensorFlow 分配检测到的最大可用内存。

使用 TensorFlow 时，可以通过以下代码段限制内存使用：

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only allocate 12GB of memory on the first GPU
  try:
    tf.config.experimental.set_virtual_device_configuration(gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=12288)],

其中 12228 = 1024x12

另一种解决方案（参见下面的讨论）是使用（适用于 OP）（仅当您没有要使用的特定内存上限时才使用此方法）：

 tf.config.experimental.set_memory_growth(physical_devices[0], True)

https://www.tensorflow.org/api_docs/python/tf/config/experimental/set_memory_growth

在 PyTorch 中，这更容易：

import torch

# use 1/2 memory of the GPU 0 (should allocate very similar amount like TF)
torch.cuda.set_per_process_memory_fraction(0.5, 0)

#Can then check with
total_memory_available = torch.cuda.get_device_properties(0).total_memory

为什么同一个笔记本在两个不同的环境中分配大不同的 vram？

Why same notebook allocating large different vram in two different environment?

python

gpu

tensorflow

pytorch

tensorflow2.0