默认情况下，TensorFlow 是否使用机器中所有可用的 GPU？

Question

我的机器上有 3 个 GTX Titan GPU。我运行 Cifar10 中提供的示例 cifar10_train.py 并得到以下输出：

I tensorflow/core/common_runtime/gpu/gpu_init.cc:60] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:60] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:127] DMA: 0 1 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:137] 0:   Y N 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:137] 1:   N Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:694] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:694] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX TITAN, pci bus id: 0000:84:00.0)

在我看来，TensorFlow 正在尝试在两个设备（gpu0 和 gpu1）上进行自我初始化。

我的问题是为什么它只在两台设备上这样做，有什么办法可以防止这种情况发生吗？（我只想运行就好像只有一个GPU一样）

Answer 1

参见：Using GPUs

手动放置设备

如果您希望在您选择的设备上对运行进行特定操作而不是自动为您选择的操作，您可以使用 with tf.device 创建设备上下文，以便所有该上下文中的操作将具有相同的设备分配。

# Creates a graph.
with tf.device('/cpu:0'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

您会看到现在 a 和 b 已分配给 cpu:0。由于没有为 MatMul 操作明确指定设备，TensorFlow 运行time 将根据操作和可用设备（本例中为 gpu:0）选择一个设备，并在需要时自动在设备之间复制张量.

Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]

较早的答案 2.

参见：Using GPUs

在 multi-GPU 系统上使用单个 GPU

如果您的系统中有多个 GPU，默认情况下会选择 ID 最小的 GPU。如果您想运行在不同的 GPU 上，您需要明确指定首选项：

# Creates a graph.
with tf.device('/gpu:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)

较早的答案 1.

来自CUDA_VISIBLE_DEVICES – Masking GPUs

Does your CUDA application need to target a specific GPU? If you are writing GPU enabled code, you would typically use a device query to select the desired GPUs. However, a quick and easy solution for testing is to use the environment variable CUDA_VISIBLE_DEVICES to restrict the devices that your CUDA application sees. This can be useful if you are attempting to share resources on a node or you want your GPU enabled executable to target a specific GPU.

Environment Variable Syntax

Results

CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible CUDA_VISIBLE_DEVICES=”0,1” Same as above, quotation marks are optional CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked

CUDA will enumerate the visible devices starting at zero. In the last case, devices 0, 2, 3 will appear as devices 0, 1, 2. If you change the order of the string to “2,3,0”, devices 2,3,0 will be enumerated as 0,1,2 respectively. If CUDA_VISIBLE_DEVICES is set to a device that does not exist, all devices will be masked. You can specify a mix of valid and invalid device numbers. All devices before the invalid value will be enumerated, while all devices after the invalid value will be masked.

To determine the device ID for the available hardware in your system, you can run NVIDIA’s deviceQuery executable included in the CUDA SDK. Happy programming!

Chris Mason

默认情况下，TensorFlow 是否使用机器中所有可用的 GPU？

Does TensorFlow by default use all available GPUs in the machine?

gpu

machine-learning

computer-vision

tensorflow