当 gpu 上的会话已经 运行 时使用 tensorflow
Using tensorflow when a session is already running on the gpu
我正在本地机器上使用 tensorflow 2 (gpu) 训练神经网络,我想并行执行一些 tensorflow 代码(只需加载模型并保存它的图形)。
加载模型时出现 cuda 错误。当另一个 tensorflow 实例正在 gpu 上训练时,如何在 cpu 上使用 tensorflow 2 加载和保存模型?
132 self._config = config
133 self._hyperparams['feature_extractor'] = self._get_feature_extractor(hyperparams['feature_extractor'])
--> 134 self._input_shape_tensor = tf.constant([input_shape[0], input_shape[1]])
135 self._build(**self._hyperparams)
136 # save parameter dict for serialization
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in constant(value, dtype, shape, name)
225 """
226 return _constant_impl(value, dtype, shape, name, verify_shape=False,
--> 227 allow_broadcast=True)
228
229
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
233 ctx = context.context()
234 if ctx.executing_eagerly():
--> 235 t = convert_to_eager_tensor(value, ctx, dtype)
236 if shape is None:
237 return t
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
93 except AttributeError:
94 dtype = dtypes.as_dtype(dtype).as_datatype_enum
---> 95 ctx.ensure_initialized()
96 return ops.EagerTensor(value, ctx.device_name, dtype)
97
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/eager/context.py in ensure_initialized(self)
490 if self._default_is_async == ASYNC:
491 pywrap_tensorflow.TFE_ContextOptionsSetAsync(opts, True)
--> 492 self._context_handle = pywrap_tensorflow.TFE_NewContext(opts)
493 finally:
494 pywrap_tensorflow.TFE_DeleteContextOptions(opts)
InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory
您正在 GPU 上加载模型,由于它已用于训练,因此它是 运行 out of memory
。您需要将负载放在 CPU 上。尝试在里面加载模型
with tf.device('/CPU:0'):
默认情况下,TensorFlow 2 在启动时分配 90% 的 GPU:0 内存。如果你设置
import tensorflow as tf
tf.config.experimental.set_memory_growth(tf.config.experimental.list_physical_devices('GPU')[0], True)
您将能够将 GPU 用于这两项任务(当然,前提是您的 GPU 有足够的内存)。
如果您想更好地控制 GPU 内存的使用,您可以创建一个具有硬编码视频内存大小的虚拟 GPU:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 2 GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048)]) # limit in megabytes
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)
我花了一段时间才找到这个答案:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
import tensorflow as tf
用这些行开始你的代码允许你 运行 你的 tf 代码在 CPU 上(显然避免使用 CUDA 是解决方案)同时 运行 宁一个重度 GPU 负载训练。
我正在本地机器上使用 tensorflow 2 (gpu) 训练神经网络,我想并行执行一些 tensorflow 代码(只需加载模型并保存它的图形)。
加载模型时出现 cuda 错误。当另一个 tensorflow 实例正在 gpu 上训练时,如何在 cpu 上使用 tensorflow 2 加载和保存模型?
132 self._config = config
133 self._hyperparams['feature_extractor'] = self._get_feature_extractor(hyperparams['feature_extractor'])
--> 134 self._input_shape_tensor = tf.constant([input_shape[0], input_shape[1]])
135 self._build(**self._hyperparams)
136 # save parameter dict for serialization
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in constant(value, dtype, shape, name)
225 """
226 return _constant_impl(value, dtype, shape, name, verify_shape=False,
--> 227 allow_broadcast=True)
228
229
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
233 ctx = context.context()
234 if ctx.executing_eagerly():
--> 235 t = convert_to_eager_tensor(value, ctx, dtype)
236 if shape is None:
237 return t
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
93 except AttributeError:
94 dtype = dtypes.as_dtype(dtype).as_datatype_enum
---> 95 ctx.ensure_initialized()
96 return ops.EagerTensor(value, ctx.device_name, dtype)
97
~/.anaconda3/envs/posenet2/lib/python3.7/site-packages/tensorflow_core/python/eager/context.py in ensure_initialized(self)
490 if self._default_is_async == ASYNC:
491 pywrap_tensorflow.TFE_ContextOptionsSetAsync(opts, True)
--> 492 self._context_handle = pywrap_tensorflow.TFE_NewContext(opts)
493 finally:
494 pywrap_tensorflow.TFE_DeleteContextOptions(opts)
InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory
您正在 GPU 上加载模型,由于它已用于训练,因此它是 运行 out of memory
。您需要将负载放在 CPU 上。尝试在里面加载模型
with tf.device('/CPU:0'):
默认情况下,TensorFlow 2 在启动时分配 90% 的 GPU:0 内存。如果你设置
import tensorflow as tf
tf.config.experimental.set_memory_growth(tf.config.experimental.list_physical_devices('GPU')[0], True)
您将能够将 GPU 用于这两项任务(当然,前提是您的 GPU 有足够的内存)。
如果您想更好地控制 GPU 内存的使用,您可以创建一个具有硬编码视频内存大小的虚拟 GPU:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 2 GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048)]) # limit in megabytes
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)
我花了一段时间才找到这个答案:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
import tensorflow as tf
用这些行开始你的代码允许你 运行 你的 tf 代码在 CPU 上(显然避免使用 CUDA 是解决方案)同时 运行 宁一个重度 GPU 负载训练。