Keras 运行 内存不足的后果
Consequences of Keras running out of memory
如果这个问题在这里不是主题,请随时参考另一个 StackExchange 站点。 :-)
我正在使用 Keras,我的 GPU 内存非常有限(GeForce GTX 970,~4G)。因此,我 运行 内存不足 (OOM) 与 Keras 一起工作,批处理大小设置在一定水平以上。降低批量大小我没有这个问题,但 Keras 输出以下警告:
2019-01-02 09:47:03.173259: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.57GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.211139: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.68GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.268074: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.95GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.685032: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.39GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.732304: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.56GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.850711: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.39GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.879135: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.48GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.963522: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.42GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.984897: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.47GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:04.058733: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.08GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
作为用户,这些警告对我意味着什么?这些性能提升是什么?这是否意味着它只是计算得更快,或者我什至在更好的验证损失方面获得了更好的结果?
在我的设置中,我使用带有 Tensorflow 后端和 tensorflow-gpu==1.8.0 的 Keras。
这意味着训练在速度方面会遇到一些效率损失,因为GPU不能用于某些操作。不过,输的结果应该不会受到影响。
为了避免此问题,最佳做法是减小批大小以有效利用可用的 GPU 内存。
如果这个问题在这里不是主题,请随时参考另一个 StackExchange 站点。 :-)
我正在使用 Keras,我的 GPU 内存非常有限(GeForce GTX 970,~4G)。因此,我 运行 内存不足 (OOM) 与 Keras 一起工作,批处理大小设置在一定水平以上。降低批量大小我没有这个问题,但 Keras 输出以下警告:
2019-01-02 09:47:03.173259: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.57GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.211139: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.68GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.268074: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.95GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.685032: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.39GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.732304: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.56GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.850711: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.39GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.879135: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.48GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.963522: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.42GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:03.984897: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.47GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-01-02 09:47:04.058733: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.08GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
作为用户,这些警告对我意味着什么?这些性能提升是什么?这是否意味着它只是计算得更快,或者我什至在更好的验证损失方面获得了更好的结果?
在我的设置中,我使用带有 Tensorflow 后端和 tensorflow-gpu==1.8.0 的 Keras。
这意味着训练在速度方面会遇到一些效率损失,因为GPU不能用于某些操作。不过,输的结果应该不会受到影响。
为了避免此问题,最佳做法是减小批大小以有效利用可用的 GPU 内存。