net.load_state_dict(torch.load('rnn_x_epoch.net')) 无法在 cpu 上工作
net.load_state_dict(torch.load('rnn_x_epoch.net')) not working on cpu
我正在使用 pytorch 来训练神经网络。当我在 GPU 上训练和测试时,它工作正常。
但是当我尝试使用 CPU 加载模型参数时:
net.load_state_dict(torch.load('rnn_x_epoch.net'))
我收到以下错误:
RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:51
我已经搜索了错误,主要是因为 CUDA 驱动程序依赖性,但是由于我 运行ning on CPU 当我得到这个错误时,它一定是其他原因,或者可能是我错过了什么。
由于它使用 GPU 工作正常,我可以在 GPU 上 运行 它,但我正在尝试在 GPU 上训练网络,存储参数,然后将其加载到 CPU 模式以进行预测。
我只是在寻找一种在 CPU 模式下加载参数的方法。
我也试过这个来加载参数:
check = torch.load('rnn_x_epoch.net')
没用。
我尝试以两种方式保存模型参数,以查看其中任何一种是否可行,但没有:
1)
checkpoint = {'n_hidden': net.n_hidden,
'n_layers': net.n_layers,
'state_dict': net.state_dict(),
'tokens': net.chars}
with open('rnn_x_epoch.net', 'wb') as f:
torch.save(checkpoint, f)
2)
torch.save(model.state_dict(), 'rnn_x_epoch.net')
回溯:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-9-e61f28013b35> in <module>()
----> 1 net.load_state_dict(torch.load('rnn_x_epoch.net'))
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in load(f, map_location, pickle_module)
301 f = open(f, 'rb')
302 try:
--> 303 return _load(f, map_location, pickle_module)
304 finally:
305 if new_fd:
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in _load(f, map_location, pickle_module)
467 unpickler = pickle_module.Unpickler(f)
468 unpickler.persistent_load = persistent_load
--> 469 result = unpickler.load()
470
471 deserialized_storage_keys = pickle_module.load(f)
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in persistent_load(saved_id)
435 if root_key not in deserialized_objects:
436 deserialized_objects[root_key] = restore_location(
--> 437 data_type(size), location)
438 storage = deserialized_objects[root_key]
439 if view_metadata is not None:
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in default_restore_location(storage, location)
86 def default_restore_location(storage, location):
87 for _, _, fn in _package_registry:
---> 88 result = fn(storage, location)
89 if result is not None:
90 return result
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in _cuda_deserialize(obj, location)
68 if location.startswith('cuda'):
69 device = max(int(location[5:]), 0)
---> 70 return obj.cuda(device)
71
72
/opt/conda/lib/python3.6/site-packages/torch/_utils.py in _cuda(self, device, non_blocking, **kwargs)
66 if device is None:
67 device = -1
---> 68 with torch.cuda.device(device):
69 if self.is_sparse:
70 new_type = getattr(torch.cuda.sparse,
self.__class__.__name__)
/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py in __enter__(self)
223 if self.idx is -1:
224 return
--> 225 self.prev_idx = torch._C._cuda_getDevice()
226 if self.prev_idx != self.idx:
227 torch._C._cuda_setDevice(self.idx)
RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:51
也可能是 Pytorch 中的 save/load 操作仅适用于 GPU 模式,但我不太相信这一点。
来自 PyTorch documentation:
When you call torch.load()
on a file which contains GPU tensors, those tensors will be loaded to GPU by default.
要在 CPU 上加载保存在 GPU 上的模型,您需要在 load
函数中将 map_location
参数作为 cpu
传递,如下所示:
# Load all tensors onto the CPU
net.load_state_dict(torch.load('rnn_x_epoch.net', map_location=torch.device('cpu')))
在这样做时,张量的底层存储会使用 map_location
参数动态重新映射到 CPU 设备。您可以在官方PyTorch tutorials.
上阅读更多内容
也可以按如下方式进行:
# Load all tensors onto the CPU, using a function
net.load_state_dict(torch.load('rnn_x_epoch.net', map_location=lambda storage, loc: storage))
我正在使用 pytorch 来训练神经网络。当我在 GPU 上训练和测试时,它工作正常。 但是当我尝试使用 CPU 加载模型参数时:
net.load_state_dict(torch.load('rnn_x_epoch.net'))
我收到以下错误:
RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:51
我已经搜索了错误,主要是因为 CUDA 驱动程序依赖性,但是由于我 运行ning on CPU 当我得到这个错误时,它一定是其他原因,或者可能是我错过了什么。 由于它使用 GPU 工作正常,我可以在 GPU 上 运行 它,但我正在尝试在 GPU 上训练网络,存储参数,然后将其加载到 CPU 模式以进行预测。 我只是在寻找一种在 CPU 模式下加载参数的方法。
我也试过这个来加载参数:
check = torch.load('rnn_x_epoch.net')
没用。
我尝试以两种方式保存模型参数,以查看其中任何一种是否可行,但没有: 1)
checkpoint = {'n_hidden': net.n_hidden,
'n_layers': net.n_layers,
'state_dict': net.state_dict(),
'tokens': net.chars}
with open('rnn_x_epoch.net', 'wb') as f:
torch.save(checkpoint, f)
2)
torch.save(model.state_dict(), 'rnn_x_epoch.net')
回溯:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-9-e61f28013b35> in <module>()
----> 1 net.load_state_dict(torch.load('rnn_x_epoch.net'))
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in load(f, map_location, pickle_module)
301 f = open(f, 'rb')
302 try:
--> 303 return _load(f, map_location, pickle_module)
304 finally:
305 if new_fd:
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in _load(f, map_location, pickle_module)
467 unpickler = pickle_module.Unpickler(f)
468 unpickler.persistent_load = persistent_load
--> 469 result = unpickler.load()
470
471 deserialized_storage_keys = pickle_module.load(f)
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in persistent_load(saved_id)
435 if root_key not in deserialized_objects:
436 deserialized_objects[root_key] = restore_location(
--> 437 data_type(size), location)
438 storage = deserialized_objects[root_key]
439 if view_metadata is not None:
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in default_restore_location(storage, location)
86 def default_restore_location(storage, location):
87 for _, _, fn in _package_registry:
---> 88 result = fn(storage, location)
89 if result is not None:
90 return result
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in _cuda_deserialize(obj, location)
68 if location.startswith('cuda'):
69 device = max(int(location[5:]), 0)
---> 70 return obj.cuda(device)
71
72
/opt/conda/lib/python3.6/site-packages/torch/_utils.py in _cuda(self, device, non_blocking, **kwargs)
66 if device is None:
67 device = -1
---> 68 with torch.cuda.device(device):
69 if self.is_sparse:
70 new_type = getattr(torch.cuda.sparse,
self.__class__.__name__)
/opt/conda/lib/python3.6/site-packages/torch/cuda/__init__.py in __enter__(self)
223 if self.idx is -1:
224 return
--> 225 self.prev_idx = torch._C._cuda_getDevice()
226 if self.prev_idx != self.idx:
227 torch._C._cuda_setDevice(self.idx)
RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:51
也可能是 Pytorch 中的 save/load 操作仅适用于 GPU 模式,但我不太相信这一点。
来自 PyTorch documentation:
When you call
torch.load()
on a file which contains GPU tensors, those tensors will be loaded to GPU by default.
要在 CPU 上加载保存在 GPU 上的模型,您需要在 load
函数中将 map_location
参数作为 cpu
传递,如下所示:
# Load all tensors onto the CPU
net.load_state_dict(torch.load('rnn_x_epoch.net', map_location=torch.device('cpu')))
在这样做时,张量的底层存储会使用 map_location
参数动态重新映射到 CPU 设备。您可以在官方PyTorch tutorials.
也可以按如下方式进行:
# Load all tensors onto the CPU, using a function
net.load_state_dict(torch.load('rnn_x_epoch.net', map_location=lambda storage, loc: storage))