无法使用自定义预测例程将经过训练的模型部署到 Google Cloud Ai-Platform:模型需要的内存超过允许的内存
Cannot deploy trained model to Google Cloud Ai-Platform with custom prediction routine: Model requires more memory than allowed
我正在尝试部署预训练的 pytorch model to AI Platform with a custom prediction routine. After following the instructions described here部署失败并出现以下错误:
ERROR: (gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy. If you continue to have error, please contact Cloud ML.
模型文件夹的内容大 83.89 MB,低于文档中描述的 250 MB 限制。文件夹中的唯一文件是模型的检查点文件 (.pth) 和自定义预测例程所需的 tarball。
创建模型的命令:
gcloud beta ai-platform versions create pose_pytorch --model pose --runtime-version 1.15 --python-version 3.5 --origin gs://rcg-models/pytorch_pose_estimation --package-uris gs://rcg-models/pytorch_pose_estimation/my_custom_code-0.1.tar.gz --prediction-class predictor.MyPredictor
将运行时版本更改为 1.14
会导致相同的错误。
我试过像 Parth 建议的那样将 --machine-type 参数更改为 mls1-c4-m2
但我仍然遇到相同的错误。
生成my_custom_code-0.1.tar.gz
的setup.py
文件如下所示:
setup(
name='my_custom_code',
version='0.1',
scripts=['predictor.py'],
install_requires=["opencv-python", "torch"]
)
来自预测器的相关代码片段:
def __init__(self, model):
"""Stores artifacts for prediction. Only initialized via `from_path`.
"""
self._model = model
self._client = storage.Client()
@classmethod
def from_path(cls, model_dir):
"""Creates an instance of MyPredictor using the given path.
This loads artifacts that have been copied from your model directory in
Cloud Storage. MyPredictor uses them during prediction.
Args:
model_dir: The local directory that contains the trained Keras
model and the pickled preprocessor instance. These are copied
from the Cloud Storage model directory you provide when you
deploy a version resource.
Returns:
An instance of `MyPredictor`.
"""
net = PoseEstimationWithMobileNet()
checkpoint_path = os.path.join(model_dir, "checkpoint_iter_370000.pth")
checkpoint = torch.load(checkpoint_path, map_location='cpu')
load_state(net, checkpoint)
return cls(net)
此外,我在 AI Platform 中为模型启用了日志记录,我得到了以下输出:
2019-12-17T09:28:06.208537Z OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
2019-12-17T09:28:13.474653Z WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/google/cloud/ml/prediction/frameworks/tf_prediction_lib.py:48: The name tf.saved_model.tag_constants.SERVING is deprecated. Please use tf.saved_model.SERVING instead.
2019-12-17T09:28:13.474680Z {"textPayload":"","insertId":"5df89fad00073e383ced472a","resource":{"type":"cloudml_model_version","labels":{"project_id":"rcg-shopper","region":"","version_id":"lightweight_pose_pytorch","model_id":"pose"}},"timestamp":"2019-12-17T09:28:13.474680Z","logName":"projects/rcg-shopper/logs/ml.googleapis…
2019-12-17T09:28:13.474807Z WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/google/cloud/ml/prediction/frameworks/tf_prediction_lib.py:50: The name tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY is deprecated. Please use tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY instead.
2019-12-17T09:28:13.474829Z {"textPayload":"","insertId":"5df89fad00073ecd4836d6aa","resource":{"type":"cloudml_model_version","labels":{"project_id":"rcg-shopper","region":"","version_id":"lightweight_pose_pytorch","model_id":"pose"}},"timestamp":"2019-12-17T09:28:13.474829Z","logName":"projects/rcg-shopper/logs/ml.googleapis…
2019-12-17T09:28:13.474918Z WARNING:tensorflow:
2019-12-17T09:28:13.474927Z The TensorFlow contrib module will not be included in TensorFlow 2.0.
2019-12-17T09:28:13.474934Z For more information, please see:
2019-12-17T09:28:13.474941Z * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
2019-12-17T09:28:13.474951Z * https://github.com/tensorflow/addons
2019-12-17T09:28:13.474958Z * https://github.com/tensorflow/io (for I/O related ops)
2019-12-17T09:28:13.474964Z If you depend on functionality not listed there, please file an issue.
2019-12-17T09:28:13.474999Z {"textPayload":"","insertId":"5df89fad00073f778735d7c3","resource":{"type":"cloudml_model_version","labels":{"version_id":"lightweight_pose_pytorch","model_id":"pose","project_id":"rcg-shopper","region":""}},"timestamp":"2019-12-17T09:28:13.474999Z","logName":"projects/rcg-shopper/logs/ml.googleapis…
2019-12-17T09:28:15.283483Z ERROR:root:Failed to import GA GRPC module. This is OK if the runtime version is 1.x
2019-12-17T09:28:16.890923Z Copying gs://cml-489210249453-1560169483791188/models/pose/lightweight_pose_pytorch/15316451609316207868/user_code/my_custom_code-0.1.tar.gz...
2019-12-17T09:28:16.891150Z / [0 files][ 0.0 B/ 8.4 KiB]
2019-12-17T09:28:17.007684Z / [1 files][ 8.4 KiB/ 8.4 KiB]
2019-12-17T09:28:17.009154Z Operation completed over 1 objects/8.4 KiB.
2019-12-17T09:28:18.953923Z Processing /tmp/custom_code/my_custom_code-0.1.tar.gz
2019-12-17T09:28:19.808897Z Collecting opencv-python
2019-12-17T09:28:19.868579Z Downloading https://files.pythonhosted.org/packages/d8/38/60de02a4c9013b14478a3f681a62e003c7489d207160a4d7df8705a682e7/opencv_python-4.1.2.30-cp37-cp37m-manylinux1_x86_64.whl (28.3MB)
2019-12-17T09:28:21.537989Z Collecting torch
2019-12-17T09:28:21.552871Z Downloading https://files.pythonhosted.org/packages/f9/34/2107f342d4493b7107a600ee16005b2870b5a0a5a165bdf5c5e7168a16a6/torch-1.3.1-cp37-cp37m-manylinux1_x86_64.whl (734.6MB)
2019-12-17T09:28:52.401619Z Collecting numpy>=1.14.5
2019-12-17T09:28:52.412714Z Downloading https://files.pythonhosted.org/packages/9b/af/4fc72f9d38e43b092e91e5b8cb9956d25b2e3ff8c75aed95df5569e4734e/numpy-1.17.4-cp37-cp37m-manylinux1_x86_64.whl (20.0MB)
2019-12-17T09:28:53.550662Z Building wheels for collected packages: my-custom-code
2019-12-17T09:28:53.550689Z Building wheel for my-custom-code (setup.py): started
2019-12-17T09:28:54.212558Z Building wheel for my-custom-code (setup.py): finished with status 'done'
2019-12-17T09:28:54.215365Z Created wheel for my-custom-code: filename=my_custom_code-0.1-cp37-none-any.whl size=7791 sha256=fd9ecd472a6a24335fd24abe930a4e7d909e04bdc4cf770989143d92e7023f77
2019-12-17T09:28:54.215482Z Stored in directory: /tmp/pip-ephem-wheel-cache-i7sb0bmb/wheels/0d/6e/ba/bbee16521304fc5b017fa014665b9cae28da7943275a3e4b89
2019-12-17T09:28:54.222017Z Successfully built my-custom-code
2019-12-17T09:28:54.650218Z Installing collected packages: numpy, opencv-python, torch, my-custom-code
我可以通过调整 setup.py
来获得成功。基本上 install_requires
尝试获取 PyPI 托管的 torch
包,这是一个巨大的 GPU 内置轮子,超出了部署配额。以下 setup.py
注入安装命令,从官方 pytorch 索引中获取 CPU 构建的 torch。
from setuptools import setup, find_packages
from setuptools.command.install import install as _install
INSTALL_REQUIRES = ['pillow']
CUSTOM_INSTALL_COMMANDS = [
# Install torch here.
[
'python-default', '-m', 'pip', 'install', '--target=/tmp/custom_lib',
'-b', '/tmp/pip_builds', 'torch==1.4.0+cpu', 'torchvision==0.5.0+cpu',
'-f', 'https://download.pytorch.org/whl/torch_stable.html'
],
]
class Install(_install):
def run(self):
import sys
if sys.platform == 'linux':
import subprocess
import logging
for command in CUSTOM_INSTALL_COMMANDS:
logging.info('Custom command: ' + ' '.join(command))
result = subprocess.run(
command, check=True, stdout=subprocess.PIPE
)
logging.info(result.stdout.decode('utf-8', 'ignore'))
_install.run(self)
setup(
name='predictor',
version='0.1',
packages=find_packages(),
install_requires=INSTALL_REQUIRES,
cmdclass={'install': Install},
)
经过数小时的良好试验错误后,我得出了与@kyamagu 相同的结论,“install_requires
尝试获取 PyPI 托管的火炬包,这是一个巨大的 GPU 内置轮子,超过了部署配额."
但是,他的解决方案对我不起作用。因此,经过更多小时的试验错误(由于缺少文档和错误的文档),我想出了这个解决方案:
我们需要获得约 100 MB 的 cpu 构建的 Pytorch 轮子,而不是默认托管的 700 MB GPU 构建的 PyPI。你可以在这里找到它们:https://download.pytorch.org/whl/cpu/torch_stable.html
接下来,我们需要将它们放在我们的 gs 存储中,然后将路径作为 --package-uris 的一部分,如下所示:
gcloud beta ai-platform versions create v17 \
--model=newest \
--origin=gs://bucket \
--runtime-version=1.15 \
--python-version=3.7 \
--package-uris=gs://bucket/predictor-0.1.tar.gz,gs://bucket/torch-1.3.0+cpu-cp37-cp37m-linux_x86_64.whl \
--prediction-class=predictor.MyPredictor \
--machine-type=mls1-c4-m4
另外,注意package-uris
的顺序,predictor
包在前,逗号后不能有space。
希望这对您有所帮助。干杯!
这是一个常见问题,我们知道这是一个痛点。请执行以下操作:
torchvision
具有 torch
作为依赖项,默认情况下,它从 pypi 中提取 torch
。
部署模型时,即使您指向使用自定义 ai-platform torchvision
包,它也会这样做,因为 torchvision
是由 PyTorch 团队构建的,它被配置为使用 torch
作为依赖项。这个来自 pypi 的 torch
依赖项提供了一个 720mb 的文件,因为它包含 GPU 单元
- 要解决 #1,您需要 build
torchvision
from source and tell torchvision
where you want to get torch
from, you need to set it to go to the torch
website as the package is smaller. Rebuild the torchvision
binary using Python PEP-0440 direct references feature. In torchvision
setup.py 我们有:
pytorch_dep = 'torch'
if os.getenv('PYTORCH_VERSION'):
pytorch_dep += "==" + os.getenv('PYTORCH_VERSION')
更新 torchvision
中的 setup.py
以使用直接引用功能:
requirements = [
#'numpy',
#'six',
#pytorch_dep,
'torch @ https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl'
]
* 我已经为你做了这个*,所以我构建了 3 个你可以使用的 wheel 文件:
gs://dpe-sandbox/torchvision-0.4.0-cp37-cp37m-linux_x86_64.whl (torch 1.2.0, vision 0.4.0)
gs://dpe-sandbox/torchvision-0.4.2-cp37-cp37m-linux_x86_64.whl (torch 1.2.0, vision 0.4.2)
gs://dpe-sandbox/torchvision-0.5.0-cp37-cp37m-linux_x86_64.whl (torch 1.4.0 vision 0.5.0)
这些 torchvision
包将从火炬站点而不是 pypi 获得 torch
:(示例:https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl)
在将模型部署到 AI Platform 时更新模型 setup.py
,使其不包含 torch
和 torchvision
.
重新部署模型如下:
PYTORCH_VISION_PACKAGE=gs://dpe-sandbox/torchvision-0.5.0-cp37-cp37m-linux_x86_64.whl
gcloud beta ai-platform versions create {MODEL_VERSION} --model={MODEL_NAME} \
--origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
--python-version=3.7 \
--runtime-version={RUNTIME_VERSION} \
--machine-type=mls1-c4-m4 \
--package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI},{PYTORCH_VISION_PACKAGE}\
--prediction-class={MODEL_CLASS}
您可以将 PYTORCH_VISION_PACKAGE
更改为我在 #2
中提到的任何选项
我正在尝试部署预训练的 pytorch model to AI Platform with a custom prediction routine. After following the instructions described here部署失败并出现以下错误:
ERROR: (gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy. If you continue to have error, please contact Cloud ML.
模型文件夹的内容大 83.89 MB,低于文档中描述的 250 MB 限制。文件夹中的唯一文件是模型的检查点文件 (.pth) 和自定义预测例程所需的 tarball。
创建模型的命令:
gcloud beta ai-platform versions create pose_pytorch --model pose --runtime-version 1.15 --python-version 3.5 --origin gs://rcg-models/pytorch_pose_estimation --package-uris gs://rcg-models/pytorch_pose_estimation/my_custom_code-0.1.tar.gz --prediction-class predictor.MyPredictor
将运行时版本更改为 1.14
会导致相同的错误。
我试过像 Parth 建议的那样将 --machine-type 参数更改为 mls1-c4-m2
但我仍然遇到相同的错误。
生成my_custom_code-0.1.tar.gz
的setup.py
文件如下所示:
setup(
name='my_custom_code',
version='0.1',
scripts=['predictor.py'],
install_requires=["opencv-python", "torch"]
)
来自预测器的相关代码片段:
def __init__(self, model):
"""Stores artifacts for prediction. Only initialized via `from_path`.
"""
self._model = model
self._client = storage.Client()
@classmethod
def from_path(cls, model_dir):
"""Creates an instance of MyPredictor using the given path.
This loads artifacts that have been copied from your model directory in
Cloud Storage. MyPredictor uses them during prediction.
Args:
model_dir: The local directory that contains the trained Keras
model and the pickled preprocessor instance. These are copied
from the Cloud Storage model directory you provide when you
deploy a version resource.
Returns:
An instance of `MyPredictor`.
"""
net = PoseEstimationWithMobileNet()
checkpoint_path = os.path.join(model_dir, "checkpoint_iter_370000.pth")
checkpoint = torch.load(checkpoint_path, map_location='cpu')
load_state(net, checkpoint)
return cls(net)
此外,我在 AI Platform 中为模型启用了日志记录,我得到了以下输出:
2019-12-17T09:28:06.208537Z OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
2019-12-17T09:28:13.474653Z WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/google/cloud/ml/prediction/frameworks/tf_prediction_lib.py:48: The name tf.saved_model.tag_constants.SERVING is deprecated. Please use tf.saved_model.SERVING instead.
2019-12-17T09:28:13.474680Z {"textPayload":"","insertId":"5df89fad00073e383ced472a","resource":{"type":"cloudml_model_version","labels":{"project_id":"rcg-shopper","region":"","version_id":"lightweight_pose_pytorch","model_id":"pose"}},"timestamp":"2019-12-17T09:28:13.474680Z","logName":"projects/rcg-shopper/logs/ml.googleapis…
2019-12-17T09:28:13.474807Z WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/google/cloud/ml/prediction/frameworks/tf_prediction_lib.py:50: The name tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY is deprecated. Please use tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY instead.
2019-12-17T09:28:13.474829Z {"textPayload":"","insertId":"5df89fad00073ecd4836d6aa","resource":{"type":"cloudml_model_version","labels":{"project_id":"rcg-shopper","region":"","version_id":"lightweight_pose_pytorch","model_id":"pose"}},"timestamp":"2019-12-17T09:28:13.474829Z","logName":"projects/rcg-shopper/logs/ml.googleapis…
2019-12-17T09:28:13.474918Z WARNING:tensorflow:
2019-12-17T09:28:13.474927Z The TensorFlow contrib module will not be included in TensorFlow 2.0.
2019-12-17T09:28:13.474934Z For more information, please see:
2019-12-17T09:28:13.474941Z * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
2019-12-17T09:28:13.474951Z * https://github.com/tensorflow/addons
2019-12-17T09:28:13.474958Z * https://github.com/tensorflow/io (for I/O related ops)
2019-12-17T09:28:13.474964Z If you depend on functionality not listed there, please file an issue.
2019-12-17T09:28:13.474999Z {"textPayload":"","insertId":"5df89fad00073f778735d7c3","resource":{"type":"cloudml_model_version","labels":{"version_id":"lightweight_pose_pytorch","model_id":"pose","project_id":"rcg-shopper","region":""}},"timestamp":"2019-12-17T09:28:13.474999Z","logName":"projects/rcg-shopper/logs/ml.googleapis…
2019-12-17T09:28:15.283483Z ERROR:root:Failed to import GA GRPC module. This is OK if the runtime version is 1.x
2019-12-17T09:28:16.890923Z Copying gs://cml-489210249453-1560169483791188/models/pose/lightweight_pose_pytorch/15316451609316207868/user_code/my_custom_code-0.1.tar.gz...
2019-12-17T09:28:16.891150Z / [0 files][ 0.0 B/ 8.4 KiB]
2019-12-17T09:28:17.007684Z / [1 files][ 8.4 KiB/ 8.4 KiB]
2019-12-17T09:28:17.009154Z Operation completed over 1 objects/8.4 KiB.
2019-12-17T09:28:18.953923Z Processing /tmp/custom_code/my_custom_code-0.1.tar.gz
2019-12-17T09:28:19.808897Z Collecting opencv-python
2019-12-17T09:28:19.868579Z Downloading https://files.pythonhosted.org/packages/d8/38/60de02a4c9013b14478a3f681a62e003c7489d207160a4d7df8705a682e7/opencv_python-4.1.2.30-cp37-cp37m-manylinux1_x86_64.whl (28.3MB)
2019-12-17T09:28:21.537989Z Collecting torch
2019-12-17T09:28:21.552871Z Downloading https://files.pythonhosted.org/packages/f9/34/2107f342d4493b7107a600ee16005b2870b5a0a5a165bdf5c5e7168a16a6/torch-1.3.1-cp37-cp37m-manylinux1_x86_64.whl (734.6MB)
2019-12-17T09:28:52.401619Z Collecting numpy>=1.14.5
2019-12-17T09:28:52.412714Z Downloading https://files.pythonhosted.org/packages/9b/af/4fc72f9d38e43b092e91e5b8cb9956d25b2e3ff8c75aed95df5569e4734e/numpy-1.17.4-cp37-cp37m-manylinux1_x86_64.whl (20.0MB)
2019-12-17T09:28:53.550662Z Building wheels for collected packages: my-custom-code
2019-12-17T09:28:53.550689Z Building wheel for my-custom-code (setup.py): started
2019-12-17T09:28:54.212558Z Building wheel for my-custom-code (setup.py): finished with status 'done'
2019-12-17T09:28:54.215365Z Created wheel for my-custom-code: filename=my_custom_code-0.1-cp37-none-any.whl size=7791 sha256=fd9ecd472a6a24335fd24abe930a4e7d909e04bdc4cf770989143d92e7023f77
2019-12-17T09:28:54.215482Z Stored in directory: /tmp/pip-ephem-wheel-cache-i7sb0bmb/wheels/0d/6e/ba/bbee16521304fc5b017fa014665b9cae28da7943275a3e4b89
2019-12-17T09:28:54.222017Z Successfully built my-custom-code
2019-12-17T09:28:54.650218Z Installing collected packages: numpy, opencv-python, torch, my-custom-code
我可以通过调整 setup.py
来获得成功。基本上 install_requires
尝试获取 PyPI 托管的 torch
包,这是一个巨大的 GPU 内置轮子,超出了部署配额。以下 setup.py
注入安装命令,从官方 pytorch 索引中获取 CPU 构建的 torch。
from setuptools import setup, find_packages
from setuptools.command.install import install as _install
INSTALL_REQUIRES = ['pillow']
CUSTOM_INSTALL_COMMANDS = [
# Install torch here.
[
'python-default', '-m', 'pip', 'install', '--target=/tmp/custom_lib',
'-b', '/tmp/pip_builds', 'torch==1.4.0+cpu', 'torchvision==0.5.0+cpu',
'-f', 'https://download.pytorch.org/whl/torch_stable.html'
],
]
class Install(_install):
def run(self):
import sys
if sys.platform == 'linux':
import subprocess
import logging
for command in CUSTOM_INSTALL_COMMANDS:
logging.info('Custom command: ' + ' '.join(command))
result = subprocess.run(
command, check=True, stdout=subprocess.PIPE
)
logging.info(result.stdout.decode('utf-8', 'ignore'))
_install.run(self)
setup(
name='predictor',
version='0.1',
packages=find_packages(),
install_requires=INSTALL_REQUIRES,
cmdclass={'install': Install},
)
经过数小时的良好试验错误后,我得出了与@kyamagu 相同的结论,“install_requires
尝试获取 PyPI 托管的火炬包,这是一个巨大的 GPU 内置轮子,超过了部署配额."
但是,他的解决方案对我不起作用。因此,经过更多小时的试验错误(由于缺少文档和错误的文档),我想出了这个解决方案:
我们需要获得约 100 MB 的 cpu 构建的 Pytorch 轮子,而不是默认托管的 700 MB GPU 构建的 PyPI。你可以在这里找到它们:https://download.pytorch.org/whl/cpu/torch_stable.html
接下来,我们需要将它们放在我们的 gs 存储中,然后将路径作为 --package-uris 的一部分,如下所示:
gcloud beta ai-platform versions create v17 \
--model=newest \
--origin=gs://bucket \
--runtime-version=1.15 \
--python-version=3.7 \
--package-uris=gs://bucket/predictor-0.1.tar.gz,gs://bucket/torch-1.3.0+cpu-cp37-cp37m-linux_x86_64.whl \
--prediction-class=predictor.MyPredictor \
--machine-type=mls1-c4-m4
另外,注意package-uris
的顺序,predictor
包在前,逗号后不能有space。
希望这对您有所帮助。干杯!
这是一个常见问题,我们知道这是一个痛点。请执行以下操作:
torchvision
具有torch
作为依赖项,默认情况下,它从 pypi 中提取torch
。
部署模型时,即使您指向使用自定义 ai-platform torchvision
包,它也会这样做,因为 torchvision
是由 PyTorch 团队构建的,它被配置为使用 torch
作为依赖项。这个来自 pypi 的 torch
依赖项提供了一个 720mb 的文件,因为它包含 GPU 单元
- 要解决 #1,您需要 build
torchvision
from source and telltorchvision
where you want to gettorch
from, you need to set it to go to thetorch
website as the package is smaller. Rebuild thetorchvision
binary using Python PEP-0440 direct references feature. Intorchvision
setup.py 我们有:
pytorch_dep = 'torch'
if os.getenv('PYTORCH_VERSION'):
pytorch_dep += "==" + os.getenv('PYTORCH_VERSION')
更新 torchvision
中的 setup.py
以使用直接引用功能:
requirements = [
#'numpy',
#'six',
#pytorch_dep,
'torch @ https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl'
]
* 我已经为你做了这个*,所以我构建了 3 个你可以使用的 wheel 文件:
gs://dpe-sandbox/torchvision-0.4.0-cp37-cp37m-linux_x86_64.whl (torch 1.2.0, vision 0.4.0)
gs://dpe-sandbox/torchvision-0.4.2-cp37-cp37m-linux_x86_64.whl (torch 1.2.0, vision 0.4.2)
gs://dpe-sandbox/torchvision-0.5.0-cp37-cp37m-linux_x86_64.whl (torch 1.4.0 vision 0.5.0)
这些 torchvision
包将从火炬站点而不是 pypi 获得 torch
:(示例:https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl)
在将模型部署到 AI Platform 时更新模型
setup.py
,使其不包含torch
和torchvision
.重新部署模型如下:
PYTORCH_VISION_PACKAGE=gs://dpe-sandbox/torchvision-0.5.0-cp37-cp37m-linux_x86_64.whl
gcloud beta ai-platform versions create {MODEL_VERSION} --model={MODEL_NAME} \
--origin=gs://{BUCKET}/{GCS_MODEL_DIR} \
--python-version=3.7 \
--runtime-version={RUNTIME_VERSION} \
--machine-type=mls1-c4-m4 \
--package-uris=gs://{BUCKET}/{GCS_PACKAGE_URI},{PYTORCH_VISION_PACKAGE}\
--prediction-class={MODEL_CLASS}
您可以将 PYTORCH_VISION_PACKAGE
更改为我在 #2