在 Google Cloud-ML 上的 运行 tensorflow 之前使用 apt-get 安装 python-tk
install python-tk using apt-get before running tensorflow on Google Cloud-ML
我正在通过 Cloud-VM 实例使用 Cloud Machine Learning Engine 开发对象检测器。按照教程 (https://cloud.google.com/blog/big-data/2017/06/training-an-object-detector-using-cloud-machine-learning-engine).
我在 Google Cloud Platform 上提交以下训练作业时出现模块导入错误:
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \
--job-dir=${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
--module-name object_detection.train \
--region us-central1 \
--config object_detection/samples/cloud/cloud.yml \
-- \
--train_dir=${YOUR_GCS_BUCKET}/train \
--pipeline_config_path=${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_coco.config
错误如下:
...object_detection/utils/visualization_utils.py", line 24, in <module>
import matplotlib.pyplot as plt
ImportError: No module named matplotlib.pyplot
我已经使用 pip install 安装了 matplotlib。此代码工作正常 python2.7 -c 'import matplotlib.pyplot as plt'.
通过在 setup.py 程序文件内的 REQUIRED_PACKAGES 列表中添加包名称解决了 matplotlib 错误。
此外,请查看我的 setup.py 文件..
"""Setup script for object_detection."""
from setuptools import find_packages
from setuptools import setup
import subprocess
subprocess.check_call(['apt-get', 'update'])
subprocess.check_call(['apt-get', 'install', 'python-tk'])
REQUIRED_PACKAGES = ['Pillow>=1.0', 'matplotlib']
setup(
name='object_detection',
version='0.1',
install_requires=REQUIRED_PACKAGES,
include_package_data=True,
packages=[p for p in find_packages() if p.startswith('object_detection')],
description='Tensorflow Object Detection Library',
)
但是,即使解决了这个问题,由于 matplotlib 依赖于 python-tk 包,在这种情况下还会出现其他一些错误。
ps-replica-0 Could not find a version that satisfies the requirement python-tk (from object-detection==0.1) (from versions: ) ps-replica-0
ps-replica-0 No matching distribution found for python-tk (from object-detection==0.1) ps-replica-0
ps-replica-0 Command '['pip', 'install', '--user', u'object_detection-0.1.tar.gz']' returned non-zero exit status 1 ps-replica-0
ps-replica-0 Module completed; cleaning up. ps-replica-0
但是python-tk/python3-tk在pip包中不可用。为此,我们需要做
sudo apt-get install python-tk
或者
sudo apt-get install python3-tk
Google Cloud-ML 运行 python 2.7。因此,我们需要在 运行 出 tensorflow 训练程序之前安装 python-tk。
现在,有人可以帮助我命令 Cloud ML 在 运行 tensorflow 之前使用 apt-get 安装 python-tk。
Update_01:*
我又遇到了一组错误。看来是python setup.py egg_info 失败造成的。
还有这个..
Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
错误日志如下所示。
预先感谢您的帮助。
ps-replica-2
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-BhSDtP-build/
ps-replica-2
Command '['pip', 'install', '--user', '--upgrade', '--force-reinstall', '--no-deps', u'object_detection-0.1.tar.gz']' returned non-zero exit status 1
The replica ps 0 exited with a non-zero status of 1. Termination reason: Error.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-C3hdCp-build/setup.py", line 8, in <module>
subprocess.check_call(['apt-get', 'install', 'python-tk'])
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
The replica ps 1 exited with a non-zero status of 1. Termination reason: Error.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-iR0TqP-build/setup.py", line 8, in <module>
subprocess.check_call(['apt-get', 'install', 'python-tk'])
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
The replica ps 2 exited with a non-zero status of 1. Termination reason: Error.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-BhSDtP-build/setup.py", line 8, in <module>
subprocess.check_call(['apt-get', 'install', 'python-tk'])
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=640992742297&resource=ml_job%2Fjob_id%2Froot_object_detection_1510462119&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22root_object_detection_1510462119%22"
职位提交代码:
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \
--job-dir=${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
--module-name object_detection.train \
--config object_detection/samples/cloud/cloud.yml \
-- \
--train_dir=${YOUR_GCS_BUCKET}/train \
--pipeline_config_path=${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_coco.config
提前致谢..
Update_02: 解
感谢 @Dennis Liu 提供解决方案。无需安装 python-tk
包。
除此之外还会有一个错误,可以通过在object_detection/builders/optimizer_builder.py
中的第103行将tf.train.get_or_create_global_step()
更改为tf.contrib.framework.get_or_create_global_step()
来解决。 Solution Link
将以下行添加到您的 setup.py:
import subprocess
subprocess.check_call(['apt-get', 'install', 'python-tk'])
并从 REQUIRED_PACKAGES
中删除 python-tk
。
使用matplotlib.use('agg')
在导入 matplotlib
之后
我将 matplotlib 的后端从 python-tk 更改为 agg,这成功了。这是我在以下位置找到的答案:
我正在通过 Cloud-VM 实例使用 Cloud Machine Learning Engine 开发对象检测器。按照教程 (https://cloud.google.com/blog/big-data/2017/06/training-an-object-detector-using-cloud-machine-learning-engine).
我在 Google Cloud Platform 上提交以下训练作业时出现模块导入错误:
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \
--job-dir=${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
--module-name object_detection.train \
--region us-central1 \
--config object_detection/samples/cloud/cloud.yml \
-- \
--train_dir=${YOUR_GCS_BUCKET}/train \
--pipeline_config_path=${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_coco.config
错误如下:
...object_detection/utils/visualization_utils.py", line 24, in <module>
import matplotlib.pyplot as plt
ImportError: No module named matplotlib.pyplot
我已经使用 pip install 安装了 matplotlib。此代码工作正常 python2.7 -c 'import matplotlib.pyplot as plt'.
通过在 setup.py 程序文件内的 REQUIRED_PACKAGES 列表中添加包名称解决了 matplotlib 错误。
此外,请查看我的 setup.py 文件..
"""Setup script for object_detection."""
from setuptools import find_packages
from setuptools import setup
import subprocess
subprocess.check_call(['apt-get', 'update'])
subprocess.check_call(['apt-get', 'install', 'python-tk'])
REQUIRED_PACKAGES = ['Pillow>=1.0', 'matplotlib']
setup(
name='object_detection',
version='0.1',
install_requires=REQUIRED_PACKAGES,
include_package_data=True,
packages=[p for p in find_packages() if p.startswith('object_detection')],
description='Tensorflow Object Detection Library',
)
但是,即使解决了这个问题,由于 matplotlib 依赖于 python-tk 包,在这种情况下还会出现其他一些错误。
ps-replica-0 Could not find a version that satisfies the requirement python-tk (from object-detection==0.1) (from versions: ) ps-replica-0
ps-replica-0 No matching distribution found for python-tk (from object-detection==0.1) ps-replica-0
ps-replica-0 Command '['pip', 'install', '--user', u'object_detection-0.1.tar.gz']' returned non-zero exit status 1 ps-replica-0
ps-replica-0 Module completed; cleaning up. ps-replica-0
但是python-tk/python3-tk在pip包中不可用。为此,我们需要做 sudo apt-get install python-tk 或者 sudo apt-get install python3-tk
Google Cloud-ML 运行 python 2.7。因此,我们需要在 运行 出 tensorflow 训练程序之前安装 python-tk。
现在,有人可以帮助我命令 Cloud ML 在 运行 tensorflow 之前使用 apt-get 安装 python-tk。
Update_01:*
我又遇到了一组错误。看来是python setup.py egg_info 失败造成的。 还有这个..
Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
错误日志如下所示。 预先感谢您的帮助。
ps-replica-2
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-BhSDtP-build/
ps-replica-2
Command '['pip', 'install', '--user', '--upgrade', '--force-reinstall', '--no-deps', u'object_detection-0.1.tar.gz']' returned non-zero exit status 1
The replica ps 0 exited with a non-zero status of 1. Termination reason: Error.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-C3hdCp-build/setup.py", line 8, in <module>
subprocess.check_call(['apt-get', 'install', 'python-tk'])
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
The replica ps 1 exited with a non-zero status of 1. Termination reason: Error.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-iR0TqP-build/setup.py", line 8, in <module>
subprocess.check_call(['apt-get', 'install', 'python-tk'])
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
The replica ps 2 exited with a non-zero status of 1. Termination reason: Error.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-BhSDtP-build/setup.py", line 8, in <module>
subprocess.check_call(['apt-get', 'install', 'python-tk'])
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['apt-get', 'install', 'python-tk']' returned non-zero exit status 1
To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=640992742297&resource=ml_job%2Fjob_id%2Froot_object_detection_1510462119&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22root_object_detection_1510462119%22"
职位提交代码:
gcloud ml-engine jobs submit training `whoami`_object_detection_`date +%s` \
--job-dir=${YOUR_GCS_BUCKET}/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
--module-name object_detection.train \
--config object_detection/samples/cloud/cloud.yml \
-- \
--train_dir=${YOUR_GCS_BUCKET}/train \
--pipeline_config_path=${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_coco.config
提前致谢..
Update_02: 解
感谢 @Dennis Liu 提供解决方案。无需安装 python-tk
包。
除此之外还会有一个错误,可以通过在object_detection/builders/optimizer_builder.py
中的第103行将tf.train.get_or_create_global_step()
更改为tf.contrib.framework.get_or_create_global_step()
来解决。 Solution Link
将以下行添加到您的 setup.py:
import subprocess
subprocess.check_call(['apt-get', 'install', 'python-tk'])
并从 REQUIRED_PACKAGES
中删除 python-tk
。
使用matplotlib.use('agg')
在导入 matplotlib
我将 matplotlib 的后端从 python-tk 更改为 agg,这成功了。这是我在以下位置找到的答案: