在 Google App Engine 上安装 Torch 时遇到问题
Trouble installing Torch on Google App Engine
我构建了一个使用 Torch 作为 ML 框架的机器学习 api。当我将代码上传到 Googe App Engine 时,内存不足。
经过一些调试后,我发现问题出在 Torch 的安装上。
I'm using Torch 1.5.0 and python 3.7.4
那么我该如何解决这个错误呢?也许我可以改变一些东西 app.yaml?
错误信息:
Step #1 - "builder": OSError: [Errno 12] Cannot allocate memory
Step #1 - "builder": self.pid = os.fork()
Step #1 - "builder": File "/usr/lib/python2.7/subprocess.py", line 938, in _execute_child
Step #1 - "builder": errread, errwrite)
Step #1 - "builder": File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 346, in _python_version
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 332, in GetCacheKeyRaw
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 109, in GetCacheKeyRaw
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/common/single_layer_image.py", line 60, in GetCacheKey
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 153, in BuildLayer
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/builder.py", line 114, in Build
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__.py", line 54, in main
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__.py", line 65, in <module>
Step #1 - "builder": exec code in run_globals
Step #1 - "builder": File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
Step #1 - "builder": "__main__", fname, loader, pkg_name)
Step #1 - "builder": File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
Step #1 - "builder": Traceback (most recent call last):
当我没有在我的 requirements.txt
中包含手电筒时,此错误消息再次没有出现
重现:
app.yaml
runtime: python37
resources:
memory_gb: 16
disk_size_gb: 10
requirements.txt
gunicorn==20.0.4
aniso8601==8.0.0
beautifulsoup4==4.9.0
boto3==1.13.3
botocore==1.16.3
bs4==0.0.1
certifi==2020.4.5.1
chardet==3.0.4
click==7.1.2
colorama==0.4.3
docutils==0.15.2
filelock==3.0.12
Flask==1.1.2
Flask-RESTful==0.3.8
googletrans==2.4.0
idna==2.9
itsdangerous==1.1.0
Jinja2==2.11.2
jmespath==0.9.5
joblib==0.14.1
MarkupSafe==1.1.1
numpy==1.18.4
protobuf==3.11.3
python-dateutil==2.8.1
pytz==2020.1
regex==2020.4.4
requests==2.23.0
s3transfer==0.3.3
sacremoses==0.0.43
sentencepiece==0.1.86
six==1.14.0
soupsieve==2.0
tokenizers==0.5.2
tqdm==4.46.0
transformers==2.8.0
urllib3==1.25.9
Werkzeug==1.0.1
main.py
import flask
from flask import Flask, request
from flask_restful import Api, Resource
app = Flask(__name__)
api = Api(app)
production = False
import json
# Import api code
# Create main api 'view'
class main_api(Resource):
def get(self):
question = request.args.get('question')
# Run the script
# But not necessary for the minimum working test
return {
'question': question,
# 'results': results_from_script,
}
# Adds resource
api.add_resource(main_api, '/')
# Starts the api
if __name__ == '__main__':
host = '127.0.0.1'
port = 8080
app.run(host=host, port=port, debug=not production)
我使用 flex 环境修复了这个错误。
我唯一需要改变的是 app.yaml
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
python_version: 3
manual_scaling:
instances: 1
resources:
cpu: 2
memory_gb: 5
disk_size_gb: 10
然后就可以部署了
我构建了一个使用 Torch 作为 ML 框架的机器学习 api。当我将代码上传到 Googe App Engine 时,内存不足。
经过一些调试后,我发现问题出在 Torch 的安装上。
I'm using Torch 1.5.0 and python 3.7.4
那么我该如何解决这个错误呢?也许我可以改变一些东西 app.yaml?
错误信息:
Step #1 - "builder": OSError: [Errno 12] Cannot allocate memory
Step #1 - "builder": self.pid = os.fork()
Step #1 - "builder": File "/usr/lib/python2.7/subprocess.py", line 938, in _execute_child
Step #1 - "builder": errread, errwrite)
Step #1 - "builder": File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 346, in _python_version
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 332, in GetCacheKeyRaw
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 109, in GetCacheKeyRaw
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/common/single_layer_image.py", line 60, in GetCacheKey
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 153, in BuildLayer
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/builder.py", line 114, in Build
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__.py", line 54, in main
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__.py", line 65, in <module>
Step #1 - "builder": exec code in run_globals
Step #1 - "builder": File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
Step #1 - "builder": "__main__", fname, loader, pkg_name)
Step #1 - "builder": File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
Step #1 - "builder": Traceback (most recent call last):
当我没有在我的 requirements.txt
中包含手电筒时,此错误消息再次没有出现重现:
app.yaml
runtime: python37
resources:
memory_gb: 16
disk_size_gb: 10
requirements.txt
gunicorn==20.0.4
aniso8601==8.0.0
beautifulsoup4==4.9.0
boto3==1.13.3
botocore==1.16.3
bs4==0.0.1
certifi==2020.4.5.1
chardet==3.0.4
click==7.1.2
colorama==0.4.3
docutils==0.15.2
filelock==3.0.12
Flask==1.1.2
Flask-RESTful==0.3.8
googletrans==2.4.0
idna==2.9
itsdangerous==1.1.0
Jinja2==2.11.2
jmespath==0.9.5
joblib==0.14.1
MarkupSafe==1.1.1
numpy==1.18.4
protobuf==3.11.3
python-dateutil==2.8.1
pytz==2020.1
regex==2020.4.4
requests==2.23.0
s3transfer==0.3.3
sacremoses==0.0.43
sentencepiece==0.1.86
six==1.14.0
soupsieve==2.0
tokenizers==0.5.2
tqdm==4.46.0
transformers==2.8.0
urllib3==1.25.9
Werkzeug==1.0.1
main.py
import flask
from flask import Flask, request
from flask_restful import Api, Resource
app = Flask(__name__)
api = Api(app)
production = False
import json
# Import api code
# Create main api 'view'
class main_api(Resource):
def get(self):
question = request.args.get('question')
# Run the script
# But not necessary for the minimum working test
return {
'question': question,
# 'results': results_from_script,
}
# Adds resource
api.add_resource(main_api, '/')
# Starts the api
if __name__ == '__main__':
host = '127.0.0.1'
port = 8080
app.run(host=host, port=port, debug=not production)
我使用 flex 环境修复了这个错误。
我唯一需要改变的是 app.yaml
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
python_version: 3
manual_scaling:
instances: 1
resources:
cpu: 2
memory_gb: 5
disk_size_gb: 10
然后就可以部署了