Docker 带有 joblib 的 uwsgi-nginx-flask,无法找到本地函数,但可以在独立的 flask 中工作
Docker uwsgi-nginx-flask with joblib, unable to find local function, but works in standalone flask
当我尝试通过 joblib 在 docker 容器中加载预训练模型时出现以下错误。
web_1 | 2018-02-06 15:11:50,826 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | 2018-02-06 15:11:50,828 INFO success: uwsgi entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | Traceback (most recent call last):
web_1 | File "./app/main.py", line 23, in <module>
web_1 | svm_detector_reloaded=joblib.load(filename);
web_1 | File "/usr/local/lib/python3.6/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 578, in load
web_1 | obj = _unpickle(fobj, filename, mmap_mode)
web_1 | File "/usr/local/lib/python3.6/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 508, in _unpickle
web_1 | obj = unpickler.load()
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1050, in load
web_1 | dispatch[key[0]](self)
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1338, in load_global
web_1 | klass = self.find_class(module, name)
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1392, in find_class
web_1 | return getattr(sys.modules[module], name)
web_1 | AttributeError: module '__main__' has no attribute 'split_into_lemmas'
web_1 | unable to load app 0 (mountpoint='') (callable not found or import error)
web_1 | *** no app loaded. going in full dynamic mode ***
web_1 | *** uWSGI is running in multiple interpreter mode ***
我的main.py长得像
from flask import Flask
from flask import request
from flask import jsonify
from textblob import TextBlob
import sklearn
import numpy as np
from sklearn.externals import joblib
app = Flask(__name__)
from .api.utils import split_into_lemmas as split_into_lemmas
def split_into_lemmas(message):
message=message.lower()
words = TextBlob(message).words
# for each word, take its "base form" = lemma
return [word.lemma for word in words]
def tollower(message):
return message.lower()
filename = '../../data/sms_spam_detector.pkl'
svm_detector_reloaded=joblib.load(filename);
text="Testing"
lowerText=tollower(text)
@app.route('/')
def hello():
return tollower("Test Test ");
@app.route('/detect/')
def route_detect():
SMS=request.args.get('SMS')
if(SMS==None or SMS==''):
SMS="Test";
return tollower(SMS);
# test=[SMS]
# message= ( svm_detector_reloaded.predict(test)[0])
# return SMS+" "+message;
if __name__ == "__main__":
# Only for debugging while developing
app.run(host='0.0.0.0')
基本上我从 tiangolo/uwsgi-nginx-flask 下载了 example-flask-package-python3.6.zip。添加了数据目录并修改了 docker 文件和 main.py。 main.py 粘贴在上面,docker 文件看起来像
FROM tiangolo/uwsgi-nginx-flask:python3.6
ENV LISTEN_PORT 8080
EXPOSE 8080
RUN pip3 install numpy TextBlob scikit-learn scipy
COPY ./app /app
COPY ./data /data
然后我将预建模型(通过 joblib 存储)复制到新创建的数据目录中。整个代码工作正常,如果我直接 运行 像 python main.py
这样的代码,但不是在发出 docker-compose up
命令时,出现上述错误。如果我评论行 svm_detector_reloaded=joblib.load(filename);
,docker 出现并且一切正常,除了机器学习部分。
基本上,定义的函数 split_into_lemmas
在 unpickled 模型中不可访问。
我在这里做错了什么?模型是按照 @ http://radimrehurek.com/data_science_python 中提到的步骤构建的。实际模型在第 6 步构建。
好的。我能够解决它。我从 3614379 得到了线索。我首先在模块(或 .py 文件)中创建函数 split_into_lemmas
并在训练时导入该模块,而不是将函数保留在主文件本身中。然后在我的 docker 实例中我也导入了相同的模块。它解决了这个问题。
当我尝试通过 joblib 在 docker 容器中加载预训练模型时出现以下错误。
web_1 | 2018-02-06 15:11:50,826 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | 2018-02-06 15:11:50,828 INFO success: uwsgi entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
web_1 | Traceback (most recent call last):
web_1 | File "./app/main.py", line 23, in <module>
web_1 | svm_detector_reloaded=joblib.load(filename);
web_1 | File "/usr/local/lib/python3.6/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 578, in load
web_1 | obj = _unpickle(fobj, filename, mmap_mode)
web_1 | File "/usr/local/lib/python3.6/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 508, in _unpickle
web_1 | obj = unpickler.load()
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1050, in load
web_1 | dispatch[key[0]](self)
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1338, in load_global
web_1 | klass = self.find_class(module, name)
web_1 | File "/usr/local/lib/python3.6/pickle.py", line 1392, in find_class
web_1 | return getattr(sys.modules[module], name)
web_1 | AttributeError: module '__main__' has no attribute 'split_into_lemmas'
web_1 | unable to load app 0 (mountpoint='') (callable not found or import error)
web_1 | *** no app loaded. going in full dynamic mode ***
web_1 | *** uWSGI is running in multiple interpreter mode ***
我的main.py长得像
from flask import Flask
from flask import request
from flask import jsonify
from textblob import TextBlob
import sklearn
import numpy as np
from sklearn.externals import joblib
app = Flask(__name__)
from .api.utils import split_into_lemmas as split_into_lemmas
def split_into_lemmas(message):
message=message.lower()
words = TextBlob(message).words
# for each word, take its "base form" = lemma
return [word.lemma for word in words]
def tollower(message):
return message.lower()
filename = '../../data/sms_spam_detector.pkl'
svm_detector_reloaded=joblib.load(filename);
text="Testing"
lowerText=tollower(text)
@app.route('/')
def hello():
return tollower("Test Test ");
@app.route('/detect/')
def route_detect():
SMS=request.args.get('SMS')
if(SMS==None or SMS==''):
SMS="Test";
return tollower(SMS);
# test=[SMS]
# message= ( svm_detector_reloaded.predict(test)[0])
# return SMS+" "+message;
if __name__ == "__main__":
# Only for debugging while developing
app.run(host='0.0.0.0')
基本上我从 tiangolo/uwsgi-nginx-flask 下载了 example-flask-package-python3.6.zip。添加了数据目录并修改了 docker 文件和 main.py。 main.py 粘贴在上面,docker 文件看起来像
FROM tiangolo/uwsgi-nginx-flask:python3.6
ENV LISTEN_PORT 8080
EXPOSE 8080
RUN pip3 install numpy TextBlob scikit-learn scipy
COPY ./app /app
COPY ./data /data
然后我将预建模型(通过 joblib 存储)复制到新创建的数据目录中。整个代码工作正常,如果我直接 运行 像 python main.py
这样的代码,但不是在发出 docker-compose up
命令时,出现上述错误。如果我评论行 svm_detector_reloaded=joblib.load(filename);
,docker 出现并且一切正常,除了机器学习部分。
基本上,定义的函数 split_into_lemmas
在 unpickled 模型中不可访问。
我在这里做错了什么?模型是按照 @ http://radimrehurek.com/data_science_python 中提到的步骤构建的。实际模型在第 6 步构建。
好的。我能够解决它。我从 3614379 得到了线索。我首先在模块(或 .py 文件)中创建函数 split_into_lemmas
并在训练时导入该模块,而不是将函数保留在主文件本身中。然后在我的 docker 实例中我也导入了相同的模块。它解决了这个问题。