Troubleshooting a deploy to heroku - RuntimeError: The reset parameter is False but there is no n_features_in_ attribute. Is this estimator fitted?

Troubleshooting a deploy to heroku - RuntimeError: The reset parameter is False but there is no n_features_in_ attribute. Is this estimator fitted?

您好 Stackeoverflow 社区。我正在为 class 开发一个简单的项目,涉及托管一个允许用户输入值并预测谁将获胜的烧瓶应用程序。好吧,在我们使用 python flask 的本地机器上运行正常,但在 heroku 上提出了挑战。难倒了!

代码不长也不复杂。也知道我们也有一些机会。

我们的 App.py 文件看起来像:

import flask
import tensorflow
import sklearn
import pandas as pd
import numpy as np
from tensorflow import keras
from pickle import load
from load import *

model = init()

scaler = load(open('model/scaler.pkl', 'rb'))
print("Awesome, your scaler has been loaded from disk! Cool beans!")

app = flask.Flask(__name__, template_folder='templates')

@app.route('/', methods=['GET', 'POST'])
def main():
    global model 
    if flask.request.method == 'GET':
        return(flask.render_template('main.html'))
    if flask.request.method == 'POST':
        R_Weight = flask.request.form['R_Weight']
        R_Height = flask.request.form['R_Height']
        R_Age = flask.request.form['R_Age']
        B_Weight = flask.request.form['B_Weight']
        B_Height = flask.request.form['B_Height']
        B_Age = flask.request.form['B_Age']
        RPrev = 2.130049
        BPrev = 1.756650
        BStreak = 0.643350
        RStreak = 0.748768
        input_variables = pd.DataFrame([[BPrev, BStreak, B_Age,B_Height,B_Weight,RPrev, RStreak, R_Age,R_Height,R_Weight]],
                                    columns=['BPrev','BStreak','B_Age','B_Height','B_Weight','RPrev','RStreak','R_Age','R_Height','R_Weight'],
                                    dtype=float)
        input_scaled = scaler.transform(input_variables)
        prediction = model.predict(input_scaled)[0][0]
        if np.round(prediction) == 0:
            prediction = "Blue"
        else:
            prediction = "Red"   

        if int(R_Weight) < 20:
            prediction = "Invalid Weight"
        elif int(B_Weight) < 20:
            prediction = "Invalid Weight"
        elif int(R_Height) < 20:
            prediction = "Invalid Height"
        elif int(B_Height) < 20:
            prediction = "Invalid Height"
        elif int(R_Age) < 18:
            prediction = "Invalid Age"
        elif int(B_Age) < 18:
            prediction = "Invalid Age"

        return flask.render_template('main.html',
                                      original_input={'R_Weight' :R_Weight,
                                                    'R_Height' :R_Height,
                                                    'R_Age' :R_Age,
                                                    'B_Weight' :B_Weight,
                                                    'B_Height' :B_Height,
                                                    'B_Age' :B_Age
                                                     },
                                     result=prediction
                                     )

@app.route('/logisticregression')
def logreg():
    return(flask.render_template('00_model_logisticsregression.html'))

@app.route('/gridsearch')
def grisea():
    return(flask.render_template('00_model_gridsearch.html'))

@app.route('/deeplearning')
def deelea():
    return(flask.render_template('01_training_model.html'))

@app.route('/viz1')
def viz1():
    return(flask.render_template('00_data_visualization_a.html'))

@app.route('/viz2')
def viz2():
    return(flask.render_template('00_data_visualization_b.html'))

@app.route('/viz3')
def viz3():
    return(flask.render_template('00_data_visualization_c.html'))

@app.route('/trained')
def traine():
    return(flask.render_template('02_notraining_model.html'))

@app.route('/debugging')
def debugg():
    return(flask.render_template('03_debugger.html'))


if __name__ == '__main__':
    app.run(debug=True)

我在 heroku 中看到的错误是:

2020-05-31T19:43:05.000000+00:00 app[api]: Build succeeded
2020-05-31T19:48:29.426002+00:00 app[web.1]: 10.81.161.51 - - [31/May/2020:19:48:29 +0000] "GET / HTTP/1.1" 200 4710 "https://dashboard.heroku.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36"
2020-05-31T19:48:29.432446+00:00 heroku[router]: at=info method=GET path="/" host=bcs-final.herokuapp.com request_id=1df55828-886a-45d1-9605-aafd3414f9e3 fwd="71.143.151.112" dyno=web.1 connect=1ms service=56ms status=200 bytes=4872 protocol=https
2020-05-31T19:48:29.623294+00:00 app[web.1]: 10.81.161.51 - - [31/May/2020:19:48:29 +0000] "GET /reset.css HTTP/1.1" 404 232 "https://bcs-final.herokuapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36"
2020-05-31T19:48:29.623792+00:00 heroku[router]: at=info method=GET path="/reset.css" host=bcs-final.herokuapp.com request_id=40816718-e41b-44e9-a3d5-da5ff2030adb fwd="71.143.151.112" dyno=web.1 connect=1ms service=7ms status=404 bytes=400 protocol=https
2020-05-31T19:48:29.625560+00:00 app[web.1]: 10.7.244.15 - - [31/May/2020:19:48:29 +0000] "GET /style.css HTTP/1.1" 404 232 "https://bcs-final.herokuapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36"
2020-05-31T19:48:29.627392+00:00 heroku[router]: at=info method=GET path="/style.css" host=bcs-final.herokuapp.com request_id=bd01a28c-7d1e-4dc6-9a85-405b1bdbb8ea fwd="71.143.151.112" dyno=web.1 connect=1ms service=8ms status=404 bytes=400 protocol=https
2020-05-31T19:48:37.727173+00:00 app[web.1]: [2020-05-31 19:48:37,723] ERROR in app: Exception on / [POST]
2020-05-31T19:48:37.727185+00:00 app[web.1]: Traceback (most recent call last):
2020-05-31T19:48:37.727186+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app
2020-05-31T19:48:37.727186+00:00 app[web.1]: response = self.full_dispatch_request()
2020-05-31T19:48:37.727187+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1952, in full_dispatch_request
2020-05-31T19:48:37.727215+00:00 app[web.1]: rv = self.handle_user_exception(e)
2020-05-31T19:48:37.727216+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1821, in handle_user_exception
2020-05-31T19:48:37.727217+00:00 app[web.1]: reraise(exc_type, exc_value, tb)
2020-05-31T19:48:37.727218+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
2020-05-31T19:48:37.727218+00:00 app[web.1]: raise value
2020-05-31T19:48:37.727219+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request
2020-05-31T19:48:37.727219+00:00 app[web.1]: rv = self.dispatch_request()
2020-05-31T19:48:37.727220+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request
2020-05-31T19:48:37.727220+00:00 app[web.1]: return self.view_functions[rule.endpoint](**req.view_args)
2020-05-31T19:48:37.727221+00:00 app[web.1]: File "/app/app.py", line 36, in main
2020-05-31T19:48:37.727221+00:00 app[web.1]: input_scaled = scaler.transform(input_variables)
2020-05-31T19:48:37.727222+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/sklearn/preprocessing/_data.py", line 794, in transform
2020-05-31T19:48:37.727222+00:00 app[web.1]: force_all_finite='allow-nan')
2020-05-31T19:48:37.727222+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/sklearn/base.py", line 436, in _validate_data
2020-05-31T19:48:37.727223+00:00 app[web.1]: self._check_n_features(X, reset=reset)
2020-05-31T19:48:37.727223+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/sklearn/base.py", line 373, in _check_n_features
2020-05-31T19:48:37.727223+00:00 app[web.1]: "The reset parameter is False but there is no "
2020-05-31T19:48:37.727314+00:00 app[web.1]: RuntimeError: The reset parameter is False but there is no n_features_in_ attribute. Is this estimator fitted?
2020-05-31T19:48:37.728944+00:00 app[web.1]: 10.7.244.15 - - [31/May/2020:19:48:37 +0000] "POST / HTTP/1.1" 500 290 "https://bcs-final.herokuapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36"
2020-05-31T19:48:37.730539+00:00 heroku[router]: at=info method=POST path="/" host=bcs-final.herokuapp.com request_id=2ba8ae51-f913-48d0-9b45-78aef407c15e fwd="71.143.151.112" dyno=web.1 connect=1ms service=27ms status=500 bytes=470 protocol=https

查看代码,我无法确定运行在本地和在 heroku 上使用您的应用程序之间有什么区别。唯一合理的猜测是您的本地环境与 heroku 上的 运行 版本不同。

无论如何,如果我们关注 heroku 的版本告诉我们的内容,您的错误来自这一行:

# ...
input_scaled = scaler.transform(input_variables) # <--- here
# ...

错误基本上是说:估计器 - 实际上是你的缩放器(参见源代码 here) - 不适合。

消息本身被打印出来 here 并且说“n_features_in_ 属性缺失”的部分本质上意味着您的 input_variables 稀疏矩阵(在 scikit-learn 源代码中称为 X) 缺少属性 n_features_in_。该属性设置为拟合过程的一部分。

最重要的是,在 transform 之前调用 fit,这可能会有所帮助。所以像:

input_scaled = scaler.fit(input_variables).transform(input_variables)