Scikit Learn 模型给出 'LocalOutlierFactor' 对象没有属性 'predict' 错误
ScikitLearn model giving 'LocalOutlierFactor' object has no attribute 'predict' Error
我是机器学习领域的新手,我已经使用 ScikitLearn 构建并训练了一个 ml 模型 library.It 在 Jupyter notebook 中运行得非常好,但是当我将这个模型部署到 Google Cloud ML 时并尝试使用 Python 脚本提供服务,它会抛出错误。
这是我的模型代码的一个片段:
Updated:
from sklearn.metrics import classification_report, accuracy_score
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor
# define a random state
state = 1
classifiers = {
"Isolation Forest": IsolationForest(max_samples=len(X),
contamination=outlier_fraction,
random_state=state),
# "Local Outlier Factor": LocalOutlierFactor(
# n_neighbors = 20,
# contamination = outlier_fraction)
}
import pickle
# fit the model
n_outliers = len(Fraud)
for i, (clf_name, clf) in enumerate(classifiers.items()):
# fit te data and tag outliers
if clf_name == "Local Outlier Factor":
y_pred = clf.fit_predict(X)
print("LOF executed")
scores_pred = clf.negative_outlier_factor_
# Export the classifier to a file
with open('model.pkl', 'wb') as model_file:
pickle.dump(clf, model_file)
else:
clf.fit(X)
scores_pred = clf.decision_function(X)
y_pred = clf.predict(X)
print("IF executed")
# Export the classifier to a file
with open('model.pkl', 'wb') as model_file:
pickle.dump(clf, model_file)
# Reshape the prediction values to 0 for valid and 1 for fraudulent
y_pred[y_pred == 1] = 0
y_pred[y_pred == -1] = 1
n_errors = (y_pred != Y).sum()
# run classification metrics
print('{}:{}'.format(clf_name, n_errors))
print(accuracy_score(Y, y_pred ))
print(classification_report(Y, y_pred ))
这是 Jupyter Notebook 中的输出:
Isolation Forest:7
0.93
precision recall f1-score support
0 0.97 0.96 0.96 94
1 0.43 0.50 0.46 6
avg / total 0.94 0.93 0.93 100
我已将此模型部署到 Google Cloud ML-Engine,然后尝试使用以下 python 脚本为其提供服务:
import os
from googleapiclient import discovery
from oauth2client.service_account import ServiceAccountCredentials
credentials = ServiceAccountCredentials.from_json_keyfile_name('Machine Learning 001-dafe42dfb46f.json')
PROJECT_ID = "machine-learning-001-201312"
VERSION_NAME = "v1"
MODEL_NAME = "mlfd"
service = discovery.build('ml', 'v1', credentials=credentials)
name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME)
name += '/versions/{}'.format(VERSION_NAME)
data = [[265580, 7, 68728, 8.36, 4.76, 84.12, 79.36, 3346, 1, 11.99, 1.14,655012, 0.65, 258374, 0, 84.12] ]
response = service.projects().predict(
name=name,
body={'instances': data}
).execute()
if 'error' in response:
print (response['error'])
else:
online_results = response['predictions']
print(online_results)
下面是这个脚本的输出:
Prediction failed: Exception during sklearn prediction: 'LocalOutlierFactor' object has no attribute 'predict'
LocalOutlierFactor
没有predict
方法,只有私有_predict
方法。这是来源的理由。
def _predict(self, X=None):
"""Predict the labels (1 inlier, -1 outlier) of X according to LOF.
If X is None, returns the same as fit_predict(X_train).
This method allows to generalize prediction to new observations (not
in the training set). As LOF originally does not deal with new data,
this method is kept private.
https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/neighbors/lof.py#L200
看起来这可能是 Python 版本问题(尽管我不清楚为什么 scikit learn 在 Python 2 和 Python 3 中表现不同)。我能够在本地验证——在同一台机器上——我的 Python 2 安装重现了上面的错误,而 Python 3 成功了(两者都使用 sci-kit learn 0.19.1)。
解决方案是在部署模型时指定 python 版本(注意最后一行,如果省略,默认为“2.7”):
gcloud beta ml-engine versions create $VERSION_NAME \
--model $MODEL_NAME --origin $DEPLOYMENT_SOURCE \
--runtime-version="1.5" --framework $FRAMEWORK
--python-version="3.5"
令人惊讶的是,问题出在 runtime version
,当您将模型版本重新创建为:
时它就会得到解决
gcloud beta ml-engine versions create $VERSION_NAME --model $MODEL_NAME --origin $DEPLOYMENT_SOURCE --runtime-version="1.6" --framework $FRAMEWORK --python-version="3.5"
Use Runtime version 1.6 instead of 1.5, turn it to a running model at least.
我从事过一个看起来非常相同的项目。我得到了同样的错误。我的问题是 if 语句中的错字。
问候
洛伦兹
我是机器学习领域的新手,我已经使用 ScikitLearn 构建并训练了一个 ml 模型 library.It 在 Jupyter notebook 中运行得非常好,但是当我将这个模型部署到 Google Cloud ML 时并尝试使用 Python 脚本提供服务,它会抛出错误。
这是我的模型代码的一个片段:
Updated:
from sklearn.metrics import classification_report, accuracy_score
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor
# define a random state
state = 1
classifiers = {
"Isolation Forest": IsolationForest(max_samples=len(X),
contamination=outlier_fraction,
random_state=state),
# "Local Outlier Factor": LocalOutlierFactor(
# n_neighbors = 20,
# contamination = outlier_fraction)
}
import pickle
# fit the model
n_outliers = len(Fraud)
for i, (clf_name, clf) in enumerate(classifiers.items()):
# fit te data and tag outliers
if clf_name == "Local Outlier Factor":
y_pred = clf.fit_predict(X)
print("LOF executed")
scores_pred = clf.negative_outlier_factor_
# Export the classifier to a file
with open('model.pkl', 'wb') as model_file:
pickle.dump(clf, model_file)
else:
clf.fit(X)
scores_pred = clf.decision_function(X)
y_pred = clf.predict(X)
print("IF executed")
# Export the classifier to a file
with open('model.pkl', 'wb') as model_file:
pickle.dump(clf, model_file)
# Reshape the prediction values to 0 for valid and 1 for fraudulent
y_pred[y_pred == 1] = 0
y_pred[y_pred == -1] = 1
n_errors = (y_pred != Y).sum()
# run classification metrics
print('{}:{}'.format(clf_name, n_errors))
print(accuracy_score(Y, y_pred ))
print(classification_report(Y, y_pred ))
这是 Jupyter Notebook 中的输出:
Isolation Forest:7
0.93
precision recall f1-score support 0 0.97 0.96 0.96 94 1 0.43 0.50 0.46 6 avg / total 0.94 0.93 0.93 100
我已将此模型部署到 Google Cloud ML-Engine,然后尝试使用以下 python 脚本为其提供服务:
import os
from googleapiclient import discovery
from oauth2client.service_account import ServiceAccountCredentials
credentials = ServiceAccountCredentials.from_json_keyfile_name('Machine Learning 001-dafe42dfb46f.json')
PROJECT_ID = "machine-learning-001-201312"
VERSION_NAME = "v1"
MODEL_NAME = "mlfd"
service = discovery.build('ml', 'v1', credentials=credentials)
name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME)
name += '/versions/{}'.format(VERSION_NAME)
data = [[265580, 7, 68728, 8.36, 4.76, 84.12, 79.36, 3346, 1, 11.99, 1.14,655012, 0.65, 258374, 0, 84.12] ]
response = service.projects().predict(
name=name,
body={'instances': data}
).execute()
if 'error' in response:
print (response['error'])
else:
online_results = response['predictions']
print(online_results)
下面是这个脚本的输出:
Prediction failed: Exception during sklearn prediction: 'LocalOutlierFactor' object has no attribute 'predict'
LocalOutlierFactor
没有predict
方法,只有私有_predict
方法。这是来源的理由。
def _predict(self, X=None):
"""Predict the labels (1 inlier, -1 outlier) of X according to LOF.
If X is None, returns the same as fit_predict(X_train).
This method allows to generalize prediction to new observations (not
in the training set). As LOF originally does not deal with new data,
this method is kept private.
https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/neighbors/lof.py#L200
看起来这可能是 Python 版本问题(尽管我不清楚为什么 scikit learn 在 Python 2 和 Python 3 中表现不同)。我能够在本地验证——在同一台机器上——我的 Python 2 安装重现了上面的错误,而 Python 3 成功了(两者都使用 sci-kit learn 0.19.1)。
解决方案是在部署模型时指定 python 版本(注意最后一行,如果省略,默认为“2.7”):
gcloud beta ml-engine versions create $VERSION_NAME \
--model $MODEL_NAME --origin $DEPLOYMENT_SOURCE \
--runtime-version="1.5" --framework $FRAMEWORK
--python-version="3.5"
令人惊讶的是,问题出在 runtime version
,当您将模型版本重新创建为:
gcloud beta ml-engine versions create $VERSION_NAME --model $MODEL_NAME --origin $DEPLOYMENT_SOURCE --runtime-version="1.6" --framework $FRAMEWORK --python-version="3.5"
Use Runtime version 1.6 instead of 1.5, turn it to a running model at least.
我从事过一个看起来非常相同的项目。我得到了同样的错误。我的问题是 if 语句中的错字。
问候 洛伦兹