为什么 LinearRegression 的得分与 sklearn.metrics 的 r2_score 给出的结果不同

Why Score from LinearRegression is giving different result than r2_score from sklearn.metrics

理想情况下,我应该得到相同的结果,因为分数只不过是 R 方。但不确定为什么结果会有所不同。

from sklearn.datasets import california_housing
data = california_housing.fetch_california_housing()
data.data.shape
data.feature_names
data.target_names

import pandas as pd
house_data = pd.DataFrame(data.data, columns=data.feature_names)
house_data.describe()
house_data['Price'] = data.target


X = house_data.iloc[:, 0:8].values
y = house_data.iloc[:, -1].values

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state = 0)

# Fitting Simple Linear Regression to the Training set
from sklearn.linear_model import LinearRegression
linear_model = LinearRegression()
linear_model.fit(X_train, y_train)
#Check R-square on training data


from sklearn.metrics import mean_squared_error, r2_score

y_pred = linear_model.predict(X_test)
print(linear_model.score(X_test, y_test))
print(r2_score(y_pred, y_test))

输出

0.5957643114594776
0.34460597952465033

来自文档:https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html

sklearn.metrics.r2_score(y_true, y_pred,...)

你通过了 y_true 和 y_pred 错误的方式。如果你切换它们,你会得到正确的结果。

print(linear_model.score(X_test, y_test))
print(r2_score(y_test, y_pred))

0.5957643114594777
0.5957643114594777