绘制逻辑回归非标度值

Plotting Logistic Regression Non-Scaled values

我是 Python 和一般编程的新手。我正在学习关于逻辑回归的 class。下面的代码是正确的并且绘图相对不错(不是很漂亮,但还可以):

# ------ LOGISTIC REGRESSION ------ #

# --- Importing the Libraries --- #

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from matplotlib.colors import ListedColormap

# --- Importing the Dataset --- #

path = '/home/bohrz/Desktop/Programação/Machine Learning/Part 3 - ' \
       'Classification/Section 14 - Logistic Regression/Social_Network_Ads.csv'
dataset = pd.read_csv(path)
X = dataset.iloc[:, 2:4].values
y = dataset.iloc[:, -1].values

# --- Splitting the Dataset into Training and Test set --- #

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25,
                                                    random_state=0)

# --- Feature Scaling --- #

sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)

# --- Fitting the Logistic Regression Model to the Dataset --- #

classifier = LogisticRegression(random_state=0)
classifier.fit(X_train, y_train)

# --- Predicting the Test set results --- #

y_pred = classifier.predict(X_test)

# --- Making the Confusion Matrix --- #

cm = confusion_matrix(y_test, y_pred)

# --- Visualizing Logistic Regression results --- #

# --- Visualizing the Training set results --- #

X_set_train, y_set_train = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start=X_set_train[:, 0].min(),
                               stop=X_set_train[:, 0].max(), step=0.01),
                     np.arange(start=X_set_train[:, 1].min(),
                               stop=X_set_train[:, 1].max(), step=0.01))

# Building the graph contour based on classification method
Z_train = np.array([X1.ravel(), X2.ravel()]).T
plt.contourf(X1, X2, classifier.predict(Z_train).reshape(X1.shape), alpha=0.75,
                                                         cmap=ListedColormap(
                                                             ('red', 'green')))

# Apply limits when outliers are present
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())

# Creating the scatter plot of the Training set results
for i, j in enumerate(np.unique(y_set_train)):
    plt.scatter(X_set_train[y_set_train == j, 0], X_set_train[y_set_train == j,
                                                              1],
                c=ListedColormap(('red', 'green'))(i), label=j)

plt.title('Logistic Regression (Trainning set results)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

我的问题是:如何绘制没有比例的结果?我尝试在代码的几个地方使用 invert_transform() 方法,但没有帮助。

提前致谢。

您的任务只是跟踪缩放数据和非缩放数据。

虽然没有详细分析您的代码,但基本思想只是:查看使用 scaled/unscaled 值的位置,并在需要时进行调整!

  • A:训练后,我们不再需要缩放后的 X,所以让我们把所有东西都变回去
  • B: 但是这种图是在一些np.mesh上使用分类器,它本身是由unscaled-data创建的,所以我们需要在那里再次使用transformer
  • C:注意:基于网格的方法正在创建一个密集的网格,如果边界在保持步长的同时发生变化,您将因内存消耗而导致 PC 崩溃
    • 这个实际上可以调整(不确定原始值来自哪里)因为情节会有细微的变化

因此需要的更改是:

答:

y_pred = classifier.predict(X_test)  # YOUR CODE
X_train = sc_X.inverse_transform(X_train) # transform back
X_test = sc_X.inverse_transform(X_test)   # """

C:

X1, X2 = np.meshgrid(np.arange(start=X_set_train[:, 0].min(),
                               stop=X_set_train[:, 0].max(), step=10.), #!!! 0.01 ),
                     np.arange(start=X_set_train[:, 1].min(),
                               stop=X_set_train[:, 1].max(), step=0.1)) #!!! 0.01))

乙:

Z_train = np.array([X1.ravel(), X2.ravel()]).T
plt.contourf(X1, X2, classifier.predict(sc_X.transform(Z_train)).reshape(X1.shape),  # TRANFORM Z
                                    alpha=0.75,
                                    cmap=ListedColormap(
                                    ('red', 'green')))

虽然原始图显示了一条有锯齿的直线(精细的阶梯图案),但现在我们看到了一些不同的东西。我会把它留给感兴趣的 reader(它与缩放有关!)。