pandas.errors.InvalidIndexError: (slice(None, None, None), None)

Question

这是我的工资预测线性回归模型，用于 pratcit 但出现错误 pl.plot(x_train, model.predict(x_train))

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

df = pd.read_csv('Salary_Data.csv')
x = df[['YearsExperience']]
y = df['Salary']

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2, random_state=0)

model = LinearRegression().fit(x_train,y_train)

import matplotlib.pyplot as plt
pl = plt
pl.scatter(x_train, y_train)
pl.plot(x_train, model.predict(x_train))
pl.show()

这是我使用的数据

YearsExperience,Salary
1.1,39343.00
1.3,46205.00
1.5,37731.00
2.0,43525.00
2.2,39891.00
2.9,56642.00
3.0,60150.00
3.2,54445.00
3.2,64445.00
3.7,57189.00
3.9,63218.00
4.0,55794.00
4.0,56957.00
4.1,57081.00
4.5,61111.00
4.9,67938.00
5.1,66029.00
5.3,83088.00
5.9,81363.00
6.0,93940.00
6.8,91738.00
7.1,98273.00
7.9,101302.00
8.2,113812.00
8.7,109431.00
9.0,105582.00
9.5,116969.00
9.6,112635.00
10.3,122391.00
10.5,121872.00

这是 iam fasing 的错误

Traceback (most recent call last):
  File "D:\Softwers\Python_3.9\lib\site-packages\pandas\core\indexes\base.py", line 3621, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc      
  File "pandas\_libs\index.pyx", line 142, in pandas._libs.index.IndexEngine.get_loc      
TypeError: '(slice(None, None, None), None)' is an invalid key

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\Project\data scince\main.py", line 16, in <module>
    pl.plot(x_train, model.predict(x_train))
  File "D:\Softwers\Python_3.9\lib\site-packages\matplotlib\pyplot.py", line 2757, in plot
    return gca().plot(
  File "D:\Softwers\Python_3.9\lib\site-packages\matplotlib\axes\_axes.py", line 1632, in plot
    lines = [*self._get_lines(*args, data=data, **kwargs)]
  File "D:\Softwers\Python_3.9\lib\site-packages\matplotlib\axes\_base.py", line 312, in __call__
    yield from self._plot_args(this, kwargs)
  File "D:\Softwers\Python_3.9\lib\site-packages\matplotlib\axes\_base.py", line 487, in _plot_args
    x = _check_1d(xy[0])
  File "D:\Softwers\Python_3.9\lib\site-packages\matplotlib\cbook\__init__.py", line 1327, in _check_1d
    ndim = x[:, None].ndim
  File "D:\Softwers\Python_3.9\lib\site-packages\pandas\core\frame.py", line 3505, in __getitem__
    indexer = self.columns.get_loc(key)
  File "D:\Softwers\Python_3.9\lib\site-packages\pandas\core\indexes\base.py", line 3628, in get_loc
    self._check_indexing_error(key)
  File "D:\Softwers\Python_3.9\lib\site-packages\pandas\core\indexes\base.py", line 5637, in _check_indexing_error
    raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: (slice(None, None, None), None)

Answer 1

这解决了我的问题不知道为什么？

x = df.iloc[:, :-1].values
y = df.iloc[:, 1].values

完整代码

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

df = pd.read_csv('Salary_Data.csv')
x = df.iloc[:, :-1].values
y = df.iloc[:, 1].values

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2, random_state=0)

model = LinearRegression().fit(x_train,y_train)

import matplotlib.pyplot as plt
pl = plt
pl.scatter(x_train, y_train)
pl.plot(x_train, model.predict(x_train))
pl.show()

Answer 2

你必须使用 Series 作为 x-axis 而不是 DataFrame:

pl.plot(x_train['YearsExperience'], model.predict(x_train))

# OR

pl.plot(x_train.squeeze(), model.predict(x_train))

pandas.errors.InvalidIndexError: (slice(None, None, None), None)

pandas.errors.InvalidIndexError: (slice(None, None, None), None)

matplotlib

pandas

scikit-learn