如何使用 python 中的 .predict() 方法进行线性回归？

Question

我根据我的数据框估计了多元回归模型。我有三个自变量：月份（1 到 36）、价格和广告日。

我想做预测，改变条件：

-未来 10 个月（37 到 47）的预测值，价格 = 85，广告日 =4

我估计了我的模型并尝试了：

Time1= np.arange(37,48)
Price1=85
Ads1=4
Lm.predict([Time1,Price1,Ads1])

但是不行

谢谢

Answer 1

你需要一个二维数组

Lm.predict([[Time1,Price1,Ads1]])

Answer 2

假设您的模型是在没有任何嵌套数组的二维数组上训练的，问题是：

您要预测的输入不是二维的
变量 Time1 本身就是一个数组，因此，您创建了一个嵌套数组：[Time1,Price1,Ads1]

您当前的预测调用如下所示：

Time1 = np.arange(37,48)
Price1=85
Ads1=4
print([Time1,Price1,Ads1])

看起来像：

[array([37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]), 85, 4]

您可以像这样将其转换为所需的格式：

import numpy as np
print(np.concatenate([Time1, [Price1, Ads1]]).reshape(1,-1))

看起来像：

array([[37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 85,  4]])

Answer 3

首先使用过去观察的训练数据训练模型。在您的情况下，火车数据由 3 个三个自变量和每个观察值的 1 个因变量组成。

训练好模型（使用超参数优化）后，您就可以使用它进行预测。

示例代码（记录在案）

import numpy as np
from sklearn.linear_model import LinearRegression

# sample dummy data 

# independent variables
time = np.arange(1,36)
price = np.random.randint(1,100,35)
ads = np.random.randint(1,10,35)
# dependent variable
y = np.random.randn(35)

# Reshape it into 35X3 where each row is an observation
train_X = np.vstack([time, price, ads]).T

# Fit the model
model = LinearRegression().fit(train_X, y)

# Sample observations for which 
# forecast of dependent variable has to be made
time1 = np.arange(37, 47)
price1 = np.array([85]*len(time1))
ads1 = np.array([4]*len(time1))

# Reshape such that each row is an observation
test_X = np.vstack([time1, price1, ads1]).T

# make the predictions
print (model.predict(test_X))'

输出：

array([0.22189608, 0.2269302 , 0.23196433, 0.23699845, 0.24203257,
       0.24706669, 0.25210081, 0.25713494, 0.26216906, 0.26720318])

如何使用 python 中的 .predict() 方法进行线性回归？

How to use .predict() method in python for linear regression?

python

regression

prediction

pandas

scikit-learn

示例代码（记录在案）