为什么我的回归模型 return 是截距，即使我设置 fit_intercept= False？

Question

基本上，我正在尝试运行基于没有截距的数据帧进行回归，因此我将 fit 截距设置为 false，但以下代码会生成包含截距的参数。有人知道为什么会这样吗？

model2 = smf.ols('Y ~ X', data=df_final)
result2 = model2.fit(cov_type = 'HAC', cov_kwds = {'maxlags':5}, fit_intercept= False)
result2.params

Intercept    0.032649
X            0.014521
dtype: float64

Answer 1

当运行使用公式的OLS模型时，默认添加截距。省略截距项的一种方法是在公式中添加一个 -1：

import pandas as pd
import numpy as np
import statsmodels.formula.api as smf

df = pd.DataFrame({'X': np.random.randint(0, 100, size=20),
                   'Y': np.random.randint(0, 100, size=20)})

model = smf.ols('Y ~ X - 1', data=df)
result = model.fit()

拟合模型现在只包含一个参数（X）：

X    0.691876
dtype: float64

如果您没有使用公式 api，则 OLS 模型不包含截距，因此您无需担心（在这种情况下，您需要明确地将其添加到你的数据）

我不确定你从哪里得到 fit_intercept 参数，因为我在 statsmodels documentation or source code. Maybe you're thinking of linear regression using scikit-learn 中找不到任何对它的引用，它确实使用参数来控制拦截

为什么我的回归模型 return 是截距，即使我设置 fit_intercept= False？

Why does my regression model return an intercept even though I set fit_intercept= False?

python

statistics

regression

statsmodels