嵌套 dataframe/tibble 的多元线性回归模型
Multiple linear regression models for a nested dataframe/tibble
我正在尝试 运行 对嵌套数据框进行多元线性回归。
我有这个数据样本:
data.frame(Subcat,Date, COMM1, COMM2,UOM, AUC_TYPE, WINNING_PRICE
#--|----------|-----|-----|----|---------|-------|
1, 2017-03-07, 40750,41400,"MT","English",35000
1, 2017-03-15, 40750,40000,"MT","English",35600
2, 2017-10-16, 41000,40500,"METER","Yankee",56440
2, 2017-11-06, 41010,40510,"METER","Yankee",52000
2, 2019-01-26, 50010,50510,"METER","English",50000
3, 2017-03-07, 40750,41400,"MT","English",56900
3, 2018-05-26, 50010,50510,"MT","English",47000
3, 2019-01-21, 40750,40200,"MT","English",56000
3, 2019-01-21, 40750,40200,"MT","English",55900
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",67000
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",65900)
Factors/Character 变量已转换为虚拟变量,然后在子类别的基础上进行嵌套。
df2= df[,-2] %>% group_by(Subcat)%>% nest()
输出是一个带有子目录和数据列的嵌套数据框。
我正在尝试 运行 使用以下代码预测每个子类别的获胜价格的回归模型:
df2= df[,-2] %>% group_by(Subcat)%>% nest() %>%
mutate(fit=map(data, ~ lm(WINNING_PRICE~.,data = .)),
results=map(fit,augment)) %>%
unnest()
显示错误输出错误:输入必须是向量列表
另外: 警告信息:
现在需要 cols
。
请使用 cols = c(data, fit, results)
。此外,数据帧 df2 未显示在控制台中。
我已引用此查询“”
提前致谢!
我认为这应该有效:
model_fn <- function(df1){
lm(WINNING_PRICE ~ AUC_TYPE, data = df1)
}
fitted_bestel <- df2 %>%
mutate(fit = map(data, model_fn))
错误来自您使用的两个点(一个替代所有协变量,一个替代数据)。
如果你想建模 WINNING_Price ~ Subcat 我不认为我们必须嵌套(第一个例子)。如果需要在 'data' 列中嵌套和拟合模型,两个模型元素都应位于嵌套数据帧 WINNING_PRICE ~ COMM1 中。以下是每种情况的两个示例: unnest() 错误也来自更改以指定要使用 'cols = ' 参数取消嵌套的列。
library(tidyverse)
df <- tribble(~Subcat, ~Date, ~COMM1, ~COMM2, ~UOM, ~AUC_TYPE, ~WINNING_PRICE,
#--|----------|-----|-----|----|---------|-------|
1, 2017-03-07, 40750,41400,"MT","English",35000,
1, 2017-03-15, 40750,40000,"MT","English",35600,
2, 2017-10-16, 41000,40500,"METER","Yankee",56440,
2, 2017-11-06, 41010,40510,"METER","Yankee",52000,
2, 2019-01-26, 50010,50510,"METER","English",50000,
3, 2017-03-07, 40750,41400,"MT","English",56900,
3, 2018-05-26, 50010,50510,"MT","English",47000,
3, 2019-01-21, 40750,40200,"MT","English",56000,
3, 2019-01-21, 40750,40200,"MT","English",55900,
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",67000,
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",65900)
fit <- lm(WINNING_PRICE ~ Subcat, data = df)
plot(df$Subcat, y = df$WINNING_PRICE)
abline(fit)
#to fit many model to data with 'data' next column
df2= df[,-2] %>% group_by(Subcat)%>% nest()
df3 <- df2 %>%
mutate(fit = map(data, ~lm(WINNING_PRICE~COMM1, data = .)),
results = map(fit, broom::augment))
#need to specify cols to unnest (this was changed recentlyish)
df4 <- df3 %>% unnest(cols = data)
我正在尝试 运行 对嵌套数据框进行多元线性回归。 我有这个数据样本:
data.frame(Subcat,Date, COMM1, COMM2,UOM, AUC_TYPE, WINNING_PRICE
#--|----------|-----|-----|----|---------|-------|
1, 2017-03-07, 40750,41400,"MT","English",35000
1, 2017-03-15, 40750,40000,"MT","English",35600
2, 2017-10-16, 41000,40500,"METER","Yankee",56440
2, 2017-11-06, 41010,40510,"METER","Yankee",52000
2, 2019-01-26, 50010,50510,"METER","English",50000
3, 2017-03-07, 40750,41400,"MT","English",56900
3, 2018-05-26, 50010,50510,"MT","English",47000
3, 2019-01-21, 40750,40200,"MT","English",56000
3, 2019-01-21, 40750,40200,"MT","English",55900
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",67000
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",65900)
Factors/Character 变量已转换为虚拟变量,然后在子类别的基础上进行嵌套。
df2= df[,-2] %>% group_by(Subcat)%>% nest()
输出是一个带有子目录和数据列的嵌套数据框。 我正在尝试 运行 使用以下代码预测每个子类别的获胜价格的回归模型:
df2= df[,-2] %>% group_by(Subcat)%>% nest() %>%
mutate(fit=map(data, ~ lm(WINNING_PRICE~.,data = .)),
results=map(fit,augment)) %>%
unnest()
显示错误输出错误:输入必须是向量列表
另外: 警告信息:
现在需要 cols
。
请使用 cols = c(data, fit, results)
。此外,数据帧 df2 未显示在控制台中。
我已引用此查询“
提前致谢!
我认为这应该有效:
model_fn <- function(df1){
lm(WINNING_PRICE ~ AUC_TYPE, data = df1)
}
fitted_bestel <- df2 %>%
mutate(fit = map(data, model_fn))
错误来自您使用的两个点(一个替代所有协变量,一个替代数据)。
如果你想建模 WINNING_Price ~ Subcat 我不认为我们必须嵌套(第一个例子)。如果需要在 'data' 列中嵌套和拟合模型,两个模型元素都应位于嵌套数据帧 WINNING_PRICE ~ COMM1 中。以下是每种情况的两个示例: unnest() 错误也来自更改以指定要使用 'cols = ' 参数取消嵌套的列。
library(tidyverse)
df <- tribble(~Subcat, ~Date, ~COMM1, ~COMM2, ~UOM, ~AUC_TYPE, ~WINNING_PRICE,
#--|----------|-----|-----|----|---------|-------|
1, 2017-03-07, 40750,41400,"MT","English",35000,
1, 2017-03-15, 40750,40000,"MT","English",35600,
2, 2017-10-16, 41000,40500,"METER","Yankee",56440,
2, 2017-11-06, 41010,40510,"METER","Yankee",52000,
2, 2019-01-26, 50010,50510,"METER","English",50000,
3, 2017-03-07, 40750,41400,"MT","English",56900,
3, 2018-05-26, 50010,50510,"MT","English",47000,
3, 2019-01-21, 40750,40200,"MT","English",56000,
3, 2019-01-21, 40750,40200,"MT","English",55900,
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",67000,
4, 2017-11-08, 37500,39000,"LTR","Dynamic Sealbid",65900)
fit <- lm(WINNING_PRICE ~ Subcat, data = df)
plot(df$Subcat, y = df$WINNING_PRICE)
abline(fit)
#to fit many model to data with 'data' next column
df2= df[,-2] %>% group_by(Subcat)%>% nest()
df3 <- df2 %>%
mutate(fit = map(data, ~lm(WINNING_PRICE~COMM1, data = .)),
results = map(fit, broom::augment))
#need to specify cols to unnest (this was changed recentlyish)
df4 <- df3 %>% unnest(cols = data)