purrr loop: Error: Problem with `mutate()` input `combined_data`. x `x` and `y` must share the same src, set `copy` = TRUE (may be slow)
purrr loop: Error: Problem with `mutate()` input `combined_data`. x `x` and `y` must share the same src, set `copy` = TRUE (may be slow)
我试图创建一个可重现的示例,但令人沮丧的是这实际上有效:
my_mtcars <- mtcars %>%
rownames_to_column('car') %>%
group_by(vs) %>%
nest
my_mtcars <- my_mtcars %>%
mutate(lhs = map(.x = data, ~ .x %>% select(car:drat))) %>%
mutate(rhs = map(.x = data, ~ .x %>% select(car, wt:carb) %>% rename(model = car))) %>%
mutate(together_again = map2(.x = lhs, .y = rhs, ~ inner_join(.x, .y, by = c('car' = 'model'))))
以上内容有效,但简而言之显示了我试图用我的真实数据做的事情。我的包含列表列的实际数据框无法通过内部连接发生变化,我希望通过在此处描述和显示一些匿名数据,有人可能会提出建议。
我的数据框pdata
:
data
# A tibble: 104 x 7
MONETIZATION_WEEK_COHORT data cut_off clv_obj model prediction training_period_metrics
<date> <list> <int> <list> <list> <list> <list>
1 2020-03-30 <tibble [214,509 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [7,328 × 3]>
2 2020-03-30 <tibble [214,509 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [7,328 × 3]>
3 2020-04-06 <tibble [496,626 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [20,060 × 3]>
4 2020-04-06 <tibble [496,626 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [20,060 × 3]>
5 2020-04-13 <tibble [595,775 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [25,816 × 3]>
6 2020-04-13 <tibble [595,775 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [25,816 × 3]>
7 2020-04-20 <tibble [548,436 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [22,161 × 3]>
8 2020-04-20 <tibble [548,436 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [22,161 × 3]>
9 2020-04-27 <tibble [529,507 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [21,113 × 3]>
10 2020-04-27 <tibble [529,507 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [21,113 × 3]>
我正在尝试将预测与每一行的训练期指标结合起来。这是这两个字段的示例,它们都是数据帧:
下面map2中的.y
字段:
pdata$prediction[[1]]$result %>% head(2) %>% glimpse
Rows: 2
Columns: 11
$ Id <chr> "123abc", "def456"
$ period.first <date> 2020-05-21, 2020-05-21
$ period.last <date> 2020-08-26, 2020-08-26
$ period.length <int> 14, 14
$ actual.x <int> 0, 0
$ actual.total.spending <dbl> 0, 0
$ PAlive <dbl> 0.72933712, 0.05683547
$ CET <dbl> 19.2692978, 0.1285307
$ DERT <dbl> 13.37550762, 0.08921192
$ predicted.mean.spending <dbl> 839.648, 1017.683
$ predicted.CLV <dbl> 11230.71800, 90.78944
下面map2中的.x
字段:
pdata$training_period_metrics[[1]] %>% head(2) %>% glimpse
Rows: 2
Columns: 3
$ S <chr> "abc123", "def456"
$ Transactions <int> 40, 3
$ Total_Spending <dbl> 14660, 1797
我正在尝试将它们作为新列加入到数据框中:
pdata %>% mutate(combined_data = map2(.x = training_period_metrics, .y = prediction, ~ inner_join(.x, .y$result, by = c('S' = 'Id'))))
Error: Problem with `mutate()` input `combined_data`.
x `x` and `y` must share the same src, set `copy` = TRUE (may be slow).
ℹ Input `combined_data` is `map2(...)`.
如何在我的 purrr 循环中加入 prediction$result
和 training_period_metrics
?
只有当 .x
和 .y
都不是 NULL
或 return NULL
时,我们才能使用条件进行连接
my_mtcars %>%
mutate(together_again = map2(.x = lhs, .y = rhs,
~ if(is.null(unlist(.x))|is.null(unlist(.y))) list(NULL) else
inner_join(.x, .y, by = c('car' = 'model'))))
我试图创建一个可重现的示例,但令人沮丧的是这实际上有效:
my_mtcars <- mtcars %>%
rownames_to_column('car') %>%
group_by(vs) %>%
nest
my_mtcars <- my_mtcars %>%
mutate(lhs = map(.x = data, ~ .x %>% select(car:drat))) %>%
mutate(rhs = map(.x = data, ~ .x %>% select(car, wt:carb) %>% rename(model = car))) %>%
mutate(together_again = map2(.x = lhs, .y = rhs, ~ inner_join(.x, .y, by = c('car' = 'model'))))
以上内容有效,但简而言之显示了我试图用我的真实数据做的事情。我的包含列表列的实际数据框无法通过内部连接发生变化,我希望通过在此处描述和显示一些匿名数据,有人可能会提出建议。
我的数据框pdata
:
data
# A tibble: 104 x 7
MONETIZATION_WEEK_COHORT data cut_off clv_obj model prediction training_period_metrics
<date> <list> <int> <list> <list> <list> <list>
1 2020-03-30 <tibble [214,509 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [7,328 × 3]>
2 2020-03-30 <tibble [214,509 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [7,328 × 3]>
3 2020-04-06 <tibble [496,626 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [20,060 × 3]>
4 2020-04-06 <tibble [496,626 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [20,060 × 3]>
5 2020-04-13 <tibble [595,775 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [25,816 × 3]>
6 2020-04-13 <tibble [595,775 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [25,816 × 3]>
7 2020-04-20 <tibble [548,436 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [22,161 × 3]>
8 2020-04-20 <tibble [548,436 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [22,161 × 3]>
9 2020-04-27 <tibble [529,507 × 9]> 7 <named list [2]> <named list [2]> <named list [2]> <tibble [21,113 × 3]>
10 2020-04-27 <tibble [529,507 × 9]> 8 <named list [2]> <named list [2]> <named list [2]> <tibble [21,113 × 3]>
我正在尝试将预测与每一行的训练期指标结合起来。这是这两个字段的示例,它们都是数据帧:
下面map2中的.y
字段:
pdata$prediction[[1]]$result %>% head(2) %>% glimpse
Rows: 2
Columns: 11
$ Id <chr> "123abc", "def456"
$ period.first <date> 2020-05-21, 2020-05-21
$ period.last <date> 2020-08-26, 2020-08-26
$ period.length <int> 14, 14
$ actual.x <int> 0, 0
$ actual.total.spending <dbl> 0, 0
$ PAlive <dbl> 0.72933712, 0.05683547
$ CET <dbl> 19.2692978, 0.1285307
$ DERT <dbl> 13.37550762, 0.08921192
$ predicted.mean.spending <dbl> 839.648, 1017.683
$ predicted.CLV <dbl> 11230.71800, 90.78944
下面map2中的.x
字段:
pdata$training_period_metrics[[1]] %>% head(2) %>% glimpse
Rows: 2
Columns: 3
$ S <chr> "abc123", "def456"
$ Transactions <int> 40, 3
$ Total_Spending <dbl> 14660, 1797
我正在尝试将它们作为新列加入到数据框中:
pdata %>% mutate(combined_data = map2(.x = training_period_metrics, .y = prediction, ~ inner_join(.x, .y$result, by = c('S' = 'Id'))))
Error: Problem with `mutate()` input `combined_data`.
x `x` and `y` must share the same src, set `copy` = TRUE (may be slow).
ℹ Input `combined_data` is `map2(...)`.
如何在我的 purrr 循环中加入 prediction$result
和 training_period_metrics
?
只有当 .x
和 .y
都不是 NULL
或 return NULL
my_mtcars %>%
mutate(together_again = map2(.x = lhs, .y = rhs,
~ if(is.null(unlist(.x))|is.null(unlist(.y))) list(NULL) else
inner_join(.x, .y, by = c('car' = 'model'))))