R - Dplyr - 比较最后一行与实际行的值

R - Dplyr - Comparing values from last row vs actual row

我有这个数据框:

    year    month    UserID
1   2014    11        3527
2   2014    12        4916
3   2015    1         2445

并想添加一个 "Variation" 列:公式为:ActualRow/LastRow - 1.

这是我的代码:

UserID_unicos2 <- UserID_unicos1 %>%
                  mutate(variation=(UserID/lag(UserID) - 1)) %>% 
                  mutate(prev=lag(UserID))

然而,它只是 returns:

    year    month   UserID  variation   prev
1   2014     11      3527      NA        NA
2   2014     12      4916   0.3938191   3527
3   2015      1      2445      NA        NA

如您所见,它只是 return 2014-12 年的值。不适用于:2015-01。怎么来的?谢谢

应用后我的数据"dput()":

structure(list(year = c(2014L, 2014L, 2015L), month = c(11L, 
12L, 1L), UserID = c(3527L, 4916L, 2445L)), .Names = c("year", 
"month", "UserID"), row.names = c(NA, -3L), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), vars = list(year), drop = TRUE, indices = list(
    0:1, 2L), group_sizes = c(2L, 1L), biggest_group_size = 2L, labels = structure(list(
    year = 2014:2015), class = "data.frame", row.names = c(NA, 
-2L), .Names = "year", vars = list(year)))

根据您的 dput,您的数据按 year 分组,这就是您看到此结果的原因。试试这个:

UserID_unicos1 %>%
  ungroup() %>%
  mutate(variation=(UserID/lag(UserID) - 1),
         prev=lag(UserID))

另请注意,您可以在同一 mutate 中创建两列,只是用逗号分隔。