替代 dplyr 包中的 "lead" 函数？

Question

我需要在我的数据框中创建一个列，其中新列 (next.1) 从列 current 的第 i + 1 行开始。我用 dplyr 尝试了这段代码，它在虚拟数据集上完成了工作。但是，它在我的原始数据框中不起作用。我试图分离 dplyr 包，重新启动 R 等，但没有运气。我想知道是否有任何其他方法可以在不使用 dplyr 的情况下完成相同的工作？

month <- c(1:12)
current <- c(20:31)
df <- data.frame(month, current)
df$month <- as.factor(as.character(df$month))

library(dplyr)
df <- df %>% 
  dplyr::mutate(next.1 = lead(current, default = first(current)))

Answer 1

这里有几个选项。

基础 R

你可以创建一个新列，在其中删除 current 列的第一个条目，然后从中减去 1，然后将第一个条目添加为最后一个条目（或者你可以做 NA，但只是基于你的 dplyr 输出）。

df$next.1 <- c(df$current[-1], df$current[1])

输出

   month current next.1
1      1      20     21
2      2      21     22
3      3      22     23
4      4      23     24
5      5      24     25
6      6      25     26
7      7      26     27
8      8      27     28
9      9      28     29
10    10      29     30
11    11      30     31
12    12      31     20

transform 来自 data.table

library(data.table)

data.table::transform(df, next.1 = c(df$current[-1], df$current[1]))

shift 来自 data.table 以及 dplyr

library(dplyr)
library(data.table)

    df %>% 
        dplyr::mutate(next.1 = data.table::shift(current, -1, df$current[1]))

如果您不想要最后一行的“20”值，那么您可以在所有 3 个选项中将 df$current[1] 替换为 NA（或任何其他值）。

Answer 2

这可行：

library(dplyr)
df %>% 
  mutate(
         next.1 = lead(current),
         # in case you do not want the last value to be `NA`:
         next.1 = ifelse(is.na(next.1), current + 1, next.1)
         )
   month current next.1
1      1      20     21
2      2      21     22
3      3      22     23
4      4      23     24
5      5      24     25
6      6      25     26
7      7      26     27
8      8      27     28
9      9      28     29
10    10      29     30
11    11      30     31
12    12      31     32

替代 dplyr 包中的 "lead" 函数？

Alternative to "lead" function in dplyr package?

iteration

for-loop

r

dplyr