索引数据框中的子组
indexing sub groups in a dataframe
我正在寻找一种在数据框中索引子类别的智能方法。
我在下面创建了一个非常简单的可重现示例。您将如何编写以下从输入到输出的步骤(即我们如何编写 color_id 变量的创建代码)?
非常感谢您对此的看法!
input <- data.frame(label = c("red", "red", "blue", "green", "green", "green", "orange"), count = c(2, 2, 1, 3, 3 ,3, 1))
output <- data.frame(label = c("red", "red", "blue", "green", "green", "green", "orange"), count = c(2, 2, 1, 3, 3 ,3, 1), color_id = c(1, 2, 1, 1, 2, 3, 1))
此致
使用 data.table:
library(data.table)
setDT(input)[ , color_id := seq_len(.N), by = label]
label count color_id
1: red 2 1
2: red 2 2
3: blue 1 1
4: green 3 1
5: green 3 2
6: green 3 3
7: orange 1 1
library(splitstackshape)
getanID(input, 'label')
我正在寻找一种在数据框中索引子类别的智能方法。
我在下面创建了一个非常简单的可重现示例。您将如何编写以下从输入到输出的步骤(即我们如何编写 color_id 变量的创建代码)?
非常感谢您对此的看法!
input <- data.frame(label = c("red", "red", "blue", "green", "green", "green", "orange"), count = c(2, 2, 1, 3, 3 ,3, 1))
output <- data.frame(label = c("red", "red", "blue", "green", "green", "green", "orange"), count = c(2, 2, 1, 3, 3 ,3, 1), color_id = c(1, 2, 1, 1, 2, 3, 1))
此致
使用 data.table:
library(data.table)
setDT(input)[ , color_id := seq_len(.N), by = label]
label count color_id
1: red 2 1
2: red 2 2
3: blue 1 1
4: green 3 1
5: green 3 2
6: green 3 3
7: orange 1 1
library(splitstackshape)
getanID(input, 'label')