如何生成一个新的数字列来表示 R 中的字符序列?
How to generate a new numeric column to represent a sequence of characters in R?
我的数据是sdf:
s = c("aa", "bb", "cc","cc","cc","cc","aa", "bb", "cc", "bb", "bb", "bb", "bb")
sdf = data.frame(s)
我想要做的是生成一个从 1 到任何值的列,但对于每个重复的字符,数字都不会改变。我可以获得以下序列:
sdf$wrongseq<-seq(1:nrow(sdf))
但是如何得到上面描述的序列:
rightseq<- c(1, 2, 3,3,3,3,4,5, 6, 7, 7, 7, 7)
sdf = cbind(sdf,rightseq)
library(data.table)
rleid(sdf$s)
#[1] 1 2 3 3 3 3 4 5 6 7 7 7 7
# if no package to be loaded:
x = rle(as.character(sdf$s))$lengths #rle calculates lengths of equal values
# x
# [1] 1 1 4 1 1 1 4
rep(seq_along(x), x)
#[1] 1 2 3 3 3 3 4 5 6 7 7 7 7
我的数据是sdf:
s = c("aa", "bb", "cc","cc","cc","cc","aa", "bb", "cc", "bb", "bb", "bb", "bb")
sdf = data.frame(s)
我想要做的是生成一个从 1 到任何值的列,但对于每个重复的字符,数字都不会改变。我可以获得以下序列:
sdf$wrongseq<-seq(1:nrow(sdf))
但是如何得到上面描述的序列:
rightseq<- c(1, 2, 3,3,3,3,4,5, 6, 7, 7, 7, 7)
sdf = cbind(sdf,rightseq)
library(data.table)
rleid(sdf$s)
#[1] 1 2 3 3 3 3 4 5 6 7 7 7 7
# if no package to be loaded:
x = rle(as.character(sdf$s))$lengths #rle calculates lengths of equal values
# x
# [1] 1 1 4 1 1 1 4
rep(seq_along(x), x)
#[1] 1 2 3 3 3 3 4 5 6 7 7 7 7