如何根据另一列中的索引操作一列中的子字符串

How to manipulate substrings in one column based on indices in another column

我想根据存储在数据框另一列中的这些子字符串的 索引 来操作一列中的子字符串:

数据:

df_test
                               Turn                              c5                              Turns_split
1 we 're not gon na know the person PNP VBB XX0 VVG TO0 VVI AT0 NN1 we, 're, not, gon, na, know, the, person
2                      great answer                         AJ0 NN1                            great, answer
3                 it 's gon na rain             PNP VBZ VVG TO0 VVI                    it, 's, gon, na, rain
                                c5_split Index
1 PNP, VBB, XX0, VVG, TO0, VVI, AT0, NN1     4
2                               AJ0, NN1      
3                PNP, VBZ, VVG, TO0, VVI     3

索引(值43)存储在列Index中;我要操作的子字符串存储在 c5 中,其中包含词性标记。我想做的操作集中在 c5 中的两个子字符串:(i) 其索引与 Index 中的索引值相同的子字符串和 (ii) 此后的子字符串,即, Index 值 + 1 的子字符串。我要执行的操作是用 = 符号替换两个子字符串之间的空格。所以 c5 中的期望输出 是这样的:

df_text$c5
"PNP VBB XX0 VVG=TO0 VVI AT0 NN1" "AJ0 NN1"                         "PNP VBZ VVG=TO0 VVI"

我真的不知道该怎么做,因此非常感谢您的指导。

可重现的数据:

df_test <- structure(list(Turn = c("we 're not gon na know the person", 
"great answer", "it 's gon na rain"), c5 = c("PNP VBB XX0 VVG TO0 VVI AT0 NN1", 
"AJ0 NN1", "PNP VBZ VVG TO0 VVI"), Turns_split = list(c("we", 
"'re", "not", "gon", "na", "know", "the", "person"), c("great", 
"answer"), c("it", "'s", "gon", "na", "rain")), c5_split = list(
    c("PNP", "VBB", "XX0", "VVG", "TO0", "VVI", "AT0", "NN1"), 
    c("AJ0", "NN1"), c("PNP", "VBZ", "VVG", "TO0", "VVI")), Index = list(
    4L, integer(0), 3L)), row.names = c(NA, -3L), class = "data.frame")

试试这个

for(i in 1:nrow(df_test)){
  if(length(df_test$Index[[i]])==0) next()
  s = unlist(strsplit(df_test$c5[i],split = " "))
  s[df_test$Index[[i]]] = paste0(s[df_test$Index[[i]]],"=",s[df_test$Index[[i]]+1])
  df_test$c5[i] = paste(s[-(df_test$Index[[i]]+1)],collapse = " ")
}