使用带有字符串的数据框中所有值的 gsub
Using gsub from all values in dataframe with strings
如果我有一个数据框是这样的值:
df<- c("One", "Two Three", "Four", "Five")
df<-data.frame(df)
df
"One"
"Two Three"
"Four"
"Five"
我还有另一个数据框,例如:
df2<-c("the park was number one", "I think the park was number two three", "Nah one and two is ok", "tell me about four and five")
df2<-data.frame(df2)
df2
the park was number one
I think the park was number two three
Nah one and two is ok
tell me about four and five
如果在 df2[1] 的任何字符串中找到在 df 中找到的值之一,我如何将其替换为 "it" 之类的词。
我想用这个替换我的最终 df2:
df3
the park was number it
I think the park was number it
Nah it and two is ok
tell me about it and it
我知道这可能与以下内容有关:
gsub(df,"it", df2)
但我觉得不对。
谢谢!
你可以这样做
sapply(df$df,function(w) df2$df2 <<- gsub(paste0(w,"|",tolower(w)),"it",df2$df2))
df2
df2
1 the park was number it
2 I think the park was number it
3 Nah it and two is ok
4 tell me about it and it
<<-
运算符确保函数更改了全局环境中 df2
的版本。 paste0(w,"|",tolower(w))
允许大小写差异,如您的示例所示。
请注意,您应该将 stringAsFactors=FALSE
添加到问题中的数据框定义中。
如果我有一个数据框是这样的值:
df<- c("One", "Two Three", "Four", "Five")
df<-data.frame(df)
df
"One"
"Two Three"
"Four"
"Five"
我还有另一个数据框,例如:
df2<-c("the park was number one", "I think the park was number two three", "Nah one and two is ok", "tell me about four and five")
df2<-data.frame(df2)
df2
the park was number one
I think the park was number two three
Nah one and two is ok
tell me about four and five
如果在 df2[1] 的任何字符串中找到在 df 中找到的值之一,我如何将其替换为 "it" 之类的词。
我想用这个替换我的最终 df2:
df3
the park was number it
I think the park was number it
Nah it and two is ok
tell me about it and it
我知道这可能与以下内容有关:
gsub(df,"it", df2)
但我觉得不对。
谢谢!
你可以这样做
sapply(df$df,function(w) df2$df2 <<- gsub(paste0(w,"|",tolower(w)),"it",df2$df2))
df2
df2
1 the park was number it
2 I think the park was number it
3 Nah it and two is ok
4 tell me about it and it
<<-
运算符确保函数更改了全局环境中 df2
的版本。 paste0(w,"|",tolower(w))
允许大小写差异,如您的示例所示。
请注意,您应该将 stringAsFactors=FALSE
添加到问题中的数据框定义中。